1 Introduction

The name “statistical geometry” is relatively new, although this geometry has existed for long and in various editions. From a geometric viewpoint, a statistical structure is nothing but a Codazzi pair. Starting from locally strongly convex hypersurfaces in the Euclidean space, through locally strongly convex equiaffine hypersurfaces in the affine space \({\mathbf {R}}^{n+1}\) and Lagrangian submanifolds in complex space forms to Hessian manifolds—all these examples are statistical manifolds. Note that the structures of the subclasses have very different properties and, therefore, the intersections of the subclasses are small. For instance, if a statistical structure can be realized on a Lagrangian submanifold and on an affine hypersurface, then its Riemannian sectional curvature, its \(\nabla \)-sectional curvature, and its K-sectional curvature are all constant. Note also that the category of statistical structures is much larger than the union of all the specific subclasses mentioned above. In particular, results proved for affine hypersurfaces or for Lagrangian submanifolds usually are rarely generalizable to the general case of statistical structures.

The aim of this paper is to prove some global and local theorems for statistical manifolds in the general setting, with extensive references to affine hypersurfaces or Lagrangian submanifolds.

By a statistical structure, we mean a pair \((g,\nabla )\), where g is a Riemannian metric tensor field and \(\nabla \) is a torsion-free affine connection such that the cubic form \(\nabla g\) is symmetric. Let \(A=-\frac{1}{2}\nabla g\) and \(\tau \) be the Czebyshev form defined as \(\tau (X)=\mathrm{tr}\, _g A(X,\cdot ,\cdot )\). Set \(K=A^\sharp \). The Levi-Civita connection for g will be denoted by \({\hat{\nabla }}\). \({\hat{R}}\) and \({\widehat{\mathrm{Ric}\,}}\) will stand for the corresponding curvature and Ricci tensors. A statistical structure is called trace-free if \(\tau =0\). If \({\hat{\nabla }} A\) is symmetric, then the statistical structure is called conjugate symmetric. Basic information on statistical structures, affine hypersurfaces, and Lagrangian submanifolds, needed in this paper, is contained in Sect. 2.

In Sect. 3, we prove some algebraic formulas for \(\Vert A\Vert \), \(\Vert \tau \Vert \), \(\tau \), and K. These formulas yield inequalities for the Ricci tensors and scalar curvatures of statistical structures.

The main part of this paper deals with the following Simons’ formula

$$\begin{aligned} \frac{1}{2}\Delta (\Vert s\Vert ^2)=g(\Delta s,s) +\Vert {\hat{\nabla }} s\Vert ^2, \end{aligned}$$
(1)

where s is a tensor field and \(\Delta s\) is a specially defined Laplacian on s. The first assumption which should be made for computing the term \(g(\Delta s,s)\) is the symmetry of \(\hat{\nabla }s\). Therefore, when we apply this formula to the cubic form A, we shall assume that the statistical structure is conjugate symmetric. Using various properties of statistical structures, this term can be computed in various ways. For example, one can get the following equality, see Theorem 4.15,

$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)=\Vert \hat{\nabla }A\Vert ^2+ g(\hat{\nabla }^2 \tau ,A)+g({\hat{R}}-R,{\hat{R}})+g(\widehat{\mathrm{Ric}\,},g(K_\cdot ,K_\cdot )), \end{aligned}$$

where R is the curvature tensor of \(\nabla \). Computing this term in another way one gets the following inequalities valid for conjugate symmetric trace-free statistical structures, see (85),

$$\begin{aligned} \begin{array}{rcl} &{}&{}\frac{1}{2}\Delta \Vert A\Vert ^2\ge (n+1)H\Vert A\Vert ^2 +\frac{n+1}{n(n-1)}\Vert A\Vert ^4+\Vert {\hat{\nabla }} A\Vert ^2,\\ &{}&{} \frac{1}{2}\Delta \Vert A\Vert ^2\le (n+1)H\Vert A\Vert ^2+\frac{3}{2}\Vert A\Vert ^4+\Vert {\hat{\nabla }} A\Vert ^2 \end{array} \end{aligned}$$
(2)

in the case where \(R=HR_0\), \(R_0(X,Y)Z=g(Y,Z)X-g(X,Z)Y\), and \(\dim M=n\). In Sect. 5, we employ (2) for proving some estimations for the functions \(\Vert A\Vert ^2\), \(\Vert {\hat{\nabla }} A\Vert ^2\) in the case where g is complete. A model theorem here is the following result due to Calabi

Theorem 1.1

For a complete hyperbolic affine Blaschke sphere, the Ricci tensor of g is negative semi-definite and, consequently,

$$\begin{aligned} \Vert A\Vert ^2\le -\rho , \end{aligned}$$
(3)

where \(\rho \) is the affine scalar curvature.

In the applications of Simons’ formula (1), the term \(\Vert {\hat{\nabla }} s\Vert ^2\) is usually just non-negative. We propose new estimations involving also \(\inf \Vert \hat{\nabla }A\Vert ^2\) and \(\mathrm{sup}\Vert \hat{\nabla }A\Vert ^2\), see Theorems 5.5, 5.7, and 5.8. For proving these theorems, we use Yau’s maximum principle. For example, we prove (see Theorem 5.7) that if for a trace-free statistical structure with complete metric on an n-dimensional manifold, we have \(R=HR_0\), where H is a negative number, then

$$\begin{aligned} \inf \, \Vert {\hat{\nabla }} A\Vert ^2\le \frac{n(n^2-1)H^2}{4}. \end{aligned}$$

In the last section, we prove a theorem on conjugate symmetric statistical structures with non-negative sectional curvature of the metric g. The theorem generalizes a theorem known for minimal Lagrangian submanifolds, see [11].

2 Preliminaries

In this section, we introduce the notations and collect basic information on the geometry of statistical structures, affine hypersurfaces, and Lagrangian submanifolds. All details for this section can be found in [4, 12, 13].

In this paper, we consider only torsion-free connections. If needed, manifolds are connected and orientable. If g is a metric tensor field (positive definite) on a manifold M, then \(\mathrm{div}\,\) will stand for the divergence relative to the Levi-Civita connection \(\hat{\nabla }\) of g. In particular, if s is a tensor field of type (1, k), then \(\mathrm{div}\,s\) is a tensor field of type (0, k) defined as

$$\begin{aligned} (\mathrm{div}\,s)(X_1,\ldots ,X_k)=\mathrm{tr}\, \{Y\rightarrow (\hat{\nabla }_Ys )(X_1,\ldots ,X_k)\}. \end{aligned}$$
(4)

We shall use the following Laplacian relative to g acting on tensor fields (usually symmetric) of type (0, k) for various values of k. Namely, we set

$$\begin{aligned} (\Delta s)(X_1,\ldots ,X_k)=\mathrm{tr}\, _g(\hat{\nabla }^2s)(\cdot ,\cdot , X_1,\ldots ,X_k)=\sum _i(\hat{\nabla }^2s)(e_i,e_i, X_1,\ldots ,X_k),\nonumber \\ \end{aligned}$$
(5)

where \(e_1,\ldots ,e_n\) is an orthonormal basis of \(T_xM\) and \( X_1,\ldots ,X_k\in T_xM\), \(x\in M\). Moreover, \(\hat{\nabla }^2 s=\hat{\nabla }(\hat{\nabla }s)\), where \(\hat{\nabla }s\) is a tensor field of type \((0,k+1)\) given by the formula

$$\begin{aligned} \hat{\nabla }s(X,\ldots )=(\hat{\nabla }_{X} s)(\ldots ). \end{aligned}$$

In the case of differential forms, the above Laplacian is not the Hodge Laplacian (defined as \(d\delta +\delta d\)). The two Laplacians are related by the Weitzenb \(\mathrm{\ddot{o}}\)ck formulas, see, e.g., (61). In this paper, the codifferential \(\delta \) of a differential k-form \(\omega \) is defined as

$$\begin{aligned} \delta \omega (X_1,\ldots ,X_{k-1})=+ \mathrm{tr}\, _g(\hat{\nabla }w)(\cdot ,\cdot , X_1,\ldots ,X_{k-1}). \end{aligned}$$
(6)

Because of these agreements, Simons’ formula has the form as in (1). The Laplacian defined in (5) is usually denoted by \(\hat{\nabla }^*\hat{\nabla }\), but we shall use the symbol \(\Delta \) for a simpler notation.

2.1 Statistical Structures

By a statistical structure on a manifold M, we mean a pair \((g,\nabla )\), where g is a metric tensor and \(\nabla \) is a torsion-free connection on M such that the cubic form \(\nabla g\) is symmetric. The difference tensor defined by

$$\begin{aligned} K(X,Y)=K_XY=\nabla _XY-\hat{\nabla }_XY \end{aligned}$$
(7)

is symmetric and symmetric relative to g. A statistical structure is called trivial if \(\nabla ={\hat{\nabla }}\), that is, \(K\equiv 0\). It is clear that a statistical structure can be also defined as a pair (gK), where K is a symmetric and symmetric relative to g (1, 2)-tensor field. Namely, the connection \(\nabla \) is defined by (7). Alternatively, instead of K, one can use the symmetric cubic form \(A(X,Y,Z)=g(K(X,Y)Z)\). The cubic forms \(\nabla g\) and A are related as follows:

$$\begin{aligned} \nabla g=-2A. \end{aligned}$$
(8)

A statistical structure is called trace-free if \(E:=\mathrm{tr}\, _gK=0\). Having a metric tensor g and a connection \(\nabla \) on a manifold M, one defines a conjugate connection \(\overline{\nabla }\) by the formula

$$\begin{aligned} g(\nabla _XY,Z)+g(Y,\overline{\nabla }_XZ)=Xg(Y,Z). \end{aligned}$$
(9)

A pair \((g,\nabla )\) is a statistical structure if and only if \((g,\overline{\nabla })\) is. The pairs are also simultaneously trace-free because if K is the difference tensor for \((g,\nabla )\), then \(-K\) is the difference tensor for \(\overline{\nabla }\).

The curvature tensors for \(\nabla \), \(\overline{\nabla }\), \(\hat{\nabla }\) will be denoted by R, \({\overline{R}}\) and \({\hat{R}}\) respectively. The corresponding Ricci tensors will be denoted by \(\mathrm{Ric}\,\), \({\overline{\mathrm{Ric}\,}}\), and \({\widehat{\mathrm{Ric}\,}}\). In general, the curvature tensor R does not satisfy the equality \(g(R(X,Y)Z,W)=-g(R(X,Y)W,Z)\). If it does, we say that the statistical structure is conjugate symmetric. For a statistical structure, we always have

$$\begin{aligned} g(R(X,Y)Z,W)=-g({\overline{R}}(X,Y)W,Z). \end{aligned}$$
(10)

The following conditions are equivalent: (1) \(R={\overline{R}}\), (2) \(\hat{\nabla }K\) is symmetric (equiv. \({\hat{\nabla }} A\) is symmetric), (3) g(R(XY)ZW) is skew-symmetric relative to ZW.

Hence, each of the above three conditions characterizes a conjugate symmetric statistical structure. The above equivalences follow from the following well-known formula

$$\begin{aligned} R(X,Y)={\hat{R}}(X,Y) +(\hat{\nabla }_XK)_Y-(\hat{\nabla }_YK)_X+[K_X,K_Y]. \end{aligned}$$
(11)

Writing the same equality for \(\overline{\nabla }\) and adding both equalities, we get

$$\begin{aligned} R+{\overline{R}} =2{\hat{R}} +2[K,K] ,\end{aligned}$$
(12)

where [KK] is a (1, 3)-tensor field defined as \([K,K](X,Y)Z=[K_X,K_Y]Z\). In particular, if \(R={\overline{R}}\), then

$$\begin{aligned} R={\hat{R}} +[K,K]. \end{aligned}$$
(13)

The (1, 3)-tensor field [KK] is a curvature tensor, that is, it satisfies the first Bianchi identity and has the same symmetries as the Riemannian curvature tensor. Define its Ricci tensor \(\mathrm{Ric}\,^K\) as follows: \(\mathrm{Ric}\,^K(Y,Z)=\mathrm{tr}\, \{X\rightarrow [K,K](X,Y)Z\}\). In [13], the following formula was proved

$$\begin{aligned} \mathrm{Ric}\,^K (Y,Z)=\tau (K(Y,Z))-g(K_Y,K_Z), \end{aligned}$$
(14)

where \(\tau \) is the Czebyshev 1-form defined as \(\tau (X)=\mathrm{tr}\, K_X=g(E,X)\). From (11) we receive, see [13],

$$\begin{aligned} \mathrm{Ric}\,={\widehat{\mathrm{Ric}\,}}+\mathrm{div}\,K-{\hat{\nabla }} \tau +\mathrm{Ric}\,^K. \end{aligned}$$
(15)

The 1-form \(\tau \) is closed (i.e., \({\hat{\nabla \tau }}\) is symmetric) if and only if the Ricci tensor \(\mathrm{Ric}\,\) is symmetric. For a conjugate symmetric statistical structure, the Ricci tensor is symmetric. More generally, if for a statistical structure \(\mathrm{Ric}\,={\overline{\mathrm{Ric}\,}}\), then \(d\tau =0\). Indeed, by writing (15) for \({\overline{\nabla }}\) and comparing with (15), we see that \(\mathrm{Ric}\,={\overline{\mathrm{Ric}\,}}\) if and only if \(\mathrm{div}\,K={\hat{\nabla \tau }}\). Therefore, if \(\mathrm{Ric}\,={\overline{\mathrm{Ric}\,}}\), then \(d\tau =0\) (because \(\mathrm{div}\,K\) is symmetric). Using (15) and the analogous formula for the connection \({\overline{\nabla }}\), one also gets

$$\begin{aligned} \mathrm{Ric}\,+\overline{\mathrm{Ric}\,} =2\widehat{\mathrm{Ric}\,}+2\tau \circ K-2g(K_\cdot ,K_\cdot ). \end{aligned}$$
(16)

In particular, if \((g, \nabla )\) is trace-free, then

$$\begin{aligned} 2\widehat{\mathrm{Ric}\,}\ge \mathrm{Ric}\,+\overline{\mathrm{Ric}\,}.\end{aligned}$$
(17)

Recall now the scalar curvatures for statistical structures. We have the Riemannian scalar curvature \({\hat{\rho }}\) for g and the scalar curvature \(\rho \) for \(\nabla \): \(\rho =\mathrm{tr}\, _g\mathrm{Ric}\,\). It turns out [e.g., by (10)] that if we define the analogous scalar curvature for \(\overline{\nabla }\), then it is equal to \(\rho \). We also have the scalar curvature \(\rho ^K=\mathrm{tr}\, _g \mathrm{Ric}\,^K\). From the above formulas, one easily gets

$$\begin{aligned} \rho ^K= & {} \Vert E\Vert ^2-\Vert K\Vert ^2, \end{aligned}$$
(18)
$$\begin{aligned} {\hat{\rho }}= & {} \rho +\Vert K\Vert ^2-\Vert E\Vert ^2. \end{aligned}$$
(19)

Recall now the sectional curvatures in statistical geometry. Of course, we have the usual Riemannian sectional curvature of the metric tensor g. We shall denote it by \({\hat{k}}(\pi )\) if \(\pi \) is a vector plane in a tangent space. In [13] (see also [6]), another sectional curvature was introduced. It was called the sectional \(\nabla \)-curvature. Namely, it was observed that the the tensor field \(R+{\overline{R}}\) has all algebraic properties needed to produce the sectional curvature. However, in general, this sectional curvature does not have the same properties as the Riemannian sectional curvature. For instance, Schur’s lemma does not hold, in general. However, it holds for conjugate symmetric statistical structures. The sectional \(\nabla \)-curvature is defined as follows:

$$\begin{aligned} k(\pi )=\frac{1}{2}g((R+{\overline{R}})(e_1,e_2)e_2,e_1), \end{aligned}$$
(20)

where \(e_1,e_2\) is an orthonormal basis of \(\pi \). Another good curvature tensor, which exists in statistical geometry, is [KK]. It again has all algebraic properties needed to define a sectional curvature. This sectional curvature was introduced and studied in [14]. It was called the sectional K-curvature. If \(\pi \) is a vector plane in a tangent space \(T_xM\), then the sectional K-curvature is equal to \(g([K,K](e_1,e_2)e_2,e_1)\) for any orthonormal basis \(e_1,e_2\) of \(\pi \). Similarly to the case of the sectional \(\nabla \)-curvature, Schur’s lemma holds for conjugate symmetric statistical structures.

A statistical structure is called Hessian if \(\nabla \) is flat, that is, if \(R=0\). As it will be noticed below, all Hessian structures are locally realizable on parabolic equiaffine spheres. Since \(\mathrm{Ric}\,=0\) for a Hessian structure and \(\nabla \) is torsion-free, we know that the Koszul form \(\beta =\nabla \tau \) is symmetric. A Hessian structure is called Einstein-Hessian if \(\nabla \tau =\lambda g\), see [17]. For a Hessian structure, we have

$$\begin{aligned} \beta = {\hat{\nabla \tau }} -\tau \circ K,\ \ \ \ \mathrm{tr}\, _g\beta = \delta \tau -\Vert \tau \Vert ^2. \end{aligned}$$
(21)

By (16), we also have

$$\begin{aligned} {\widehat{\mathrm{Ric}\,}}=g(K_\cdot ,K_\cdot )-\tau \circ K. \end{aligned}$$
(22)

In geometry, the richest sources of statistical manifolds seem to be the theory of affine hypersurfaces in \({\mathbf {R}}^{n+1}\) and that of Lagrangian submanifolds in complex space forms. We shall shortly recall the basic facts from these theories.

2.2 Equiaffine Locally Strongly Convex Hypersurfaces in \({\mathbf {R}}^{n+1}\)

For the theory of affine hypersurfaces, we refer to [12]. Let \(f:M\rightarrow {\mathbf {R}}^{n+1}\) be an immersed locally strongly convex hypersurface of the affine space \({\mathbf {R}}^{n+1}\). Denote the standard flat affine connection (the operator of the ordinary differentiation) on \({\mathbf {R}}^{n+1}\) by D. Assume that the hypersurface is equipped with a transversal vector field \(\xi \) such that \(D_X\xi \) is tangent to the hypersurface for every \(X\in T_xM\), \(x\in M\). Such a transversal vector field is called equiaffine and a hypersurface endowed with such a transversal vector field is called an equiaffine hypersurface. All hypersurfaces considered in this paper will be equiaffine. By the following formulas of Gauss and Weingarten, one gets the induced connection \(\nabla \), the second fundamental form g, and the shape operator S

$$\begin{aligned} \begin{array}{rcl} &{}&{}D_Xf_*Y=f_*(\nabla _XY) +g(X,Y)\xi ,\\ &{}&{}D_X\xi =-f_*(SX). \end{array} \end{aligned}$$
(23)

A hypersurface is locally strongly convex if and only if g is definite. If an equiaffine hypersurface is locally strongly convex, the sign of \(\xi \) is chosen in such a way that g is positive definite. It turns out that for an equiaffine hypersurface, the cubic form \(\nabla g\) is symmetric. The symmetry of \(\nabla g\) is the so-called first Codazzi equation for an equiaffine hypersurface. It means that the induced structure \((g,\nabla )\) is a statistical structure. We also have the following Gauss equation

$$\begin{aligned} R(X,Y)Z=g(Y,Z)\mathcal SX-g(X,Z)\mathcal SY, \end{aligned}$$
(24)

The so-called Ricci equation for equiaffine immersions states that the Ricci tensor \(\mathrm{Ric}\,\) is symmetric. The symmetry of the Ricci tensor is the first obstruction for a statistical structure to be realizable as the induced structure on a hypersurface in \({\mathbf {R}}^{n+1}\). The second obstruction is that the dual connection \(\overline{\nabla }\) must be projectively flat. Namely, an important theorem on the realizability of statistical structures on equiaffine hypersurfaces is the following fundamental theorem proved in [5]

Theorem 2.1

Let \((g,\nabla )\) be a statistical structure on a simply connected manifold M and satisfy the following conditions:

  1. (1)

    the Ricci tensor of \(\nabla \) is symmetric,

  2. (2)

    the dual connection \(\overline{\nabla }\) is projectively flat.

Then there is a locally strongly convex immersion \(f:M\rightarrow {\mathbf {R}}^{n+1}\) and its equiaffine transversal vector field \(\xi \) such that \(\nabla \) is the induced connection and g is the second fundamental form for the equiaffine hypersurface \((f,\xi )\).

From now on, we shall always assume that a hypersurface is locally strongly convex without mentioning this each time.

An equiaffine hypersurface is called an equiaffine sphere if the shape operator is a multiple of the identity, i.e., \(S=H \mathrm{id}\,\), where \(\mathrm{id}\,\) is the identity (1, 1)-tensor field on M. In such a case, H must be constant on connected M. A hypersphere is called elliptic if \(H>0\), hyperbolic if \(H<0\), and parabolic if \(H=0\). For an equiaffine sphere, we have the equality \(R=HR_0\), where \(R_0\) is the curvature tensor defined by \( R_0(X,Y)Z=g(Y,Z)X-g(X,Z)Y. \) It is now clear that for an equiaffine sphere its statistical structure is conjugate symmetric. The converse is also true, that is, if the statistical structure on an equiaffine hypersurface is conjugate symmetric, then the hypersurface must be a sphere (Lemma 12.5 in [13]).

It is also known that if we have a statistical structure (on an n-dimensional connected manifold M) whose curvature tensor R satisfies the equality \(R=HR_0\), where H is possibly a function, and \(n>2\), then H must be a constant (Proposition 12.7 in [13]). If \(R=HR_0\) then g(R(XY)ZW) is skew-symmetric for ZW, hence \({\overline{R}}=R=HR_0\). If, moreover, H is constant, \(\overline{\nabla }\) is clearly projectively flat. A statistical structure for which \(R=HR_0\), where H is a constant, was called in [7] a statistical structure of constant curvature. By Theorem 2.1, we now have that statistical structures of constant curvature are (from a local viewpoint) exactly the induced structures on equiaffine spheres. Let us rewrite (13) in the case where \(R=HR_0\):

$$\begin{aligned} HR_0={\hat{R}} +[K,K]. \end{aligned}$$
(25)

Among the equiaffine hypersurfaces, the best known and historically first discovered are those which additionally satisfy the so-called apolarity condition. This condition is equivalent to the trace-freeness of the induced statistical structure (in the terminology we are using in this paper). The importance of this class of equiaffine hypersurfaces follows from the following classical fact: For a locally strongly convex hypersurface, there is a unique (up to a constant) equiaffine transversal vector field, for which the induced statistical structure is trace-free. An equiaffine hypersurface whose induced statistical structure is trace-free is called a Blaschke hypersurface. An equiaffine sphere whose induced structure is trace-free is called a Blaschke affine sphere. In contrast to Riemannian geometry, the category of affine spheres is very rich and not well recognized by now. Trivial Blaschke hypersurfaces are quadrics. In particular, a parabolic affine sphere with vanishing cubic form, that is, whose induced statistical structure is trivial, must be a part of an elliptic paraboloid.

2.3 Lagrangian Submanifolds

We shall now briefly recall some facts and introduce some notations for Lagrangian submanifolds. For this part, we refer to [4]. Let \({\tilde{M}}(4c)\) be a complex space form of holomorphic sectional curvature 4c and let M be its Lagrangian submanifold, that is, M is a real submanifold of \({\tilde{M}}\), \(\dim M=\dim _{\mathbf {C}}{\tilde{M}}\) and the bundle JTM is orthogonal to TM, where J is the complex structure on \({\tilde{M}}\). The induced metric tensor field on M will be denoted by g. If \(\sigma \) is the second fundamental tensor, then we set \(K=J\sigma \). The tensor field K is symmetric and symmetric relative to g. Hence, the pair (gK) is a statistical structure. We shall call it the induced statistical structure on a Lagrangian submanifold. For this structure, we shall use all the technique we use for statistical structure. The Codazzi equation for a Lagrangian submanifold says that the (1, 3)-tensor field \(\hat{\nabla }K\) is symmetric. Hence, the statistical structure on a Lagrangian submanifold in a complex space form is conjugate symmetric. We have the following Gauss equation for a Lagrangian submanifold in \({\tilde{M}}(4c)\)

$$\begin{aligned} cR_0={\hat{R}}-[K,K]. \end{aligned}$$
(26)

This equality is different than the equality (25) holding for equiaffine spheres. By comparing the two equalities, we see that a statistical structure on a Lagrangian submanifold can be also realized (locally) on an equiaffine hypersurface only when the Riemannian sectional curvature for g and the sectional K-curvature are constant.

Instead of the mean curvature vector of a Lagrangian submanifold, one can consider the tangent vector field \(E=\mathrm{tr}\, _gK\). In particular, a Lagrangian submanifold of \({\tilde{M}}(4c)\) is minimal if the induced statistical structure is trace-free. The mean curvature tensor \(\mu \) is equal to \(-\frac{JE}{n}\). Hence it is parallel if and only if \(\hat{\nabla }E=0\). By (26), we have

$$\begin{aligned} {\widehat{\mathrm{Ric}\,}}=c(n-1)g-g(K_\cdot ,K_\cdot )+\tau \circ K \end{aligned}$$
(27)

and

$$\begin{aligned} {\hat{\rho }} =cn(n-1)+\Vert E\Vert ^2-\Vert A\Vert ^2=cn(n-1)+n^2\Vert \mu \Vert ^2-\Vert A\Vert ^2. \end{aligned}$$
(28)

3 Algebraic and Curvature Inequalities

The notation \(\sum \limits _{i\ne j}\) stands for the sum over indices ij such that \(i\ne j\), whereas \(\sum \limits _{j;j\ne i}\) means that the sum is for the index j such that \(j\ne i\) for a fixed i.

Theorem 3.1

For a statistical structure we have

$$\begin{aligned} (\tau \circ K)(U,U)-g(K_{U}, K_{U})\le \frac{1}{4}\Vert \tau \Vert ^2g(U,U) \end{aligned}$$
(29)

for every vector U. If \(U\ne 0\), the equality holds if and only if \(\tau =0\) and \(K_U=0\).

If \(g(K(U,U),U)=0\), then

$$\begin{aligned} (\tau \circ K)(U,U)-g(K_{U}, K_{U})\le \frac{1}{8}\Vert \tau \Vert ^2g(U,U). \end{aligned}$$
(30)

If U is unit, the equality holds if and only if \(g(K(U,V), W)=0\) for VW perpendicular to U, E is perpendicular to U and \(E=4K(U,U)\).

Proof

We shall prove (29) for an arbitrary unit vector \(U\in T_xM\). We extend a given vector U to an orthonormal basis \(e_1=U\), \(e_2,\ldots , e_n\) of \(T_xM\). We set \(A_{ijk}=A(e_i,e_j,e_k)\) and \(a_{ir}=A_{iir}\). We want to show

$$\begin{aligned} 0\le \frac{1}{4}\sum _{ijr}A_{iir}A_{jjr}+\sum _{ir}A_{1ir}^2-\sum _{ir}A_{iir}A_{11r}. \end{aligned}$$
(31)

The last formula can be rewritten as follows:

$$\begin{aligned} 0\le \frac{1}{4}\sum _{ijr}a_{ir}a_{jr} + a_{11}^2 +2\sum _{r=2}^na_{1r}^2+\sum _{i\ne 1 r\ne 1} A_{1ir}^2-\sum _{ir}a_{ir}a_{1r}. \end{aligned}$$
(32)

Since \(\sum _{i\ne 1 r\ne 1} A_{1ir}^2\ge 0\), it is sufficient to prove

$$\begin{aligned} 0\le a_{11}^2+2\sum _{r=2}^n a_{1r}^2+\frac{1}{4}\sum _{ijr}a_{ir}a_{jr}-\sum _{ir}a_{ir}a_{1r}. \end{aligned}$$
(33)

The last inequality is equivalent to

$$\begin{aligned} 0\le a_{11}^2+2\sum _{r=2}^n a_{1r}^2 +\frac{1}{4}\sum _{rj;j\ne 1}a_{1r}a_{jr}+\frac{1}{4}\sum _{r=1}^n\left( \sum _{j=2}^n a_{jr}\right) ^2 -\frac{3}{4}\sum _{ir}a_{ir}a_{1r}. \end{aligned}$$
(34)

For a fixed r, set \(c_r=\sum _{j=2}^na_{jr}\). We have

$$\begin{aligned} 0\le \frac{1}{4}(a_{1r}-c_r)^2=a_{1r}^2+\frac{1}{4}a_{1r}c_r+\frac{1}{4}c_r^2-\frac{3}{4}a_{1r}(c_r+a_{1r}). \end{aligned}$$
(35)

By comparing the last inequality with (34), we obtain (33). Indeed, for \(r=1\) we get in (35)

$$\begin{aligned} 0\le a_{11}^2+\frac{1}{4}a_{11}c_1+\frac{1}{4}c_1^2-\frac{3}{4}a_{11}(c_1+a_{11}), \end{aligned}$$

hence this term appearing also in (34) is non-negative and equal to 0 if and only if \(a_{11}=c_1\). Consider now \(r>1\). For such r, we have in (34) the term

$$\begin{aligned} 2a_{1r}^2+\frac{1}{4}a_{1r}c_r+\frac{1}{4}c_r^2-\frac{3}{4}a_{1r}(c_r+a_{1r}). \end{aligned}$$
(36)

By comparing this with (35), we see that this term is non-negative and equal to 0 if and only if \(a_{1r}=0\) and \(a_{1r}=c_r\). The inequality (29) has been proved and moreover, one sees that the equality holds if and only if \(A_{11r}=0\), \(A_{22r}+\cdots +A_{nnr}=0\) for every \(r>1\), \(A_{111}=A_{221}+\cdots +A_{nn1}\) and \(A_{1ir}=0\) \(\forall i\ne 1, r\ne 1\). In particular, \(A_{1ii}=0\) for \(i>1\). These conditions imply that \(K_{e_1}=0\) and \(\tau =0\).

Assume now that \(g(K(U,U),U)=0\), that is, \(a_{11}=0\). In (31), we replace \(\frac{1}{4}\) by \(\frac{1}{8}\) and we obtain the following formula analogous to (34)

$$\begin{aligned} 0\le 2\sum _{r=2}^n a_{1r}^2 +\frac{1}{8}\sum _{jr;j\ne 1}a_{1r}a_{jr}+\frac{1}{8}\sum _{r=1}^n\left( \sum _{j=2}^n a_{jr}\right) ^2 -\frac{7}{8}\sum _{ir}a_{ir}a_{1r}. \end{aligned}$$
(37)

Again we fix r. The only possibly non-vanishing term of the right-hand side in (37) with \(r=1\) is \(\frac{1}{8}c_1^2\). For \(r>1\) we write

$$\begin{aligned} 0\le \frac{1}{8}(3a_{1r}-c_r)^2=2a_{1r}^2+\frac{1}{8}a_{1r}c_r+\frac{1}{8}c_r^2-\frac{7}{8}a_{1r}(c_r+a_{1r}). \end{aligned}$$
(38)

Comparing (38) with (37), we get the desired inequality.

If we have the equality in (30), then \(c_1=0\). Since \(a_{11}=0\), we have \(g(E,e_1)=0\). By (38), we have \(3A_{11r}=A_{22r}+\cdots +A_{nnr}\). It follows that \(E=4K(e_1,e_1)\). As in the first part of the proof, we have that \(A_{1ij}=0\) for \(i,j>1\). If \(E=0\), then \(K_{e_1}=0\). Assume that \(E\ne 0\) and \(e_2=\frac{E}{\Vert E\Vert }\). If \(K(e_1,e_1)=\lambda e_2\), then \(g(K_{e_1},K_{e_1})=2\lambda ^2\), \(\tau (K(e_1,e_1))=4\lambda ^2\) and \(\Vert \tau \Vert ^2=16\lambda ^2\). \(\square \)

As an immediate consequence of the inequality (29) and the formula (16), we get a generalization of (17)

Theorem 3.2

For any statistical structure, we have

$$\begin{aligned} 2{\widehat{\mathrm{Ric}\,}}\ge \mathrm{Ric}\,+{\overline{\mathrm{Ric}\,}}- \frac{1}{2}\Vert \tau \Vert ^2g. \end{aligned}$$
(39)

For a statistical structure on an equiaffine sphere with \(R=HR_0\), we have

$$\begin{aligned} {\widehat{\mathrm{Ric}\,}}\ge \left( H(n-1)- \frac{1}{4}\Vert \tau \Vert ^2\right) g. \end{aligned}$$
(40)

For a Lagrangian submanifold of a complex space form \(\tilde{M}(4c)\), we have

$$\begin{aligned} {\widehat{\mathrm{Ric}\,}}\le \left( c(n-1)+\frac{n^2}{4}\Vert \mu \Vert ^2\right) g, \end{aligned}$$
(41)

where \(\mu \) is the mean curvature vector of the submanifold.

The last inequality was first proved in [3].

Theorem 3.3

For any statistical structure, we have

$$\begin{aligned} \frac{n+2}{3}\Vert A\Vert ^2-\Vert E\Vert ^2\ge 0. \end{aligned}$$
(42)

Proof

We use the same notation as in the proof of the above theorem. To prove (42), we first compute \(\Vert A\Vert ^2-\Vert E\Vert ^2\). Let \(e_1,\ldots ,e_n\) be any orthonormal basis of \(T_xM\), \(x\in M\). We have

$$\begin{aligned} \Vert E\Vert ^2= & {} \sum _i\left( \sum _j A_{jji}\right) ^2\nonumber \\= & {} \sum _{i}A_{iii}^2+2\sum _{i\ne j}A_{iii}A_{jji}+\sum _{j\ne i}A_{jji}^2+2\sum _{i\ne j, l\ne i, l< j}A_{jji}A_{lli}\nonumber \\= & {} \sum _{i}a_{ii}^2+2\sum _{i\ne j}a_{ii}a_{ji}+\sum _{j\ne i}a_{ji}^2+2\sum _{i\ne j, l\ne i, l< j}a_{ji}a_{li}. \end{aligned}$$
(43)
$$\begin{aligned} \Vert A\Vert ^2= & {} \sum _{ijl}A_{ijl}^2=\sum _{i}A_{iii}^2 +3\sum _{i\ne j}A_{jji}^2 +\epsilon =\sum _{i}a_{ii}^2 +3\sum _{i\ne j}a_{ji}^2 +\epsilon , \end{aligned}$$
(44)

where . We now have

$$\begin{aligned} \Vert A\Vert ^2-\Vert E\Vert ^2=2\sum _{i\ne j}a_{ji}^2-2\sum _{i\ne j, i\ne l, l<j}a_{li}a_{ji}-2\sum _{i\ne j}a_{ii}a_{ji}+\epsilon . \end{aligned}$$
(45)

For any real numbers \(b_1,\ldots , b_k\), the following formula holds

$$\begin{aligned} \sum _{l<j\le k}(b_l-b_j)^2=(k-1)\sum _jb_j^2-2\sum _{l<j\le k}b_lb_j. \end{aligned}$$
(46)

Hence, for a fixed i we have

$$\begin{aligned} (n-2)\sum _{j; j\ne i}a_{ji}^2-2\sum _{jl;j\ne i, l\ne i, l<j}a_{li}a_{ji}\ge 0. \end{aligned}$$
(47)

Denote the left-hand side of the last inequality by \(c_i\) and set \(c=\sum _i c_i\). By (44), (45), and (47), we now obtain

$$\begin{aligned} \frac{n-4}{3}\Vert A\Vert ^2+(\Vert A\Vert ^2-\Vert E\Vert ^2)= \frac{n-4}{3}\sum _i a_{ii}^2-2\sum _{j\ne i}a_{ii}a_{ji}+c+\frac{n-1}{3}\epsilon .\nonumber \\ \end{aligned}$$
(48)

For a fixed i and a positive number \(\mu \), we have

$$\begin{aligned} \begin{array}{rcl} 0&{}&{}\le \sum \limits _{j;j\ne i}\left( \mu a_{ii}-\frac{1}{\mu }a_{ji}\right) ^2=\sum \limits _{j;j\ne i}\left[ (\mu a_{ii})^2-2(\mu a_{ii})\left( \frac{1}{\mu } a_{ji}\right) +\left( \frac{1}{\mu }a_{ji}\right) ^2\right] \\ &{}&{} = (n-1)\mu ^2 a_{ii}^2-2\sum \limits _{j, j\ne i} a_{ii}a_{ji}+\frac{1}{\mu ^2}\sum \limits _{j; j\ne i}a_{ji}^2. \end{array}\nonumber \\ \end{aligned}$$
(49)

We now add to the left-hand side of (48) \(\Vert A\Vert ^2\) and we set \(\mu ^2=\frac{1}{3}\). By (48), (44), and (49), we obtain

$$\begin{aligned} \frac{n+2}{3}\Vert A\Vert ^2 -\Vert E\Vert ^2= & {} \Vert A\Vert ^2+ \frac{n-4}{3}\Vert A\Vert ^2 +(\Vert A\Vert ^2-\Vert E\Vert ^2)\nonumber \\= & {} \frac{n-1}{3}\sum _ia_{ii}^2-2 \sum _{i\ne j} a_{ii}a_{ji}+3\sum _{i\ne j} a_{ji}^2+\frac{n+2}{3}\epsilon +c\ge 0.\nonumber \\ \end{aligned}$$
(50)

Observe additionally that in the last line, the equality holds if and only if for every i and j such that \(i\ne j\) we have \(A_{iii}{=}3A_{jji}\) and \(A_{ijr}{=}0\) for \(i\ne j, i\ne r, j\ne r\). \(\square \)

Example 3.4

Consider a statistical structure (gK) on \({\mathbf {R}}^2\), where g is the standard metric and the difference tensor K is defined as follows:

$$\begin{aligned} K(e_1,e_1)=e_2,\quad \ \ \ K(e_1,e_2)=e_1,\quad \ \ \ K(e_2,e_2)=3e_2, \end{aligned}$$
(51)

where \(e_1,e_2\) is the canonical basis of \({\mathbf {R}}^2\). The vector \(U=e_1\) satisfies the equality in (30). We also see that for this structure the equality holds in (42).

Theorem 3.5

For any statistical structure, we have

$$\begin{aligned} {\hat{\rho }}-\rho= & {} \Vert A\Vert ^2-\Vert E\Vert ^2\ge -\frac{n-1}{3}\Vert A\Vert ^2, \end{aligned}$$
(52)
$$\begin{aligned} {\hat{\rho }}-\rho= & {} \Vert A\Vert ^2-\Vert E\Vert ^2\ge -\frac{n-1}{n+2}\Vert E\Vert ^2. \end{aligned}$$
(53)

For a statistical structure on a Lagrangian submanifold of a complex space form \({\tilde{M}}(4c)\), we have

$$\begin{aligned}&{\hat{\rho }}\le \frac{n(n-1)}{n+2}\left( c(n+2)+n\Vert \mu \Vert ^2\right) , \end{aligned}$$
(54)
$$\begin{aligned}&{\hat{\rho }}\le \frac{n-1}{3}\left( 3cn+\Vert A\Vert ^2\right) . \end{aligned}$$
(55)

For a statistical structure on an equiaffine sphere with \(R=HR_0\), we have

$$\begin{aligned}&{\hat{\rho }}\ge \frac{n-1}{n+2}\left( Hn(n+2)-\Vert E\Vert ^2\right) , \end{aligned}$$
(56)
$$\begin{aligned}&{\hat{\rho }}\ge \frac{n-1}{3}\left( 3Hn-\Vert A\Vert ^2 \right) . \end{aligned}$$
(57)

4 Using Simons’ Formulas

We shall use the following version of Simons’ formula. Let g be a positive-definite metric tensor field on M. For any tensor field s on a manifold M, we have

$$\begin{aligned} \frac{1}{2}\Delta (\Vert s\Vert ^2)=g(\Delta s,s) +\Vert {\hat{\nabla }} s\Vert ^2, \end{aligned}$$
(58)

where the meaning of the Laplacians is explained in Preliminaries. We shall consider tensor fields of type (0, k). Recall the Ricci identity:

$$\begin{aligned} (\hat{\nabla }^2s)(X,Y,\ldots )-(\hat{\nabla }^2s)(Y,X,\ldots ) =({\hat{R}}(X,Y)\cdot s) (\ldots ). \end{aligned}$$
(59)

In particular, if \({\hat{\nabla }} ^2s\) is symmetric, then \({\hat{R}}(X,Y)\cdot s=0\) for every XY.

We shall employ (58) for studying the statistical cubic form A, but first, we shall make some comments on the applications of (58) to 1-forms and bilinear symmetric forms.

4.1 For 1-Forms

If \(\tau \) is a 1-form on M, then we have

$$\begin{aligned} (\hat{\nabla }^2\tau )(X,Y,Z)-(\hat{\nabla }^2\tau )(Y,X,Z)=-\tau ({\hat{R}}(X,Y)Z)=g({\hat{R}}(X,Y)E,Z), \end{aligned}$$
(60)

where \(E=\tau ^\sharp \). From the last formula we immediately get

Proposition 4.1

If \(\tau \) is a closed 1-form and \({\hat{R}}=0\), then the cubic form \(\hat{\nabla }^2\tau \) is symmetric.

Proposition 4.2

Assume that \(\hat{\nabla }^2\tau \) is a symmetric cubic form. Then \(\mathrm{im}\, {\hat{R}}\subset \mathrm{ker}\,\tau \). In particular, \( {\widehat{\mathrm{Ric}\,}}(X,E)=0 \) for every X, hence \(\tau =0\) if \({\widehat{\mathrm{Ric}\,}}\) is non-degenerate.

By (60), we also get

$$\begin{aligned} \mathrm{tr}\, _g(\hat{\nabla }^2 \tau )(\cdot ,\cdot ,X)=({{d}}\delta \tau +\delta {{d}}\tau )(X)+{\widehat{\mathrm{Ric}\,}}(X,E), \end{aligned}$$
(61)

Simons’ formula (58) now yields

$$\begin{aligned} \frac{1}{2}\Delta (\Vert \tau \Vert ^2)=g(({{d}}\delta +\delta {{d}}) \tau ,\tau )+\widehat{\mathrm{Ric}\,}(E,E)+\Vert {\hat{\nabla }}\tau \Vert ^{2}. \end{aligned}$$
(62)

By Proposition 4.2, (61), and (62), we obtain

Theorem 4.3

Let \(\tau \) be a 1-form such that \(\hat{\nabla }^2\tau =0\). Then \({\widehat{\mathrm{Ric}\,}} (X,E)=0\) for every X and \(\tau \) is harmonic. If, additionally, M is compact, or \(\Vert \tau \Vert \) is constant, then \({\hat{\nabla \tau }}=0\). If \({\widehat{\mathrm{Ric}\,}}\) is non-degenerate at a point of M, then \(\tau =0\).

Of course, (62) immediately implies the following classical Bochner’s theorem

Theorem 4.4

Let (Mg) be a compact Riemannian manifold. Assume that \(\widehat{\mathrm{Ric}\,}\) is semi-positive definite on M. Then each harmonic 1-form \(\tau \) on M is \({\hat{\nabla }}\)-parallel and \({\widehat{\mathrm{Ric}\,}} (E,E)=0\) on M. If, moreover, \({\widehat{\mathrm{Ric}\,}}\) is positive definite at some point of M, then \(\tau = 0\) on M.

We can apply the above theorems to the Czebyshev form of a statistical structure. For instance, we have

Corollary 4.5

Let \((g,\nabla )\) be a statistical structure on a manifold M and \(\tau \) be its Czebyshev form. Assume that \(\hat{\nabla }^2\tau =0\), \(\Vert \tau \Vert \) is constant or M is compact and, moreover, the Ricci tensor of the metric g is non-degenerate at a point of M. Then the structure is trace-free.

Recall that if \(\dim M=2\), then the Ricci tensor (at a point of M) of a Levi-Civita connection is either non-degenerate or vanishes. In the last case, the curvature tensor also vanishes at the point. It provides additional information in the above theorems in the case where \(\dim M=2\). We have, for example,

Corollary 4.6

Let \(\tau \) be a 1-form on a 2-dimensional manifold M and \(\hat{\nabla }^2\tau =0\). At each point of M we have \(\tau =0\) or \({\hat{R}}=0\). In the case where M is compact, we have \(\tau =0\) or \({\hat{R}}=0\) on M.

4.2 For Bilinear Forms

If \(\beta \) is a symmetric bilinear form on a Riemannian manifold (Mg), then we have

$$\begin{aligned} ({\hat{R}}(X,Y)\cdot \beta )(U,V)=-\beta ({\hat{R}}(X,Y)U,V)-\beta (U,{\hat{R}}(X,Y)V). \end{aligned}$$

It follows that if \(e_1,\ldots ,e_n\) is an orthonormal basis diagonalizing \(\beta \) and \(\lambda _1,\ldots ,\lambda _n\) are the corresponding eigenvalues, then

$$\begin{aligned} ({\hat{R}}(X,Y)\cdot \beta )(e_i,e_j)=g({\hat{R}}(X,Y)e_i,e_j)(\lambda _i-\lambda _j)=g({\hat{R}}(e_i,e_j)Y,X)(\lambda _i-\lambda _j).\nonumber \\ \end{aligned}$$
(63)

In particular, we have

Proposition 4.7

If \({\hat{\nabla }}^2\beta \) is symmetric and the sectional curvature of g is non-zero for every plane, then \(\beta =\lambda g\).

Simons’ formula for symmetric tensor fields of type (0, 2) can be formulated as follows:

Theorem 4.8

Let \(\beta \) be a symmetric tensor field of type (0, 2) on a Riemannian manifold M. If \({\hat{\nabla \beta }}\) is symmetric, then we have

$$\begin{aligned} \frac{1}{2}\Delta \Vert \beta \Vert ^2 = \Vert \hat{\nabla }\beta \Vert ^2&+ \sum _{ijk}{\hat{\nabla }}^2\beta (e_j,e_k, e_i,e_i)\beta _{jk}\nonumber \\&\quad +\sum _{i<k}{\hat{k}}(e_i\wedge e_k)(\lambda _i-\lambda _k)^2, \end{aligned}$$
(64)

where \(\beta _{jk}=\beta (e_j,e_k)\), \(\beta _{jk}=\delta _{jk}\lambda _k\) for some orthonormal basis \(e_1,\ldots ,e_n\) of \(T_xM\), \(x\in M\) and \({\hat{k}}(e_i\wedge e_k)\) is the sectional curvature of g by the plane spanned by \(e_i, e_k\).

Proof

First we have

$$\begin{aligned} \sum _i\hat{\nabla }^2 \beta (e_i,e_i,X,Y)&=\sum _i[\hat{\nabla }^2\beta (X,Y,e_i,e_i)+\beta ({\hat{R}}(X,e_i))Y,e_i)\\&\qquad +\beta (Y,{\hat{R}}(X,e_i)e_i)] \end{aligned}$$

for any orthonormal basis \(e_1,\ldots ,e_n\) and any XY. Hence

$$\begin{aligned} g(\Delta \beta ,\beta )= & {} \sum _{kli}\hat{\nabla }^2\beta (e_k,e_l, e_i,e_i)\beta _{kl}\\&+\sum _{kils}\hat{R}_{kils}\beta _{is}\beta _{kl}+\sum _{ksl}\widehat{\mathrm{Ric}\,}_{ks}\beta _{kl}\beta _{ls}. \end{aligned}$$

If \(e_1,\ldots , e_n\) is a basis diagonalizing \(\beta \) with eigenvalues \(\lambda _1,\ldots ,\lambda _n\), then we get

$$\begin{aligned}&\sum _{kils} {\hat{R}}_{kils}\beta _{is}\beta _{kl}+\sum _{ksl}\widehat{\mathrm{Ric}\,}_{ks}\beta _{kl}\beta _{ls} =-\sum _{ik}{\hat{R}}_{ikki}\lambda _i\lambda _k+\sum _k \widehat{\mathrm{Ric}\,}_{kk}\lambda _{k}^2\\&\quad =\sum _{i<k}[-2{\hat{k}}(e_i\wedge e_k)\lambda _i\lambda _k+{\hat{k}}(e_i\wedge e_k)\lambda _i^2 +{\hat{k}}(e_i\wedge e_k)\lambda _k^2]\\&\qquad = \sum _{i<k}{\hat{k}}(e_i\wedge e_k)(\lambda _i-\lambda _k)^2. \end{aligned}$$

It is now sufficient to use (58). \(\square \)

From the above formula, we immediately get

Theorem 4.9

Let (Mg) be a compact Riemannian manifold and \(\beta \) be a symmetric (0, 2)-tensor field such that \({\hat{\nabla \beta }}\) is symmetric and the function \(\lambda =\mathrm{tr}\, _g\beta \) is constant on M. If the sectional curvature of g is non-negative on M and positive at some point of M, then \(\beta =\frac{\lambda }{n}g\).

Proof

If the sectional curvature of g is everywhere non-negative, then by (64) we know that \(\hat{\nabla }\beta =0\). The formula (64) implies also that for every index \(i\ne k\) \({\hat{k}}(e_i\wedge e_k)(\lambda _i-\lambda _k)^2=0\). If at some point of M all sectional curvatures \({\hat{k}}\) are positive, we see that \( \beta =\frac{\lambda }{n}g\) at this point and hence everywhere on M. \(\square \)

In the above theorem, instead of assuming that M is compact, one can assume that \(\Vert \beta \Vert \) is constant on M.

Theorem 4.10

Let (Mg) be a connected Riemannian manifold whose sectional curvature is positive (or negative) at some point of M. If \(\beta \) is a symmetric 2-form and \(\hat{\nabla }\beta =0\), then \(\beta =c g\) for some real constant c.

Corollary 4.11

Let (Mg) be a compact Riemannian manifold whose sectional curvature is non-negative on M and positive at some point of M. If \(\hat{\nabla }{\widehat{\mathrm{Ric}\,}} \) is symmetric and the scalar curvature \({\hat{\rho }} \) is constant, then the Riemannian structure is Einstein.

Corollary 4.12

Let (Mg) be a Riemannian manifold whose sectional curvature is positive at some point of M (or negative at some point of M). If \(\hat{\nabla }{\widehat{\mathrm{Ric}\,}} =0\), then the Riemannian structure is Einstein.

In the geometry of statistical structures, we have a few symmetric bilinear forms, for instance, \(\tau \circ K\), \(g(K_\cdot ,K_\cdot )\). The Czebyshev form is closed if and only if the bilinear forms \(\nabla \tau \), \(\mathrm{Ric}\,\), \({\overline{\mathrm{Ric}\,}}\) are also symmetric. In particular, we can formulate the following.

Corollary 4.13

Let \((M,g,\nabla )\) be a connected statistical manifold. If \(\mathrm{Ric}\,\) is symmetric, \(\hat{\nabla }\mathrm{Ric}\,=0\) and the sectional curvature of g is positive (or negative) at some point of M, then \(\mathrm{Ric}\,=\lambda g\) for some constant number \(\lambda \).

Corollary 4.14

Let \((M,g,\nabla )\) be a Hessian manifold and \(\beta =\nabla \tau \) its Koszul form. If \(\hat{\nabla }\beta =0\) and the sectional curvature for g is positive (or negative) at some point of M, then the structure is Einstein-Hessian.

4.3 For cubic forms

We shall first compute a Simons’ type formula for symmetric cubic forms on an n-dimensional Riemannian manifold (Mg). Let A be a symmetric cubic form and \(K=A^\sharp \). Assume that \(\hat{\nabla }A\) is symmetric. Thus we have a conjugate symmetric statistical structure. Let \(e_1,\ldots ,e_n\) be an orthonormal basis of a tangent space \(T_xM\). We have

$$\begin{aligned} \hat{\nabla }^2A(e_i,e_i, X,Y,Z)= & {} \hat{\nabla }^2A(X,Y,Z, e_i,e_i)-A({\hat{R}}(e_i,X)Y, Z,e_i)\\&-A(Y,{\hat{R}}(e_i,X)Z,e_i)-A(Y,Z,{\hat{R}}(e_i,X)e_i) \end{aligned}$$

for every i. Hence

$$\begin{aligned} \hat{\nabla }A(e_i,e_i, e_j,e_k, e_l)= & {} \hat{\nabla }A(e_j,e_k,e_l,e_i,e_i)-g(K(e_l,e_i), {\hat{R}}(e_i,e_j)e_k)\\&-g(K(e_k,e_i),{\hat{R}}(e_i,e_j)e_l)-g(K(e_k,e_l),{\hat{R}}(e_i,e_j)e_i)\\= & {} \hat{\nabla }A(e_j,e_k,e_l,e_i,e_i)\\&-\sum _r A_{lir}{\hat{R}}_{ijkr}-\sum _r A_{kir}{\hat{R}}_{ijlr}-\sum _rA_{klr}{\hat{R}}_{ijir} \end{aligned}$$

for every ijkl. We also have

$$\begin{aligned} \hat{\nabla }^2\tau (X,Y,Z)=\sum _i\hat{\nabla }^2 A(X,Y,Z, e_i,e_i). \end{aligned}$$
(65)

We now compute

$$\begin{aligned}&\sum _{ijkl}\hat{\nabla }^2A(e_i,e_i, e_j, e_k, e_l)A_{jkl}\\&\quad =\sum _{ijkl}\hat{\nabla }^2A(e_j,e_k, e_l, e_i, e_i)A_{jkl}\\&\qquad -\sum _{ijklr}[A_{lir}A_{jkl}{\hat{R}}_{ijkr}+ A_{kir}A_{jkl}{\hat{R}}_{ijlr}+A_{klr}A_{jkl}{\hat{R}}_{ijir}]\\&\quad =g(\hat{\nabla }^2 \tau , A)+\sum _{kljr}\widehat{\mathrm{Ric}\,}_{jr}A_{jkl}A_{rkl}\\&\qquad -\sum _{ijklr} A_{irl}A_{jkl}{\hat{R}}_{ijkr}-\sum _{ijklr}A_{irk}A_{jlk}{\hat{R}}_{ijlr}\\&\quad =g(\hat{\nabla }^2 \tau , A)+\sum _{kljr}\widehat{\mathrm{Ric}\,}_{jr}A_{jkl}A_{rkl}+\sum _{ijklr}(A_{ikr}A_{jlr}-A_{ilr}A_{jkr}){\hat{R}}_{ijkl}\\&\quad =g(\hat{\nabla }^2 \tau , A)+ g(\widehat{\mathrm{Ric}\,}, g(K_\cdot , K_\cdot ))-\sum _{ijkl}g([K_{e_i}, K_{e_j}]e_k,e_l){\hat{R}}_{ijkl}. \end{aligned}$$

Since for a conjugate symmetric statistical structure \(R={\hat{R}}+[K,K]\) and \(\mathrm{Ric}\,={\widehat{\mathrm{Ric}\,}}+\tau \circ K-g(K_\cdot ,K_\cdot )\) (see Preliminaries), by using (58), we obtain

Theorem 4.15

For any conjugate symmetric statistical structure, we have

$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)= & {} \Vert \hat{\nabla }A\Vert ^2+ g(\hat{\nabla }^2 \tau ,A)-g([K,K],{\hat{R}})+g(\widehat{\mathrm{Ric}\,},g(K_\cdot ,K_\cdot )), \end{aligned}$$
(66)
$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)= & {} \Vert \hat{\nabla }A\Vert ^2+ g(\hat{\nabla }^2 \tau ,A)+g({\hat{R}}-R,{\hat{R}})+g(\widehat{\mathrm{Ric}\,},g(K_\cdot ,K_\cdot )), \end{aligned}$$
(67)
$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)= & {} \Vert \hat{\nabla }A\Vert ^2+ g(\hat{\nabla }^2 \tau ,A)\nonumber \\&+{\hat{R}}^2+\widehat{\mathrm{Ric}\,}^2-g(R,{\hat{R}})-g(\mathrm{Ric}\,,\widehat{\mathrm{Ric}\,})\nonumber \\&+g(\widehat{\mathrm{Ric}\,}, \tau \circ K). \end{aligned}$$
(68)

Consider now the case where the sectional K-curvature is constant. In [14] the following result was proved

Theorem 4.16

Let (gK) be a conjugate symmetric trace-free statistical structure on a manifold M. If the sectional K-curvature is constant (automatically non-positive, because of the trace-freeness), then either \(K=0\) or: \({\hat{R}}=0\) and \({\hat{\nabla }} K=0\).

In a more general situation, we get

Theorem 4.17

Let (gK) be a conjugate symmetric statistical structure with \({\hat{\nabla }}\)-parallel Czebyshev form \(\tau \) and \(\widehat{\mathrm{Ric}\,}\ge 0\). If the sectional K-curvature is constant and non-positive, then \({\hat{\nabla }} K=0\). If the sectional K-curvature is a negative constant, then \(\widehat{\mathrm{Ric}\,}=0\).

Proof

If \([K,K]=\kappa R_0\), then (66) becomes

$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)=\Vert \hat{\nabla }A\Vert ^2+ g(\hat{\nabla }^2 \tau ,A)-2\kappa {\hat{\rho }} +g(\widehat{\mathrm{Ric}\,},g(K_\cdot ,K_\cdot )). \end{aligned}$$
(69)

Since \(\rho ^K=\Vert E\Vert ^2-\Vert A\Vert ^2\) (see (18)) and \({\hat{\nabla }} E=0\), we have that \(\Vert A\Vert \) is constant. Since \(\widehat{\mathrm{Ric}\,}\ge 0\), we have \(g(\widehat{\mathrm{Ric}\,},g(K_\cdot ,K_\cdot ))\ge 0\). The first assertion now follows from (69). If \(\kappa < 0\), then \({\hat{\rho }}=0\). Since \(\widehat{\mathrm{Ric}\,}\ge 0\), we have \(\widehat{\mathrm{Ric}\,}=0\). \(\square \)

For trace-free statistical structures, (68) becomes

$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)=\Vert \hat{\nabla }A\Vert ^2+{\hat{R}}^2+\widehat{\mathrm{Ric}\,}^2-g(R,{\hat{R}})-g(\mathrm{Ric}\,,\widehat{\mathrm{Ric}\,}). \end{aligned}$$
(70)

The last formula was proved by An-Min Li for hyperbolic affine Blaschke spheres, [8]. In that case, \(\tau =0\), \(R=HR_0\), where H is a constant and \(\mathrm{Ric}\,=(n-1)Hg\). Hence (70) can be written as follows:

$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)=\Vert \hat{\nabla }A\Vert ^2+ {\hat{R}}^2+{\widehat{\mathrm{Ric}\,}}^2-(n+1)H{\hat{\rho }}. \end{aligned}$$
(71)

Moreover, \(\Vert A\Vert ^2= {\hat{\rho }} -n(n-1)H\), see (19). Hence \(\Vert A\Vert \) is constant if \({\hat{\rho }}\) is constant. If \({\hat{\rho }}=0\), H must be non-positive. Thus we have

Theorem 4.18

[10] For a Blaschke affine sphere if \({\hat{\rho }}=0\), then \({\hat{R}} =0\) and \(\hat{\nabla }A=0\). The sphere is either a part of a trivial paraboloid (an elliptic paraboloid with its trivial affine structure) or the sphere is hyperbolic.

The hyperbolic sphere from the last theorem must be given by the equation \(x_1\cdot ...,\cdot x_{n+1}=c\), where c is a positive constant, see [10].

For equiaffine spheres, we can prove

Theorem 4.19

For an equiaffine hyperbolic or parabolic sphere, if \(\hat{\nabla }\tau =0\), \(\widehat{\mathrm{Ric}\,}\ge 0\) and \({\hat{\rho }} \) is constant, then \({\hat{R}}=0\) and \(\hat{\nabla }A=0\).

Proof

Since \(R=HR_0\) and \({\hat{\nabla }} \tau =0\), the formula (67) can be written as follows:

$$\begin{aligned} \frac{1}{2}\Delta (\Vert A\Vert ^2)=\Vert \hat{\nabla }A\Vert ^2+{\hat{R}}^2-2H{\hat{\rho }} +g(\widehat{\mathrm{Ric}\,},g(K_\cdot ,K_\cdot )). \end{aligned}$$
(72)

Moreover, the function \(\Vert E\Vert \) is constant. Since the functions \({\hat{\rho }}\) and \(\rho \) are constant, (19) implies that \(\Vert A\Vert \) is also constant. The assertion now follows from (72). \(\square \)

From (66), we immediately get

Proposition 4.20

Let \((M,g,\nabla )\) be a conjugate symmetric statistical manifold such that \(\hat{\nabla }^2\tau =0\), \({\hat{R}}=0\), and \(\Vert A\Vert \) is constant (or M is compact). Then \(\hat{\nabla }A=0\).

From (72) one gets

Proposition 4.21

For a Hessian structure (gA), if \(\hat{\nabla }A=0\) and \({\widehat{\mathrm{Ric}\,}}\ge 0\), then \({\hat{R}}=0\).

For a Lagrangian submanifold in a complex space form \({\tilde{M}}(4c)\), we have \( -[K,K]=cR_0-{\hat{R}} \), see (26). Using now (66) we obtain

$$\begin{aligned} \Delta \Vert A\Vert ^2=\Vert {\hat{\nabla }} A\Vert ^2 +g({\hat{\nabla }}^2\tau ,A)-\Vert {\hat{R}}\Vert ^2+2c{\hat{\rho }} +g({\widehat{\mathrm{Ric}\,}},g(K_\cdot , K_\cdot )). \end{aligned}$$
(73)

Theorem 4.22

Let (gK) be the statistical structure on a Lagrangian submanifold in a complex space form \({\tilde{M}} (4c)\) whose second fundamental tensor K is \({\hat{\nabla }}\)-parallel. If \({\widehat{\mathrm{Ric}\,}}=0\), then \({\hat{R}}=0\). If \({\widehat{\mathrm{Ric}\,}}\le 0\) and \(c\ge 0\) then \({\hat{R}}=0\).

If g has constant curvature \({\hat{k}}\), then using (28) and (73) we get

$$\begin{aligned} \Delta \Vert A\Vert ^2=\Vert {\hat{\nabla }} A\Vert ^2 +g({\hat{\nabla }}^2\tau ,A)+{\hat{k}}[2(\Vert A\Vert ^2-\Vert E\Vert ^2)+(n-1)\Vert A\Vert ^2]. \end{aligned}$$
(74)

Hence, if \({\hat{k}}\ge 0\), then (53) yields

$$\begin{aligned} \Delta \Vert A\Vert ^2\ge \Vert {\hat{\nabla }} A\Vert ^2 +g({\hat{\nabla }}^2\tau ,A)+\frac{n-1}{3}{\hat{k}}\Vert A\Vert ^2. \end{aligned}$$
(75)

This formula immediately yields a theorem proved by Chen and Ogiue in [2] and saying that a minimal Lagrangian submanifold of constant positive curvature in a complex space form must be totally geodesic. Indeed, it is sufficient to note that if \(\tau =0\), then, by (28), \(\Vert A\Vert \) is constant. Using (28) and Theorem 4.3, we also get

Theorem 4.23

Let M be a flat Lagrangian submanifold in a complex space form and \({\hat{\nabla }} ^2\tau =0\). If M is compact or the mean curvature is constant, then the submanifold has parallel second fundamental tensor.

5 Using a Maximum Principle

In this section, we set \(u=\Vert A\Vert ^2\) and we assume that \((g,\nabla )\) is a trace-free statistical structure on an n-dimensional connected manifold M such that \(R=HR_0\). It is automatically conjugate symmetric and therefore, if \(n>2\), then H is constant. If \(n=2\), H can be a function. In this case, the assumption \(R=HR_0\) is equivalent to the assumption that \((g,\nabla )\) is conjugate symmetric. If H is constant, such a statistical structure can be automatically realized (locally) on an affine Blaschke sphere.

We shall now employ two inequalities due to Calabi and An-Min Li. For detailed proofs, we refer to [9, 12]. Here we give a sketch of the proof. First define the tensor fields L, P, and Q as follows:

$$\begin{aligned} L(X,Y,W,Z)= & {} g(K(X,Y),K(W,Z)), \end{aligned}$$
(76)
$$\begin{aligned} P(X,Y,W,Z)= & {} L(X,Y,W,Z)-L(W,Y,X,Z), \end{aligned}$$
(77)
$$\begin{aligned} Q(Y,W,Z)= & {} \mathrm{tr}\, _g([K_{\cdot }, K_Y]\cdot A)(\cdot ,W,Z), \end{aligned}$$
(78)

where \(X,Y,Z,W\in T_xM\), \(x\in M\). In the case under consideration, we have \({\hat{R}}=HR_0-[K,K]\) and

$$\begin{aligned} \Delta A (Y,W,Z)=\mathrm{tr}\, _g({\hat{R}}(\cdot ,Y)\cdot A)(\cdot ,W,Z)=H(n+1)A -Q(Y,W,Z). \end{aligned}$$

We have

$$\begin{aligned} \Vert L\Vert ^2=\sum _{ij}a_{ij}^2,\ \ \ \ \ \ \ \Vert P\Vert ^2=\sum _{ijkl}b_{ij:kl}^2, \end{aligned}$$
(79)

where

$$\begin{aligned} \begin{array}{rcl} &{}&{}a_{ij}=\sum \limits _{kl}A_{ikl}A_{jkl},\\ &{}&{}b_{ij;kl}=\sum \limits _m(A_{ijm}A_{klm}-A_{kjm}A_{ilm}), \end{array} \end{aligned}$$
(80)

The notation in (80) is similar to that in [12] on p. 84, where it was introduced for the cubic form \(C=-2A\). It was proved there that \(-g(Q,A)=\Vert L\Vert ^2+\Vert P\Vert ^2\), that is,

$$\begin{aligned} g(\Delta A,A)=(n+1)Hu+\Vert L\Vert ^2+\Vert P\Vert ^2. \end{aligned}$$
(81)

Moreover, see [12],

$$\begin{aligned} \Vert L\Vert ^2+\Vert P\Vert ^2\ge \frac{n+1}{n(n-1)}u^2, \end{aligned}$$
(82)

where \(u=\Vert A\Vert ^2\). In [12] it was assumed that H is constant, but this assumption is not necessary (although automatically satisfied if \(n>2\)). The inequality actually is due to Blaschke (for \(n=2\)) and Calabi (for \(n>2\)). In [9] A-M. Li proved the following algebraic inequality (Theorem 1 in [9])

$$\begin{aligned} \Vert L\Vert ^2+\Vert P\Vert ^2\le \frac{3}{2}u^2. \end{aligned}$$
(83)

In particular, in the case where \(n=2\), by comparing (82) and (83), we get

$$\begin{aligned} \Vert L\Vert ^2+\Vert P\Vert ^2=\frac{3}{2}u^2. \end{aligned}$$
(84)

The last equality was also directly proved in [12]. Using the above facts and Simons’ formula (58), one can formulate

Theorem 5.1

Let \((g,\nabla )\) be a trace-free statistical structure on an n-dimensional manifold M and \(R=HR_0\). We have

$$\begin{aligned} (n+1)Hu +\frac{n+1}{n(n-1)}u^2+\Vert {\hat{\nabla }} A\Vert ^2\le \frac{1}{2}\Delta u\le (n+1)Hu+\frac{3}{2}u^2+\Vert {\hat{\nabla }} A\Vert ^2.\nonumber \\ \end{aligned}$$
(85)

If \(n=2\), the equalities hold.

As an immediate consequence of this theorem, we have

Corollary 5.2

Let \((g,\nabla )\) be a trace-free statistical structure on an n-dimensional manifold M and \(R=HR_0\). Assume that \({\hat{\nabla }} A=0\) on M. (1) If \(H\ge 0\), then \(A=0\) on M. (2) If \(H< 0\), then either \(A=0\) on M or

$$\begin{aligned} \frac{2}{3}(n+1)(-H)\le u\le n(n-1)(-H). \end{aligned}$$
(86)

If \(n=2\) and \(\mathrm{sup}\, H=0\), then \(A=0\).

For treating the case where g is complete, we shall use the following maximum principle

Lemma 5.3

[18] Let (Mg) be a complete Riemannian manifold, whose Ricci tensor is bounded from below and let f be a \({\mathcal {C}}^2\)-function bounded from above on M. For every \(\varepsilon >0\), there exists a point \(x\in M\) at which

  1. (i)

    \(f(x)> \mathrm{sup}f -\varepsilon \)

  2. (ii)

    \(\Delta f(x)<\varepsilon \).

The results contained in the following theorem and their generalizations can be found in [1, 10, 15].

Theorem 5.4

Let \((g,\nabla )\) be a trace-free statistical structure with \(R=HR_0\) and complete g on a manifold M. If \(H\ge 0\), then the structure is trivial. If H is constant and negative, then the Ricci tensor of g is non-positive and consequently

$$\begin{aligned} u\le n(n-1)(-H). \end{aligned}$$
(87)

We shall now give a more delicate estimation for u by looking more carefully at the function \(\Vert \hat{\nabla }A\Vert \). Recall here that for a trace-free conjugate symmetric statistical structure we have \({\widehat{\mathrm{Ric}\,}}>\mathrm{Ric}\,\), see (17). Hence, in all theorems below, the Ricci tensor of g is automatically bounded from below.

Theorem 5.5

Let \((g,\nabla )\) be a trace-free statistical structure on an n-dimensional manifold M and \(R=HR_0\), where H is a negative constant. If g is complete and

$$\begin{aligned} \mathrm{sup}\, \Vert {\hat{\nabla }} A\Vert ^2<\frac{H^2(n+1)^2}{6}, \end{aligned}$$
(88)

then

$$\begin{aligned} \inf u\ge \frac{(n+1)(-H)+\sqrt{(n+1)^2H^2-6N_2}}{3}, \end{aligned}$$
(89)

or

$$\begin{aligned} \inf u\le \frac{(n+1)(-H)-\sqrt{(n+1)^2H^2-6N_2}}{3}, \end{aligned}$$
(90)

where \(u=\Vert A\Vert ^2\) and \(N_2=\mathrm{sup}\Vert {\hat{\nabla }} A\Vert ^2\).

Proof

By Theorem 5.1, we have

$$\begin{aligned} \frac{1}{2}\Delta u\le (n+1)Hu+\frac{3}{2}u^2+N_2. \end{aligned}$$
(91)

Set \(N_1=\inf u\) and \({\tilde{u}}=-u\). The function \({\tilde{u}}\) is bounded from above by 0 and \(-N_1=\mathrm{sup}\, {\tilde{u}}\). Take sequences \(\varepsilon _i\) and \(x_i\) satisfying i) from Lemma 5.3 applied to the function \({\tilde{u}}\). Assume that the sequence \(\varepsilon _i\) is decreasing and tending to 0. We have \({\tilde{u}} (x_i)>-N_1-\varepsilon _i>-N_1-\varepsilon _1\) and consequently \(u(x_i)<N_1+\varepsilon _1\) and \(-u^2(x_i)>-(N_1+\varepsilon _1)^2\). By (91), we now have

$$\begin{aligned} \frac{1}{2}(\Delta {\tilde{u}})(x_i)\ge -(n+1)HN_1-\frac{3}{2}(N_1+\varepsilon _1)^2-N_2. \end{aligned}$$
(92)

Consider the following polynomial of degree 2

$$\begin{aligned} p(t)= & {} -\frac{3}{2}(t+\varepsilon _1)^2-(n+1)Ht-N_2\nonumber \\= & {} -\frac{3}{2}t^2+[-3\varepsilon _1-(n+1)H]t-\left[ \frac{3}{2}\varepsilon _1^2+N_2\right] . \end{aligned}$$
(93)

Denote by \(\delta \) its discriminant. We have

$$\begin{aligned} \delta =N^2+6\varepsilon _1(n+1)H<N^2,\end{aligned}$$
(94)

where \(N=\sqrt{(n+1)^2H^2-6N_2}\). Assume that \(\varepsilon _1\) is so small that \(\delta >0\). Then \(\delta =N^2q^2\) for some \(0<q<1\). The relation between q and \(\varepsilon _1\) is the following

$$\begin{aligned} \varepsilon _1=\frac{N^2(1-q^2)}{6(n+1)(-H)}. \end{aligned}$$
(95)

Instead of choosing \(\varepsilon _1\), we can choose q, and \(\varepsilon _1\) tends to 0 if and only if q tends to 1. Let \(t_1\), \(t_2\) be the roots of the polynomial p(t) depending also on q. We have

$$\begin{aligned} \begin{array}{rcl} &{}&{}t_1(q)=\frac{(n+1)(-H)-q\sqrt{(n+1)^2H^2-6N_2}}{3}-\frac{N^2(1-q^2)}{6(n+1)(-H)},\\ &{}&{} t_2(q) =\frac{(n+1)(-H)+q\sqrt{(n+1)^2H^2-6N_2}}{3}-\frac{N^2(1-q^2)}{6(n+1)(-H)}. \end{array} \end{aligned}$$
(96)

One sees that

$$\begin{aligned} t_1(q)\rightarrow \frac{(n+1)(-H)-\sqrt{(n+1)^2H^2-6N_2}}{3}, \\ t_2(q)\rightarrow \frac{(n+1)(-H)+\sqrt{(n+1)^2H^2-6N_2}}{3} \end{aligned}$$

if q tends to 1. We shall now use the maximum principle. Suppose that

$$\begin{aligned} \begin{array}{rcl} &{}&{}\frac{(n+1)(-H)-q\sqrt{(n+1)^2H^2-6N_2}}{3}<N_1\\ &{}&{}\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ <\frac{(n+1)(-H)+q\sqrt{(n+1)^2H^2-6N_2}}{3}. \end{array} \end{aligned}$$
(97)

There is q (sufficiently close to 1) such that \(t_1(q)<N_1<t_2(q)\). Hence \(p(N_1)\), depending only on n, H, \(N_1\), \(N_2\), and \(\varepsilon _1\) (equivalently q), is positive and, by (92), we cannot have \(\frac{1}{2}\Delta {\tilde{u}}(x_i)< \varepsilon _i\) for \(\varepsilon _i\) tending to 0. We have got a contradiction. \(\square \)

Corollary 5.6

Let \((g,\nabla )\) be a trace-free statistical structure on an n-dimensional manifold M and \(R=HR_0\), where H is a negative number. If g is complete,

$$\begin{aligned} N_2<\frac{H^2(n+1)^2}{6} \end{aligned}$$
(98)

and

$$\begin{aligned} \inf u> \frac{(n+1)(-H)-\sqrt{(n+1)^2H^2-6N_2}}{3}, \end{aligned}$$
(99)

then

$$\begin{aligned} u\ge \frac{(n+1)(-H)+\sqrt{(n+1)^2H^2-6N_2}}{3} \end{aligned}$$
(100)

on M.

Theorem 5.7

Let \((g,\nabla )\) be a trace-free statistical structure on an n-dimensional manifold M and \(R=HR_0\), where H is a negative number. If g is complete, then

$$\begin{aligned} \inf \, \Vert {\hat{\nabla }} A\Vert ^2\le \frac{n(n^2-1)H^2}{4} \end{aligned}$$
(101)

and the following inequalities hold

$$\begin{aligned}&\mathrm{sup}\, u\le \frac{n(n-1)(-H)+\sqrt{n^2(n-1)^2H^2-4N_4}}{2}, \end{aligned}$$
(102)
$$\begin{aligned}&\mathrm{sup}\, u\ge \frac{n(n-1)(-H)-\sqrt{n^2(n-1)^2H^2-4N_4}}{2}, \end{aligned}$$
(103)

where \(u=\Vert A\Vert ^2\) and \(N_4=\frac{n(n-1)}{n+1}\inf \, \Vert {\hat{\nabla }} A\Vert ^2\).

Proof

By Theorem 5.1, we have

$$\begin{aligned} \frac{1}{2}\Delta u\ge (n+1)H u +\frac{n+1}{n(n-1)}u^2+\Vert {\hat{\nabla }} A\Vert ^2.\end{aligned}$$
(104)

Let \(N_3=\mathrm{sup}\, u\). By Theorem 5.4, \(N_3\) is finite. Take sequences \(\varepsilon _i\) and \(x_i\) satisfying i) from Lemma 5.3 applied to the function u. Assume that \(\varepsilon _i\) is decreasing and tending to 0, and \(N_3-\varepsilon _1>0\). By (104), we have

$$\begin{aligned} \frac{n(n-1)}{2(n+1)}\Delta u(x_i)\ge n(n-1)HN_3 + (N_3-\varepsilon _1)^2+N_4.\end{aligned}$$
(105)

Consider the following polynomial of degree 2

$$\begin{aligned} \begin{array}{rcl} &{}&{}p(t)=-n(n-1)(-H)t+(t-\varepsilon _1)^2+N_4\\ &{}&{}\ \ \ \ \ \ \ =t^2-[2\varepsilon _1+n(n-1)(-H)]t+(\varepsilon _1^2+N_4). \end{array} \end{aligned}$$
(106)

Denote by \(\delta \) its discriminant. We have

$$\begin{aligned} \delta =n^2(n-1)^2H^2-4N_4+4n(n-1)(-H)\varepsilon _1.\end{aligned}$$
(107)

If \(n^2(n-1)^2H^2-4N_4<0\), then, by taking \(\varepsilon _1\) sufficiently small, we get \(\delta <0\) and hence \(\Delta u(x_i)\) will be positive and bounded from zero by a positive number. This will give a contradiction to the maximum principle. We have proved (101), that is, \(n^2(n-1)^2H^2-4N_4\ge 0\).

Set \(N^2=n^2(n-1)^2H^2-4N_4\). Assume first that \(N=0\). Then \(\delta \) is positive and the roots of the polynomial (106) are

$$\begin{aligned} \frac{n(n-1)(-H)}{2}+\varepsilon _1\pm \sqrt{n(n-1)(-H)\varepsilon _1}. \end{aligned}$$
(108)

If \(\varepsilon _1\) tends to 0, both roots tend to \(\frac{n(n-1)(-H)}{2}\). Suppose that \(N_3>\frac{n(n-1)(-H)}{2}\). There is \(\varepsilon _1\) such that \(N_3>\frac{n(n-1)(-H)}{2}+\varepsilon _1\pm \sqrt{n(n-1)(-H)\varepsilon _1}\) and then all \(\Delta u(x_i)\) are positive and bounded from zero by a positive number. This gives a contradiction with the maximum principle. Similar arguments work in the case where \(N_3<\frac{n(n-1)(-H)}{2}\).

Assume now that \(N^2>0\). There is \(Q>1\) such that \(\delta =N^2Q^2\). The relation between \(\varepsilon _1\) and Q is the following

$$\begin{aligned} \varepsilon _1=\frac{N^2(Q^2-1)}{4n(n-1)(-H)}. \end{aligned}$$
(109)

The roots of the polynomial (106) are

$$\begin{aligned} \begin{array}{rcl} &{}&{}t_1(Q)=\frac{n(n-1)(-H)(Q^2+1)-2Q\sqrt{n^2(n-1)^2H^2-4N_4}}{4}-\frac{N_4(Q^2-1)}{n(n-1)(-H)},\\ &{}&{}t_2(Q)=\frac{n(n-1)(-H)(Q^2+1)+2Q\sqrt{n^2(n-1)^2H^2-4N_4}}{4}-\frac{N_4(Q^2-1)}{n(n-1)(-H)}. \end{array} \end{aligned}$$
(110)

If \(Q\rightarrow 1\), then

$$\begin{aligned} t_1(Q)\rightarrow L_1=\frac{n(n-1)(-H)-\sqrt{n^2(n-1)^2H^2-4N_4}}{2}\ge 0 \end{aligned}$$

and

$$\begin{aligned} t_2(Q)\rightarrow P_1=\frac{n(n-1)(-H)+\sqrt{n^2(n-1)^2H^2-4N_4}}{2}\le n(n-1)(-H). \end{aligned}$$

Suppose now that \(N_3<L_1\) or \(N_3>P_1\). If \(N_3<L_1\), then there is Q (equivalently the corresponding \(\varepsilon _1)\) such that \(N_3<t_1(Q)\), which means that \(p(N_3)\) is positive and depends only on n, H, \(N_4\) and Q. Hence (105) gives a contradiction with the maximum principle. We argue similarly if \(N_3>P_1\). \(\square \)

The case where \(n=2\) should be treated separately because in this case H may be a function. Note that \(\mathrm{Ric}\,=Hg\), hence \(\widehat{\mathrm{Ric}}\,\) is bounded from below, if H is bounded from below. The same arguments as those used in the above proofs yield

Theorem 5.8

Let \((g,\nabla )\) be a conjugate symmetric trace-free statistical structure with complete g on a 2-dimensional manifold M. Then \(R=HR_0\) for some function H and if \(-\infty <H_2\le H\le 0\) for some number \(H_2\), then

$$\begin{aligned} \inf {\hat{u}}\le \frac{3}{2}H_2^2 \end{aligned}$$
(111)

and

$$\begin{aligned} -H_2-\sqrt{H_2^2-\frac{2}{3}\inf {\hat{u}}}\le \mathrm{sup}\, u\le -H_2+\sqrt{H_2^2-\frac{2}{3} \inf {\hat{u}}}, \end{aligned}$$
(112)

where \({\hat{u}}=\Vert {\hat{\nabla }} A\Vert ^2\). If \(-\infty <H_2\le H\le H_1\le 0\) for some numbers \(H_1\), \(H_2\) and

$$\begin{aligned} \mathrm{sup}\,{\hat{u}}\le \frac{3}{2}H_1^2, \end{aligned}$$
(113)

then

$$\begin{aligned} \inf \, u\ge -H_1+\sqrt{H_1^2-\frac{2}{3}\mathrm{sup}\,{\hat{u}}} \end{aligned}$$
(114)

or

$$\begin{aligned} \inf \, u\le -H_1-\sqrt{H_1^2-\frac{2}{3}\mathrm{sup}\,{\hat{u}}}. \end{aligned}$$
(115)

6 Using Ros’ Integral Formula

Let s be a tensor of type (0, k), \(k\ge 2\) and \(g_0\) – the standard scalar product on \({\mathbf {R}}^n\). Define the following 1-form on \(S^{n-1}=\{V\in {\mathbf {R}}^n;\Vert V\Vert =1\}\):

$$\begin{aligned} \alpha _V(e)=s(V,\ldots ,V,e,V,\ldots ,V), \end{aligned}$$

where e stays at a fixed place \(i_0\). Let \(\delta \) denote the codifferential relative to \(g_0\) restricted to \(S^{n-1}\). By a straightforward computation, one gets

$$\begin{aligned} \begin{array}{rcl} \delta \alpha &{}&{} =-(n+k-2) s(V,\ldots ,V)\\ &{}&{}\quad +\mathrm{tr}\, _gs(\cdot ,V,\ldots ,V, \cdot ,V,\ldots ,V)+\ldots +\mathrm{tr}\, _gs(V,\ldots ,V,\cdot , V,\ldots ,V, \cdot ), \end{array}\nonumber \\ \end{aligned}$$
(116)

where one of the dots "\(\cdot \) " stays at the fixed \(i_0\)-th place.

Let now g be a positive-definite metric tensor field on a manifold M. For simplicity, we shall assume that M is connected and oriented. Let UM denote the unit sphere bundle over M. By (116), we have

Proposition 6.1

For a covariant tensor field s of degree \(k\ge 2\) we have

$$\begin{aligned} \begin{array}{rcl} &{}&{}(n+k-2)\int _{U_xM} s(V,\ldots ,V)\\ &{}&{}\quad =\int _{U_xM}\mathrm{tr}\, _gs(\cdot ,V,\ldots ,V, \cdot ,V,\ldots ,V)+\cdots \\ &{}&{}\qquad +\int _{U_xM}\mathrm{tr}\, _gs(V,\ldots ,V,\cdot , V,\ldots ,V, \cdot ) \end{array} \end{aligned}$$
(117)

for every \(x\in M\).

We have the following Ros’ integral formula

$$\begin{aligned} \int _{UM}\mathrm{tr}\, _g({\hat{\nabla }} s)(\cdot ,\cdot ,V,\ldots ,V)=0, \end{aligned}$$
(118)

where s is a covariant tensor field of degree greater than 1 on a compact manifold M and \(\int _{UM}=\int _{x\in M}\int _{U_XM}\), see [16].

Theorem 6.2

Let (gA) be a conjugate symmetric statistical structure on a compact manifold M. If the sectional curvature \({\hat{k}}\ge 0\) on M and \({\hat{\nabla }}^2\tau =0\), then \({\hat{\nabla }} A=0\) on M. If moreover, \({\hat{k}}>0\) at some point p of M, then the statistical structure is trivial.

Proof

Consider the following tensor field s on M

$$\begin{aligned} s(X_1,\ldots ,X_7)={\hat{\nabla }} A(X_1,\ldots ,X_4)A(X_5,X_6,X_7). \end{aligned}$$
(119)

By Theorem 4.3, we have \({\hat{\nabla }} \tau =0\), which implies that \(\sum _i{\hat{\nabla }} A (X,Y,e_i,e_i)=0\), where, as usual, \(e_1,\ldots ,e_n\) stands for an orthonormal basis of a tangent space. Using (59), we obtain

$$\begin{aligned}&\sum _i{\hat{\nabla }} s(e_i,e_i,V,\ldots ,V)\nonumber \\&\quad =\sum _i{\hat{\nabla }}^2A(e_i,e_i,V,V,V)A(V,V,V)+\sum _i{\hat{\nabla }} A(e_i,V,V,V)^2\nonumber \\&\quad =\Vert ({\hat{\nabla }} K)(V,V,V)\Vert ^2+\sum _i(({\hat{R}}(e_i,V)A)(e_i,V,V)A(V,V,V)\nonumber \\&\quad =\Vert ({\hat{\nabla }} K)(V,V,V)\Vert ^2-\sum _iA({\hat{R}}(e_i,V)e_i,V,V)A(V,V,V)\nonumber \\&\qquad -\sum _i2A(e_i,{\hat{R}}(e_i,V)V,V)A(V,V,V). \end{aligned}$$
(120)

If we define a 1-form \(\alpha \) on \(U_pM\) by \( \alpha _V(e)=A({\hat{R}}(e,V)V,V,V)A(V,V,V) \), then, by (116), we have

$$\begin{aligned} \delta \alpha (V)= & {} \sum _iA({\hat{R}}(e_i,V)e_i,V,V)A(V,V,V)\\&+2\sum _iA({\hat{R}}(e_i,V)V,e_i,V)A(V,V,V)\\&+3\sum _iA({\hat{R}}(e_i,V)V,V,V)A(e_i,V,V), \end{aligned}$$

and consequently

$$\begin{aligned}&\int _{U_pM}\left[ -\sum _iA({\hat{R}}(e_i,V)e_i,V,V)A(V,V,V)-\sum _i 2A(e_i,{\hat{R}}(e_i,V)V,V)A(V,V,V)\right] \\&\quad =3\int _{U_pM}g({\hat{R}} (K(V,V),V)V,K(V,V)). \end{aligned}$$

By (118) and (120), one gets

$$\begin{aligned} 0=\int _{UM}\Vert ({\hat{\nabla }} K)(V,V,V)\Vert ^2+3\int _{UM}g({\hat{R}} (K(V,V),V)V,K(V,V)). \end{aligned}$$
(121)

This implies the first assertion. To prove the second one, suppose that \({\hat{k}}>0\) at \(p\in M\), \(K\ne 0\) (automatically at every point, in particular at p). By (121), K(VV) is parallel to V for each \(V\in U_pM\). By Theorem 4.3, we know that E vanishes on M. Consider the function \(U_pM\ni V\rightarrow A(V,V,V)\in {\mathbf {R}}\). It attains a maximum at some vector, say \(e_1\). It is an eigenvector of \(K_{e_1}\), that is, \(K(e_1,e_1)=\lambda _1e_1\) and \(\lambda _1>0\). Let \(e_1,\ldots , e_n\) be an orthonormal eigenbasis of \(K_{e_1}\). Hence \(K(e_1, e_i)=\lambda _ie_i\) for some numbers \(\lambda _i\). Since \(E=0\), we have \(\sum _i\lambda _i=0\). Since \(K(e_i,e_i)\) is parallel to \(e_i\) and \(K(e_1,e_i)=\lambda _i e_i\), we have \(0=g(K(e_i,e_i),e_1)=g(K(e_1,e_i), e_i)=\lambda _i\) for every \(i=2,\ldots ,n\). It follows that \(\lambda _1=0\), which gives a contradiction. \(\square \)