1 Introduction

This paper deals with the following general question:

Let \(X^\alpha \in \Gamma (T M)\) be a vector field and \(Y_\beta \in \Gamma (T^* M)\) be a co-vector field on a smooth manifold M. Does there exist a smooth Riemannian metric \(g_{\alpha \beta }\) on M such that \(Y_\beta = g_{\alpha \beta } X^\alpha \)?Footnote 1

Clearly, this is not always true: \(X^\alpha \) and \(Y_\beta \) will have to satisfy some compatibility conditions. Firstly, \(X^\alpha \) and \(Y_\beta \) need to have the same set of zeroes (critical points). Secondly, at all other points \(m \in M\), they need to satisfy \(X^\alpha Y_\alpha |_m > 0\). A third (and slightly less obvious) compatibility condition is obtained by differentiating the equation \(Y_\beta = g_{\alpha \beta } X^\alpha \): at each critical point \(m \in M\) there should exist a scalar product \({\bar{g}}_{\alpha \beta } \in T_m^* M \otimes _{\textrm{S}} T_m^* M\) such that \( \nabla _\alpha Y_\gamma |_m = {\bar{g}}_{\beta \gamma } \nabla _\alpha X^\beta |_m \) for some (equivalently, any) connection \(\nabla _\alpha \). This condition does not hold automatically: it represents a compatibility constraint on \(X^\alpha \) and \(Y_\beta \) with a natural interpretation in some examples below.

While these three conditions are clearly necessary, it is not obvious that they are also sufficient. The main result of this paper shows that this is indeed the case, under mild smoothness and non-degeneracy assumptions; namely, at all critical points, we require non-degeneracy of the derivative of \(Y_\beta \) and we assume that \(X^\alpha \) and \(Y_\beta \) are real analytic in suitable local coordinates; cf. Sect. 2 for the details.

Theorem 1.1

(Main result) Let \(X^\alpha \in \Gamma (T M)\) and \(Y_\beta \in \Gamma (T^* M)\) satisfy Assumption 2.1 below. Then there exists a metric \(g_{\alpha \beta } \in \Gamma (T^* M \otimes _{\textrm{S}} T^* M ) \) satisfying \(Y_\beta = g_{\alpha \beta } X^\alpha \) if and only if the following conditions hold:

  1. (i)

    For all \(m \in M\) with \(Y_\beta |_m \ne 0\) we have \(X^\alpha Y_\alpha |_m > 0\);

  2. (ii)

    For all \(m \in M\) with \(Y_\beta |_m = 0\) we have \(X^\alpha |_m = 0\);

  3. (iii)

    For all \(m \in M\) with \(Y_\beta |_m = 0\) there exists a scalar product \({\bar{g}}_{\alpha \beta } \in T_m^* M \otimes _{\textrm{S}} T_m^* M\) such that

    $$\begin{aligned} \nabla _\alpha Y_\gamma |_m = {\bar{g}}_{\beta \gamma } \nabla _\alpha X^\beta |_m. \end{aligned}$$

The choice of the connection \(\nabla \) in (iii) is arbitrary.

We shall also prove a variant of this result where \(X^\alpha \) and \(Y_\beta \) are of class \(C^{k+1}\) for some \(k \in {\mathbb N}\). In this case, the metric \(g_{\alpha \beta }\) is of class \(C^k\); see Theorem 2.7 below.

While Theorem 1.1 is of independent interest, our motivation comes from an open question on gradient flow structures for dissipative quantum systems, that will be discussed below.

Let us first briefly sketch the structure of the proof. To prove the sufficiency of conditions (i)–(iii), it suffices to construct a local metric around every point of M. The global metric can then be constructed using a partition of unity. Around non-critical points the construction is straightforward: in local coordinates, it corresponds to constructing a positive definite matrix that maps one given vector to another one. However, it is not trivial to construct a smooth metric satisfying \(Y_\beta = g_{\alpha \beta } X^\alpha \) in a neighbourhood of a critical point.

To solve this problem, we assume that the sought metric has a power series expansion in a suitable chart around the critical point. We then derive an infinite hierarchy of tensor equations, which express power series coefficients of degree N in terms of coefficients of degree at most \(N - 1\) for \(N \ge 1\). Solvability of the lowest order equation is guaranteed by compatibility condition (iii). We then prove that higher order equations can be solved iteratively. Moreover, the norms of the solutions are exponentially bounded in the degree, which allows us to construct a convergent power series that satisfies the desired equation in a neighbourhood of the critical point.

1.1 Application to gradient structures

Consider now the special case where \(Y \in \Gamma (T^* M)\) is the derivative of a smooth function \(f \in C^\infty \), i.e, \(Y_\beta = \nabla _\beta f\). Then our question becomes: Does there exist a smooth Riemannian metric \(g_{\alpha \beta }\) such that X is the gradient of f with respect to the metric g, i.e., \(X^\alpha = g^{\alpha \beta } \nabla _\beta f\)? In other words, the question is whether the ODE \(\dot{u} = -X(u)\) on M can be formulated as a gradient flow equation \( \dot{u}(t) = - \nabla f\big (u(t)\big ) \) for a suitable Riemannian metric. Our main result yields necessary and sufficient conditions.

Gradient flows describe motion in the direction of steepest descent of the function f in the geometry defined by the metric g. The identification of an ODE as a gradient flow equation is often fruitful, as there are powerful techniques available for the analysis of gradient flows [1].

As an application of our main result, we address an open question on the gradient flow structure of finite-dimensional dissipative quantum systems. To put this result into context, let us first discuss the corresponding classical setting.

1.1.1 Classical Markov semigroups

Consider an irreducible continuous-time Markov chain on a finite set \(\mathcal {X}\) with transition rates \(q_{xy} \ge 0\) for \(x, y \in \mathcal {X}\) with \(x \ne y\). The associated Markov semigroup \((P_t)_{t \ge 0}\) is a \(C_0\)-semigroup of positive operators on \({\mathbb R}^\mathcal {X}\) that preserves the constant functions. Its infinitesimal generator \(L: {\mathbb R}^\mathcal {X}\rightarrow {\mathbb R}^\mathcal {X}\) is given by

$$\begin{aligned} \big (L \psi \big )(x) := \sum _{y \in \mathcal {X}} q_{xy} \big ( \psi (y) - \psi (x) \big ). \end{aligned}$$

As time evolves, the marginal law of the Markov chain describes a curve \((\mu _t)_{t > 0}\) in \(\mathscr {P}_*(\mathcal {X})\), the simplex of probability densities with positive density. It evolves according to the Kolmogorov forward equation (KFE)

$$\begin{aligned} \partial _t \mu _t = L^* \mu _t, \quad \text { where } \big (L^* \mu \big )(x) = \sum _{y \ne x} \mu (y) q_{yx} - \mu (x) q_{xy} \end{aligned}$$

for \(\mu \in \mathscr {P}(\mathcal {X})\). Let \(\pi \in \mathscr {P}_*(\mathcal {X})\) be the unique stationary distribution. It is well known and easy to verify that the relative entropy

$$\begin{aligned} {{\,\textrm{Ent}\,}}_\pi (\mu ) := \sum _{x \in \mathcal {X}} \mu (x) \log \Big (\frac{\mu (x)}{\pi (x)}\Big ) \end{aligned}$$

decreases along trajectories of the KFE.

Much more is true if the Markov chain is reversible, i.e., the detailed balance condition \(\pi _x q_{x y} = \pi _y q_{y x}\) holds for all \(x \ne y\). Equivalently, this means that the generator L is selfadjoint in the Hilbert space \(L^2(\mathcal {X},\pi )\). In this case, it was shown in [17, 18] that the KFE can be written as the gradient flow equation of \({{\,\textrm{Ent}\,}}_\pi \) with respect to a Riemannian metric on \(\mathscr {P}_*(\mathcal {X})\). The associated Riemannian distance is given by a discrete dynamical optimal transport problem, in the spirit of the Benamou–Brenier formulation for the Wasserstein distance [4]. This gradient flow structure is a discrete version of the Wasserstein gradient flow structure for the Fokker–Planck equation discovered by Jordan, Kinderlehrer, and Otto [15]. This construction has been the starting point for the development of discrete Ricci curvature based on geodesic convexity with applications to functional inequalities [10,11,12,13, 20]

It was shown by Dietert [9] that the reversibility assumption is also necessary: if the KFE can be written as gradient flow equation for \({{\,\textrm{Ent}\,}}_\pi \) with respect to some Riemannian metric on \(\mathscr {P}_*(\mathcal {X})\), then the underlying Markov chain is necessarily reversible. Combined with the results from [17, 18], this result characterises reversible Markov chains as exactly those that admit a gradient flow structure for the relative entropy \({{\,\textrm{Ent}\,}}_\pi \).

In this paper we provide a noncommutative analogue of this result.

1.1.2 Quantum Markov semigroups

Let \((\mathscr {P}_t)_{t \ge 0}\) be a quantum Markov semigroup on a finite-dimensional \(C^*\)-algebra \(\mathcal {A}\), i.e., \((\mathscr {P}_t)_{t \ge 0}\) is a \(C_0\)-semigroup of linear operators on \(\mathcal {A}\) such that \(\mathscr {P}_t {\textbf{1}}= {\textbf{1}}\) and the operators \(\mathscr {P}_t\) are completely positive, i.e., \(\mathscr {P}_t \otimes I_n\) is a positive operator on \(\mathcal {A}\otimes {\mathbb M}_n({\mathbb C})\) for all \(n \ge 1\). (Here, \({\textbf{1}}\in \mathcal {A}\) denotes the unit element, and \(I_n\) denotes the identity operator on the algebra of \(n \times n\)-matrices \({\mathbb M}_n({\mathbb C})\).) The infinitesimal generator of \((\mathscr {P}_t)_{t \ge 0}\) will be denoted by \(\mathscr {L}\).

Let \((\mathscr {P}_t^\dagger )_{t \ge 0}\) be the adjoint semigroup with respect to the duality pairing \(\langle {A,B}\rangle = {{\,\textrm{Tr}\,}}[A^* B]\). This is a \(C_0\)-semigroup of completely positive and trace-preserving linear operators with generator \(\mathscr {L}^\dagger \). In particular, the operators \(\mathscr {P}_t^\dagger \) map the set of density matrices \({\mathfrak P}:= \{ \rho \in \mathcal {A}\, \ \rho \ge 0 \text { and } {{\,\textrm{Tr}\,}}[\rho ] = 1 \}\) into itself. Here we restrict our attention to the ergodic setting: we assume that there exists a unique stationary state, i.e., a unique density matrix \(\sigma \in {\mathfrak P}\) satisfying \(\mathscr {L}^\dagger \sigma = 0\). We shall assume that \(\sigma \) is invertible.

The non-commutative analogue of the KFE is the Lindblad equation \(\partial _t \rho _t = \mathscr {L}^\dagger \rho _t\). It is well known [22, 23] that the von Neumann relative entropy

$$\begin{aligned} H_\sigma (\rho ) := {{\,\textrm{Tr}\,}}[\rho (\log \rho - \log \sigma )] \end{aligned}$$

decreases along solutions to this equation. Moreover, following the earlier works [6, 19], it was shown in [7, 21] that the Lindblad equation \(\partial _t \rho = \mathscr {L}^\dagger \rho \) can be written as gradient flow equation for \(H_\sigma \) under the condition of gns-detailed balance. This condition means that the generator \(\mathscr {L}\) is selfadjoint with respect to the weighted \(L^2\)-type scalar product

$$\begin{aligned} \langle {A,B}\rangle _\sigma ^{\textsc {gns}} := {{\,\textrm{Tr}\,}}[ \sigma A^* B] \end{aligned}$$

named after Gelfand, Naimark, and Segal. As in the discrete setting above, the associated Riemannian metric is related to a dynamical optimal transport problem.

It is now natural to ask whether the condition of gns-detailed balance is also necessary for the existence of a gradient flow structure for the von Neumann relative entropy. However, it was shown in [8] that a different symmetry condition is necessary, namely the condition of bkm-detailed balance. This condition corresponds to the selfadjointness of \(\mathscr {L}\) with respect to another weighted \(L^2\)-type scalar product

$$\begin{aligned} \langle {A,B}\rangle _\sigma ^{\textsc {bkm}} := \int _0^1 {{\,\textrm{Tr}\,}}[ \sigma ^{1-s} A^* \sigma ^s B] \; \text {d}s, \end{aligned}$$

named after Bogoliubov, Kubo, and Mori. As the condition of bkm-detailed balance is strictly weaker than gns-detailed balance [8], there was a gap between the known necessary and sufficient conditions. As an application of Theorem 1.1 we prove the following result, which closes this gap.

Theorem 1.2

Let \(\mathscr {L}\) be the generator of an ergodic quantum Markov semigroup on a finite dimensional \(C^*\)-algebra \(\mathcal {A}\), and let \(\sigma \in {\mathfrak P}_+\) be its stationary state. The following statements are equivalent:

  1. (1)

    The operator \(\mathscr {L}\) is selfadjoint with respect to the bkm scalar product \(\langle {\cdot , \cdot }\rangle _{\sigma }^{\textsc {bkm}}\).

  2. (2)

    There exists a Riemannian metric on the interior of \({\mathfrak P}\) for which the Lindblad equation \({\dot{\rho }}_t = \mathscr {L}^\dagger \rho _t\) is the gradient flow equation of the von Neumann relative entropy \(H_\sigma \).

The implication (2) \(\Rightarrow \) (1) was proved in [8, Theorem 2.9]. The converse implication is new.

1.2 Structure of the paper

Section 2 contains the main result and a reformulation of the result in the gradient case. The proof of the main result is contained in Sect. 3, except for the construction of the local metric, which is presented in Sect. 4. Section 5 deals with the construction of a metric of class \(C^k\) under the assumption that the fields \(X^\alpha \) and \(Y_\beta \) are of class \(C^{k+1}\). The application to quantum Markov semigroups is contained in Sect. 6.

2 Main results

Let \(X^\alpha \in \Gamma (T M)\) be a vector field and \(Y_\beta \in \Gamma (T^* M)\) be a co-vector field on a smooth manifold M. Let \({\textsf {N}}_Y:= \{ m \in M \, \ Y|_m = 0 \}\) be the set of critical points of Y.

In the sequel we impose the following assumptions on the fields \(X^\alpha \) and \(Y_\beta \).

Assumption 2.1

(i):

(Non-degeneracy) The bilinear form \(\nabla _\alpha Y_\beta |_m\) is non-degenerate for all \(m \in {\textsf {N}}_Y\) for some (equivalently, any) connection \(\nabla \).

(ii):

(Real analyticity) For all \(m \in {\textsf {N}}_Y\) there exists a neighbourhood \(U_m \ni m\), an open set \(\Omega \subset {\mathbb R}^n\), and a coordinate chart \(\varphi _m: U_m \rightarrow \Omega \), such that the fields \({\widetilde{X}}^a:= X^a \circ \varphi _m^{-1}: \Omega \rightarrow {\mathbb R}\) and \({\widetilde{Y}}_a:= Y_a \circ \varphi _m^{-1}: \Omega \rightarrow {\mathbb R}\) are real analytic at \(\varphi _m(m)\), i.e., \({\widetilde{X}}^a\) and \({\widetilde{Y}}_a\) have a converging power series expansion around \(\varphi _m(m)\) for all \(a \in \{1, \ldots , n\}\).

The non-degeneracy assumption (i) implies that critical points are isolated. This assumption is necessary, in the sense that it cannot be removed in the statement of our main result; see Remark 2.5 below. In the special case where \(Y_\beta = \nabla _\beta f\) is the differential of a function f, the assumption (i) corresponds to the invertibility of the Hessian of f. The second assumption is a mild regularity condition. Note that its validity depends on the choice of the chart.

Remark 2.2

The choice of the connection in (i) above is irrelevant, since the difference of two connections \(\nabla \) and \({\widetilde{\nabla }}\) satisfies \({\widetilde{\nabla }}_\alpha Y_\beta - \nabla _\alpha Y_\beta = \Gamma _{\alpha \beta }^\gamma Y_\gamma \), where \(\Gamma _{\alpha \beta }^\gamma \) is a (1, 2) tensor. In particular, \({\widetilde{\nabla }}_\alpha Y_\beta = \nabla _\alpha Y_\beta \) for \(m \in {\textsf {N}}_Y\). For the same reason, the choice of the connection is irrelevant in (iii) in the following result.

Using the notation introduced above, we restate our main result (Theorem 1.1) for the convenience of the reader.

Theorem 2.3

(Main result) Let \(X^\alpha \in \Gamma (T M)\) and \(Y_\beta \in \Gamma (T^* M)\) satisfy Assumption 2.1. Then there exists a smooth metric \(g_{\alpha \beta } \in \Gamma (T^* M \otimes _{\textrm{S}} T^* M )\) satisfying \(Y_\beta = g_{\alpha \beta } X^\alpha \), if and only if the following conditions hold:

  1. (i)

    \(X^\alpha Y_\alpha |_m>0\) for all \(m \in M \setminus {\textsf {N}}_Y\);

  2. (ii)

    \(X^\alpha |_m=0\) for all \(m \in {\textsf {N}}_Y\);

  3. (iii)

    For all \(m \in {\textsf {N}}_Y\) there exists a scalar product \({\bar{g}}_{\alpha \beta } \in T_m^* M \otimes _{\textrm{S}} T_m^* M\), such that

    $$\begin{aligned} \nabla _\alpha Y_\gamma |_m = {\bar{g}}_{\beta \gamma } \nabla _\alpha X^\beta |_m, \end{aligned}$$

    where \(\nabla _\alpha \) is an arbitrary connection.

Remark 2.4

As the necessity of the three conditions has been discussed above, it remains to prove their sufficiency. This will be done in Sect. 3 below.

Remark 2.5

It is worth pointing out, that Theorem 2.3 fails when the non-degeneracy condition (i) in Assumption 2.1 is violated.

To see this, we work in \({\mathbb R}^2\) with the standard coordinate chart and write \(x = (x_1, x_2)\) for \(x \in {\mathbb R}^2\). Consider the vector field X and the co-vector field Y defined by

$$\begin{aligned} X(x) = \frac{|x|^2}{\sqrt{2}} \begin{bmatrix} x_1 + x_2 \\ -x_1 + x_2 \end{bmatrix} \quad \textrm{and}\quad Y(x) = {|x|^2}x \, . \end{aligned}$$

These fields are polynomial, hence real analytic. (Furthermore, \(Y = \nabla f\), where \(f(x) = \frac{1}{4}|x|^4\).) Since \({\textsf {N}}_Y = \{0\}\) and \(\langle {X,Y}\rangle (x) = |x|^6/\sqrt{2}\) for all \(x \in {\mathbb R}^2\), it is readily seen that conditions (i) and (ii) from Theorem 2.3 are satisfied. Moreover, since \(\nabla X(0) = 0\) and \(\nabla Y(0) = 0\), condition (iii) from Theorem 2.3 holds as well (for any scalar product). However, the non-degeneracy assumption (i) in Assumption 2.1 is violated. We will show that there does not exist a continuous metric \(g_{\alpha \beta }\) satisfying \(Y_\beta = g_{\alpha \beta } X^\alpha \).

To obtain a contradiction, suppose that such a metric \(g_{\alpha \beta }\) exists. Let \(G_x\) denote the coordinate matrix of the metric \(g_{\alpha \beta }\). Using the matrix \(U:=\frac{1}{\sqrt{2}}\big [{\begin{matrix} 1 &{} 1 \\ -1 &{} 1 \end{matrix}}\big ]\), which describes a rotation by \(-\frac{\pi }{4}\), we can write \(X(x) = |x|^2\, U x\). Therefore, the equation \(Y_\beta = g_{\alpha \beta } X^\alpha \) implies that \(x = G_x U x\) for all \(x \in {\mathbb R}^2\). Applying this identity to tx with \(t \in {\mathbb R}\), we infer that \(x = G_{tx}Ux \). Since we assume continuity of \(g_{\alpha \beta }\), we therefore obtain

$$\begin{aligned} G_0 Ux = \lim _{t\rightarrow 0}G_{tx}Ux = x\,, \end{aligned}$$

hence \(U = G_0^{-1}\). Since \(G_0^{-1}\) is symmetric, but U is not, this is the desired contradiction.

In the special case where the co-vector field \(Y_\alpha := \nabla _\alpha F \in \Gamma (T^* M)\) is the derivative of a scalar function \(f: M \rightarrow {\mathbb R}\), the above result admits a convenient reformulation. Assuming that f attains its minimum at a unique critical point \({\bar{m}} \in M\), the next results shows that property (iii) above is equivalent to the symmetry and positivity of the linearised map \(\Lambda : T_{{\bar{m}}} M \rightarrow T_{{\bar{m}}} M\), \(Z \mapsto \nabla _Z X\), at the critical point \({\bar{m}}\). The relevant scalar product is given by the Hessian of f.

Corollary 2.6

(Gradient case) Let \(f \in C^\infty (M)\) be a function and \(X^\alpha \in \Gamma (TM)\) be a vector field, such that \(X^\alpha \) and \(Y_\alpha := \nabla _\alpha f\) satisfy Assumption 2.1. Suppose that Y has a unique zero, \({\bar{m}} \in M\), at which f attains its minimum. Then there exists a Riemannian metric \(g_{\alpha \beta } \in \Gamma (T^* M \otimes _{\textrm{S}} T^* M )\) satisfying

$$\begin{aligned} \nabla _\beta f = g_{\alpha \beta } X^\alpha , \end{aligned}$$

if and only if the following conditions hold:

  1. (i)

    \(\nabla _{X^\alpha } f |_m < 0\) for all \(m \in M\) with \(m \ne {\bar{m}}\);

  2. (ii)

    \(X^\alpha |_{{\bar{m}}} = 0\);

  3. (iii)

    The linear map \(\Lambda := \nabla _\alpha X^\beta |_{{\bar{m}}}: T_{{\bar{m}}} M \rightarrow T_{{\bar{m}}} M\) is positive and symmetricFootnote 2 with respect to the Hessian scalar product \(h_{\alpha \beta }:= \nabla _\alpha \nabla _\beta f|_{{\bar{m}}}\) on \(T_{{\bar{m}}} M\).

Proof

It is clear that the conditions (i) and (ii) match the corresponding conditions in Theorem 2.3.

Suppose now that condition (iii) from Theorem 2.3 holds, for some scalar product \({\bar{g}}^{\alpha \beta } \in T_{{\bar{m}}} M \otimes _{\textrm{S}} T_{{\bar{m}}} M\). We have to show that

$$\begin{aligned} h_{\alpha \beta } (\Lambda Z)^\alpha W^\beta&= h_{\alpha \beta } Z^\alpha (\Lambda W)^\beta{} & {} \text {for all } Z^\alpha , W^\alpha \in T_{{\bar{m}}} M, \text { and } \\ h_{\alpha \beta } (\Lambda Z)^\alpha Z^\beta&> 0{} & {} \text {for all }Z^\alpha \in T_{{\bar{m}}} M, \ Z^\alpha \ne 0. \end{aligned}$$

To show this, note that \((\Lambda Z)^\alpha = Z^\gamma \nabla _\gamma X^\alpha = Z^\gamma {\bar{g}}^{\alpha \delta } h_{\delta \gamma }\) for \(Z^\alpha \in T_{{\bar{m}}} M\). Hence, for \(W^\alpha \in T_{{\bar{m}}} M\), we see that the expression

$$\begin{aligned} h_{\alpha \beta } (\Lambda Z)^\alpha W^\beta = h_{\alpha \beta } {\bar{g}}^{\alpha \delta } h_{\delta \gamma } Z^\gamma W^\beta \end{aligned}$$

is invariant under interchanging Z and W, which proves the desired symmetry. Moreover, this expression implies that \(h_{\alpha \beta } (\Lambda Z)^\alpha Z^\beta = {\bar{g}}^{\alpha \beta } {\widetilde{Z}}_\alpha {\widetilde{Z}}_\beta \) where \({\widetilde{Z}}_\alpha = h_{\alpha \beta } Z^\beta \). Since \(h_{\alpha \beta }\) is invertible by Assumption 2.1 and \({\bar{g}}^{\alpha \beta }\) is positive definite, it follows that \(h_{\alpha \beta } (\Lambda Z)^\alpha Z^\beta > 0\) whenever \(Z^\alpha \ne 0\).

Conversely, suppose that condition (iii) of the corollary holds. For all \(Z^\alpha , W^\alpha \in T_{{\bar{m}}} M\) it follows that \( h_{\alpha \beta } (\Lambda Z)^\alpha W^\beta = {\widetilde{g}}_{\alpha \beta } Z^\alpha W^\beta \) for a positive and symmetric tensor \({\widetilde{g}}_{\alpha \beta } \in T_{{\bar{m}}}^* M \otimes _{\textrm{S}} T_{{\bar{m}}}^* M\). Since \( h_{\alpha \beta } (\Lambda Z)^\alpha W^\beta = h_{\alpha \beta } Z^\gamma \nabla _\gamma X^\alpha W^\beta \) we infer that \({\widetilde{g}}_{\alpha \beta } = h_{\gamma \beta } \nabla _\alpha X^\gamma \). Now define

$$\begin{aligned} {\bar{g}}^{\alpha \beta } := h^{\alpha \delta } {\widetilde{g}}_{\delta \gamma } h^{\gamma \beta } \in T_{{\bar{m}}} M \otimes _{\textrm{S}} T_{{\bar{m}}} M. \end{aligned}$$

Since \({\widetilde{g}}_{\alpha \beta }\) is positive and symmetric and \( h^{\alpha \delta }\) is invertible, \({\bar{g}}^{\alpha \beta }\) defines a scalar product. Moreover, we have the desired identity \( \nabla _\alpha X^\beta |_{{\bar{m}}} = {\bar{g}}^{\beta \gamma } h_{\alpha \gamma }, \) which completes the proof. \(\square \)

In the special case were \(Y_\beta \) is the derivative of a scalar function f, the existence of a metric satisfying \(\nabla _\beta f = g_{\alpha \beta } X^\alpha \) was proved in [3] on the complement of the set of critical points. The existence of a metric with the desired property on the whole manifold was stated as an open question [3, Question 1]. Subsequently, under an additional assumption, which corresponds to (iii) in Theorem 2.3, the existence of a continuous extension of \(g_{\alpha \beta }\) to all of M was obtained in [5]; cf. Sect. 5 below for more details. However, the metric constructed in [5] is in general not differentiable, even if the fields \(X^\alpha \) and \(Y_\beta \) are smooth; see Example 5.2 below.

Here we show that \(C^k\)-regularity of the metric can be obtained if the fields \(X^\alpha \) and \(Y_\beta \) are assumed to be of class \(C^{k+1}\).

Theorem 2.7

(Existence of a metric of class \(C^k\)) Let \(X^\alpha \) and \(Y_\beta \) be of class \(C^{k+1}\) on M for some \(k \in {\mathbb N}\) and assume that \(\nabla _\alpha Y_\beta |_m\) is non-degenerate for all \(m \in {\textsf {N}}_Y\) for some (equivalently, any) connection \(\nabla \). Then there exists a metric \(g_{\alpha \beta } \) of class \(C^k\) on M satisfying \(Y_\beta = g_{\alpha \beta } X^\alpha \) if and only if conditions (i), (ii), and (iii) of Theorem 2.3 hold.

The proof of this result will be given in Sect. 5 below. It relies on the construction based on tensor equations that we develop in the proof of Theorem 2.3.

3 Proof of the main result

Our main result (Theorem 2.3) relies on two local versions of this result. First we construct a local solution around any non-critical point \(m \in M \setminus {\textsf {N}}_Y\). In the special case were \(Y_\beta \) is the derivative of a scalar function, a different construction of a metric away from critical points was carried out in [3]; see Sect. 5 below.

Theorem 3.1

(Local solutions around non-critical points) Suppose that \(X^\alpha \in \Gamma (T M)\) and \(Y_\beta \in \Gamma (T^* M)\) satisfy \(X^\alpha Y_\alpha |_{{\bar{m}}}>0\) for some \({\bar{m}} \in M\). Then there exists a neighbourhood U of \({\bar{m}}\) and a smooth local metric \(g_{\alpha \beta }: U \rightarrow T^*M \otimes _{\textrm{S}} T^*M\) such that

$$\begin{aligned} X^\alpha |_m = g^{\alpha \beta } Y_\beta |_m \end{aligned}$$
(3.1)

for all \(m \in U\).

Proof

Since \(X^\alpha Y_\alpha |_m > 0\), we have \(Y_\alpha |_m \ne 0\). Therefore, we can complete the co-vector field \(e^1_\alpha := Y_\alpha \in T^* M\) to a dual frame \(E:= ( e^1_\alpha , \ldots , e^n_\alpha )\) in a neighbourhood V of m, i.e., \(( e^1_\alpha |_m, \ldots , e^n_\alpha |_m )\) is a basis of \(T_m^* M\) for all \(m \in V\). The coordinates of \(X^\alpha \) with respect to this frame are given by \(\bar{X}^j = X^\alpha e^j_\alpha : V \rightarrow {\mathbb R}\) for \(j = 1, \ldots , n\). Since \(\bar{X}^1|_{{\bar{m}}} > 0\), the set \(U:= V \cap \{ \bar{X}^1 > 0 \}\) is still a neighbourhood of \({\bar{m}}\). Let us define \({\bar{X}}': U \rightarrow {\mathbb R}^{n-1}\) and \(f: U \rightarrow {\mathbb R}\) by

$$\begin{aligned} {\bar{X}}' := ({\bar{X}}^2, \ldots , {\bar{X}}^n), \qquad f := \frac{{\bar{X}}^1}{2} + \frac{2}{{\bar{X}}^1} |\bar{X}'|^2. \end{aligned}$$

We then define the bilinear form \(g^{\alpha \beta }\) in coordinates \(G = (g^{i j})_{i,j=1}^n\) as

$$\begin{aligned} G := \begin{bmatrix} {\bar{X}}^1 &{} (\bar{X}')^\intercal \\ \bar{X}' &{} f I_{n-1} \end{bmatrix}, \end{aligned}$$

where \(I_n\) is the identity matrix. Since the matrix G is symmetric, the bilinear form g is symmetric as well. To verify that \(G > 0\), we write

$$\begin{aligned} G&= \begin{bmatrix} \sqrt{\frac{{\bar{X}}^1}{2}} \\ \sqrt{\frac{2}{{\bar{X}}^1}}\bar{X}' \end{bmatrix} \begin{bmatrix} \sqrt{\frac{{\bar{X}}^1}{2}}&\sqrt{\frac{2}{{\bar{X}}^1}}(\bar{X}')^\intercal \end{bmatrix} + \begin{bmatrix} \frac{{\bar{X}}^1}{2} &{} 0 \\ 0 &{} f I_{n-1} - \frac{2}{{\bar{X}}^1} \bar{X}' (\bar{X}')^\intercal \end{bmatrix} \\&\ge \begin{bmatrix} \frac{{\bar{X}}^1}{2} &{} 0 \\ 0 &{} \big ( f - \frac{2}{{\bar{X}}^1} |\bar{X}'|^2 \big ) I_{n-1} \end{bmatrix} = \frac{{\bar{X}}^1}{2} I_{n} > 0, \end{aligned}$$

as desired. To complete the proof, we compute

$$\begin{aligned} g^{\alpha \beta } Y_\beta e_\alpha ^i = g^{\alpha \beta } e_\beta ^1 e_\alpha ^i = g^{i 1} = {\bar{X}}^i =X^\alpha e_\alpha ^i, \end{aligned}$$

which shows (3.1). \(\square \)

The second local version of Theorem 2.3 concerns the construction of a smooth local metric in a neighbourhood of a critical point.

Theorem 3.2

(Local solutions around critical points) Let \(X^\alpha \in \Gamma (T M)\) and \(Y_\beta \in \Gamma (T^* M)\) satisfy Assumption 2.1. Suppose that \(X^\alpha |_{{\bar{m}}} = Y_\alpha |_{{\bar{m}}} = 0\) for some \({\bar{m}} \in M\), and suppose that there exists a scalar product \({\bar{g}} \in T_{{\bar{m}}} M \otimes _{\textrm{S}} T_{{\bar{m}}} M\), such that

$$\begin{aligned} \nabla _\alpha X^\beta |_{{\bar{m}}} = {\bar{g}}^{\beta \gamma } \nabla _\alpha Y_\gamma |_{{\bar{m}}}. \end{aligned}$$

Then there exists a neighbourhood U of m and a smooth local metric \(g_{\alpha \beta }: U \rightarrow T^*M \otimes _{\textrm{S}} T^*M\) such that

$$\begin{aligned} X^\alpha |_m = g^{\alpha \beta } Y_\beta |_m \end{aligned}$$

for all \(m \in U\).

The proof of Theorem 3.2 is the main challenge of this paper and will be carried out in Sect. 4.

We now show that the main result (Theorem 2.3) follows readily from the local Theorems 3.1 and 3.2 using a partition of unity argument; see, e.g., [14, Theorem 1.131] for the existence of a partition of unity.

Proof of Theorem 2.3

The local results Theorems 3.1 and 3.2 guarantee that for any \(m \in M\) there exists a neighbourhood \(U_m\) and a local metric \(g_{\alpha \beta }\) defined on \(U_m\), such that the desired identity

$$\begin{aligned} X^\alpha = g^{\alpha \beta } Y_\beta , \end{aligned}$$

holds on \(U_m\).

Let \(\{ f_k \}_{k \in {\mathbb N}}\) be a partition of unity subordinated to the cover \(\{ U_m: m \in M \}\) of the manifold M, i.e., there exists a locally finite open covering \(\{V_k\}_{k \in {\mathbb N}}\) of M, such that each \(V_k\) is contained in \(U_{m_k}\) for some \(m_k \in M\), each function \(f_k: M \rightarrow {\mathbb R}\) is nonnegative and smooth and its support is contained in \(V_k\), and we have \( \sum _{k \in {\mathbb N}} f_k(m) = 1 \) for all \(m \in M\) (where the sum is finite for each m). We then define

$$\begin{aligned} g^{\alpha \beta } := \sum _{k \in {\mathbb N}} f_k g_{m_k}^{\alpha \beta }. \end{aligned}$$

As \(g^{\alpha \beta }\) is a finite convex combination of the scalar products \(g^{\alpha \beta }_{m_k}\), it is a scalar product. By linearity, \(g^{\alpha \beta }\) satisfies the desired equation \(X^\alpha = g^{\alpha \beta } Y_\beta \). \(\square \)

4 Local solutions around critical points

In this section we give the proof of Theorem 3.2, which deals with the construction of the metric around critical points.

Fix \({\bar{m}} \in M\) and let \(\varphi =(\varphi ^1,\dots ,\varphi ^n): U \rightarrow \Omega \) be a coordinate chart which maps a neighbourhood U of \({\bar{m}}\) onto an open set \(\Omega \subseteq {\mathbb R}^n\), where n is the dimension of the manifold M. Using this chart we can identify the vector field \(X^\alpha \in \Gamma (T M)\) defined on \(U \subseteq M\) with the function \({\widetilde{X}}^\alpha : \Omega \rightarrow V:= {\mathbb R}^n\), where \({\widetilde{X}}^\alpha :=({\widetilde{X}}^1,\dots ,{\widetilde{X}}^n)\) with \( {\widetilde{X}}^j:=(\nabla _\alpha \varphi ^j X^\alpha ) \circ \varphi ^{-1}\). Similarly, the co-vector field \(Y_\beta \in \Gamma (T^* M)\) defined on \(U \subseteq M\) can be identified with a function \({\widetilde{Y}}_\beta : \Omega \rightarrow V^*\), and the metric \(g_{\alpha \beta } \in \Gamma (T^* M \otimes _\text {S}T^* M)\) can be identified with a function \({\widetilde{g}}_{\alpha \beta }: \Omega \rightarrow V^* \otimes _\text {S}V^*\). In the remainder of this section, we will work on a fixed chart and remove the tildes to lighten notation.

4.1 Motivation of the tensor equations

Let \({\bar{x}} \in \Omega \) be such that \(Y_\beta |_{{\bar{x}}} = 0\), and suppose that the identity \(X^\alpha = g^{\alpha \beta } Y_\beta \) holds in a neighbourhood of \({\bar{x}}\). For \(N \in {\mathbb N}\) and all indices \(c_1, \ldots , c_N \in \{ 1, \ldots , n \}\) we will derive a system of equations that the partial derivatives \(T_{c_1 \cdots c_N}^{a b}:= \partial _{c_1} \cdots \partial _{c_N} g^{ab}\) satisfy at \(x = {\bar{x}}\).

Taking partial differentives \(\partial _c\) for \(c \in \{1, \ldots , n\}\) yields

$$\begin{aligned} \partial _c X^a = \partial _c g^{a b} Y_b + g^{a b} \partial _c Y_b. \end{aligned}$$

Since \(Y_b|_{{\bar{x}}} = 0\), we find that

$$\begin{aligned} \partial _c X^a = g^{a b} \partial _c Y_b \end{aligned}$$

at \(x = {\bar{x}}\). Taking second order derivatives, we find, for \(c_1, c_2 \in \{1, \ldots , n\}\),

$$\begin{aligned} \partial _{c_1} \partial _{c_2} X^a = \partial _{c_1} \partial _{c_2} g^{a b} Y_b + \partial _{c_1} g^{a b} \partial _{c_2} Y_b + \partial _{c_2} g^{a b} \partial _{c_1} Y_b + g^{a b} \partial _{c_1} \partial _{c_2} Y_b. \end{aligned}$$

As \(Y_b|_{{\bar{x}}} = 0\), the first term on the right-hand side vanishes, and we infer that the tensor of first-order derivatives \(T_c^{a b}:= \partial _c g^{a b}\) is a solution to the system

$$\begin{aligned} U_{c_2 b} T_{c_1}^{a b} + U_{c_1 b} T_{c_2}^{a b} = R_{c_1 c_2}^a, \end{aligned}$$

where \(U_{a b}:= \partial _a Y_b \) and \(R_{c_1 c_2}^a:= \partial _{c_1} \partial _{c_2} X^a - g^{a b} \partial _{c_1} \partial _{c_2} Y_b. \)

More generally, for \(N = 1, 2, \ldots \), we find

$$\begin{aligned} \partial _{c_1} \cdots \partial _{c_N} X^a = \sum _{ S \subseteq [N] } \partial _{c_S} g^{a b} \partial _{c_{[N] \setminus S}} Y_b, \end{aligned}$$

where we use the shorthand notation \(\partial _{c_S} = \partial _{c_{i_1}} \cdots \partial _{c_{i_k}}\) for \(S = \{i_1, \ldots , i_k\} \subseteq [N]:=\{1, \ldots , N\}\) with \(i_\mu \ne i_\nu \) for \(\mu \ne \nu \). Since \(Y_b = 0\), the term with \(|S| = N\) vanishes. Thus, the derivatives of order \((N-1)\), given by \(T_{c_1 \cdots c_{N-1}}^{a b}:= \partial _{c_1} \cdots \partial _{c_{N-1}} g^{a b}\) solve the system

(4.1)

where \(U_{c b}:= \partial _c Y_b\), and

$$\begin{aligned} R_{c_1 \cdots c_N}^a := \partial _{c_1} \cdots \partial _{c_N} X^a - \sum _{ \begin{array}{c} S \subseteq [N] \\ |S| < N-1 \end{array}} \partial _{c_S} g^{a b} \partial _{c_{[N] \setminus S}} Y_b \end{aligned}$$

depends on (derivatives of) X and Y, and on derivatives of g of order at most \(N-2\). The notation means that the index \(c_i\) is removed.

The identity (4.1) suggests an iterative scheme to construct a local solution \(g^{\alpha \beta }\) to the equation \(X^\alpha = g^{\alpha \beta } Y_\beta \) around a critical point \({\bar{x}} \in U\) as a power series

$$\begin{aligned} g^{a b}|_{x} := \sum _{N = 0}^\infty \frac{1}{N!} T^{a b}_{c_1 \cdots c_N} (x - {\bar{x}})^{c_1} \cdots (x - {\bar{x}})^{c_N} \end{aligned}$$

with coefficients \(T^{\alpha \beta }_{\gamma _1 \cdots \gamma _N} \in V^{\otimes _\text {S}2} \otimes (V^*)^{\otimes _\text {S}N}\). The idea is to define, for \(N = 0\), \(T^{a b}:= {\bar{g}}^{a b}\), where \({\bar{g}} \in T_{{\bar{x}}}^* M \otimes _\text {S}T_{{\bar{x}}}^* M\) is the scalar product satisfying

$$\begin{aligned} \partial _c X^a|_{{\bar{x}}} = {\bar{g}}^{a b} \ \partial _c Y_b|_{{\bar{x}}}, \end{aligned}$$

which exists by assumption. Higher order Taylor coefficients \(T_{c_1 \ldots c_N}^{a b}\) are then constructed by iteratively solving a system of tensor equations of the form (4.1).

Section 4.2 deals with the existence of a solution to these equations. The construction and the convergence of the iterative scheme is contained in Sect. 4.3.

4.2 Solving the tensor equations

We start by formulating an explicit solution to the tensor Eq. (4.1) of order \(N = 2\). We will make the crucial assumption that \(U_{\alpha \beta }\) is invertible. In our application, this assumption corresponds to the non-degeneracy in Assumption 2.1.

Lemma 4.1

Let V be a finite-dimensional vector space, and let \(R_{\gamma \delta }^\alpha \in V \otimes (V^* \otimes _\text {S}V^*) \) and \(U_{\alpha \beta }\in V^* \otimes V^*\) be given. We assume that \(U_{\alpha \beta }\) is invertible with inverse \(W^{\alpha \beta }\in V \otimes V\), i.e.,

$$\begin{aligned} U_{\alpha \beta }W^{\beta \gamma }=\delta _\alpha ^\gamma . \end{aligned}$$

Then the tensor \(T_\gamma ^{\alpha \beta } \in (V \otimes V) \otimes V^*\) defined by

$$\begin{aligned} T_\gamma ^{\alpha \beta } := \frac{1}{2} \Big ( W^{\beta \delta } R_{\gamma \delta }^\alpha + W^{\alpha \delta } R_{\gamma \delta }^\beta - U_{\gamma \gamma '} W^{\alpha \alpha '} W^{\beta \beta '} R_{\alpha ' \beta '}^{\gamma '} \Big ) \end{aligned}$$

satisfies the equations \(T_\gamma ^{\alpha \beta } = T_\gamma ^{\beta \alpha }\) and

$$\begin{aligned} U_{\delta \beta } T_\gamma ^{\alpha \beta } + U_{\gamma \beta } T_\delta ^{\alpha \beta } = R_{\gamma \delta }^\alpha . \end{aligned}$$
(4.2)

Proof

The fact that \(T_\gamma ^{\alpha \beta } = T_\gamma ^{\beta \alpha }\) follows readily from the definition. To show that (4.2) holds, note that by definition of T,

$$\begin{aligned} 2 U_{\delta \beta } T_\gamma ^{\alpha \beta }&= R_{\gamma \delta }^\alpha + U_{\delta \beta } W^{\alpha \epsilon } R_{\gamma \epsilon }^\beta - U_{\gamma \gamma '} W^{\alpha \alpha '} R_{\alpha ' \delta }^{\gamma '}, \end{aligned}$$
(4.3)
$$\begin{aligned} 2 U_{\gamma \beta } T_\delta ^{\alpha \beta }&= R_{\delta \gamma }^\alpha + U_{\gamma \beta } W^{\alpha \epsilon } R_{\delta \epsilon }^\beta - U_{\delta \delta '} W^{\alpha \alpha '} R_{\alpha ' \gamma }^{\delta '}. \end{aligned}$$
(4.4)

Relabeling indices on the right-hand side and using the symmetry of R, we observe that the second term in (4.3) equals the third term in (4.4), and the second term in (4.4) equals the third term in (4.3). Summing these identities, we thus obtain (4.2). \(\square \)

We also need the following multilinear generalisation.

Lemma 4.2

Fix \(N \ge 2\). Let V be a finite-dimensional vector space, and let \(R_{\gamma _1 \cdots \gamma _N}^\alpha \in V \otimes (V^*)^{\otimes _\text {s}N}\) and \(U_{\alpha \beta } \in V^* \otimes V^*\) be given. We assume that \(U_{\alpha \beta }\) is invertible with inverse \(W^{\alpha \beta } \in V \otimes V\). Then the tensor \(T_{\gamma _1 \cdots \gamma _{N-1}}^{\alpha \beta } \in V^{\otimes _\text {s}2} \otimes (V^*)^{\otimes _\text {s}(N-1)} \) defined by

(4.5)

satisfies

(4.6)

Proof

The fact that T belongs to \(V^{\otimes _\text {s}2} \otimes (V^*)^{\otimes _\text {s}(N-1)} \) follows readily from the definition. To show that (4.6) holds, note that

This yields the result, as the first term has the desired form, and the second term cancels against the third term, as can be seen by renaming indices \((\alpha ', \gamma _j')\) into \((\delta ,\beta )\). \(\square \)

4.3 Iterative construction of the power series & Proof of Theorem 3.2

We now place ourselves in the setting of Theorem 3.2. Thus, let \(X^\alpha \in \Gamma (T M)\) and \(Y_\beta \in \Gamma (T^* M)\) satisfy Assumption 2.1, and suppose that \(X^\alpha |_{{\bar{m}}} = Y_\alpha |_{{\bar{m}}} = 0\) for some fixed \({\bar{m}} \in M\). We assume that there exists a scalar product \({\bar{g}} \in T_{{\bar{m}}} M \otimes _{\textrm{S}} T_{{\bar{m}}} M\) satisfying

$$\begin{aligned} \nabla _\alpha X^\beta |_{{\bar{m}}} = {\bar{g}}^{\beta \gamma } \nabla _\alpha Y_\gamma |_{{\bar{m}}}. \end{aligned}$$

Our goal is to construct the local metric \(g^{\alpha \beta }\) around \({\bar{m}}\) as a convergent power series centered at \({\bar{x}} = \varphi ({\bar{m}})\). We now present the definition of its coeffients \(T_{c_1 \cdots c_N}^{a b}\), which is motivated by the Eq. (4.1). Our computations will be performed in a fixed chart \(\varphi : U \rightarrow \Omega \) around \({\bar{m}}\) which satisfies Assumption 2.1.

Definition 4.3

(The power series coeffients \(T_{c_1 \cdots c_N}^{a b}\)) Write \(U_{\alpha \beta }:= \nabla _\alpha Y_\beta |_{{\bar{m}}}\) for brevity.

  • Initialisation: We define the initial tensor \(T^{\alpha \beta } \in V \otimes _\text {S}V\) of our iteration as

    $$\begin{aligned} T^{a b} := {\bar{g}}^{a b}. \end{aligned}$$
  • Iterative step (special case \(N=2\)): We first define \(R_{\gamma \delta }^\alpha \in V \otimes (V^* \otimes _\text {S}V^*)\) by

    $$\begin{aligned} R_{c d}^a := \partial _{c} \partial _{d} X^a - T^{a b} \partial _{c} \partial _{d} Y_b \end{aligned}$$

    and then define \(T_\gamma ^{\alpha \beta } \in (V \otimes _\text {S}V) \otimes V^*\) as the solution to the system

    $$\begin{aligned} U_{d b} T_c^{a b} + U_{c b} T_d^{a b} = R_{c d}^a \end{aligned}$$

    constructed in Lemma 4.1.

  • Iterative step  (\(N = 2, 3, \ldots \)): We first define \(R_{\gamma _1 \cdots \gamma _N}^{\alpha } \in V \otimes (V^*)^{\otimes _\text {S}N}\) in terms of the lower order tensors \(T^{\alpha \beta }, T_{\gamma _1}^{\alpha \beta }, \ldots , T_{\gamma _1 \cdots \gamma _{N-2}}^{\alpha \beta }\) by

    $$\begin{aligned} R_{c_1 \cdots c_N}^{a}&:= \partial _{c_1} \cdots \partial _{c_N} X^a - \sum _{ \begin{array}{c} S \subseteq [N] \\ |S| < N-1 \end{array}} T_{c_S}^{a b} \ \partial _{c_{[N] \setminus S}} Y_b. \end{aligned}$$
    (4.7)

    Here we use the shorthand notation \( T_{c_S}:= T_{c_{i_1} \cdots c_{i_k}} \) for \(S:= \{ i_1, \ldots , i_k\}\) with \(i_\mu \ne i_\nu \) for \(\mu \ne \nu \). Then we define the tensor \(T_{\gamma _1 \cdots \gamma _{N-1}}^{\alpha \beta } \in V^{\otimes _\text {S}(N-1)} \otimes (V^*)^{\otimes _\text {S}2} \) as the solution to the system

    constructed in Lemma 4.2.

Remark 4.4

The nondegeneracy assumption on the derivative \(\nabla _\alpha Y_\beta |_{{\bar{m}}}\) from Assumption 2.1 is crucially used in this construction, as the application of Lemmas 4.1 and 4.2 requires the invertibility of \(U_{\alpha \beta }\).

Our next aim is to show that the power series

$$\begin{aligned} g^{a b}|_x := \sum _{N = 0}^\infty \frac{1}{N!} T^{a b}_{c_1 \cdots c_N} (x - {\bar{x}})^{c_1} \cdots (x - {\bar{x}})^{c_N} \end{aligned}$$

converges and defines a Riemannian metric in a neigbourhood of \({\bar{x}}\). For this purpose we equip the spaces \( V^{\otimes k} \otimes (V^*)^{\otimes \ell }\) with the norm

$$\begin{aligned} \big \Vert W_{\alpha _1 \cdots \alpha _k} ^{\beta _1 \ldots \beta _\ell } \big \Vert _\infty :=\max _{ a_1, \ldots , a_k, b_1, \ldots , b_\ell } \big | W_{a_1 \cdots a_k}^{b_1 \cdots b_\ell } \big |, \end{aligned}$$

where \(W_{a_1 \cdots a_k}^{b_1 \cdots b_\ell }\) are the coordinates of \(W_{\alpha _1 \cdots \alpha _k} ^{\beta _1 \ldots \beta _\ell }\) in the standard basis of \({\mathbb R}^n\). For brevity, let us write

$$\begin{aligned} r_N := \Vert R_{\gamma _1 \cdots \gamma _N}^\alpha \Vert _\infty \quad \textrm{and}\quad t_N := \Vert T_{\gamma _1 \cdots \gamma _N}^{\alpha \beta } \Vert _\infty . \end{aligned}$$

We then obtain the following crucial growth bound on the power series coefficients.

Lemma 4.5

There exist constants \(C, p < \infty \) such that \(t_N \le C N! p^N\) for all \(N \ge 1\).

Proof

Recall that we work in a chart for which Assumption 2.1 holds. Therefore, the real analyticity assumption implies that there exist constants \(C', q < \infty \) such that

$$\begin{aligned} \big | \partial _{c_1} \cdots \partial _{c_M} {\widetilde{X}}^a|_{{\bar{x}}} \big | \le C' M! q^M \quad \textrm{and}\quad \big | \partial _{c_1} \cdots \partial _{c_M} {\widetilde{Y}}_a|_{{\bar{x}}} \big |&\le C' M! q^M \end{aligned}$$
(4.8)

for all \(m \in {\mathbb N}\) and all \(c_1, \ldots , c_M \in \{1, \ldots , n\}\); see, e.g., [16, Proposition 2.2.10].

Since \(U_{\alpha \beta }\) is non-degenerate by Assumption 2.1, its inverse \(W^{\alpha \beta }\) is well-defined and we have

$$\begin{aligned} K := \max \big \{ \Vert U_{\alpha \beta }\Vert _\infty , \Vert W^{\alpha \beta }\Vert _\infty \big \} < \infty . \end{aligned}$$

Using the bounds on the power series coefficients from (4.8) and the definitions of T and R from (4.5) and (4.7), we obtain the following relations between the norms \(r_k\) and \(t_k\):

$$\begin{aligned} \frac{r_N}{N!}&\le C' q^N + \frac{C' n}{N!} \sum _{ \begin{array}{c} S \subseteq [N] \\ |S| < N - 1 \end{array}} t_{|S|} q^{N-|S|} \big ( N - |S|\big )! \\ {}&= C' q^N + \frac{C' n}{N!} \sum _{k = 0}^{N - 2} \left( {\begin{array}{c}N\\ k\end{array}}\right) t_{k} q^{N-k} ( N - k)! = C' q^N \bigg ( 1 + n \sum _{k = 0}^{N - 2} \frac{t_{k} }{k! q^{k}} \bigg ) \end{aligned}$$

and

$$\begin{aligned} t_{N-1} \le \frac{1}{N} \Big ( 2 n K r_N + K^3 n^3 r_N \Big ) =: \frac{{\widetilde{K}}}{N} r_N, \end{aligned}$$

where \({\widetilde{K}} < \infty \) depends on K and n. Using these estimates we shall now prove the desired result by induction.

We thus assume, for some \(N \ge 0\), that the desired inequality \(t_k / k! \le C p^k\) holds for all \(k \le N\), with suitable constants \(C, p < \infty \). We will now show that \(t_{N+1} / (N+1)! \le C p^{N+1}\). Indeed, using the inequalities above and the induction assumption, we obtain

$$\begin{aligned} \frac{t_{N + 1}}{(N+1)!} \le \frac{{\widetilde{K}}}{(N + 2)!} r_{N + 2} \le C' {\widetilde{K}} q^{N + 2} \bigg ( 1 + n \sum _{k=0}^N \frac{t_k}{k! q^k} \bigg ) \le C' {\widetilde{K}} q^{N + 2} \bigg ( 1 + C n \sum _{k=0}^N \Big (\frac{p}{q}\Big )^k \bigg ). \end{aligned}$$

Assuming, without loss of generality, that \(C \ge 1\) and \(p > q\), this yields

$$\begin{aligned} \frac{t_{N + 1}}{(N+1)!}&\le C p^{N+1} C' {\widetilde{K}} q \bigg ( \Big (\frac{q}{p}\Big )^{N+1} + n \sum _{k=0}^N \Big (\frac{q}{p}\Big )^{N-k+1} \bigg ) \le C p^{N+1} C' {\widetilde{K}} q \bigg ( \frac{q}{p} + \frac{n q}{p - q} \bigg ). \end{aligned}$$

By choosing p sufficiently large, the last term in brackets can be made smaller than \((C' {\widetilde{K}} q)^{-1}\). This yields the result. \(\square \)

Corollary 4.6

There exists a neigbourhood \(U \ni {\bar{x}}\), such that the power series

$$\begin{aligned} g^{a b}|_x := \sum _{N = 0}^\infty \frac{1}{N!} T^{a b}_{c_1 \cdots c_N} (x - {\bar{x}})^{c_1} \cdots (x - {\bar{x}})^{c_N} \end{aligned}$$
(4.9)

converges for all \(x \in U\), its inverse defines a Riemannian metric, and the equality \(X^\alpha |_x = g^{\alpha \beta } Y_\beta |_x\) holds for all \(x \in U\).

Proof

The definitions yield

$$\begin{aligned} \big | T^{a b}_{c_1 \cdots c_N} (x - {\bar{x}})^{c_1} \cdots (x - {\bar{x}})^{c_N} \big | \le n^N \Vert T^{\alpha \beta }_{\gamma _1 \cdots \gamma _N} \Vert _\infty \Vert x - {\bar{x}} \Vert _1^N, \end{aligned}$$

where \(\Vert y \Vert _1:= \sum _a | y^a |\) for \(y \in V^*\). Since \(\Vert T_{\gamma _1 \cdots \gamma _N}^{\alpha \beta } \Vert _\infty \le C N! p^N\) by Lemma 4.5, we infer that the power series (4.9) converges for \(\Vert x - {\bar{x}}\Vert _1 < 1/(p n)\).

To verify that \(g^{\alpha \beta }\) defines a metric, note first that \(g^{a b} = g^{b a}\) by construction. To show that \(g^{\alpha \beta }\) is positive definite when x is close enough to \({\bar{x}}\), it suffices to note that \(g^{\alpha \beta }|_{{\bar{x}}} = {\bar{g}}^{\alpha \beta }\) is positive definite and the map \(x \mapsto g^{\alpha \beta }|_{x}\) is continuous.

Since the tensor fields \(X^\alpha \), \(Y_\beta \), and \(g^{\alpha \beta }\) are given by convergent power series, and since \(X^\alpha |_{{\bar{x}}} = g^{\alpha \beta } Y_\beta |_{{\bar{x}}} \) by assumption, it is enough to verify that all derivatives at \({\bar{x}}\) coincide, i.e.,

$$\begin{aligned} \partial _{c_1} \cdots \partial _{c_N} X^a = \partial _{c_1} \cdots \partial _{c_N} (g^{a b} Y_b ) \end{aligned}$$

for all \(N\in {\mathbb N}\) and all \(c_1, \ldots , c_N \in \{ 1, \ldots , n \}\). To prove this identity, we use the notation from Definition 4.3, to obtain at \(x = {\bar{x}}\),

(4.10)

To obtain the third equality, we use that \({\bar{x}}\) is a critical point, together with the definitions of R, T, and U in Definition 4.3. In the final step we use the tensor Eq. (4.6). \(\square \)

The proof of Theorem 3.2 is now complete, as the metric \(g^{\alpha \beta }\) constructed above can be pushed back to M using the chart \(\varphi \).

5 Construction of a metric of class \(C^k\)

Let \(X^\alpha \) be a vector field and \(Y_\beta \) be a co-vector field on a smooth manifold M. As before, let \({\textsf {N}}_Y:= \{ m \in M \, \ Y|_m = 0 \}\) be the set of critical points of Y. In this section we weaken the regularity assumptions on X and Y. In Proposition 5.1 these fields are assumed to be merely differentiable. Subsequently we provide the proof of Theorem 2.7, which deals with fields of class \(C^{k+1}\) for \(k \in {\mathbb N}\).

The following result, which does not require an iterative scheme, is known in the special case where \(Y_\beta \) is the derivative of a scalar function [3, 5]. In this setting, the existence of a metric with the desired property away from critical points is proved in [3]. The construction of the metric below is taken from there. It relies on the unique decomposition of vector fields into a component parallel to X and a component annihilating Y, which only works away from critical points. The proof of the existence of a continuous extension to all of M is adapted from [5].

Proposition 5.1

(Existence of a continuous metric) Let \(X^\alpha \) and \(Y_\beta \) be differentiable fields on M and suppose that the bilinear form \(\nabla _\alpha Y_\beta |_m\) is non-degenerate for all \(m \in {\textsf {N}}_Y\) for some (equivalently, any) connection \(\nabla \). Suppose that the following conditions hold:

  1. (i)

    \(X^\alpha Y_\alpha |_m>0\) for all \(m \in M \setminus {\textsf {N}}_Y\);

  2. (ii)

    \(X^\alpha |_m=0\) for all \(m \in {\textsf {N}}_Y\);

  3. (iii)

    For all \(m \in {\textsf {N}}_Y\) there exists a scalar product \({\bar{g}}_m \in T_m M \otimes _{\textrm{S}} T_m M\), such that

    $$\begin{aligned} \nabla _\alpha Y_\gamma |_m = {\bar{g}}_{\beta \gamma } \nabla _\alpha X^\beta |_m, \end{aligned}$$

    where \(\nabla _\alpha \) is an arbitrary connection.

Then there exists a continuous metric \(g_{\alpha \beta }\) on M satisfying \(Y_\beta = g_{\alpha \beta } X^\alpha \).

Proof

Let \(m \in M \setminus {\textsf {N}}_Y\) be a non-critical point, hence \(X|_m \ne 0\) and \(Y|_m \ne 0\) by (ii). The assumption (i) implies that we have the direct sum decomposition \(T_m M = Y_m^\perp \oplus {{\,\textrm{span}\,}}\{X_m\}\), hence every vector \(Z \in T_m M\) can be uniquely decomposed as

$$\begin{aligned} Z = Z^{(0)} + Z^{(1)}, \quad \text {with} \quad Z^{(0)} \in Y_m^\perp \quad \textrm{and}\quad Z^{(1)} := \frac{\langle {Z,Y_m}\rangle }{\langle {X_m, Y_m}\rangle } X_m \in {{\,\textrm{span}\,}}\{X_m\}. \end{aligned}$$

Let \(g = g_{\alpha \beta }\) be an arbitrary continuous metric on M satisfying \(g|_m = {\bar{g}}_m\) at all critical points \(m \in {\textsf {N}}_Y\). Following [3], we construct a perturbation of \({\widetilde{g}}\) on \(M {\setminus } {\textsf {N}}_Y\) as follows:

$$\begin{aligned} {\widetilde{g}}(Z,W) := g(Z^{(0)},W^{(0)}) + \frac{ \langle {Z^{(1)},Y}\rangle \; \langle {W^{(1)},Y}\rangle }{\langle {X, Y}\rangle }, \end{aligned}$$
(5.1)

for \(Z, W \in \Gamma (TM)\). In view of (i), it readily follows that g defines a continuous metric on \(M \setminus {\textsf {N}}_Y\). It remains to show that \({\widetilde{g}}\) can be continuously extended to all of M.

It will be convenient to use abstract index notation. Taking into account that \(\langle {Z^{(1)},Y}\rangle = \langle {Z,Y}\rangle \) and \(\langle {W^{(1)},Y}\rangle = \langle {W,Y}\rangle \), it follows from the definition that

$$\begin{aligned} {\widetilde{g}}_{\alpha \beta }&= g_{\alpha \beta } + \frac{Y_\alpha Y_\beta - g_{\alpha \gamma } X^\gamma Y_\beta - g_{\gamma \beta } X^\gamma Y_\alpha }{X^\delta Y_\delta } + \frac{g_{\gamma \delta } X^\gamma X^\delta Y_\alpha Y_\beta }{(X^\delta Y_\delta )^2}. \end{aligned}$$

Introducing the deficit \(R_\beta := Y_\beta - g_{\alpha \beta } X^\alpha \), we can write

$$\begin{aligned} {\widetilde{g}}_{\alpha \beta }&= g_{\alpha \beta } + \frac{R_\alpha Y_\beta + R_\beta Y_\alpha }{X^\delta Y_\delta } - \frac{R_\gamma X^\gamma Y_\alpha Y_\beta }{(X^\delta Y_\delta )^2}. \end{aligned}$$
(5.2)

Fix a critical point \({\bar{m}} \in {\textsf {N}}_Y\). Using assumptions (ii) and (iii) we shall show that \({\widetilde{g}}|_m \rightarrow g|_{{\bar{m}}}\) as \(m \rightarrow {\bar{m}}\), following the arguments in [5]. Using the notation from Sect. 4, we shall perform a Taylor expansion of the terms in (5.2) in a fixed chart, where \({\bar{m}} \in M\) corresponds to \({\bar{x}} \in {\mathbb R}^n\). As X and Y are differentiable, and \({\bar{x}}\) is a critical point, it follows from (ii) that

$$\begin{aligned} X^a(x) = \nabla _c X^a({\bar{x}}) (x - {\bar{x}})^c + o \big (|x-{\bar{x}}|\big ) \quad \textrm{and}\quad Y_b(x) = \nabla _c Y_b({\bar{x}}) (x - {\bar{x}})^c + o \big (|x-{\bar{x}}|\big ). \end{aligned}$$
(5.3)

Since \({\bar{g}}_{a b}({\bar{x}})\) is a scalar product, there exists \(\kappa > 0\) such that \({\bar{g}}_{a b}({\bar{x}}) v^a v^b \ge \kappa |v|^2\) for all \(v \in {\mathbb R}^n\). Furthermore, \(\nabla _b X^a\) is non-degenerate by assumption (iii) and the non-degeneracy assumption on \(\nabla _b Y^a\). Therefore, \(|\nabla _b X^a v |^2 \ge {\widetilde{\kappa }} | v |^2\) for some constant \({\widetilde{\kappa }} > 0\). Using these inequalities, together with (iii), yields

$$\begin{aligned} \begin{aligned} X^a Y_a(x)&= \nabla _b X^a({\bar{x}}) \nabla _c Y_a({\bar{x}}) (x - {\bar{x}})^b (x - {\bar{x}})^c + o \big (|x-{\bar{x}}|^2\big ) \\ {}&= {\bar{g}}_{a d}({\bar{x}}) \nabla _b X^a({\bar{x}}) \nabla _c X^d({\bar{x}}) (x - {\bar{x}})^b (x - {\bar{x}})^c + o \big (|x-{\bar{x}}|^2\big ) \\ {}&\ge \kappa \big | \nabla _b X^a({\bar{x}}) (x - {\bar{x}})^b \big |^2 + o \big (|x-{\bar{x}}|^2\big ) \\ {}&\ge \kappa {\widetilde{\kappa }} | x - {\bar{x}} |^2 + o \big (|x-{\bar{x}}|^2\big ), \end{aligned} \end{aligned}$$
(5.4)

which bounds the denominator in (5.2) from below. As for the terms in the numerator, we first note that \(X^a(x) = O\big (|x-{\bar{x}}|\big )\) and \(Y_b(x) = O\big (|x-{\bar{x}}|\big )\). These bounds trivially imply that \(R_b(x) = O\big (|x-{\bar{x}}|\big )\) as well, but this is not sufficient. The key point of the proof is that this bound can be improved. Indeed, using (iii) and the continuity of g at \({\bar{x}}\), we obtain

$$\begin{aligned} \begin{aligned} R_b(x)&= \big (Y_b - g_{a b} X^a\big )(x) \\ {}&= \nabla _c Y_b({\bar{x}}) (x - {\bar{x}})^c - g_{a b}(x) \nabla _c X^a({\bar{x}}) (x - {\bar{x}})^c + o (|x-{\bar{x}}|) \\ {}&= \big ({\bar{g}}_{a b}(x) - g_{a b}(x) \big ) \nabla _c X^a({\bar{x}}) (x - {\bar{x}})^c + o (|x-{\bar{x}}|) \\ {}&= o \big (|x-{\bar{x}}|\big ). \end{aligned} \end{aligned}$$
(5.5)

It now follows from (5.4) and (5.5) together with the bounds on X and Y, that the fractions in (5.2) vanish as \(x \rightarrow {\bar{x}}\). This shows that \({\widetilde{g}}\) can be continuously extended to M by setting \({\widetilde{g}}_{a b}({\bar{x}}):= {\bar{g}}_{a b}({\bar{x}})\). \(\square \)

While the metric \(\widetilde{g}\) constructed in the proof of Proposition 5.1 is continuous, it is not in general differentiable, even if the background metric \(g_{\alpha \beta }\) and the vector fields \(X^\alpha \) and \(Y_\beta \) are smooth. Here is an explicit counterexample.

Example 5.2

Let M be the open unit ball in \({\mathbb R}^2\). We work in cartesian coordinates. Set \(X(x) = Y(x) = x\) for \(x \in M\), and consider the background metric \(g_{\alpha \beta }\) defined by

$$\begin{aligned} g_{a b}(x):= \begin{bmatrix} 1 +x_2 &{} 0 \\ 0 &{} 1 \end{bmatrix} \end{aligned}$$

for \(x = (x_1, x_2) \in M\). Since g is smooth and \(g|_0 = I\), it is a valid background metric. An explicit computation using the identity \(R(x) =\Big [ \begin{matrix} - x_1 x_2 \\ 0 \end{matrix}\Big ]\) yields

$$\begin{aligned} \widetilde{g}_{1 1}(x) = 1 + \frac{x_2^5}{(x_1^2 + x_2^2)^2} \quad \textrm{and}\quad \nabla _1 \widetilde{g}_{1 1}(x) = -4\frac{ x_1 x_2^5}{(x_1^2 + x_2^2)^3} \end{aligned}$$

for \(x \in M \setminus \{ 0 \}\). The latter is a non-constant homogeneous function and, therefore, its limit as \(x \rightarrow 0\) does not exist. We conclude that \({\widetilde{g}}_{\alpha \beta }\) does not extend to a \(C^1\) metric on M.

Theorem 2.7 shows that better regularity properties can be obtained by a careful choice of the background metric \(g_{\alpha \beta }\). In the following proof we define \(g_{\alpha \beta }\) by making use of the construction in Sect. 4, which yields improved bounds on the deficit \(R_\beta := Y_\beta - g_{\alpha \beta } X^\alpha \) around critical points. This allows us to construct a metric \(\widetilde{g}_{\alpha \beta }\) of class \(C^k\) whenever \(X^\alpha \) and \(Y_\beta \) are of class \(C^{k+1}\).

Proof of Theorem 2.7

First we note that the necessity of conditions (i) and (ii) was already observed in the introduction. The necessity of (iii) follows, even when g is assumed to be merely continuous, from the expansions for X and Y in (5.3) and the expansion \(g(x) = g({\bar{x}}) + o\big (|x-{\bar{x}}|\big )\) in local coordinates around a critical point \({\bar{x}}\). Therefore it remains to show that these three conditions are also sufficient.

As in Proposition 5.1, we construct a metric of the form (5.2) on the non-critical set \(M {\setminus } {\textsf {N}}_Y\):

$$\begin{aligned} {\widetilde{g}}_{\alpha \beta }&= g_{\alpha \beta } + \frac{R_\alpha Y_\beta + R_\beta Y_\alpha }{X^\delta Y_\delta } - \frac{R_\gamma X^\gamma Y_\alpha Y_\beta }{(X^\delta Y_\delta )^2}, \end{aligned}$$
(5.6)

where \(R_\beta := Y_\beta - g_{\alpha \beta } X^\alpha \) denotes the deficit, and \(g_{\alpha \beta }\) is a background metric on M that will be carefully chosen below. As noted before, it is immediate to verify that the desired identity \(Y_\beta = \widetilde{g}_{\alpha \beta } X^\alpha \) holds on \(M \setminus {\textsf {N}}_Y\).

Construction of the background metric Fix \(\bar{m} \in {\textsf {N}}_Y\). As in Sect. 4 we work in a fixed coordinate chart where \(\bar{m}\) corresponds to \(\bar{x} \in {\mathbb R}^n\). In these local coordinates we then define the background metric by

$$\begin{aligned} g_{\bar{m}}^{a b}(x) := \sum _{N = 0}^k \frac{1}{N!} T^{a b}_{c_1 \cdots c_N} (x - \bar{x})^{c_1} \cdots (x - \bar{x})^{c_N} \end{aligned}$$

for x in a small neigbourhood around \(\bar{x}\). It is crucial that we use the tensors \(T^{\alpha \beta }_{\gamma _1 \cdots \gamma _N}\) that were constructed in Definition 4.3. Note that \(T^{\alpha \beta }_{\gamma _1 \cdots \gamma _N}\) is indeed well defined for \(N\le k\) due to our assumption that \(X^\alpha \) and \(Y_\beta \) are \(k+1\) times continuously differentiable. As \(T^{\alpha \beta }\) is positive definite, it follows that \((g_{{\bar{m}}})_{\alpha \beta }\) defines a metric in a neighbourhood of \({\bar{x}}\).

For each cricitical point \({\bar{m}}\), this construction yields a Riemannian metric in an open neighbourhood \(\mathcal {V}_{\bar{m}}\) of \({\bar{m}}\). By the non-degeneracy assumption, we may assume that the sets \(\{ \mathcal {V}_{\bar{m}}\}_{{\bar{m}} \in {\textsf {N}}_Y}\) are pairwise disjoint. Let \(\mathcal {U}_{\bar{m}}\) be an open neighbourhood of \({\bar{m}}\) satisfying \(\overline{\mathcal {U}_{\bar{m}}} \subseteq \mathcal {V}_{\bar{m}}\) and let \(f_{{\bar{m}}}: M \rightarrow [0,1]\) be a smooth function on M satisfying \(f_{{\bar{m}}}|_{\mathcal {U}_{\bar{m}}} = 1\) and \(f_{{\bar{m}}}|_{M\setminus \mathcal {V}_{\bar{m}}} = 0\). Using an arbitrary metric \((g_*)_{\alpha \beta }\) on M and the function \({\widetilde{f}}:= 1 - \sum _{{\bar{m}}\in {\textsf {N}}_Y} f_{{\bar{m}}}\), we define

$$\begin{aligned} g_{\alpha \beta } := \sum _{{\bar{m}}\in {\textsf {N}}_Y} f_{{\bar{m}}}(g_{\bar{m}})_{\alpha \beta } + {\widetilde{f}} {\widetilde{g}}_{\alpha \beta }, \end{aligned}$$
(5.7)

which yields a \(C^k\) metric \(g_{\alpha \beta }\) on M satisfying \(g_{\alpha \beta }|_m = (g_{\bar{m}})_{\alpha \beta }|_m\) for all \(\bar{m}\in {\textsf {N}}_Y\) and \(m\in \mathcal {U}_{\bar{m}}\).

The crucial property of this background metric g, which will be used below, is that the deficit \(R_\beta := Y_\beta - g_{\alpha \beta } X^\alpha \) satisfies

$$\begin{aligned} \partial _{c_1}\cdots \partial _{c_p} R_\beta |_{\bar{m}} = 0 \end{aligned}$$
(5.8)

for all \(\bar{m} \in {\textsf {N}}_Y\) and \(p\le k+1\). This follows from the definition of the tensors \(T^{a b}_{c_1 \cdots c_N}\) using the computation (4.10).

Differentiability of the metric To verify that \({\widetilde{g}}_{\alpha \beta }\) is k times continuously differentiable, we will show that the partial derivatives

$$\begin{aligned} U_{\alpha \beta \, c_1\dots c_p} := \partial _{c_1} \cdots \partial _{c_p} \frac{R_\alpha Y_\beta }{X^\delta Y_\delta } \quad \textrm{and}\quad V_{\alpha \beta \, c_1\dots c_p} := \partial _{c_1} \cdots \partial _{c_p} \frac{R_\gamma X^\gamma Y_\alpha Y_\beta }{(X^\delta Y_\delta )^2} \end{aligned}$$

can be continuously extended from \(M \setminus {\textsf {N}}_Y\) to all of M for \(p\le k\). In view of (5.6) this yields the desired result.

We use the notation from Definition 4.3, thus \(\partial _{c_S} = \partial _{c_{i_1}} \cdots \partial _{c_{i_q}}\) for \(S = \{ i_1, \ldots , i_q \} \subseteq \{1, \ldots , p\}\) with \(i_\mu \ne i_\nu \) for \(\mu \ne \nu \). With this notation we have

$$\begin{aligned}&U_{\alpha \beta \, c_1\dots c_p} = \sum _{\ell =0}^p \sum _{\{S_1,\dots ,S_\ell ,A,B\} \in \mathcal {X}_p} \frac{(-1)^\ell \ell !}{\left( X^\delta Y_\delta \right) ^{\ell +1}} \partial _{c_{S_1}} \! \big (X^\delta Y_\delta \big ) \cdots \partial _{c_{S_\ell }} \! \big (X^\delta Y_\delta \big ) \, \partial _{c_{A}} R_\alpha \, \partial _{c_{B}} Y_\beta , \\&V_{\alpha \beta \, c_1\dots c_p} = \sum _{\ell =0}^p \sum _{\{S_1,\dots ,S_\ell ,A,B\} \in \mathcal {X}_p} \! \frac{(-1)^\ell \ell !}{\big (X^\delta Y_\delta \big )^{2(\ell +1)}}\partial _{c_{S_1}} \! \big (X^\delta Y_\delta \big )^2 \cdots \partial _{c_{S_\ell }} \! \big (X^\delta Y_\delta \big )^2 \partial _{c_{A}} \! R_\gamma \, \partial _{c_{B}} \! \big (X^\gamma Y_\alpha Y_\beta \big ), \end{aligned}$$

where \(\mathcal {X}_p\) is the collection of all possible partitions of \(\{1,\dots ,p\}\).

Let us fix a critical point \({\bar{m}} \in {\textsf {N}}_Y\) and let \(\bar{x}\) be the corresponding point in \({\mathbb R}^n\). Recall from (5.4) that

$$\begin{aligned} \big (X^\delta Y_\delta \big )^{-1}(x) = O \big (|x-{\bar{x}}|^{-2}\big ). \end{aligned}$$

Furthermore, since \(X^\alpha |_{\bar{x}} = 0\) and \(Y_\alpha |_{\bar{x}} = 0\), Taylor’s formula yields, for any \(S \subseteq \{1, \ldots , p\}\),

$$\begin{aligned} \partial _{c_S} \big (X^\delta Y_\delta \big )(x) = O \big ( |x-{\bar{x}}|^{(2 - |S|)_+} \big ),{} & {} &\partial _{c_S} Y_\beta (x) = O \big ( |x-{\bar{x}}|^{(1 - |S|)_+} \big ),\\ \partial _{c_S} \big (X^\delta Y_\delta \big )^2(x) = O \big (|x-{\bar{x}}|^{(4-|S|)_+}\big ),{} & {} &\partial _{c_S} \big ( X^\gamma Y_\alpha Y_\beta \big )(x) = O \big ( |x-{\bar{x}}|^{(3 - |S|)_+} \big ) . \end{aligned}$$

To estimate \(\partial _{c_S} R_\alpha (x)\) we use the crucial point, observed in (5.8), that our background metric is constructed so that \(\partial _{c_S} R_\beta ({\bar{x}}) = 0\) when \(|S| \le k+1\). This ensures that

$$\begin{aligned} \partial _{c_S} R_\alpha (x) = O \big (|x-{\bar{x}}|^{k+2-|S|}\big ). \end{aligned}$$

Combining these bounds, we estimate the right-hand sides of \(U_{\alpha \beta \, c_1\dots c_p}\) and \(V_{\alpha \beta \, c_1\dots c_p}\) as follows:

$$\begin{aligned} \frac{1}{\left( X^\delta Y_\delta \right) ^{\ell +1}} \partial _{c_{S_1}} \! \big (X^\delta Y_\delta \big ) \cdots \partial _{c_{S_\ell }} \! \big (X^\delta Y_\delta \big ) \, \partial _{c_{A}} R_\alpha \, \partial _{c_{B}} Y_\beta = O \big (|x-{\bar{x}}|^u\big ), \\ \frac{1}{\left( X^\delta Y_\delta \right) ^{2(\ell +1)}} \, \partial _{c_{S_1}} \! \big ( X^\delta Y_\delta \big )^2 \cdots \partial _{c_{S_\ell }} \! \big ( X^\delta Y_\delta \big )^2 \, \partial _{c_{A}} \! R_\gamma \, \partial _{c_{B}} \! \big ( X^\gamma Y_\alpha Y_\beta \big ) = O\big ( |x-{\bar{x}}|^{ v } \big ), \end{aligned}$$

where the exponents u and v satisfy

$$\begin{aligned} u&= -2 (\ell + 1) + \big ( 2 - |S_1| \big )_+ + \ldots + \big ( 2 - |S_\ell | \big )_+ + \big ( k + 2 - |S_A| \big ) + \big ( 1 - |S_B| \big )_+ \,, \\ v&= - 4 (\ell + 1) + \big (4-|S_1|\big )_+ + \ldots + \big (4-|S_\ell |\big )_+ + \big ( k + 2 - |S_A| \big ) + \big ( 3 - |S| \big )_+\,. \end{aligned}$$

Since \(|S_1|+ \cdots + |S_\ell | + |A| + |B| = p\) for \(\{S_1, \ldots , S_\ell , A, B\} \in \mathcal {X}_p\), we obtain \( u \ge k - p + 1 \ge 1 \) and \( v \ge k - p + 1 \ge 1 \), which shows that

$$\begin{aligned} U_{\alpha \beta \, c_1\dots c_p} = O\big ( |x - {\bar{x}} | \big ) \quad \textrm{and}\quad V_{\alpha \beta \, c_1\dots c_p} = O\big ( |x - {\bar{x}} | \big ). \end{aligned}$$

Therefore \(U_{\alpha \beta \, c_1\dots c_p}\) and \(V_{\alpha \beta \, c_1\dots c_p}\) can be extended continuously to all of M by assigning the value zero for \({\bar{m}} \in {\textsf {N}}_Y\). \(\square \)

6 Application to quantum Markov semigroups (QMS)

In this section prove Theorem 1.2 by an application of Corollary 2.6. As in Sect. 1, let \(\mathscr {L}\) be the generator of an ergodic quantum Markov semigroup \((\mathscr {P}_t)_{t \ge 0}\) on a finite dimensional \(C^*\)-algebra \(\mathcal {A}\) with stationary state \(\sigma \in {\mathfrak P}_+\). The manifold under consideration is the set of strictly positive density matrices

$$\begin{aligned} {\mathfrak P}_+ = \{ \rho \in {\mathfrak P}\ : \ \rho > 0 \}. \end{aligned}$$

Note that \({\mathfrak P}_+\) is a relatively open subset of the affine space \(\sigma + T \subseteq \mathcal {A}\), where

$$\begin{aligned} T := \{A\in \mathcal {A}\ : \ A = A^*, \ {{\,\textrm{Tr}\,}}[A] = 0 \}. \end{aligned}$$

Therefore, the tangent space of \({\mathfrak P}_+\) can be naturally identified with T. We will apply Corollary 2.6 to the triple (MfX) where \(M:= {\mathfrak P}_+\) and

$$\begin{aligned} f&: {\mathfrak P}_+ \rightarrow {\mathbb R},{} & {} f(\sigma ) := H_\sigma (\rho ) = {{\,\textrm{Tr}\,}}[\rho (\log \rho - \log \sigma )], \\ X&: {\mathfrak P}_+ \rightarrow T,{} & {} X(\rho ) := \mathscr {L}^\dagger \rho . \end{aligned}$$

The functional \(H_\sigma \) is everywhere strictly positive, except at its global minimum \(\sigma \). Moreover, a standard computation shows that, for \(\rho \in {\mathfrak P}_+\) and \(A \in T\),

$$\begin{aligned} \partial _\varepsilon \big |_{\varepsilon = 0} H_\sigma (\rho + \varepsilon A) = {{\,\textrm{Tr}\,}}[(\log \rho - \log \sigma )A]. \end{aligned}$$
(6.1)

Therefore, the differential of \(H_\sigma \) is everywhere non-zero except at \(\sigma \), so that we are in a position to apply Corollary 2.6.

Recall that we are interested in the bkm-scalar product on \(\mathcal {A}\) given by

$$\begin{aligned} \langle {A,B}\rangle _\sigma ^{\textsc {bkm}} := {{\,\textrm{Tr}\,}}[ A^* \mathscr {M}_\sigma (B)], \quad \text {where } \mathscr {M}_\sigma (B) := \int _0^1 \sigma ^{1-s} B\sigma ^s \; \text {d}s, \end{aligned}$$

for \(A, B \in \mathcal {A}\). We refer to [2] for a recent study of this scalar product. It is natural to also consider the inner product on \(\mathcal {A}\) defined in terms of the inverse operator \(\mathscr {M}_\sigma ^{-1}: \mathcal {A}\rightarrow \mathcal {A}\) given by

$$\begin{aligned} \langle {A,B}\rangle _\sigma ^{\widetilde{\textsc {bkm}}} := {{\,\textrm{Tr}\,}}[ A^* \mathscr {M}_\sigma ^{-1} (B)], \quad \text {where } \mathscr {M}_\sigma ^{-1}(B) := \int _0^\infty (t + \sigma )^{-1} B (t + \sigma )^{-1} \; \text {d}t. \end{aligned}$$

We will use the following simple result.

Lemma 6.1

For a linear operator \(\mathscr {K}: \mathcal {A}\rightarrow \mathcal {A}\) the following assertions are equivalent:

  1. (1)

    \(\mathscr {K}\) is selfadjoint with respect to the inner product \(\langle {\cdot ,\cdot }\rangle _\sigma ^{\textsc {bkm}}\).

  2. (2)

    \(\mathscr {K}^\dagger \) is selfadjoint with respect to the inner product \(\langle {\cdot ,\cdot }\rangle _\sigma ^{\widetilde{\textsc {bkm}}}\).

Proof

It is readily seen that both assertions are equivalent to \(\mathscr {M}_\sigma \mathscr {K}= \mathscr {K}^\dagger \mathscr {M}_\sigma \). \(\square \)

The entropy production functional \(I_\sigma : {\mathfrak P}_+ \rightarrow {\mathbb R}\) is defined by

$$\begin{aligned} I_\sigma (\rho ) = - {{\,\textrm{Tr}\,}}[ (\log \rho - \log \sigma ) \mathscr {L}^\dagger \rho ] \end{aligned}$$

for \(\rho \in {\mathfrak P}_+\). Note that indeed \( \frac{\text {d}}{\text {d}t}H_\sigma (\mathscr {P}_t^\dagger \rho ) = - I_\sigma (\mathscr {P}_t^\dagger \rho ) \). The functional \(I_\sigma \) is nonnegative and convex [22, 23]. The following result shows the strict positivity of the entropy production (except at stationarity) under the assumption of bkm-detailed balance.

Proposition 6.2

Let \(\mathscr {L}\) be the generator of an ergodic quantum Markov semigroup on a finite dimensional \(C^*\)-algebra \(\mathcal {A}\), with invariant state \(\sigma \in {\mathfrak P}_+\). If bkm-detailed balance holds, then \(I_\sigma (\rho ) > 0\) for all \(\rho \in {\mathfrak P}_+\) with \(\rho \ne \sigma \).

Proof

As remarked above, \(I_\sigma \) is nonnegative and convex. Therefore, it suffices to show that \(I_\sigma \) is strictly convex at its minimum \(\sigma \). Take \(A \in T\) with \(A \ne 0\).

For \(\rho \in {\mathfrak P}_+\) we set \(\rho _\varepsilon := \rho + \varepsilon A\) for \(|\varepsilon |\) sufficiently small to ensure that \(\rho _\varepsilon \in {\mathfrak P}_+\). Using the standard identities

$$\begin{aligned} \partial _\varepsilon \big |_{\varepsilon = 0} \log \rho _\varepsilon = \int _0^\infty (t + \rho )^{-1} A (t + \rho )^{-1} \; \text {d}t \quad \textrm{and}\quad \partial _\varepsilon \big |_{\varepsilon = 0} (s + \rho _\varepsilon )^{-1} = - (s + \rho )^{-1} A (s + \rho )^{-1} \end{aligned}$$

for \(s > 0\), we obtain

$$\begin{aligned} \partial _\varepsilon \big |_{\varepsilon = 0} I_\sigma (\rho _\varepsilon ) = {{\,\textrm{Tr}\,}}[ (\log \rho - \log \sigma ) \mathscr {L}^\dagger A ] + {{\,\textrm{Tr}\,}}\bigg [ \int _0^\infty (t + \rho )^{-1} A (t + \rho )^{-1} \mathscr {L}^\dagger \rho \; \text {d}t \bigg ], \end{aligned}$$

and

$$\begin{aligned} \partial _\varepsilon ^2\big |_{\varepsilon = 0} I_\sigma (\rho _\varepsilon )&=2 {{\,\textrm{Tr}\,}}\bigg [ \int _0^\infty (t + \rho )^{-1} A (t + \rho )^{-1} \mathscr {L}^\dagger A \; \text {d}t \bigg ] \\ {}&\qquad - 2 {{\,\textrm{Tr}\,}}\bigg [ \int _0^\infty (t + \rho )^{-1} A (t + \rho )^{-1} A (t + \rho )^{-1} \mathscr {L}^\dagger \rho \; \text {d}t \bigg ]. \end{aligned}$$

In particular, for \(\sigma _\varepsilon := \sigma + \varepsilon A\), we obtain

$$\begin{aligned} \partial _\varepsilon ^2\big |_{\varepsilon = 0} I_\sigma (\sigma _\varepsilon )&=2 {{\,\textrm{Tr}\,}}\bigg [ \int _0^\infty (t + \sigma )^{-1} A (t + \sigma )^{-1} \mathscr {L}^\dagger A \; \text {d}t \bigg ] = 2 \langle {A, \mathscr {L}^\dagger A}\rangle _\sigma ^{\widetilde{\textsc {bkm}}}. \end{aligned}$$

Since \(I_\sigma \) is convex, this identity implies that \(\langle {A, \mathscr {L}^\dagger A}\rangle _\sigma ^{\widetilde{\textsc {bkm}}} \ge 0\).

On the other hand, \(\mathscr {L}^\dagger \) is selfadjoint with respect to \(\langle {\cdot ,\cdot }\rangle _\sigma ^{\widetilde{\textsc {bkm}}}\) by Lemma 6.1 and the assumption of bkm-detailed balance. Moreover, the restriction of \(\mathscr {L}^\dagger \) to T is invertible by the ergodicity assumption. Therefore, \(\langle {A, \mathscr {L}^\dagger A}\rangle _\sigma ^{\widetilde{\textsc {bkm}}} \ne 0\).

We thus conclude that \(\langle {A, \mathscr {L}^\dagger A}\rangle _\sigma ^{\widetilde{\textsc {bkm}}} > 0\), which yields the result. \(\square \)

Proof of Theorem 1.2

First we will translate condition (iii) of Corollary 2.6, namely the selfadjointness of the linearised operator \(\Lambda \) with respect to the Hessian scalar product h. We claim that this is exactly the assumption of bkm-detailed balance in our setting.

Indeed, since \(\mathscr {L}^\dagger \) is a linear operator, its linearisation \(\Lambda : T \rightarrow T\) appearing in condition (iii) is simply given by \(\Lambda := \mathscr {L}^\dagger \). Moreover, the Hessian of \(\rho \mapsto H_\sigma (\rho )\) at \(\rho = \sigma \) is given by

$$\begin{aligned} h(A, B) := \partial _\varepsilon \big |_{\varepsilon = 0} \partial _\eta \big |_{\eta = 0} H_\sigma \big ( \sigma + \varepsilon A + \eta B \big ) = \int _0^\infty {{\,\textrm{Tr}\,}}\Big [ \frac{1}{s + \sigma } A \frac{1}{s + \sigma } B \Big ] \; \text {d}s {=} \langle {A,B}\rangle _\sigma ^{\widetilde{\textsc {bkm}}} \end{aligned}$$

for \(A, B \in T\). Hence the Hessian scalar product in condition (iii) is the \({\widetilde{\textsc {bkm}}}\)-scalar product. Thus, condition (iii) is the \({\widetilde{\textsc {bkm}}}\)-selfadjointness of \(\mathscr {L}^\dagger \). By Lemma 6.1 this corresponds to the \({{\textsc {bkm}}}\)-selfadjointness of \(\mathscr {L}\), which is the assumption of \({{\textsc {bkm}}}\)-detailed balance.

This argument shows that the necessity of bkm-detailed balance for the gradient flow structure follows from Corollary 2.6. To show that bkm-detailed balance is also sufficient, we note first that condition (ii) of Corollary 2.6 is simply the stationarity condition \(\mathscr {L}^\dagger \sigma = 0\), which holds by assumption. Thus, it remains to show that condition (i) of Corollary 2.6 is implied by the assumption of \({{\textsc {bkm}}}\)-detailed balance. Then the existence of the gradient flow structure follows by applying Corollary 2.6 in the opposite direction.

For this purpose, recall that \(f = H_\sigma \) and \(X = \mathscr {L}^\dagger \), so that

$$\begin{aligned} \nabla _{X} f = {{\,\textrm{Tr}\,}}[(\log \rho - \log \sigma ) \mathscr {L}^\dagger \rho ] = - I_\sigma . \end{aligned}$$

Hence, condition (i) is the strict positivity of the entropy production \(I_\sigma (\rho )\) or \(\rho \ne \sigma \), which follows from the assumption of \({{\textsc {bkm}}}\)-detailed balance by Proposition 6.2. \(\square \)