Banach manifold structure and infinite-dimensional analysis for causal fermion systems

Finster, Felix; Lottner, Magdalena

doi:10.1007/s10455-021-09775-4

Banach manifold structure and infinite-dimensional analysis for causal fermion systems

Open access
Published: 31 May 2021

Volume 60, pages 313–354, (2021)
Cite this article

Download PDF

You have full access to this open access article

Annals of Global Analysis and Geometry Aims and scope Submit manuscript

Banach manifold structure and infinite-dimensional analysis for causal fermion systems

Download PDF

1899 Accesses
8 Citations
2 Altmetric
Explore all metrics

Abstract

A mathematical framework is developed for the analysis of causal fermion systems in the infinite-dimensional setting. It is shown that the regular spacetime point operators form a Banach manifold endowed with a canonical Fréchet-smooth Riemannian metric. The so-called expedient differential calculus is introduced with the purpose of treating derivatives of functions on Banach spaces which are differentiable only in certain directions. A chain rule is proven for Hölder continuous functions which are differentiable on expedient subspaces. These results are made applicable to causal fermion systems by proving that the causal Lagrangian is Hölder continuous. Moreover, Hölder continuity is analyzed for the integrated causal Lagrangian.

On the Mathematical Foundations of Causal Fermion Systems in Minkowski Space

Article Open access 23 November 2020

Fermionic Fock Spaces and Quantum States for Causal Fermion Systems

Article 15 October 2021

A Hamiltonian formulation of causal variational principles

Article 27 April 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The theory of causal fermion systems is a recent approach to fundamental physics (see the basics in Sect. 2, the reviews [11, 12, 16], the textbook [10] or the website [1]). In this approach, spacetime and all objects therein are described by a measure $\rho $ on a set ${\mathcal{F}}$ of linear operators of rank at most 2n on a Hilbert space $({\mathcal{H}}, \langle .|. \rangle _{{\mathcal{H}}})$. The physical equations are formulated via the so-called causal action principle, a nonlinear variational principle where an action ${\mathcal{S}}$ is minimized under variations of the measure $\rho $. If the Hilbert space ${\mathcal{H}}$ is finite-dimensional, the set ${\mathcal{F}}$ is a locally compact topological space. Making essential use of this fact, it was shown in [9] that the causal action principle is well defined and that minimizers exist. Moreover, as is worked out in detail in [15], the interior of ${\mathcal{F}}$ (consisting of the so-called regular points; see Definition 3.1) has a smooth manifold structure. Taking these structures as the starting point, causal variational principles were formulated and studied as a mathematical generalization of the causal action principle, where an action of the form

$$\begin{aligned} {\mathcal{S}}= \int _{{\mathcal{F}}} \mathrm{d}\rho (x) \int _{{\mathcal{F}}} \mathrm{d}\rho (y)\, {\mathcal{L}}(x,y) \end{aligned}$$

is minimized for a given lower-semicontinuous Lagrangian ${\mathcal{L}}: {\mathcal{F}}\times {\mathcal{F}}\rightarrow {\mathbb{R}}^+_0$ on an (in general non-compact) manifold ${\mathcal{F}}$ under variations of $\rho $ within the class of regular Borel measures, keeping the total volume $\rho ({\mathcal{F}})$ fixed. We refer the reader interested in causal variational principles to [19, Section 1 and 2] and the references therein.

This article is devoted to the case that the Hilbert space ${\mathcal{H}}$ is infinite-dimensional and separable. While the finite-dimensional setting seems suitable for describing physical spacetime on a fundamental level (where spacetime can be thought of as being discrete on a microscopic length scale usually associated to the Planck length), an infinite-dimensional Hilbert space arises in mathematical extrapolations where spacetime is continuous and has infinite volume. Most notably, infinite-dimensional Hilbert spaces come up in the examples of causal fermion systems describing Minkowski space (see [10, Section 1.2] or [26]) or a globally hyperbolic Lorentzian manifold (see for example [11]), and it is also needed for analyzing the limiting case of a classical interaction (the so-called continuum limit; see [10, Section 1.5.2 and Chapters 3-5]). A workaround to avoid infinite-dimensional analysis is to restrict attention to locally compact variations, as is done in [14, Section 2.3]. Nevertheless, in view of the importance of the examples and physical applications, it is a task of growing significance to analyze causal fermion systems systematically in the infinite-dimensional setting. It is the objective of this paper to put this analysis on a sound mathematical basis.

We now outline the main points of our constructions and explain our main results. Extending methods and results in [15] to the infinite-dimensional setting, we endow the set of all regular points of ${\mathcal{F}}$ with the structure of a Banach manifold (see Definition 3.1 and Theorem 3.4). To this end, we construct an atlas formed of so-called symmetric wave charts (see Definition 3.3). We also show that the Hilbert–Schmidt norm on finite-rank operators on ${\mathcal{H}}$ gives rise to a Fréchet-smooth Riemannian metric on this Banach manifold. More precisely, in Theorems 3.11 and 3.12, we prove that ${\mathcal{F}}^{\mathrm{reg}}$ is a smooth Banach submanifold of the Hilbert space ${\mathscr{S}}({\mathcal{H}})$ of selfadjoint Hilbert–Schmidt operators, with the Riemannian metric given by

$$\begin{aligned} g_x \, :\, T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \times T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \rightarrow {\mathbb{R}},\qquad g_x(A,B) := {{\,\mathrm{tr}\,}}(AB) . \end{aligned}$$

In order to introduce higher derivatives at a regular point $p \in {\mathcal{F}}$, our strategy is to always work in the distinguished symmetric wave chart around this point. This has the advantage that we can avoid the analysis of differentiability properties under coordinate transformations. The remaining difficulty is that the causal Lagrangian ${\mathcal{L}}$ and other derived functions are not differentiable. Instead, directional derivatives exist only in certain directions. In general, these directions do not form a vector space. As a consequence, the derivative is not a linear mapping, and the usual product and chain rules cease to hold. On the other hand, these computation rules are needed in the applications, and it is often sensible to assume that they do hold. This motivates our strategy of looking for a vector space on which the function under consideration is differentiable. Clearly, in this way, we lose information on the differentiability in certain directions which do not lie in such a vector space. But this shortcoming is outweighted by the benefit that we can avoid the subtleties of non-smooth analysis, which, at least for most applications in mind, would be impractical and inappropriately technical. Clearly, we want the subspace to be as large as possible, and moreover, it should be defined canonically without making any arbitrary choices. These requirements lead us to the notion of expedient subspaces (see Definition 4.2). In general, the expedient subspace is neither dense nor closed. On these expedient subspaces, the function is Gâteaux differentiable, the derivative is a linear mapping, and higher derivatives are multilinear.

The differential calculus on expedient subspaces is compatible with the chain rule in the following sense: If f is locally Hölder continuous, $\gamma $ is a smooth curve whose derivatives up to sufficiently high order lie in the expedient differentiable subspace of f, then the composition $f \circ \gamma $ is differentiable and the chain rule holds (see Proposition 4.4), i.e.,

$$\begin{aligned} (f\circ \gamma )'(t_0) = D^{{\mathcal{E}}}f|_{x_0}\, \gamma '(t_0) , \end{aligned}$$

where the index ${\mathcal{E}}$ denotes the derivative on the expedient subspace. We also prove a chain rule for higher derivatives (see Proposition 4.5). The requirement of Hölder continuity is a crucial assumption needed in order to control the error term of the linearization. The most general statement is Theorem 5.8 where Hölder continuity is required only on a subspace which contains the curve $\gamma $ locally.

We also work out how the differential calculus on expedient subspaces applies to the setting of causal fermion systems. In order to establish the chain rule, we prove that the causal Lagrangian is indeed locally Hölder continuous with uniform Hölder exponent (Theorem 5.1), and we analyze how the Hölder constant depends on the base point (Theorem 5.3). Moreover, we prove that for all $x,y \in {\mathcal{F}}$, there is a neighborhood $U\subseteq {\mathcal{F}}$ of y with (see (5.9))

$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n, y) \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U \end{aligned}$$

(where 2n is the maximal rank of the operators in ${\mathcal{F}}$). Relying on these results, we can generalize the jet formalism as introduced in [17] for causal variational principles to the infinite-dimensional setting (Sect. 5.2). We also work out the chain rule for the Lagrangian (Theorem 5.6) and for the function $\ell $ obtained by integrating one of the arguments of the Lagrangian (Theorem 5.9),

$$\begin{aligned} \ell (x) = \int _M {\mathcal{L}}(x,y)\, \mathrm{d}\rho (y) - \mathfrak {s}\end{aligned}$$

(1.1)

(where is a positive constant).

The paper is organized as follows. Section 2 provides the necessary preliminaries on causal fermion systems and infinite-dimensional analysis. In Sect. 3, an atlas of symmetric wave charts is constructed, and it is shown that this atlas endows the regular points of ${\mathcal{F}}$ with the structure of a Fréchet-smooth Banach manifold. Moreover, it is shown that the Hilbert–Schmidt norm induces a Fréchet-smooth Riemannian metric. In Sect. 4, the differential calculus on expedient subspaces is developed. In Sect. 5, this differential calculus is applied to causal fermion systems. Appendix gives some more background information on the Fréchet derivative. Finally, Appendix 2 provides details on how the Riemannian metric looks like in different charts.

We finally point out that in order to address a coherent readership, concrete applications of our methods and results for example to physical spacetimes have not been included here. The example of causal fermion systems in Minkowski space will be worked out separately in [25].

2 Preliminaries

2.1 Causal fermion systems and the causal action principle

We now recall the basic definitions of a causal fermion system and the causal action principle.

Definition 2.1

(causal fermion system) Given a separable complex Hilbert space ${\mathcal{H}}$ with scalar product $\langle .|. \rangle _{{\mathcal{H}}}$ and a parameter $n \in {\mathbb{N}}$ (the “spin dimension”), we let ${\mathcal{F}}\subseteq \mathrm{L}({\mathcal{H}})$ be the set of all selfadjoint operators on ${\mathcal{H}}$ of finite rank, which (counting multiplicities) have at most n positive and at most n negative eigenvalues. On ${\mathcal{F}}$, we are given a positive measure $\rho $ (defined on a $\sigma $-algebra of subsets of ${\mathcal{F}}$), the so-called universal measure. We refer to $({\mathcal{H}}, {\mathcal{F}}, \rho )$ as a causal fermion system.

A causal fermion system describes a spacetime together with all structures and objects therein. In order to single out the physically admissible causal fermion systems, one must formulate physical equations. To this end, we impose that the universal measure should be a minimizer of the causal action principle, which we now introduce.

For any $x, y \in {\mathcal{F}}$, the product xy is an operator of rank at most 2n. However, in general, it is no longer a selfadjoint operator because $(xy)^* = yx$, and this is different from xy unless x and y commute. As a consequence, the eigenvalues of the operator xy are in general complex. We denote these eigenvalues counting algebraic multiplicities by $\lambda ^{xy}_1, \ldots , \lambda ^{xy}_{2n} \in {\mathbb{C}}$ (more specifically, denoting the rank of xy by $k \le 2n$, we choose $\lambda ^{xy}_1, \ldots , \lambda ^{xy}_{k}$ as all the nonzero eigenvalues and set $\lambda ^{xy}_{k+1}, \ldots , \lambda ^{xy}_{2n}=0$). We introduce the Lagrangian and the causal action by

$$\begin{aligned} {\text{Lagrangian:}} \qquad {\mathcal{L}}(x,y)&= \frac{1}{4n} \sum _{i,j=1}^{2n} \left( \left|\lambda ^{xy}_i \right| - \left|\lambda ^{xy}_j \right| \right)^2 \end{aligned}$$

(2.1)

$$\begin{aligned} {\text{causal action:}} \qquad {\mathcal{S}}(\rho )&= \iint _{{\mathcal{F}}\times {\mathcal{F}}} {\mathcal{L}}(x,y)\, d\rho (x)\, d\rho (y) . \end{aligned}$$

(2.2)

The causal action principle is to minimize ${\mathcal{S}}$ by varying the measure $\rho $ under the following constraints:

$$\begin{aligned}&{\text{volume constraint:}} \qquad \rho ({\mathcal{F}}) = \text{const} \quad \;\; \end{aligned}$$

(2.3)

$$\begin{aligned}&{\text{trace constraint:}} \qquad \int _{{\mathcal{F}}} {{\,\mathrm{tr}\,}}(x)\, \mathrm{d}\rho (x) = \text{const} \end{aligned}$$

(2.4)

$$\begin{aligned}&{\textit{boundedness constraint:}} \qquad \iint _{{\mathcal{F}}\times {\mathcal{F}}} |xy|^2 \, d\rho (x)\, d\rho (y) \le C , \end{aligned}$$

(2.5)

where C is a given parameter, ${{\,\mathrm{tr}\,}}$ denotes the trace of a linear operator on ${\mathcal{H}}$, and the absolute value of xy is the so-called spectral weight,

$$\begin{aligned} |xy| := \sum _{j=1}^{2n} \left|\lambda ^{xy}_j \right| . \end{aligned}$$

This variational principle is mathematically well posed if ${\mathcal{H}}$ is finite-dimensional. For the existence theory and the analysis of general properties of minimizing measures, we refer to [3, 8, 9]. In the existence theory, one varies in the class of regular Borel measures (with respect to the topology on $\mathrm{L}({\mathcal{H}})$ induced by the operator norm), and the minimizing measure is again in this class. With this in mind, here, we always assume that

$$\begin{aligned} \rho \ \text{is a regular Borel measure}. \end{aligned}$$

Let $\rho $ be a minimizing measure. Spacetime is defined as the support of this measure,

$$\begin{aligned} M := {{\,\mathrm{supp}\,}}\rho . \end{aligned}$$

Thus, the spacetime points are selfadjoint linear operators on ${\mathcal{H}}$. These operators contain a lot of additional information which, if interpreted correctly, gives rise to spacetime structures like causal and metric structures, spinors and interacting fields. We refer the interested reader to [10, Chapter 1].

The only results on the structure of minimizing measures which will be needed here concern the treatment of the trace constraint and the boundedness constraint. As a consequence of the trace constraint, for any minimizing measure $\rho $, the local trace is constant in spacetime, i.e., there is a real constant $c \ne 0$ such that (see [10, Proposition 1.4.1])

$$\begin{aligned} {{\,\mathrm{tr}\,}}x = c \qquad \text{for all}\,x \in M . \end{aligned}$$

Restricting attention to operators with fixed trace, the trace constraint (2.4) is equivalent to the volume constraint (2.3) and may be disregarded. The boundedness constraint, on the other hand, can be treated with a Lagrange multiplier. Indeed, as is made precise in [3, Theorem 1.3], for every minimizing measure $\rho $, there is a Lagrange multiplier $\kappa >0$ such that $\rho $ is a local minimizer of the causal action with the Lagrangian replaced by

$$\begin{aligned} {\mathcal{L}}_{\kappa }(x,y) := {\mathcal{L}}(x,y) + \kappa \, |xy|^2 , \end{aligned}$$

leaving out the boundedness constraint.

2.2 Fréchet and Gâteaux derivatives

We now recall a few basic concepts from the differential calculus on normed vector spaces. In what follows, we let $(E, \Vert .\Vert _E)$ and $(F, \Vert .\Vert _F)$ be real normed vector spaces. The most common concept is that of the Fréchet derivative.

Definition 2.2

Let $U \subseteq E$ be open and $f : U \rightarrow F$ be an F-valued function on U. The function f is Fréchet-differentiable in $x_0 \in U$ if there is a bounded linear mapping $A \in \mathrm{L}(E, F)$ such that

$$\begin{aligned} f(x) = f(x_0) + A\, (x-x_0) + r(x) , \end{aligned}$$

where the error term $r : U \rightarrow F$ goes to zero faster than linearly, i.e.,

$$\begin{aligned} \lim _{x \rightarrow x_0, x \ne x_0} \frac{\Vert r(x)\Vert _F}{\Vert x-x_0\Vert _E} = 0 . \end{aligned}$$

The linear operator A is the Fréchet derivative, also denoted by $Df|_{x_0}$. A function is Fréchet-differentiable in U if it is Fréchet-differentiable at every point of U.

The Fréchet derivative is uniquely defined. Moreover, the concept can be iterated to define higher derivatives. Indeed, if f is differentiable in U, its derivative Df is a mapping

$$\begin{aligned} Df \, :\, U \rightarrow \mathrm{L}(E,F) . \end{aligned}$$

Since $\mathrm{L}(E,F)$ is a normed vector space (with the operator norm), we can apply Definition 2.2 once again to define the second derivative at a point $x_0$ by

$$\begin{aligned} D^2f|_{x_0} = D\left( Df \right) \big|_{x_0} \;\in \; \mathrm{L}\left( E, \mathrm{L}(E,F) \right) . \end{aligned}$$

The second derivative can also be viewed as a bilinear mapping from E to F,

$$\begin{aligned} D^2f|_{x_0} : E \times E \rightarrow F,\qquad D^2f|_{x_0}(u,v) := \left( D\left( Df \right) \big|_{x_0} u, v \right) . \end{aligned}$$

It is by definition bounded, meaning that there is a constant $c>0$ such that

$$\begin{aligned} \big \Vert D^2f|_{x_0}(u,v) \big \Vert _F \le c \, \Vert u\Vert _E\, \Vert v\Vert _E \qquad \text{for all}\,u, v \in E. \end{aligned}$$

By iteration, one obtains similarly the Fréchet derivatives of order $p \in {\mathbb{N}}$ as multilinear operators

$$\begin{aligned} D^pf|_{x_0} : \underbrace{E \times \cdots \times E}_{p\ \mathrm{factors}} \rightarrow F . \end{aligned}$$

A function is Fréchet-smooth on U if it is Fréchet-differentiable to every order.

Lemma 2.3

If the function $f : U \subseteq E \rightarrow F$ is p times Fréchet-differentiable in $x_0 \in U$, then its $p^{\mathrm{th}}$ Fréchet derivative is symmetric, i.e., for any $u_1, \ldots , u_p \in E$ and any permutation $\sigma \in {{\mathcal{S}}}_p$,

$$\begin{aligned} D^pf|_{x_0}\left(u_1, \ldots , u_p \right) = D^pf|_{x_0} \left(u_{\sigma (1)}, \ldots , u_{\sigma (p)} \right) . \end{aligned}$$

We omit the proof, which can be found for example in [5, Section 4.4]. For the Fréchet derivative, most concepts familiar from the finite-dimensional setting carry over immediately. In particular, the composition of Fréchet-differentiable functions is again Fréchet-differentiable. Moreover, the chain and product rules hold. We refer for the details to [5, Sections 2.2 and 2.3] and [6, Chapter 8]^{Footnote 1} and Appendix 1.

A weaker concept of differentiability which we will use here is Gâteaux differentiability.

Definition 2.4

Let $U \subseteq E$ be open and $f : U \rightarrow F$ be an F-valued function on U. The function f is Gâteaux differentiable in $x_0 \in U$ in the direction $u \in E$ if the limit of the difference quotient exists,

$$\begin{aligned} d_u f(x_0) := \lim _{h \rightarrow 0, h \ne 0} \frac{f(x_0+h u) - f(x_0)}{h} . \end{aligned}$$

The resulting vector $d_u f(x_0) \in F$ is the Gâteaux derivative.

By definition, the Gâteaux derivative is homogeneous of degree one, i.e.,

$$\begin{aligned} d_{\lambda u} f(x_0) = \lambda \, d_u f(x_0) \qquad \text{for all}\,\lambda \in {\mathbb{R}}. \end{aligned}$$

Moreover, if f is Fréchet-differentiable in $x_0$, then it is also Gâteaux differentiable in any direction $u \in E$ and

$$\begin{aligned} d_u f(x_0) = Df|_{x_0} u . \end{aligned}$$

However, the converse is not true because, even if the Gâteaux derivatives exist for any $u \in E$, it is in general not possible to represent them by a bounded linear operator. As a consequence, the chain and product rules in general do not hold for Gâteaux derivatives. We shall come back to this issue in Sect. 5.

2.3 Banach manifolds

We recall the basic definition of a smooth Banach manifold (for more details see for example [29, Chapter 73]).

Definition 2.5

Let B be a Hausdorff topological space and $(E, \Vert .\Vert _E)$ a Banach space. A chart $(U, \phi )$ is a pair consisting of an open subset $U \subseteq B$ and a homeomorphism $\phi $ of U to an open subset $V := \phi (U)$ of E, i.e.,

$$\begin{aligned} \phi \, :\, U \overset{\mathrm{open}}{\subseteq } B \rightarrow V \overset{\mathrm{open}}{\subseteq } E . \end{aligned}$$

A smooth atlas ${{\mathcal{A}}} = ( \phi _i, U_i, E)_{i \in I}$ is a collection of charts (for a general index set I) with the properties that the domains of the charts cover B,

$$\begin{aligned} B = \bigcup _{i \in I} U_i \end{aligned}$$

and that for any $i, j \in I$, the transition map

$$\begin{aligned} \phi _j \circ \phi _i^{-1} \, :\, \phi _i \left( U_i \cap U_j \right) \subseteq E \rightarrow \phi _j \left( U_i \cap U_j \right) \end{aligned}$$

is Fréchet-smooth. Two atlases $( \phi _i, U_i, E)_{i \in I}$ and $( \psi _i, V_i, E)_{j \in J}$ are called equivalent if all the transition maps $\psi _j \circ \phi _i^{-1}$ and $\phi _i \circ \psi _j^{-1}$ are Fréchet-smooth. We denote the corresponding equivalence class by $[{\mathcal{A}}]$. The union of the charts of all atlases in $[{\mathcal{A}}]$ is called maximal atlas ${\mathcal{A}}_{\mathrm{max}}$. The triple $(B, E, {{\mathcal{A}}})$ is referred to as a smooth Banach manifold with differentiable structure provided by ${\mathcal{A}}_{\mathrm{max}}$.

Definition 2.6

Just as in the case of finite-dimensional manifolds, we call a function $f: U\subseteq A \rightarrow B$ between two smooth Banach manifolds $(A, E, {\mathcal{A}})$ and $(B, G, {\mathcal{B}})$ (with $U\subseteq A$ open) n-times (Fréchet) differentiable (resp. smooth) if for all combinations of charts $\phi _a:U_a \rightarrow V_a$ and $\phi _b: U_b \rightarrow V_b$ of some (and thus all) atlases $\tilde{{\mathcal{A}}}$ in $[{\mathcal{A}}]$, respectively, $\tilde{{\mathcal{B}}}$ in $[{\mathcal{B}}]$, the mapping $\phi _b \circ f\circ \phi _a^{-1}: V_a\rightarrow V_b$ is n-times (Fréchet) differentiable (resp. smooth).

3 Smooth Banach manifold structure of ${\mathcal{F}}^{\mathrm{reg}}$

In the definition of causal fermion systems, the number of positive or negative eigenvalues of the operators in ${\mathcal{F}}$ can be strictly smaller than n. This is important because it makes ${\mathcal{F}}$ a closed subspace of $\mathrm{L}({\mathcal{H}})$ (with respect to the norm topology), which in turn is crucial for the general existence results for minimizers of the causal action principle (see [9] or [18]). However, in most physical examples in Minkowski space or in a Lorentzian spacetime, all the operators in M do have exactly n positive and exactly n negative eigenvalues. This motivates the following definition (see also [10, Definition 1.1.5]).

Definition 3.1

An operator $x \in {\mathcal{F}}$ is said to be regular if it has the maximal possible rank, i.e., $\dim x({\mathcal{H}}) = 2n$. Otherwise, the operator is called singular. A causal fermion system is regular if all its spacetime points are regular.

In what follows, we restrict attention to regular causal fermion systems. Moreover, it is convenient to also restrict attention to all those operators in ${\mathcal{F}}$ which are regular,

$$\begin{aligned} {\mathcal{F}}^{\mathrm{reg}} := \big \{ x \in {\mathcal{F}}\,|\, x\ \text{is regular} \big \} . \end{aligned}$$

${\mathcal{F}}^{\mathrm{reg}}$ is a dense open subset of ${\mathcal{F}}$ (again with respect to the norm topology on $\mathrm{L}({\mathcal{H}})$).

3.1 Wave charts and symmetric wave charts

We now choose specific charts and prove that the resulting atlas endows ${\mathcal{F}}^{\mathrm{reg}}$ with the structure of a smooth Banach manifold (see Definition 2.5). In the finite-dimensional setting, these charts were introduced in [15]. We now recall their definition and generalize the constructions to the infinite-dimensional setting.

Given $x \in {\mathcal{F}}^{\mathrm{reg}}$, we denote the image of x by $I:=x({\mathcal{H}})$. We consider I as a 2n-dimensional Hilbert space with the scalar product induced from $\langle .|. \rangle _{{\mathcal{H}}}$. Denoting its orthogonal complement by $J:=I^{\perp }$, we obtain the orthogonal sum decomposition

$$\begin{aligned} {\mathcal{H}}= I \oplus J . \end{aligned}$$

This also gives rise to a corresponding decomposition of operators, like for example

$$\begin{aligned} \mathrm{L}({\mathcal{H}}, I) = \mathrm{L}(I,I) \oplus \mathrm{L}(J,I). \end{aligned}$$

(3.1)

Given an operator $\psi \in \mathrm{L}({\mathcal{H}}, I)$, we denote its adjoint by $\psi ^{\dagger } \in \mathrm{L}(I, {\mathcal{H}})$; it is defined by the relation

$$\begin{aligned} \langle u \,|\, \psi \,v \rangle _I = \langle \psi ^{\dagger } \,u \,|\, v \rangle _{{\mathcal{H}}} \qquad \text{for all}\,u \in I\ \text{and}\,v \in {\mathcal{H}}. \end{aligned}$$

We now form the operator

$$\begin{aligned} R_x(\psi ) := \psi ^{\dagger } \,x\, \psi \in \mathrm{L}({\mathcal{H}}) . \end{aligned}$$

(3.2)

By construction, this operator is symmetric and has at most n positive and at most n negative eigenvalues. Therefore, it is an operator in ${\mathcal{F}}$. Using (3.1), we conclude that $R_x$ is a mapping

$$\begin{aligned} R_x \, :\, \mathrm{L}(I,I) \oplus \mathrm{L}(J,I) \rightarrow {\mathcal{F}}. \end{aligned}$$

(3.3)

Before going on, it is useful to rewrite the operator $R_x(\psi )$ in a slightly different way. On I, one can also introduce the indefinite inner product

$$\begin{aligned} \prec .|. \succ _x \, :\, S_x \times S_x \rightarrow {\mathbb{C}}, \qquad \prec u | v \succ _x = -\langle u | x v \rangle _{{\mathcal{H}}} , \end{aligned}$$

(3.4)

referred to as the spin inner product. For conceptual clarity, we denote I endowed with the spin inner product by $(S_x, \prec .|. \succ _x)$ and refer to it as the spin space at x (for more details on the spin spaces, we refer for example to [10, Section 1.1]). It is an indefinite inner product space of signature (n, n). We denote the adjoint with respect to the spin inner product by a star. More specifically, for a linear operator $A \in \mathrm{L}(S_x)$, the adjoint is defined by

$$\begin{aligned} \prec \phi \,|\, A\, \tilde{\phi } \succ _x = \prec A^* \,\phi \,|\, \tilde{\phi } \succ _x \qquad \text{for all}\,\phi , \tilde{\phi } \in S_x. \end{aligned}$$

Using again the definition of the spin inner product (3.4), we can rewrite this equation as

$$\begin{aligned} -\langle \phi \,|\, X\,A \tilde{\phi } \rangle _{{\mathcal{H}}} = -\langle A^* \phi \,|\,X \tilde{\phi } \rangle _{{\mathcal{H}}} , \end{aligned}$$

where we introduced the short notation

$$\begin{aligned} X := x|_{S_x} \, :\, S_x \rightarrow S_x . \end{aligned}$$

(3.5)

Taking adjoints in the Hilbert space ${\mathcal{H}}$ gives

$$\begin{aligned} -\langle X^{-1}\, A^{\dagger }\,X \phi \,|\,X \tilde{\phi } \rangle _{{\mathcal{H}}} = -\langle A^* \phi \,|\,X \tilde{\phi } \rangle _{{\mathcal{H}}} \end{aligned}$$

(note that, the operator X is invertible because $S_x$ is by definition its image). We thus obtain the relation

$$\begin{aligned} A^* = X^{-1}\, A^{\dagger }\,X . \end{aligned}$$

(3.6)

Using such transformations, one readily verifies that identifying the image of $\psi $ with a subspace of $S_x$, the right side of (3.2) can be written as $-\psi ^* \psi $ (for details, see [15, Lemma 2.2]). Thus, with this identification, the operator $R_x$ can be written instead of (3.2) and (3.3) in the equivalent form

$$\begin{aligned} R_x \, :\, \mathrm{L}(I,S_x) \oplus \mathrm{L}(J,S_x) \rightarrow {\mathcal{F}}, \qquad R_x(\psi ) = -\psi ^* \psi , \end{aligned}$$

(3.7)

where $\psi ^*$ is the adjoint with respect to the corresponding inner products, i.e.,

$$\begin{aligned} \prec \phi \,|\, \psi \,u \succ _x = \langle \psi ^* \phi \,|\, u \rangle _{{\mathcal{H}}} \qquad \text{for}\,u \in H\ \text{and}\,\phi \in S_x. \end{aligned}$$

We want to use the operator $R_x$ in order to construct local parametrizations of ${\mathcal{F}}^{\mathrm{reg}}$. The main difficulty is that the operator $R_x$ is not injective. For an explanation of this point in the context of local gauge freedom, we refer to [15]. Here, we merely explain how to arrange that $R_x$ becomes injective. We let $\mathrm{Symm}(S_x) \subseteq \mathrm{L}(S_x)$ be the real vector space of all operators A on $S_x$ which are symmetric with respect to the spin inner product, i.e.,

$$\begin{aligned} \prec \phi | A \tilde{\phi } \succ _x = \prec A \phi | \tilde{\phi } \succ _x \qquad \text{for all}\,\phi , \tilde{\phi } \in S_x. \end{aligned}$$

We now restrict the operator $R_x$ in (3.3) and (3.7) to

$$\begin{aligned} R_x^{\mathrm{symm}} := R_x|_{\mathrm{Symm}(S_x) \oplus \mathrm{L}(J,S_x)} \, :\, \mathrm{Symm}(S_x) \oplus \mathrm{L}(J,S_x) \rightarrow {\mathcal{F}},\qquad R_x(\psi ) = -\psi ^* \psi . \end{aligned}$$

(3.8)

We write the direct sum decomposition as

$$\begin{aligned} \psi = \psi _I + \psi _J \qquad \text{with} \qquad \psi _I \in \text{Symm}(S_x),\; \psi _J \in \mathrm{L}(J,S_x). \end{aligned}$$

Extending the analysis in [15, Section 6.1] to the infinite-dimensional setting, one finds that this mapping is a local parametrization of ${\mathcal{F}}^{\mathrm{reg}}$:

Theorem 3.2

There is an open neighborhood $W_x$ of $(\mathrm{id}_{S_x}, 0) \in \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)$ such that the restriction of $R_x^{\mathrm{symm}}$ maps to an open subset $\Omega _x :=R_x^{\mathrm{symm}}(W_x)$ of ${\mathcal{F}}^{\mathrm{reg}}$,

$$\begin{aligned} R_x^{\mathrm{symm}}|_{W_x} \, :\, W_x \rightarrow \Omega _x \overset{\mathrm{open}}{\subseteq } {\mathcal{F}}^{\mathrm{reg}} , \end{aligned}$$

and is a homeomorphism to its image (always with respect to the topology induced by the operator norm on $\mathrm{L}({\mathcal{H}})$).

Proof

The estimate

$$\begin{aligned}&\Vert R_x^{\mathrm{symm}}(\psi ) - R_x^{\mathrm{symm}}(\tilde{\psi }) \Vert _{\mathrm{L}({\mathcal{H}})} \nonumber \\&\quad = \big \Vert \psi ^* \psi - \tilde{\psi }^* \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} \le \big \Vert \psi ^* \psi - \psi ^* \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} +\big \Vert \psi ^* \tilde{\psi } - \tilde{\psi }^* \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} \nonumber \\&\quad \le \Vert \psi ^*\Vert _{\mathrm{L}({\mathcal{H}})} \, \big \Vert \psi - \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} + \big \Vert \tilde{\psi }^* - \tilde{\psi }^* \big \Vert _{\mathrm{L}({\mathcal{H}})}\, \Vert \tilde{\psi }\Vert _{\mathrm{L}({\mathcal{H}})} \end{aligned}$$

(3.9)

shows that $R_x^{\mathrm{symm}}$ is continuous. Since the point $R_x^{\mathrm{symm}}(\mathrm{id}_{S_x}, 0)=x \in {\mathcal{F}}^{\mathrm{reg}}$ is regular, by continuity, we may choose an open neighborhood $W_x$ of $(\mathrm{id}_{S_x}, 0)$ such that $R_x$ maps to ${\mathcal{F}}^{\mathrm{reg}}$.

In order to show that $R_x^{\mathrm{symm}}$ is bijective, we begin with the formula for $\phi _x$ as derived in [15, Proposition 6.6], which will turn out to be the inverse of $R_x^{\mathrm{symm}}$. It has the form

$$\begin{aligned} \phi _x(y) = \left( P(x,x)^{-1}\, A_{xy}\, P(x,x)^{-1} \right)^{-\frac{1}{2}} \, P(x,x)^{-1}\, P(x,y) \, \Psi (y) \;\in \; \mathrm{L}({\mathcal{H}}, S_x) , \end{aligned}$$

(3.10)

where P(x, y) (the kernel of the fermionic projector) and $A_{xy}$ (the closed chain) are defined by

$$\begin{aligned} P(x,y) := \pi _x y|_{S_y} \, :\, S_y \rightarrow S_x ,\qquad A_{xy} := P(x,y)\, P(y,x) \, :\, S_x \rightarrow S_x . \end{aligned}$$

(3.11)

Our task is to show that for a sufficiently small open neighborhood $\Omega _x$ of x, this formula defines a continuous mapping

$$\begin{aligned} \phi _x \, :\, \Omega _x \subseteq {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, {\mathcal{H}}) , \end{aligned}$$

and that the compositions

$$\begin{aligned} \phi _x \circ R_x^{\mathrm{symm}}|_{W_x} \qquad \text{and} \qquad R_x^{\mathrm{symm}} \circ \phi _x \end{aligned}$$

(3.12)

are both the identity (showing that $\phi _x$ is indeed the inverse of $R_x^{\mathrm{symm}}$).

In preparation, we rewrite the formula (3.10) as

$$\begin{aligned} \phi _x(y) = \left(X^{-1} \,\pi _x y\pi _y x X^{-1}\right)^{-\frac{1}{2}}X^{-1} \,\pi _x y \pi _y = \left(X^{-1} \, \pi _x y |_{S_x}\right)^{-\frac{1}{2}}X^{-1}\,\pi _x y , \end{aligned}$$

(3.13)

where we again used the notation (3.5). Choosing $y=x$, the operator $X^{-1} \,\pi _x y |_{S_x}$ is the identity on $S_x$. We first choose an open neighborhood $\tilde{\Omega }_x$ of x so small such that for any $y \in \tilde{\Omega }_x$,

$$\begin{aligned} \big \Vert \mathrm{id}_{S_x}-X^{-1}\pi _x y |_{S_x} \big \Vert _{\mathrm{L}({\mathcal{H}})}< \frac{1}{2} . \end{aligned}$$

(3.14)

Then, the square root as well as the inverse square root of $A=X^{-1}\pi _x y$ are well defined for all $x\in \tilde{\Omega }_x$ by the respective power series,

$$\begin{aligned} A^{\frac{1}{2}}:=\sum _{n=0}^{\infty } (-1)^n \left( {\begin{array}{c}1/2\\ n\end{array}}\right) (\mathrm{id}_{S_x}-A)^n\;,\;\;\; A^{-\frac{1}{2}}:=\sum _{n=0}^{\infty } (-1)^n \left( {\begin{array}{c}-1/2\\ n\end{array}}\right) (\mathrm{id}_{S_x}-A)^n\;, \end{aligned}$$

with the generalized binomial coefficients given for $\beta \in \mathbb{R}$ and $n\in \mathbb{N}$ by

$$\begin{aligned} \left( {\begin{array}{c}\beta \\ n\end{array}}\right) := \left\{ \begin{array}{cl} \displaystyle \frac{1}{n!}\;\beta \cdot (\beta - 1) \cdots (\beta -n+1)\quad &{} \text{if}\,n>0 \\ 0\quad &{}\text{if}\,n=0 \end{array}\right. \end{aligned}$$

as for both power series the radius of convergence equals one. Moreover, note that, all square roots, inverse square roots, etc., appearing in the following are well defined as they are always applied to operators within their radius of convergence. We conclude that the mapping $\phi _x$ is well defined and continuous on $\tilde{\Omega }_x$. Now, by possibly shrinking $W_x$, we can arrange that $\Omega _x:=R_x^{\mathrm{symm}}(W_x)$ lies in $\tilde{\Omega }_x$. Note that it now suffices to show that $\phi _x|_{\Omega _x}$ is the inverse of $R_x^{\mathrm{symm}}|_{W_x}$, because then the set $\Omega _x=(\phi _x|_{\tilde{\Omega }_x})^{-1}(W_x)$ is open.

In order to verify that $\phi _x$ maps into $\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)$, we restrict $\phi _x(y)$ to $S_x$,

$$\begin{aligned} \phi _x(y) \big|_I=\left( \left(X^{-1} \,\pi _x \,y \big|_{S_x}\right)^{-\frac{1}{2}}X^{-1} \,\pi _x \,y \right)\Big|_I \nonumber \\= \left(X^{-1} \,\pi _x \,y \big|_{S_x}\right)^{-1/2} X^{-1} \,\pi _x \,y|_{S_x} =\left(X^{-1} \,\pi _x \,y \,\pi _x \big|_{S_x}\right)^{\frac{1}{2}} . \end{aligned}$$

(3.15)

A direct computation using (3.6) shows that the operator $X^{-1}\pi _x y \pi _x|_{S_x}$ and hence also its square root are symmetric on $S_x$.

It remains to compute the compositions in (3.12). First,

$$\begin{aligned} \phi _x \circ R_x^{\mathrm{symm}}(\psi )&= \phi _x(\psi ^{\dagger }X\psi ) =(X^{-1}\underbrace{\pi _x \psi ^{\dagger }X}_{=\psi _I^{\dagger }X} \underbrace{\psi |_{S_x}}_{\psi _I})^{-\frac{1}{2}}X^{-1} \underbrace{\pi _x \psi ^{\dagger }X}_{=\psi _I^{\dagger }X}\psi \\&= \left( \underbrace{ X^{-1}\psi _I^{\dagger }X}_{=\psi _I} \psi _I\right)^{-\frac{1}{2}}\underbrace{ X^{-1}\psi _I^{\dagger }X}_{=\psi _I} \psi = \left(\psi _I^2 \right)^{-\frac{1}{2}}\,\psi _I\,\psi =\psi , \end{aligned}$$

where in the last line, we applied (3.6) and used that $\psi _I$ is symmetric on $S_x$. Moreover,

$$\begin{aligned} R_x^{\mathrm{symm}} \circ \phi _x(y)&= \phi _x(y)^{\dagger } X \phi _x(y) \\&= y\,\pi _x\, X^{-1}\,\left(\pi _x y \pi _x\,X^{-1}\, \right)^{-\frac{1}{2}} \,X\, \left(X^{-1}\,\pi _x y \pi _x|_{S_x} \right)^{-\frac{1}{2}} X^{-1}\,\pi _x \,y . \end{aligned}$$

Since the spectral calculus is invariant under similarity transformations, we know that for any invertible operator B on $S_x$,

$$\begin{aligned} X^{-1} B^{-\frac{1}{2}} X = \left( X^{-1} B X \right)^{-\frac{1}{2}} . \end{aligned}$$

Hence,

$$\begin{aligned} R_x^{\mathrm{symm}} \circ \phi _x(y)&= y\,\pi _x\, \left(X^{-1} \,\pi _x y \pi _x|_{S_x}\right)^{-\frac{1}{2}}\,\left(X^{-1} \,\pi _x y \pi _x|_{S_x} \right)^{-\frac{1}{2}} X^{-1}\,\pi _x \,y \\&= y\,\pi _x\, \left(X^{-1}\,\pi _x y \pi _x|_{S_x}\right)^{-1}\,X^{-1}\,\pi _x \,y \\&= y\,\pi _x\, \left( \pi _x \,y \pi _x |_{S_x} \right)^{-1}\, \pi _x \,y = y\,x \left( \pi _x \,y x|_{S_x} \right)^{-1}\, \pi _x \,y \\&= y\,P(y,x)\,\left( P(x,y)\, P(y,x) \right)^{-1} \, P(x,y) = y \end{aligned}$$

(note that $P(x,y) : S_y \rightarrow S_x$ is invertible in view of (3.14)). This concludes the proof. $\square $

The mapping $\phi _x$, which already appeared in the proof of the previous lemma, can also be introduced abstractly to define the chart.

Definition 3.3

Setting

$$\begin{aligned} \phi _x := R_x^{\mathrm{symm}} \big|_{W_x}^{-1} \, :\, \Omega _x \rightarrow \mathrm{Symm}(S_x) \oplus \mathrm{L}(J,S_x) , \end{aligned}$$

we obtain a chart $(\phi _x, \Omega _x)$, referred to as the symmetric wave chart about the point $x \in {\mathcal{F}}^{\mathrm{reg}}$.

We remark that more general charts can be obtained by restricting $R_x$ to another subspace of $\mathrm{L}(I,S_x) \oplus \mathrm{L}(J,S_x)$, i.e., in generalization of (3.8),

$$\begin{aligned} R_x^E := R_x|_{E \oplus \mathrm{L}(J,S_x)} \, :\, E \oplus \mathrm{L}(J,S_x) \rightarrow {\mathcal{F}},\qquad R(\psi ) = -\psi ^* \psi , \end{aligned}$$

where E is a subspace of $\mathrm{L}(S_x)$ which has the same dimension as $\mathrm{Symm}(S_x)$. The resulting charts $\phi ^E_x$ are obtained by composition with a unitary operator $U_x$ on $S_x$, i.e.,

$$\begin{aligned} \phi ^E_x = U_x \circ \phi _x \qquad \text{with} \qquad U_x \in \mathrm{U}(S_x) \end{aligned}$$

(for details and the connection to local gauge transformations, see [15, Section 6.1]). Since linear transformations are irrelevant for the question of differentiability, in what follows, we may restrict attention to symmetric wave charts.

3.2 A Fréchet smooth atlas

The goal of this section is to prove that the symmetric wave charts $(\phi _x, \Omega _x)$ form a smooth atlas of ${\mathcal{F}}^{\mathrm{reg}}$.

Theorem 3.4

(Symmetric wave atlas) The collection of all symmetric wave charts on ${\mathcal{F}}^{\mathrm{reg}}$ defines a Fréchet-smooth atlas of ${\mathcal{F}}^{\mathrm{reg}}$, endowing ${\mathcal{F}}^{\mathrm{reg}}$ with the structure of a smooth Banach manifold (see Definition 2.5).

Proof

We first verify that for any $x\in {\mathcal{F}}^{\mathrm{reg}}$, the vector space $\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)$ together with the operator norm of $\mathrm{L}({\mathcal{H}},I)=\mathrm{L}({\mathcal{H}},S_x)$ is a Banach space. To this end, we note that, this vector space coincides with the kernel of the mapping $\psi \mapsto (X^{-1}\psi ^{\dagger } \pi _x X - \psi |_I)$ on $\mathrm{L}({\mathcal{H}}, I)$. Since this mapping is continuous on $\mathrm{L}({\mathcal{H}},I)$ (as one verifies by an estimate similar to (3.9)), its kernel is closed. As a consequence, the vector space $\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)$ is a closed subspace of $\mathrm{L}({\mathcal{H}},I)$ and thus indeed a Banach space.

We saw in Theorem 3.2 that for any $x \in {\mathcal{F}}^{\mathrm{reg}}$, $(\phi _x, \Omega _x)$ defines a chart on ${\mathcal{F}}^{\mathrm{reg}}$. Since the $\Omega _x$ clearly cover ${\mathcal{F}}^{\mathrm{reg}}$, it remains to show that all transition mappings are Fréchet-smooth. To this end, we first note that, for any $x,y \in {\mathcal{F}}^{\mathrm{reg}}$ and $\psi \in \phi _x(\Omega _x \cap \Omega _y)$,

$$\begin{aligned} \phi _y \circ \phi _x^{-1}(\psi ) = \phi _y \left(\psi ^{\dagger } \,X\, \psi \right) = \left(Y^{-1} \,\pi _y\, \psi ^{\dagger } \,X\, \psi |_{S_y}\right)^{-\frac{1}{2}} \,Y^{-1} \,\pi _y\,\psi ^{\dagger }\, X\, \psi . \end{aligned}$$

Next, we define the mappings

$$\begin{aligned}&B_{xy}: \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x) \rightarrow \mathrm{L}({\mathcal{H}},S_y), \quad \psi \mapsto Y^{-1} \,\pi _y \,\psi ^{\dagger } \,X\, \psi \;,\\&\tilde{B}_{xy}: \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x) \rightarrow \mathrm{L}(S_y), \quad \psi \mapsto Y^{-1} \,\pi _y \,\psi ^{\dagger } \,X\, \psi |_{S_y}\;,\\&W: B_{\frac{1}{2}}(0)\subseteq \mathrm{L}(S_y) \rightarrow \mathrm{L}(S_y), \quad B\mapsto (1+B)^{-\frac{1}{2}} = \sum _{n=0}^{\infty }(-1)^n \left( {\begin{array}{c}-1/2\\ n\end{array}}\right) \,B^n \end{aligned}$$

(where the radius of the ball $B_{1/2}(0)$ is taken with respect to the operator norm).

Recall that in the proof of Theorem 3.2 (more precisely (3.14)), we chose $\Omega _y$ so small that the operator $\Vert \mathrm{id}_{S_y}- Y^{-1}\pi _yz|_{S_y} \Vert <1/2$ for any $z\in \Omega _y$. Thus, since for any $\psi \in \phi _x(\Omega _x \cap \Omega _y)$ we have $\psi ^{\dagger }X\psi =\phi ^{-1}_x(\psi )\in \Omega _y$, we obtain $\tilde{B}_{xy}(\phi _x(\Omega _x \cap \Omega _y)) \subseteq B_{1/2}(\mathrm{id}_{S_y})$. Therefore, we can write the transition mapping $\phi _y \circ \phi _x^{-1}$ as

$$\begin{aligned} \phi _y \circ \phi _x^{-1}(\psi ) = W\left(\mathrm{id}_{S_y} -\tilde{B}_{xy}(\psi )\right) \circ B_{xy}(\psi ). \end{aligned}$$

Now note that, for the Fréchet derivative, we consider all vector spaces here as a real Banach spaces, but still with the canonical operator norm induced by $\Vert .\Vert _{{\mathcal{H}}}$. In view of the chain rule for Fréchet derivatives (for details, see Lemma 6.2 in Appendix 1) and the properties of the Fréchet derivative in Lemma 6.1 in Appendix 1, it remains to show that the mappings W, $B_{xy}$ and $\tilde{B}_{xy}$ are Fréchet-smooth (note that, the composition operator of $\mathbb{R}$-linear mappings is also always Fréchet-smooth as it defines a bounded $\mathbb{R}$-bilinear map and the map $\mathrm{L}(S_y)\ni y \mapsto \mathrm{id}_{S_y}-y\in \mathrm{L}(S_y)$ is clearly Fréchet-smooth as well). For W, this is clear due to [21, pp. 40–42] (note that, $\mathrm{L}(S_y)$ obviously defines a finite-dimensional unital Banach-algebra). Moreover, the mappings $B_{xy}$ and $\tilde{B}_{xy}$ are obviously $\mathbb{R}$-bilinear and bounded and thus Fréchet-smooth. $\square $

3.3 The tangent bundle

Having endowed ${\mathcal{F}}^{\mathrm{reg}}$ with a canonical smooth Banach manifold structure, the next step is to consider its tangent bundle. For finite-dimensional manifolds, the tangent space can be defined either by equivalence classes of curves or by derivations, and these two definitions coincide (see for example [24, Chapter 2]). In infinite dimensions, however, this does no longer be the case: In general, the derivation-tangent vectors (usually called operational tangent vectors) form a larger class of than the curve-tangent vectors (called kinematic tangent vectors). There might even be operational tangent vectors that depend on higher-order derivatives of the inserted function (while the kinematic tangent vectors interpreted as directional derivatives only involve the first derivatives); for details on such issues, see for example [22, Sections 28 and 29] or [2, pp. 3–6]. It turns out that for our applications in mind, it is preferable to define tangent vectors as equivalence classes of curves. Indeed, as we shall see, with this definition, the usual computation rules remain valid. More specifically, the tangent vectors of ${\mathcal{F}}^{\mathrm{reg}}$ are compatible with the Fréchet derivative, and each fiber of the corresponding tangent bundle can be identified with the underlying Banach space

$$\begin{aligned} V_x :=\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x) \end{aligned}$$

with respect to the chart $\phi _x$.

Following [22, p. 284], we begin with the abstract definition of the (kinematic) tangent bundle, which makes it easier to see the topological structure. Afterward, we will show that this notion indeed agrees with equivalence classes of curves. Given $x' \in {\mathcal{F}}^{\mathrm{reg}}$, we consider the set $\Omega _{x'} \times V_{x'} \times \{x'\}$ (endowed with the topology inherited from the direct sum of Banach spaces). We take the disjoint union

$$\begin{aligned} \bigcup \limits _{x' \in {\mathcal{F}}^{\mathrm{reg}}} \Omega _{x'} \times V_{x'} \times \{x'\} \end{aligned}$$

and introduce the equivalence relation

$$\begin{aligned} (x,\mathbf{v },x') \sim (y,\mathbf{w },y') \qquad \Longleftrightarrow \qquad x=y \quad \mathrm{and} \quad (\phi _{x'}\circ \phi _{y'}^{-1})'|_{\phi _{y'}(x)}\mathbf{w }= \mathbf{v }. \end{aligned}$$

For clarity, we point out that the first entry represents the point of the Banach manifold ${\mathcal{F}}^{\mathrm{reg}}$, whereas the third entry labels the chart.

Definition 3.5

We define the tangent bundle $T{\mathcal{F}}^{\mathrm{reg}}$ as the quotient space with respect to this equivalence relation,

$$\begin{aligned} T{\mathcal{F}}^{\mathrm{reg}} := \left(\bigcup \limits _{x' \in {\mathcal{F}}^{\mathrm{reg}}} \Omega _{x'} \times V_{x'} \times \{x'\} \right)\Big / \sim . \end{aligned}$$

The canonical projection is given by

$$\begin{aligned} \pi : T{\mathcal{F}}^{\mathrm{reg}} \rightarrow {\mathcal{F}}^{\mathrm{reg}}\;,\;\;\;\pi ([x,\mathbf{v },x']) = x. \end{aligned}$$

For every $x \in {\mathcal{F}}^{\mathrm{reg}}$, the tangent space at x is defined by

$$\begin{aligned} T_x{\mathcal{F}}^{\mathrm{reg}}:=\pi ^{-1}(x). \end{aligned}$$

Note that, each $T_x{\mathcal{F}}^{\mathrm{reg}}$ has a canonical vector space structure in the following sense: Since all equivalence classes in $T_x{\mathcal{F}}^{\mathrm{reg}}$ have a representative of the form $[x,\mathbf{v },x]$, this representative can be identified with $\mathbf{v }\in V_x$. In this way, we obtain an identification of $T_x{\mathcal{F}}^{\mathrm{reg}}$ with $V_x$.

The tangent bundle is again a Banach manifold, as we now explain. For any $x \in {\mathcal{F}}^{\mathrm{reg}}$, the mapping

$$\begin{aligned} (\phi _x, D\phi _x): \pi ^{-1}(W_x) \rightarrow \Omega _x \times V_x\;, \;\;\;[y,\mathbf{v },z] \mapsto \left(\phi _x(y), D\left( \phi _x \circ \phi _z^{-1} \right) \big|_{\phi _z(y)}\mathbf{v }\right) \end{aligned}$$

has the inverse

$$\begin{aligned} (\phi _x, D\phi _x)^{-1}: \Omega _x \times V_x \rightarrow \pi ^{-1}(W_x)\;,\;\;\; (\psi ,\mathbf{v }) \mapsto [\phi _x^{-1}(\psi ),\mathbf{v },x] . \end{aligned}$$

On $T{\mathcal{F}}^{\mathrm{reg}}$, we choose the coarsest topology with the property that the natural projections of these mappings to $\Omega _x$ and $V_x$ are both continuous (where on $\Omega _x$ and $V_x$, we choose the topology induced by the norm topology of $\mathrm{L}({\mathcal{H}})$). With this topology, the mapping $(\phi _x, D\phi _x)$ defines a chart of $T{\mathcal{F}}^{\mathrm{reg}}$. For any $(\psi ,\mathbf{v })\in (\phi _y, D\phi _y)\left(\pi ^{-1}(\Omega _x)\cap \pi ^{-1}(\Omega _y)\right)$, the transition mappings are given by

$$\begin{aligned} (\phi _x, D\phi _x) \circ (\phi _y, D\phi _y)^{-1}(\psi , \mathbf{v })&= (\phi _x, D\phi _x)([\phi _y^{-1}(\psi ), \mathbf{v },y])\\&= \left((\phi _x \circ \phi _y^{-1})(\psi ), D \left(\phi _x \circ \phi _y^{-1} \right) \big|_{\psi }\mathbf{v }\right) . \end{aligned}$$

Proposition 3.6

$T{\mathcal{F}}^{\mathrm{reg}}$ is again a Banach manifold.

Proof

We need to show that transition maps are Fréchet-smooth. This is clear for the first component because the transition mappings $\phi _x \circ \phi _y^{-1}$ are Fréchet-smooth and fiberwise linear. The second component can be considered as the composition of the insertion map

$$\begin{aligned} \mathrm{L}(V_y,V_x)\times V_y \ni (A,\mathbf{v }) \mapsto A(\mathbf{v })\in V_x \end{aligned}$$

(which is obviously continuous and bilinear and thus Fréchet-smooth, for details, see Lemma 6.1 in Appendix 1) with the mapping $W_y\times V_y \ni (\psi ,\mathbf{v }) \mapsto ((\phi _x \circ \phi _y^{-1})'|_{\psi },\mathbf{v })\in \mathrm{L}(V_x,V_y)\times V_y$, which is Fréchet-smooth due to the Fréchet-smoothness of the transition mappings. $\square $

In what follows, we will sometimes use the notation

$$\begin{aligned} D\phi _x([y,\mathbf{v },z]) := D\left( \phi _x \circ \phi _z^{-1} \right) \big|_{\phi _z(y)} \,\mathbf{v }\qquad \forall x\in {\mathcal{F}}^{\mathrm{reg}},\; [y,\mathbf{v },z]\in \pi ^{-1}(\Omega _x)\;, \end{aligned}$$

which also clarifies the independence of the choice of representatives.

Lemma 3.7

For any $x \in {\mathcal{F}}^{\mathrm{reg}}$, the mapping

$$\begin{aligned} \psi _x: \Omega _x \times V_x \rightarrow \pi ^{-1}(\Omega _x)\;,\qquad (y, v) \mapsto [y,\mathbf{v },x] \end{aligned}$$

is a local trivialization.

Proof

We need to verify the properties of a local trivialization. Clearly, the operator $\pi \circ \psi _x$ is the projection to the first component, and for fixed $y \in \Omega _x$, the mapping $v \mapsto \psi _x(y,\mathbf{v })=[y,\mathbf{v },x]=[y, (\phi _y \circ \phi _x^{-1})'|_{\phi _x(x)}\mathbf{v },y]$ corresponds to $\mathbf{v }\mapsto (\phi _y \circ \phi _x^{-1})'|_{\phi _x(x)}\mathbf{v }$ (by the identification of $T_y{\mathcal{F}}^{\mathrm{reg}}$ with $V_y$ from before), which is obviously an isomorphism of vector spaces in view of Lemma 6.1 (vi). $\square $

To summarize, the Banach manifold ${\mathcal{F}}^{\mathrm{reg}}$ has similar properties as in the finite-dimensional case.

We now explain how the above definition of tangent vectors relates to the equivalence classes of curves (following [22, p. 285]):

Remark 3.8

(equivalence classes of curves) On curves $\gamma , \tilde{\gamma } \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})$, we consider the equivalence relation $\gamma \sim \tilde{\gamma }$ defined by the conditions that $\gamma (0) = \tilde{\gamma }(0)$ and that in a chart $\phi _x$ with $\gamma (0) \in \Omega _x$, the relation $(\phi _x\circ \gamma )'|_0= (\phi _x\circ \tilde{\gamma })'|_0$ holds. Note that, if the last relation holds in one chart, then it also holds in any other chart $\phi _y$ with $\gamma (0)\in \Omega _y$ because, due to the chain rule,

$$\begin{aligned} (\phi _y\circ \gamma )'|_0&= (\phi _y \circ \phi _x^{-1} \circ \phi _x \circ \gamma )'|_0 = (\phi _y \circ \phi _x^{-1})'|_{\phi _x (\gamma (0))}(\phi _x\circ \gamma )'|_0\\&= (\phi _y \circ \phi _x^{-1})'|_{\phi _x (\gamma (0))}(\phi _x\circ \tilde{\gamma })'|_0 =(\phi _y \circ \phi _x^{-1} \circ \phi _x \circ \tilde{\gamma })'|_0 =(\phi _y\circ \tilde{\gamma })'|_0. \end{aligned}$$

Now we can identify $C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}}) / \sim $ with $T{\mathcal{F}}^{\mathrm{reg}}$ via the mapping

$$\begin{aligned} C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}}) / \sim&\rightarrow T{\mathcal{F}}^{\mathrm{reg}} \nonumber \\ {[}\gamma ]&\mapsto \Big [\gamma (0),\, (\phi _{\gamma (0)}\circ \gamma )'|_0 ,\, \gamma (0)\Big ]\;, \end{aligned}$$

(3.16)

which bijective with inverse (for details, see [22, p. 285])

$$\begin{aligned} {[}x,\mathbf{v },x'] \mapsto \Big [t\mapsto \phi _{x'}^{-1}\left(\phi _{x'}(x) +t \,\xi _{\mathbf{v }}(t)\,\mathbf{v }\right) \Big ]\;, \end{aligned}$$

where $\xi _{\mathbf{v }} \in C_0^{\infty }(\mathbb{R})$ is a smooth cutoff function with $0\le \xi _v \le 1$. Moreover, $\mathrm{supp}(\xi _{\mathbf{v }})\subseteq (-\varepsilon ,\varepsilon )$ and $\xi _{\mathbf{v }}|_{(-\varepsilon /2,\varepsilon /2)}\equiv 1$ with $\varepsilon >0$ chosen so small that

$$\begin{aligned} B_{\varepsilon \Vert \mathbf{v }\Vert }\left( \phi _{x'}(x) \right) \subseteq W_{x'} . \end{aligned}$$

Note that, in (3.16), the tangent vector at $\gamma (0)$ was expressed in the specific chart $(\phi _{\gamma (0)}, \Omega _{\gamma (0)})$. However, the tangent vector can also be represented in another chart as follows. Let $x \in {\mathcal{F}}^{\mathrm{reg}}$ and $[x,\mathbf{v },z] \in T_x{\mathcal{F}}^{\mathrm{reg}}$ be arbitrary. We say that a curve $\gamma \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})$ represents $[x,\mathbf{v },z]$ if in one chart $\phi _y$ with $x \in \Omega _y$ (and thus any chart, as one can show using the chain rule just as before) it holds that

$$\begin{aligned} {[}x,\mathbf{v },z] = [\gamma (0),(\phi _y \circ \gamma )'|_0, y]. \end{aligned}$$

(3.17)

In order to show independence of y, let $w\in {\mathcal{F}}^{\mathrm{reg}}$ with $x \in \Omega _w$. Then,

$$\begin{aligned} (\phi _w\circ \gamma )'|_0 = (\phi _w \circ \phi _y^{-1} \circ \phi _y \circ \gamma )'|_0 = (\phi _w \circ \phi _y^{-1})'|_{\phi _y(x)}(\phi _y\circ \gamma )'|_0\;, \end{aligned}$$

and thus,

$$\begin{aligned} {[}\gamma (0), (\phi _w\circ \gamma )'|_0, w] =[\gamma (0),(\phi _y \circ \gamma )'|_0, y] = [x,\mathbf{v },z] . \end{aligned}$$

Hence, if (3.17) holds in one chart, it also holds in any other chart around x. $\square $

Remark 3.9

(directional derivatives) Let $\gamma \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})$ be a curve that represents $[x,\mathbf{v },z]$. We define the directional derivative of a Fréchet-differentiable function $f: {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}$ at x in the direction $[x,\mathbf{v },z]$ as

$$\begin{aligned} D_{[x,\mathbf{v },z]}f|_x := \frac{\mathrm{d}}{\mathrm{d}t}(f\circ \gamma )|_{t=0}. \end{aligned}$$

This definition is independent of the choice of the curve $\gamma $. Indeed, for any chart $\phi _w$ around x, we have

$$\begin{aligned}&\frac{\mathrm{d}}{\mathrm{d}t}(f\circ \gamma )|_{t=0} = (f\circ \phi _w^{-1} \circ \phi _w \circ \phi _z \circ \phi _z^{-1} \gamma )'(0) \\&\quad = D(f\circ \phi _w^{-1})|_{\phi _w(x)}\, D(\phi _w \circ \phi _z^{-1})|_{\phi _z(x)}\, (\phi _z \circ \gamma )'(0) \\&\quad =D(f\circ \phi _w^{-1})|_{\phi _w(x)}\, D(\phi _w \circ \phi _z^{-1})|_{\phi _z(x)} \,v =D(f\circ \phi _w^{-1})|_{\phi _w(x)}\, D\phi _w([x,\mathbf{v },z]) . \end{aligned}$$

$\square $

We close this subsection with one last definition:

Definition 3.10

(Tangent vector fields) A tangent vector field on a Banach manifold is—similar to the finite-dimensional case—a Fréchet-smooth map $\mathbf{v }: {\mathcal{F}}^{\mathrm{reg}} \rightarrow T{\mathcal{F}}^{\mathrm{reg}}$ such that $\mathbf{v }(x) \in T_x{\mathcal{F}}^{\mathrm{reg}}$ (i.e. $\pi (\mathbf{v }(x)) = x$) for all $x \in {\mathcal{F}}^{\mathrm{reg}}$. We denote the set of all tangent vectors fields of ${\mathcal{F}}^{\mathrm{reg}}$ by $\Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})$.

We note that, according to this definition, multiplying a vector field by Fréchet-smooth real-valued function gives again a vector field. In other words, the space of all tangent vector fields forms a module over the ring of Fréchet-smooth functions from ${\mathcal{F}}^{\mathrm{reg}}$ to ${\mathbb{R}}$.

3.4 A Riemannian metric

In this section, we show that the Hilbert–Schmidt scalar product gives rise to a canonical Riemannian metric on ${\mathcal{F}}^{\mathrm{reg}}$. For the constructions, it is most convenient to recover ${\mathcal{F}}^{\mathrm{reg}}$ as a Banach submanifold of the real Hilbert space ${\mathscr{S}}({\mathcal{H}})$ of all selfadjoint Hilbert–Schmidt operators on ${\mathcal{H}}$ endowed with the scalar product (${\mathscr{S}}$ because of the second Schatten class; for details, see [7, Section XI.6])

$$\begin{aligned} \langle A, B\rangle _{{\mathscr{S}}({\mathcal{H}})} := {{\,\mathrm{tr}\,}}\left(A B \right) . \end{aligned}$$

Theorem 3.11

${\mathcal{F}}^{\mathrm{reg}}$ is a smooth Fréchet submanifold of ${\mathscr{S}}({\mathcal{H}})$ in the following sense. Given $x \in {\mathcal{F}}^{\mathrm{reg}}$, we choose $\psi _0 \in \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I)$ with $x = -\psi _0^* \psi _0$. Then, the mapping

$$\begin{aligned} \mathscr{R} \, :\, \left( \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \right) \oplus {\mathscr{S}}(J)&\rightarrow {\mathscr{S}}({\mathcal{H}}) \\ (\;\;\psi \;\;,\;\; B\;\;)\,&\mapsto -\psi ^* \psi + \begin{pmatrix} 0 &{} 0 \\ 0 &{} B \end{pmatrix} \end{aligned}$$

(where the last matrix denotes a block operator on ${\mathcal{H}}=I \oplus J$) is a local Fréchet-diffeomorphism at $(\psi _0, 0)$. Its local inverse takes the form

$$\begin{aligned} \Phi := ({\mathscr{R}}|_{\hat{W}})^{-1} \, :\, {\mathscr{S}}({\mathcal{H}}) \cap \hat{\Omega }_x&\rightarrow \hat{W} \subseteq \left( \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \right) \oplus {\mathscr{S}}(J) \\ E&\mapsto \bigg ( \phi _x(\pi _x E), \pi _J \left( E + \left(\phi _x(\pi _x E) \right)^* \phi _x(\pi _x E) \right) \Big|_J \bigg ) , \end{aligned}$$

where $\hat{W} = W_x \oplus {\mathscr{S}}(J)$, $\hat{\Omega }_x:=\mathscr{R}(\hat{W})=\Omega _x+{\mathscr{S}}(J)$ (with $W_x$ and $\Omega _x$ as in Theorem 3.2), and $\phi _x(\pi _x E)$ is defined in analogy to (3.13) by

$$\begin{aligned} \phi _x \left(\pi _x E \right) := \left(X^{-1} \,\pi _x E |_{S_x} \right)^{-\frac{1}{2}}X^{-1}\,\pi _x E \,\in \, \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \end{aligned}$$

(the fact that this maps to the symmetric operators on $S_x$ is verified as in (3.15)).

Proof

A direct computation shows that ${\mathscr{R}}$ and $\Phi $ are inverses of each other: In order to compute $\mathscr{R}\circ \Phi $, we use the block operator notation

$$\begin{aligned} E= \begin{pmatrix} E_{II} &{} E_{IJ} \\ E_{JI} &{} E_{JJ} \end{pmatrix} \in {\mathscr{S}}({\mathcal{H}}) \cap \hat{\Omega }_x . \end{aligned}$$

Then, there exist operators $\tilde{E}_J, \hat{E}_J \in {\mathscr{S}}(J)$ such that $E_{JJ}=\tilde{E}_J+\hat{E}_J$, and the operator

$$\begin{aligned} \tilde{E}:= \begin{pmatrix} E_{II} &{} E_{IJ} \\ E_{JI} &{} \tilde{E}_J \end{pmatrix} \end{aligned}$$

is contained in $\Omega _x$. Note that, $\phi _x E = \pi _x\tilde{E}$ and therefore $-\phi _x(\pi _x E)^*\phi _x(\pi _x E)=\tilde{E}$. We conclude that

$$\begin{aligned} \mathscr{R}\circ \Phi (E) = \begin{pmatrix} E_{II} &{} E_{IJ} \\ E_{JI} &{} \tilde{E}_J+E_{JJ}-\tilde{E}_J \end{pmatrix} =E . \end{aligned}$$

In order to compute $\Phi \circ \mathscr{R}$, we take $(\psi , B)\in \hat{W}$ arbitrary and note that, due to the definition of $\phi _x$ in (3.13) and Theorem 3.2, we have

$$\begin{aligned} \phi _x(\mathscr{R}(\psi ,B)) = \phi _x(-\pi _x\psi ^*\psi ) = \phi _x(-\psi ^*\psi )=\psi \end{aligned}$$

(note that, the first two mappings $\phi _x$ are the ones defined in this theorem, whereas the third mapping is the one from (3.13)). We thus obtain

$$\begin{aligned} \phi _x(\mathscr{R}(\psi ,B)) = \left(\psi , \, \pi _J\left(-\psi ^*\psi +\begin{pmatrix} 0 &{} 0 \\ 0 &{} B \end{pmatrix} + \psi ^*\psi \right)\big|_J \right) = (\psi , B). \end{aligned}$$

Next, the mappings ${\mathscr{R}}$ and $\Phi $ are Fréchet-smooth because for operators of finite rank (namely rank at most 2n), the operator norm is equivalent to the Hilbert–Schmidt norm. Indeed, for an operator $A \, :\, H \rightarrow I$ mapping to a finite-dimensional Hilbert space I,

$$\begin{aligned} \Vert A\Vert ^2 \le \Vert A^{\dagger } A \Vert \le {{\,\mathrm{tr}\,}}(A^{\dagger } A) =\Vert A\Vert _{{\mathscr{S}}({\mathcal{H}}, I)}^2 \le \dim (I)\, \Vert A\Vert ^2 . \end{aligned}$$

This concludes the proof. $\square $

We consider a smooth curve

$$\begin{aligned}&\gamma \, :\, (-\delta , \delta ) \rightarrow {\mathcal{F}}^{\mathrm{reg}} \qquad \text{with} \qquad \gamma (0)=x .\\&\frac{\mathrm{d}}{\mathrm{d}\tau } \left(\phi _y \circ \gamma (\tau ) \right)\big|_{\tau =0} = \mathbf{v }\in V_y . \end{aligned}$$

The corresponding equivalence class defines a tangent vector $[x,\mathbf{v },y] \in T_x {\mathcal{F}}^{\mathrm{reg}}$. On the other hand, considering $\gamma $ as a curve in ${\mathscr{S}}$, it has the tangent vector

$$\begin{aligned} \frac{\mathrm{d} \gamma (\tau )}{\mathrm{d}\tau } \Big|_{\tau =0} \in {\mathscr{S}}. \end{aligned}$$

In the chart $\phi _x$ and setting $\psi _0 = \phi _x(x)$, the curve is parametrized by $\psi (\tau ) := \phi _x \circ \gamma (\tau )$ with

$$\begin{aligned} \gamma (\tau ) = \phi _x^{-1} \circ \psi (\tau )=-\psi (\tau )^* \psi (\tau ) \end{aligned}$$

and thus

$$\begin{aligned} \frac{\mathrm{d} \gamma (\tau )}{\mathrm{d}\tau } \Big|_{\tau =0} = D\phi _x^{-1}|_{\psi _0} \mathbf{v }= -\mathbf{v }^* \psi _0 - \psi _0^* \mathbf{v }\qquad \text{with} \qquad \mathbf{v }\in V_x . \end{aligned}$$

As $\psi _0=\phi _x(x)=\pi _x$, a direct computation (for details, see the proof of Lemma 7.6 in Appendix 2) that the map $V_x \ni \mathbf{v }\mapsto -\mathbf{v }^*\psi _0-\psi ^*_0\mathbf{v }= -\mathbf{v }^*\pi _x-\pi ^*_x \mathbf{v }$ is injective.This makes it possible to write the tangent space as

$$\begin{aligned} T_x {\mathcal{F}}^{\mathrm{reg}} \simeq T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} := \big \{ -\psi ^* \psi _0 - \psi _0^* \psi \,\big|\, \psi \in \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \big \} \subseteq {\mathscr{S}}({\mathcal{H}}) . \end{aligned}$$

(3.18)

Theorem 3.12

Using the identification (3.18), the mapping

$$\begin{aligned} g_x \, :\, T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \times T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \rightarrow {\mathbb{R}}, \qquad g_x(A,B) := {{\,\mathrm{tr}\,}}(AB) . \end{aligned}$$

defines a Fréchet-smooth Riemannian metric on ${\mathcal{F}}^{\mathrm{reg}}$. Moreover, the topology on ${\mathcal{F}}^{\mathrm{reg}}$ induced by the operator norm coincides with the topology induced by the Riemannian metric.

Proof

Follows immediately because $g_x$ is the restriction of the Hilbert space scalar product to the smooth Fréchet submanifold ${\mathcal{F}}^{\mathrm{reg}}$.

We finally remark that the symmetric wave charts are related to Gaussian charts (see the formulas in [15, Sections 5 and 6.2], which apply to the infinite-dimensional case as well). Detailed computations for the Riemannian metric in symmetric wave charts are given in Appendix 2.

4 Differential calculus on expedient subspaces

If all functions arising in the analysis were Fréchet-smooth, all the methods and notions from the finite-dimensional setting could be adapted in a straightforward way to the infinite-dimensional setting. However, this procedure is not sufficient for our purposes, because the Lagrangian is not Fréchet-smooth. Therefore, we need to develop a differential calculus on Banach spaces for functions which are only Hölder continuous. Clearly, in general, such functions are not even Fréchet-differentiable, but the Gâteaux derivative may exist in certain directions. The disadvantage of Gâteaux derivatives is that the differentiable directions in general do not form a vector space. As a consequence, the usual computation rules like the linearity of the derivative or the chain and product rules cease to hold. Our strategy for preserving the usual computation rules is to work on suitable linear subspaces of the star-shaped set of all Gâteaux-differentiable directions, referred to as the expedient differentiable subspace.

4.1 The expedient differentiable subspaces

In this section, E and F denote Banach spaces.

Definition 4.1

Let $U \subseteq E$ be open and $f : U \rightarrow F$ an F-valued function. Moreover, let V be a subspace of E. The function f is k times V-differentiable at $x_0 \in U$ if for every finite-dimensional subspace $H \subseteq V$, the restriction of f to the affine subspace $H+x_0$ denoted by

$$\begin{aligned} g^H : H \rightarrow F ,\qquad g^H(h) = f(x_0+h) \end{aligned}$$

is k-times continuously differentiable at $h=0$. If this condition holds, the subspace V is called k-admissible at $x_0$.

Thus, a function f is once V-differentiable at $x_0$ if for every finite-dimensional subspace $H \subseteq V$, for every $h_0$ in a small neighborhood of the origin,

$$\begin{aligned} g^H(h) = g^H(h_0) + Dg^H|_{h_0} (h-h_0) + o(h-h_0) \qquad \text{for all}\,h \in H , \end{aligned}$$

and if $Dg^H|_{h_0}$ is continuous in the variable $h_0$ at $h_0=0$. Equivalently, choosing a basis $e_1, \ldots , e_L$ of H, this condition can be stated that all partial derivatives

$$\begin{aligned} \frac{\partial }{\partial \alpha _i} g^H \left( \alpha _1 e_1 + \cdots + \alpha _L e_L \right) \end{aligned}$$

exist and are continuous at $\alpha _1,\ldots , \alpha _L=0$. The higher differentiability of $g^H$ can be defined inductively or, equivalently, by demanding that all partial derivatives up to the order k, i.e., all the functions

$$\begin{aligned} \frac{\partial ^p}{\partial \alpha _{i_1} \cdots \alpha _{i_p}} g^H(\alpha _1 e_1 + \cdots + \alpha _L e_L) \end{aligned}$$

with $i_1,\ldots , i_p \in \{1,\ldots , L\}$ and $p \le k$, exist and are continuous at $\alpha _1,\ldots , \alpha _L=0$.

An admissible subspace V is maximal if there are no admissible proper extensions $\tilde{V} \supsetneq V$. The existence of maximal admissible subspaces is guaranteed by Zorn’s lemma, but maximal subspaces are in general not unique. In order to obtain a canonical subspace, we take the intersection of all maximal admissible subspaces:

Definition 4.2

The expedient k-differentiable subspace ${\mathcal{E}}^k(f,x_0)$ of f at $x_0$ is defined as the intersection

$$\begin{aligned} {\mathcal{E}}^k(f,x_0) := \bigcap \big \{ V \,\big|\, V \subseteq E\ k\text{-admissible at}\,x_0\ \text{and maximal} \big \} . \end{aligned}$$

Since the expedient differentiable subspace is again admissible at $x_0$, we obtain a corresponding derivative as follows. Given $k\in {\mathbb{N}}$ and vectors $h_1, \ldots , h_k \in {\mathcal{E}}(f,x_0)$, we choose H as a finite-dimensional subspace which contains these vectors. We set

$$\begin{aligned} D^{k,{\mathcal{E}}} f|_{x_0}(h_1,\ldots , h_k) := D^k g^H|_0(h_1, \ldots , h_k) \end{aligned}$$

(4.1)

(where again $g^H(h):=f(x_0+h)$).

Lemma 4.3

This procedure defines $D^{k,{\mathcal{E}}} f|_{x_0}$ canonically as a symmetric, multilinear mapping

$$\begin{aligned} D^{k,{\mathcal{E}}} f|_{x_0} \, :\, \underbrace{{\mathcal{E}}^k(f,x_0) \times \cdots \times {\mathcal{E}}^k(f,x_0)}_{k\ \mathrm{factors}} \rightarrow F . \end{aligned}$$

Proof

In order to show that $D^{k,{\mathcal{E}}} f|_{x_0}$ is well defined, let H and $\tilde{H}$ be two finite-dimensional subspaces of ${\mathcal{E}}(f,x_0)$ which contain the vectors $h_1, \ldots , h_k$. Then, expressing the partial derivatives in terms of partial derivatives, it follows that

$$\begin{aligned} D^k g^H|_0(h_1, \ldots , h_k)&= \frac{\partial ^p}{\partial \alpha _1 \cdots \alpha _k} f(x_0 + \alpha _1 h_1 + \cdots + \alpha _k h_k) \Big|_{\alpha _1=\cdots =\alpha _k=0} \\&= D^k g^{\tilde{H}}|_0(h_1, \ldots , h_k) . \end{aligned}$$

This shows that the definition (4.1) does not depend on the choice of H.

The symmetry and homogeneity follow immediately from the corresponding properties of $D^k g^H$ in (4.1). In order to prove additivity, we let $h_1, \ldots , h_k \in {\mathcal{E}}^k(f,x_0)$ and $\tilde{h}_1, \ldots , \tilde{h}_k \in {\mathcal{E}}^k(f,x_0)$. We let H be the span of all these vectors and use that the corresponding operator $D^k g^H|_0$ in (4.1) applied to $h_1+\tilde{h}_1, \ldots , h_k +\tilde{h}_k$ is multilinear. $\square $

Note that, the operator $D^{k,{\mathcal{E}}} f|_{x_0}$ is in general not bounded. Moreover, ${\mathcal{E}}^k(f,x_0)$ will in general not be a closed subspace of E nor will it in general be dense.

4.2 Derivatives along smooth curves

We now analyze under which assumptions directional derivatives exist. To this end, we let I be an interval and $\gamma : I \rightarrow E$ a smooth curve (here, the notions of Fréchet and Gâteaux smoothness coincide). Moreover, let $t_0 \in I$ with $x_0:=\gamma (t_0) \in U$ and $U\subseteq E$ open. Given a function $f : U \rightarrow F$, we consider the composition

$$\begin{aligned} f \circ \gamma \, :\, I \rightarrow F . \end{aligned}$$

Proposition 4.4

(chain rule) Assume that f is locally Hölder continuous at $x_0$, meaning that there is a neighborhood $V \subseteq U$ of $x_0$ as well as constants $\alpha , c>0$ such that

$$\begin{aligned} \Vert f(x) - f(x') \Vert _F \le c\, \Vert x-x'\Vert _E^{\alpha } \qquad \text{for all}\,x,x' \in V. \end{aligned}$$

(4.2)

Moreover, assume that all the derivatives of $\gamma $ at $x_0$ up to the order

$$\begin{aligned} p := \bigg \lceil \frac{1}{\alpha } \bigg \rceil \end{aligned}$$

(4.3)

(where $\lceil \cdot \rceil $ is the ceiling function) lie in the expedient differentiable subspace at $x_0$, i.e.,

$$\begin{aligned} \gamma ^{(n)}(t_0) \in {\mathcal{E}}(f,x_0) \qquad \text{for all}\,n \in \{1, \ldots , p\} . \end{aligned}$$

Then, the function $f \circ \gamma $ is differentiable at $t_0$ and

$$\begin{aligned} (f\circ \gamma )'(t_0) = D^{{\mathcal{E}}}f|_{x_0}\, \gamma '(t_0) . \end{aligned}$$

Proof

We consider the polynomial approximation of $\gamma $

$$\begin{aligned} \gamma _p(t) := \sum _{n=0}^p \frac{\gamma ^{(n)}(t_0)}{n!}\, (t-t_0)^n . \end{aligned}$$

(4.4)

By assumption, this curve lies in the affine subspace ${\mathcal{E}}(f,x_0)+x_0$. Using that the restriction of f to this subspace is continuously differentiable, it follows that

$$\begin{aligned} (f\circ \gamma _p)'(t_0) = D^{{\mathcal{E}}} f|_{x_0}\, \gamma '(t_0) . \end{aligned}$$

It remains to control the error term of the polynomial approximation. Using that f is locally Hölder continuous, we know that

$$\begin{aligned} \big \Vert (f\circ \gamma )(t) - (f\circ \gamma _p)(t) \big \Vert _F \le c\, \Vert \gamma (t) - \gamma _p(t)\Vert _E^{\alpha } . \end{aligned}$$

Using that $\gamma $ is smooth, it follows that

$$\begin{aligned} \big \Vert (f\circ \gamma )(t) - (f\circ \gamma _p)(t) \big \Vert _F \le \big \Vert o \left( (t-t_0)^p \right) \big \Vert _E^{\alpha } = o \left( (t-t_0)^{\alpha p} \right) . \end{aligned}$$

(4.5)

According to (4.3), we know that $\alpha p \ge 1$. Therefore, the error term is of the order $o(t-t_0)$, which shows that also the function $t\mapsto (f\circ \gamma )(t) - (f\circ \gamma _p)(t)$ is differentiable with vanishing derivative. This proves the desired result. $\square $

This result immediately generalizes to higher derivatives:

Proposition 4.5

(higher order chain rule) Assume that f is locally Hölder continuous at $x_0$ (see (4.2)). Moreover, assume that all the derivatives of $\gamma $ at $x_0$ up to the order

$$\begin{aligned} p := \bigg \lceil \frac{q}{\alpha } \bigg \rceil \end{aligned}$$

(4.6)

lie in the expedient differentiable subspace at $x_0$, i.e.,

$$\begin{aligned} \gamma ^{(n)}(t_0) \in {\mathcal{E}}^q(f,x_0) \qquad \text{for all}\,n \in \{1, \ldots , p\} . \end{aligned}$$

Then, the function $f \circ \gamma $ is q-times differentiable at $t_0$, and the derivative can be computed with the usual product and chain rules (formula of Faà di Bruno).

Proof

We again consider f along the polynomial approximation $\gamma _p$ (4.4) of the curve $\gamma $. By assumption, this curve lies in a finite-dimensional subspace of the affine space

$$\begin{aligned} {\mathcal{E}}^q(f,x_0)+x_0 \;\subset \; F . \end{aligned}$$

Using that the restriction of f to this subspace is continuously differentiable, we know that $f\circ \gamma _p$ is q times continuously differentiable at $t=t_0$, and the derivatives can be computed with the formula of Faà di Bruno,

$$\begin{aligned} (f\circ \gamma _p)^{(q)}(t_0)&= D^{{\mathcal{E}},q} f|_{x_0} \left(\gamma '(t_0), \ldots , \gamma '(t_0) \right) \\&\quad + \frac{q(q-1)}{2}\, D^{{\mathcal{E}},q-1} f|_{x_0} \left(\gamma ''(t_0), \gamma '(t_0), \ldots , \gamma '(t_0) \right) + \cdots . \end{aligned}$$

Using (4.5) and (4.6), we conclude that

$$\begin{aligned} (f\circ \gamma )(t) - (f\circ \gamma _p)(t) = o \left( (t-t_0)^q \right) . \end{aligned}$$

It follows that also this function is q-times differentiable and that all its derivatives vanish. This concludes the proof. $\square $

5 Application to causal fermion systems in infinite dimensions

5.1 Local Hölder continuity of the causal Lagrangian

The goal of this section is to prove the following result.

Theorem 5.1

The Lagrangian is locally Hölder continuous in the sense that for all $x,y_0 \in {\mathcal{F}}$ there is a neighborhood $U \subseteq {\mathcal{F}}$ of $y_0$ and a constant $c>0$ such that

$$\begin{aligned} \left| {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) \right| \le c\, \Vert y-\tilde{y}\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,y,\tilde{y} \in U, \end{aligned}$$

(5.1)

where n is the spin dimension. Moreover, the integrand of the boundedness constraint is locally Lipschitz continuous in the sense that

$$\begin{aligned} \left| |x y|^2 - |x\tilde{y}|^2 \right| \le c\, \Vert y-\tilde{y}\Vert ^{\frac{1}{2n}} \qquad \text{for all}\,y,\tilde{y} \in U. \end{aligned}$$

(5.2)

We begin with a preparatory lemma.

Lemma 5.2

(Hölder continuity of roots) Let

$$\begin{aligned} {{\mathcal{P}}}(\lambda ) := \lambda ^g + c_{g-1}\, \lambda ^{g-1} + \cdots + c_0 = \prod _{i=1}^g (\lambda - \lambda _i) \end{aligned}$$

be a complex monic polynomial of degree g with roots $\lambda _1, \ldots , \lambda _g$. Then, there are constants $C, \varepsilon >0$ such that any complex monic polynomial $\tilde{{\mathcal{P}}}(\lambda ) =\lambda ^g + \tilde{c}_{g-1}\, \lambda ^{g-1} + \cdots + \tilde{c}_0$ of degree g which is close to ${{\mathcal{P}}}$ in the sense that

$$\begin{aligned} \Vert \tilde{{\mathcal{P}}} - {{\mathcal{P}}}\Vert := \max _{\ell \in \{0,\ldots ,g-1\} } \left| \tilde{c}_{\ell } - c_{\ell } \right| < \varepsilon \end{aligned}$$

can be written as $\tilde{{\mathcal{P}}}(\lambda ) = \prod _{i=1}^g (\lambda - \tilde{\lambda }_i)$ with

$$\begin{aligned} |\lambda _i - \tilde{\lambda }_i| \le C\, \Vert \tilde{{\mathcal{P}}} - {{\mathcal{P}}} \Vert ^{\frac{1}{p_i}} \qquad \text{for all}\,i=1,\ldots , g, \end{aligned}$$

where $p_i$ is the multiplicity of the root $\lambda _i$.

This lemma is proven in a more general context in [4, Theorem 2]. For self-consistency, we here give a simple proof based on Rouché’s theorem:

Proof of Lemma 5.2

After the rescaling $\lambda \rightarrow \nu \lambda $ and $\lambda _i \rightarrow \nu \lambda _i$ with $\nu >0$, we can assume that all the roots $\lambda _i$ are in the unit ball. Then, the polynomial $\Delta {{\mathcal{P}}} := \tilde{{\mathcal{P}}} - {{\mathcal{P}}}$ is bounded in the ball of radius two by

$$\begin{aligned} |\Delta {{\mathcal{P}}}(\lambda )| \le g\,2^g\, \Vert \Delta {{\mathcal{P}}}\Vert \qquad \text{for all}\,\lambda \ \text{with}\,|\lambda | \le 2. \end{aligned}$$

(5.3)

We denote the minimal distance of distinct eigenvalues by

$$\begin{aligned} D := \min _{\lambda _i \ne \lambda _j} |\lambda _i - \lambda _j| . \end{aligned}$$

Since there is a finite number of roots, it clearly suffices to prove the lemma for one of them. Given $i \in \{1, \ldots , g\}$, we choose

$$\begin{aligned} \delta = \bigg ( \frac{g\,2^{2g-p_i+1}}{D^{g-p_i}} \, \Vert \Delta {{\mathcal{P}}} \Vert \bigg )^{\frac{1}{p_i}} . \end{aligned}$$

(5.4)

Next, we choose $\varepsilon $ so small that $\delta <D/2$. We consider the ball $\Omega = B_{\delta }(\lambda _i)$. Then, for any $\lambda \in \partial \Omega $, the polynomial ${{\mathcal{P}}}$ satisfies the bound

$$\begin{aligned} |{{\mathcal{P}}}(\lambda ) | \ge (D/2)^{g-p_i} \,\delta ^{p_i} \ge g\,2^{g+1}\, \Vert \Delta {{\mathcal{P}}} \Vert > |\Delta {{\mathcal{P}}}(\lambda ) | , \end{aligned}$$

where we used (5.4) and (5.3). Therefore, Rouché’s theorem (see for example [27, Theorem 10.36]) implies that the polynomials ${{\mathcal{P}}}$ and $\tilde{{\mathcal{P}}}$ have the same number of roots in the ball $\Omega $. Thus, after a suitable ordering of the roots,

$$\begin{aligned} |\lambda _i - \tilde{\lambda }_i| \le \delta . \end{aligned}$$

Using (5.4) gives the result. $\square $

Proof of Theorem 5.1

Let $x, y \in {\mathcal{F}}$. Since both operators x and y vanish on the orthogonal complement of the span their images combined, $J :=\text{span}(S_x, S_y)$, it suffices to compute the eigenvalues on the finite-dimensional subspace J. Choosing an orthonormal basis of $S_x=x({\mathcal{H}})$ and extending it to an orthonormal basis of J, the matrix $x y|_J- {\mathbb {1}}_J$ has the block matrix form

$$\begin{aligned} \begin{pmatrix} x y \pi _x - \lambda {\mathbb {1}} &{} * \\ 0 &{} -\lambda {\mathbb {1}} \end{pmatrix} . \end{aligned}$$

Therefore, its characteristic polynomial is given by

$$\begin{aligned} \det \nolimits _J (x y- {\mathbb {1}}_J) = (-\lambda )^{\dim J -\dim x({\mathcal{H}})} \det \nolimits _{x({\mathcal{H}})} \left(x y \pi _x - \lambda {\mathbb {1}}_{x({\mathcal{H}})} \right) . \end{aligned}$$

This consideration shows that it suffices to analyze the operators $x y \pi _x$ and similarly $x \tilde{y} \pi _x$ on the finite-dimensional Hilbert space $x({\mathcal{H}})$. We denote the corresponding characteristic polynomials by ${{\mathcal{P}}}$ and $\tilde{{\mathcal{P}}}$, respectively. They are monic polynomials of degree $g:= \dim x({\mathcal{H}})$. The difference of these polynomials can be estimated in terms of operator norms on $\mathrm{L}({\mathcal{H}})$ as follows,

$$\begin{aligned} \Vert \tilde{{\mathcal{P}}} - {{\mathcal{P}}}\Vert \le c\left(g, \Vert x\Vert , \Vert y\Vert \right) \, \big \Vert x \tilde{y} \pi _x - x y \pi _x \big \Vert \le c'\left(g, \Vert x\Vert , \Vert y\Vert \right) \, \big \Vert \tilde{y}-y \big \Vert , \end{aligned}$$

valid for all $\tilde{y}$ with $\Vert \tilde{y}\Vert \le 2 \,\Vert y\Vert $. According to Lemma 5.2, for sufficiently small $\Vert y-\tilde{y}\Vert $, the eigenvalues of these matrices can be arranged to satisfy the inequalities

$$\begin{aligned} |\lambda _i - \tilde{\lambda }_i| \le C\, \Vert \tilde{{\mathcal{P}}} -{{\mathcal{P}}} \Vert ^{\frac{1}{p_i}} \le C'\left(x, y \right) \, \big \Vert \tilde{y}-y \big \Vert ^{\frac{1}{p_i}} . \end{aligned}$$

In order to prove (5.2), we consider the estimate

$$\begin{aligned} \left| |x y|^2 - |x \tilde{y}|^2 \right|&\le \sum _{i=1}^g \left| |\lambda _i|^2 -|\tilde{\lambda }_i|^2 \right| \nonumber \\&\le \sum _{i=1}^g |\lambda _i - \tilde{\lambda }_i| \, \left( |\lambda _i| +|\tilde{\lambda }_i| \right) \le \tilde{C}(x, y) \, \big \Vert \tilde{y}-y \big \Vert ^{\frac{1}{g}} \end{aligned}$$

(5.5)

and use that $g \le 2n$.

It remains to prove (5.1). In the case $g<2n$, a simple estimate similar to (5.5) gives the result. In the remaining case $g=2n$, using the abbreviation $\Delta \lambda _i := \tilde{\lambda }_i - \lambda _i$, we obtain

$$\begin{aligned} \left| {\mathcal{L}}(x, \tilde{y}) - {\mathcal{L}}(x,y) \right|&\le \frac{1}{g} \sum _{i,j=1}^g \left| |\tilde{\lambda }_i - \tilde{\lambda _j} |^2 -|\lambda _i - \lambda _j |^2 \right| \\&\le \frac{1}{g} \sum _{i,j=1}^g \left( 2\, |\Delta \lambda _i - \Delta \lambda _j |\,|\lambda _i - \lambda _j | + |\Delta \lambda _i - \Delta \lambda _j |^2 \right) \\&\le c_2(x, y) \sum _{i,j=1}^g \left( \big \Vert \tilde{y}-y \big \Vert ^{\max \left(\frac{1}{p_i}, \frac{1}{p_j} \right) }\,|\lambda _i -\lambda _j | + \big \Vert \tilde{y}-y \big \Vert ^{\frac{2}{g}} \right) \\&\le c_3(x, y) \sum _{i,j=1}^g \left( \big \Vert \tilde{y}-y \big \Vert ^{\frac{1}{g-1}} + \big \Vert \tilde{y}-y \big \Vert ^{\frac{2}{g}} \right) , \end{aligned}$$

where in the last step we used that whenever $\lambda _i \ne \lambda _j$, the multiplicities of both roots are at most $g-1$. The inequality

$$\begin{aligned} \frac{2}{g} = \frac{1}{n} \ge \frac{1}{2n-1} = \frac{1}{g-1} , \end{aligned}$$

yields the desired Hölder inequality with exponent $1/(2n-1)$. Finally, it is clear from the construction that the constant depends continuously on y. This concludes the proof. $\square $

In the case of spin dimension one, the Lagrangian is Lipschitz continuous, in agreement with the findings in [20]. If the spin dimension is larger, one still has Hölder continuity, but the Hölder exponent becomes smaller if the spin dimension is increased. This can be understood from the fact that the higher the spin dimension is, the higher the degeneracies of the eigenvalues of xy can be.

We next prove a global Hölder continuity result.

Theorem 5.3

(Global Hölder continuity) There is a constant c(n) which depends only on the spin dimension such that for all $x,y \in {\mathcal{F}}$ with $y \ne 0$, there is a neighborhood $U\subseteq {\mathcal{F}}$ of y with

$$\begin{aligned} | {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n) \,\Vert y\Vert ^{2-\frac{1}{2n-1}} \, \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U. \end{aligned}$$

(5.6)

Proof

Without loss of generality, we can assume that $x \ne 0$. Moreover, using that both sides of the inequality (5.6) have the same scaling behavior under the rescaling

$$\begin{aligned} x \rightarrow \frac{x}{\Vert x\Vert }\;, \quad y \rightarrow \frac{y}{\Vert y\Vert } \;, \quad \tilde{y} \rightarrow \frac{\tilde{y}}{\Vert y\Vert }\;, \end{aligned}$$

it suffices to consider the case that $\Vert x\Vert =\Vert y\Vert =1$.

Next, choosing a fixed 4n-dimensional subspace of $I \subseteq {\mathcal{H}}$, we can always find a unitary transformation $U: {\mathcal{H}}\rightarrow {\mathcal{H}}$ such that $UxU^{-1}({\mathcal{H}}), UyU^{-1}({\mathcal{H}}) \subseteq I$. Since the Lagrangian and the operator norms are invariant under such joint unitary transformations (as they leave the eigenvalues of xy invariant), we can assume that both x and y map into the fixed finite dimensional subspace I.

After these transformations, the operators x and y can be considered as operators in $\mathrm{L}(I)$. Therefore, they lie in the compact set $\overline{B_1(0)} \subseteq \mathrm{L}(I)$. Since the Hölder constant for the local Hölder continuity depends continuously on x and y, a compactness argument shows that we can choose the Hölder constant uniformly in x and y: As the previous arguments show, the local Hölder constant can be written as a continuous function $c: \mathrm{L}(I)\times \mathrm{L}(I) \rightarrow {\mathbb{R}}^+,\, (x,y) \mapsto c(x,y)$. Since $\overline{B_1(0)} \times \overline{B_1(0)} \subseteq \mathrm{L}(I)\times \mathrm{L}(I)$ is compact, the local Hölder constant function c is bounded on this set by a constant $c_{\mathrm{max}}>0$, which can then be taken as the desired global Hölder constant. $\square $

Remark 5.4

(1)
Since the Lagrangian is symmetric, Theorem 5.3 also gives rise to global Hölder continuity with respect to the other argument. Thus, for all $x,y \in {\mathcal{F}}$ with $x \ne 0$, there is a neighborhood $U \subseteq {\mathcal{F}}$ of x such that
$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)| \le c(n) \Vert x\Vert ^{2-\frac{1}{2n-1}} \Vert y\Vert ^2 \Vert \tilde{x}-x\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{x}\in U. \end{aligned}$$
(5.7)
(2)
As explained in the proof of Theorem 5.3, the Lagrangian ${\mathcal{L}}(x,y)$ depends only on the nonzero eigenvalues of xy and these coincide with the eigenvalues of $xy\pi _x$. Thus, denoting
$$\begin{aligned} J:=\mathrm{span}(S_x,S_{\tilde{x}}) , \end{aligned}$$
we immediately obtain the following strengthened version of (5.7): Every $x\ne 0$ has a neighborhood $U \subset {\mathcal{F}}$ such that the inequality
$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)|&= |{\mathcal{L}}(x,\pi _J \,y\,\pi _J) -{\mathcal{L}}(\tilde{x},\pi _J \,y\, \pi _J)| \nonumber \\&\le c(n)\, \Vert x\Vert ^{2-\frac{1}{2n-1}} \,\Vert \pi _J \,y\, \pi _J\Vert ^2 \,\Vert \tilde{x} -x\Vert ^{\frac{1}{2n-1}} , \end{aligned}$$
(5.8)
holds for all $\tilde{x} \in U$ and all $y \in {\mathcal{F}}$. This estimate will be needed for the proof of the chain rule for the integrated Lagrangian $\ell $ in Theorem 5.9.
(3)
In the case $y=0$, a direct estimate of the eigenvalues shows that one has Hölder continuity with the improved exponent two,
$$\begin{aligned} \left| {\mathcal{L}}(x,\tilde{y}) \right| \le c(n) \, \Vert x\Vert ^2\, \Vert \tilde{y}\Vert ^2 . \end{aligned}$$
This inequality can be combined with the result of Theorem 5.3 to the statement that for all x, y there is a neighborhood $U\subseteq {\mathcal{F}}$ of y with
$$\begin{aligned} | {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n, y) \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U. \end{aligned}$$
(5.9)
Likewise, (5.8) generalizes to
$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)| \le c(n,x) \,\Vert \pi _J \,y\, \pi _J\Vert ^2 \,\Vert \tilde{x}-x\Vert ^{\frac{1}{2n-1}} . \end{aligned}$$
(5.10)
This inequality will be used in the proof of Theorem 5.9.

$\square $

5.2 Definition of Jet Spaces

For the analysis of causal variational principles, the jet formalism was developed in [17]; see also [13, Section 2]. We now generalize the definition of the jet spaces to causal fermion systems in the infinite-dimensional setting. Our method is to work with the expedient subspaces, where for convenience derivatives at x are always computed in the corresponding chart $\phi _x$. For example, for analyzing the differentiability of a real-valued function f at a point $x \in {\mathcal{F}}^{\mathrm{reg}}$, we consider the composition

$$\begin{aligned} f \circ \phi _x^{-1} \, :\, \Omega _x \subseteq \mathrm{Symm}(S_x) \oplus \mathrm{L}(J,I) \rightarrow {\mathbb{R}}. \end{aligned}$$

We introduce $\Gamma ^{\mathrm{diff}}_{\rho }$ as the linear space of all vector fields for which the directional derivative of the function $\ell $ exists in the sense of expedient subspaces (see Definition 4.2),

$$\begin{aligned} \Gamma ^{\mathrm{diff}}_{\rho } =\Big \{ \mathbf{u }\in C^{\infty }(M, T{\mathcal{F}}^{\mathrm{reg}}) \;\big|\; \mathbf{u }(x) \in {\mathcal{E}}\left( \ell \circ \phi _x^{-1}, \phi _x(x) \right)\ \text{for all}\,x \in M \Big \} . \end{aligned}$$

This gives rise to the jet space

$$\begin{aligned} \mathfrak {J}^{\mathrm{diff}}_{\rho } := C^{\infty }(M, {\mathbb{R}}) \oplus \Gamma ^{\mathrm{diff}}_{\rho } \;\subseteq \; \mathfrak {J}_{\rho } . \end{aligned}$$

(5.11)

We choose a linear subspace $\mathfrak {J}^{\mathrm{test}}_{\rho } \subseteq \mathfrak {J}^{\mathrm{diff}}_{\rho }$ with the property that its scalar and vector components are both vector spaces,

$$\begin{aligned} \mathfrak {J}^{\mathrm{test}}_{\rho } = C^{\mathrm{test}}(M, {\mathbb{R}}) \oplus \Gamma ^{\mathrm{test}}_{\rho } \;\subseteq \; \mathfrak {J}^{\mathrm{diff}}_{\rho } , \end{aligned}$$

(5.12)

and the scalar component is nowhere trivial in the sense that

$$\begin{aligned} \text{for all}\,x \in M\ \text{there is}\,a \in C^{\mathrm{test}}(M, {\mathbb{R}})\ \text{with}\,a(x) \ne 0. \end{aligned}$$

It is convenient to consider a pair $\mathfrak {u}:= (a, \mathbf{u })$ consisting of a real-valued function a on M and a vector field $\mathbf{u }$ on $T{\mathcal{F}}^{\mathrm{reg}}$ along M and to denote the combination of multiplication and directional derivative by

$$\begin{aligned} \nabla _{\mathfrak {u}} \ell (x) := a(x)\, \ell (x) + \left(D_{\mathbf{u }} \ell \right)(x) . \end{aligned}$$

(5.13)

For the Lagrangian, being a function of two variables $x,y \in {\mathcal{F}}^{\mathrm{reg}}$, we always work in charts $\phi _x$ and $\phi _y$, giving rise to the mapping

$$\begin{aligned} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right) ={\mathcal{L}}\left(\phi _x^{-1}(.), \phi _y^{-1}(.) \right) \, :\, \Omega _x \times \Omega _y \subseteq E \rightarrow {\mathbb{R}}, \end{aligned}$$

(5.14)

where E is the Cartesian product of Banach spaces

$$\begin{aligned} E := \left( \mathrm{Symm}(S_x) \oplus \mathrm{L}(J_x,I_x) \right) \times \left( \mathrm{Symm}(S_y) \oplus \mathrm{L}(J_y,I_y) \right) \end{aligned}$$

with the norm

$$\begin{aligned} \Vert (\psi _x, \psi _y)\Vert _E := \max \left( \Vert \psi _x \Vert _{\mathrm{L}({\mathcal{H}})}, \Vert \psi _y\Vert _{\mathrm{L}({\mathcal{H}})} \right) \end{aligned}$$

(where the subscripts x and y clarify the dependence on the base points, i.e., $I_x = x(H)$, $J_x = I_x^{\perp } \subseteq {\mathcal{H}}$ and similarly at y). We denote partial derivatives acting on the first and second arguments by subscripts 1 and 2, respectively. Throughout this paper, we use the following conventions for partial derivatives and jet derivatives:

${\blacktriangleright}$:: Partial and jet derivatives with an index $i \in \{ 1,2 \}$, as for example in (5.15), only act on the respective variable of the function ${\mathcal{L}}$. This implies, for example, that the derivatives commute,
$$\begin{aligned} \nabla _{1,\mathfrak {v}} \nabla _{1,\mathfrak {u}} {\mathcal{L}}(x,y) = \nabla _{1,\mathfrak {u}} \nabla _{1,\mathfrak {v}} {\mathcal{L}}(x,y) . \end{aligned}$$
${\blacktriangleright}$:: The partial or jet derivatives which do not carry an index act as partial derivatives on the corresponding argument of the Lagrangian. This implies, for example, that
$$\begin{aligned} \nabla _{\mathfrak {u}} \int _{{\mathcal{F}}} \nabla _{1,\mathfrak {v}} \, {\mathcal{L}}(x,y) \, \mathrm{d}\rho (y) =\int _{{\mathcal{F}}} \nabla _{1,\mathfrak {u}} \nabla _{1,\mathfrak {v}}\, {\mathcal{L}}(x,y) \, \mathrm{d}\rho (y) . \end{aligned}$$

Definition 5.5

For any $\ell \in {\mathbb{N}}_0 \cup \{\infty \}$, the jet space $\mathfrak {J}_{\rho }^{\ell } \subseteq \mathfrak {J}_{\rho }$ is defined as the vector space of test jets with the following properties:

(i)
The directional derivatives up to order $\ell $ exist in the sense that
$$\begin{aligned} \mathfrak {J}^{\ell }_{\rho }&\subseteq \Big \{ (b,\mathbf{v }) \in \mathfrak {J}_{\rho } \,\Big|\, \left(\mathbf{v }(x), \mathbf{v }(y) \right) \in \Gamma ^{\ell }_{\rho }(x,y) \\&\qquad \text{for all}\,y \in M\ \text{and}\,x\ \text{in an open neighborhood of}\,M \subseteq {\mathcal{F}}^{\mathrm{reg}}\Big \} , \end{aligned}$$
where
$$\begin{aligned} \Gamma ^{\ell }_{\rho }(x,y) := {\mathcal{E}}^{\ell } \left({\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right), \left(\phi _x(x), \phi _y(y) \right) \right) . \end{aligned}$$
The higher jet derivatives are defined by using (5.13) and multiplying out, keeping in mind that the partial derivatives act only on the Lagrangian, i.e.,
$$\begin{aligned}&\nabla ^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathfrak {v}_1(x), \mathfrak {v}_1(y) \right), \ldots , \left(\mathfrak {v}_p(x), \mathfrak {v}_p(y) \right) \right) \\&\quad := D^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathbf{v }_1(x), \mathbf{v }_1(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \left(b_1(x)+b_1(y) \right) \, D^{p-1, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))}\\&\qquad \qquad \times \left( \left(\mathbf{v }_2(x), \mathbf{v }_2(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \left(b_2(x)+b_2(y) \right)\, D^{p-1, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \\&\qquad \qquad \times \left( \left(\mathbf{v }_1(x), \mathbf{v }_1(y) \right), \left(\mathbf{v }_3(x), \mathbf{v }_3(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \cdots + \left(b_1(x)+b_1(y) \right) \cdots \left(b_p(x)+b_p(y) \right)\, {\mathcal{L}}(x,y). \end{aligned}$$
(ii)
The functions
$$\begin{aligned}&\left( \nabla _{1, \mathfrak {v}_1} + \nabla _{2, \mathfrak {v}_1} \right) \cdots \left( \nabla _{1, \mathfrak {v}_p} + \nabla _{2, \mathfrak {v}_p} \right) {\mathcal{L}}(x,y) \nonumber \\&\qquad := \nabla ^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right) \big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathfrak {v}_1(x), \mathfrak {v}_1(y) \right), \ldots , \left(\mathfrak {v}_p(x), \mathfrak {v}_p(y) \right) \right) \end{aligned}$$
(5.15)
are $\rho $-integrable in the variable y, giving rise to locally bounded functions in x. More precisely, these functions are in the space
$$\begin{aligned} L^{\infty }_{\mathrm{loc}}\left( M, L^1\left(M, d\rho (y) \right); d\rho (x) \right) . \end{aligned}$$
(iii)
Integrating the expression (5.15) in y over M with respect to the measure $\rho $, the resulting function g (defined for all x in an open neighborhood of M) is continuously differentiable in the direction of every jet $\mathfrak {u}\in \mathfrak {J}^{\mathrm{test}}_{\rho }$, i.e.,
$$\begin{aligned} \Gamma ^{\mathrm{test}}_x \subseteq {\mathcal{E}}(g, x) \qquad \text{for all}\,x \in M. \end{aligned}$$

5.3 Derivatives of ${\mathcal{L}}$ and $\ell $ along smooth curves

In this section, we use the chain rule in Proposition 4.4 in order to differentiate the Lagrangian ${\mathcal{L}}$ and the function $\ell $ along smooth curves.

Theorem 5.6

Let $\gamma _1$ and $\gamma _2$ be two smooth curves in ${\mathcal{F}}^{\mathrm{reg}}$,

$$\begin{aligned} \gamma _1, \gamma _2 \in C^{\infty }((-\delta , \delta ), {\mathcal{F}}^{\mathrm{reg}}) . \end{aligned}$$

Setting $x=\gamma _1(0)$ and $y=\gamma _2(0)$, we assume that the tangent vectors up to the order $p=2n-1$ denoted by

$$\begin{aligned} \mathbf{v }_1^{(1)}&:= (\phi _x \circ \gamma _a)'(0) , \ldots , \, \mathbf{v }_1^{(p)} := (\phi _x \circ \gamma _a)^{(p)}(0) \\ \mathbf{v }_2^{(1)}&:= (\phi _y \circ \gamma _a)'(0) , \ldots , \, \mathbf{v }_2^{(p)} := (\phi _y \circ \gamma _a)^{(p)}(0) \end{aligned}$$

are in the expedient differentiable subspace of the Lagrangian, i.e.,

$$\begin{aligned} \left( \mathbf{v }^{(1)}_1, \mathbf{v }^{(1)}_2 \right), \ldots , \left( \mathbf{v }^{(p)}_1, \mathbf{v }^{(p)}_2 \right) \in \Gamma _{\rho }(x,y) . \end{aligned}$$

Then, the function ${\mathcal{L}}(\gamma _1(\tau ), \gamma _2(\tau ))$ is $\tau $-differentiable at $\tau =0$ and the chain rule holds, i.e.,

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\tau } {\mathcal{L}}\left(\gamma _1(\tau ), \gamma _2(\tau ) \right)\big|_{\tau =0}&= D^{{\mathcal{E}}} \left({\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\right) \big|_{(\phi _x(x), \phi _y(y))} \left(\mathbf{v }_1, \mathbf{v }_2 \right) \\&\equiv \left( D_{1, \gamma _1'(0)} + D_{2, \gamma _2'(0)} \right) {\mathcal{L}}(x,y) . \end{aligned}$$

Proof

We again consider the Lagrangian in the charts $\phi _x$ and $\phi _y$, (5.14). In order to show that this function is locally Hölder continuous on E, we begin with the estimate

$$\begin{aligned}&\left| {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), \phi ^{-1}_y(\tilde{\psi }_y) \right) -{\mathcal{L}}(x,y) \right| \\&\quad \le \left| {\mathcal{L}}\left( \phi ^{-1}_x(\tilde{\psi }_x), \phi ^{-1}_y(\tilde{\psi }_y) \right) - {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), y \right) \big \Vert + \big \Vert {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), y \right) - {\mathcal{L}}(x,y) \right| \\&\quad \le c \left( \Vert \phi ^{-1}_x(\tilde{\psi }_x) - x\Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}})} + \Vert \phi ^{-1}_y(\tilde{\psi }_y) - y\Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}})} \right) . \end{aligned}$$

Noting that the function

$$\begin{aligned} \phi _x^{-1}(\tilde{\psi }_x) = -\tilde{\psi }_x^* \tilde{\psi }_x \end{aligned}$$

is bilinear and therefore Fréchet-smooth, it follows that

$$\begin{aligned}&\left| {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), \phi ^{-1}_y(\tilde{\psi }_y) \right) -{\mathcal{L}}(x,y) \right| \\&\quad \le c C \left( \Vert \tilde{\psi }_x - \psi _x \Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}}, I)} +\Vert \tilde{\psi }_y - \psi _y\Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}}, I)} \right) \le 2 c C \,\big \Vert (\tilde{\psi }_x, - \tilde{\psi }_y) -(\psi _x, \psi _y) \big \Vert ^{\alpha }_E , \end{aligned}$$

where $\psi _x := \phi ^{-1}_x(x)$ and $\psi _y := \phi ^{-1}_y(y)$. This proves local Hölder continuity on E. Applying Proposition 4.4 gives the result. $\square $

We remark that using Proposition 4.5, the above method could be generalized in a straightforward manner to higher derivatives.

Definition 5.7

We call $\ell $ Hölder continuous with Hölder exponent $\alpha $ along a smooth curve $\gamma : I \rightarrow {\mathcal{F}}$ (with I an open interval) if for any $t_0 \in I$ with $x_0 = \gamma (t_0)$ there exists a subspace $E_0 \subseteq \mathrm{Symm}S_{x_0} \oplus {\mathcal{L}}(J_{x_0},I_{x_0})$ and $\delta >0$ such that the mapping

$$\begin{aligned} \gamma _{x_0}: (t_0-\delta , t_0+\delta ) \rightarrow E_0, \quad t \mapsto \phi _{x_0}\circ \gamma (t) - ({\mathbb {1}},0)\;, \end{aligned}$$

is well defined and locally Hölder continuous with Hölder exponent $\alpha $.

Theorem 5.8

Let $\gamma : I \rightarrow {\mathcal{F}}$ be a smooth curve and $\ell $ Hölder continuous along $\gamma $ with Hölder exponent $\alpha $. For $t_0 \in I$ with $x_0 = \gamma (t_0)$, we set

$$\begin{aligned} \ell _{x_0}: E_0 \rightarrow {\mathbb{R}},\quad \ell _{x_0}(x) =\ell \circ \phi _{x_0}^{-1} \left(x+({\mathbb {1}},0) \right) . \end{aligned}$$

If for any $x_0\in I$, the derivatives of $\gamma _{x_0}$ up to the order $p:=\lceil q/\alpha \rceil $ lie in the expedient differentiable subspace at $x_0$, i.e.,

$$\begin{aligned} (\gamma _{x_0})^{(n)}(t_0) \in {\mathcal{E}}^q\left(\ell _{x_0}, 0\right) \quad \mathrm{for\;all\;}n\in \{1, \dots , p\}\;, \end{aligned}$$

then the function $\ell \circ \gamma = \ell _{x_0} \circ \gamma _{x_0}$ is q-times differentiable at $t_0$. Moreover, the usual product and chain rules hold for $\ell _{x_0} \circ \gamma _{x_0}$.

Proof

Applying proposition 4.5 to $\ell _{x_0}$ and $\gamma _{x_0}$ yields the claim as the assumptions for this theorem are clearly fulfilled. $\square $

We now give a sufficient condition which ensures that $\ell $ is Hölder continuous along $\gamma $. This condition needs to be verified in the applications; see for example [25].

Theorem 5.9

Let $\gamma $ be a smooth curve in ${\mathcal{F}}$ with

$$\begin{aligned} \int _M \big \Vert P(\gamma (\tau ), y) \big \Vert ^4 \, \big \Vert Y^{-1} \big \Vert ^2 \,\mathrm{d}\rho (y) < C \qquad \text{for all}\,\tau \in (-\delta ,\delta ) , \end{aligned}$$

where P(x, y) is again the kernel of the fermionic projector (3.11) and Y is (similar to (3.5)) the invertible operator

$$\begin{aligned} Y := y|_{S_y} \, :\, S_y \rightarrow S_y . \end{aligned}$$

Then the integrated Lagrangian $\ell $ defined by (1.1) is Hölder continuous along $\gamma $ with Hölder exponent $\frac{1}{2n-1}$.

Proof

The idea of the proof is to integrate the estimate (5.10) over M. To this end, it is crucial to estimate the factor $\Vert \pi _J y \pi _J \Vert $. We let $(\tilde{\phi }_i)_{i\in 1, \dots m}$ be an orthonormal basis of J and denote the orthogonal projection on $\mathrm{span}(\tilde{\phi }_i)$ by $\pi _i$. Since on the finite-dimensional vector space L(J) all norms are equivalent, we can work with the Hilbert–Schmidt norm of $\pi _J y\pi _J$, i.e., for a suitable constant $C=C(n)$,

$$\begin{aligned} \Vert \pi _J \,y\, \pi _J \Vert ^2 = \Vert \pi _J \,y\,Y^{-1}\,y \pi _J \Vert ^2 \le \Vert \pi _J \,y\Vert ^2 \,\Vert Y^{-1}\Vert ^2 \, \Vert y \,\pi _J \Vert ^2 = \Vert \pi _J \,y\Vert ^4 \,\Vert Y^{-1}\Vert ^2 , \end{aligned}$$

where in the last step, we used that the norm of an operator is the same as the norm of its adjoint. Combining this inequality with the estimate

$$\begin{aligned} \big \Vert \pi _J \,y \,\psi \big \Vert ^2&\le \left( \big \Vert \pi _x \,y \,\psi \big \Vert +\big \Vert \pi _{\tilde{x}} \,y \,\psi \big \Vert \right)^2 \le 2 \,\big \Vert \pi _x \,y \,\psi \big \Vert ^2 + 2\,\big \Vert \pi _{\tilde{x}} \,y\, \psi \big \Vert ^2\;, \end{aligned}$$

we obtain

$$\begin{aligned} \Vert \pi _J \,y\, \pi _J \Vert ^2&\le 2\,C(n) \,\left( \big \Vert \pi _x \,y\, \psi \big \Vert ^2 + \big \Vert \pi _{\tilde{x}} \,y\, \psi \big \Vert ^2 \right)^2 \,\big \Vert Y^{-1} \big \Vert ^2 \\&\le 4\, C(n) \,\big \Vert Y^{-1} \big \Vert ^2 \,\left( \big \Vert \pi _x \,y\, \psi \big \Vert ^4 + \big \Vert \pi _{\tilde{x}} \,y\, \psi \big \Vert ^4 \right) \\&= 4\, C(n) \,\big \Vert Y^{-1} \big \Vert ^2 \, \left( \big \Vert P(x,y) \, \psi \big \Vert ^4 + \big \Vert P(\tilde{x},y) \,\psi \big \Vert ^4 \right) . \end{aligned}$$

Using this estimate when integrating (5.10) over M and noting that $\phi _{x}^{-1}$ is locally Lipschitz (since it is Fréchet-smooth) yields the claim. $\square $

Notes

In this reference, everything is worked out in the case of Banach spaces, but the completeness is not needed for these results.
As explained in Remark 7.1, the trace operator for the finite-rank operators $x\mathbf{u }x\mathbf{u }$, $x\mathbf{u }\mathbf{u }^{\dagger }x$ and $\mathbf{u }^{\dagger }x^2\mathbf{u }$ can indeed be calculated like that as they all map into $(S_x+\mathbf{u }^{\dagger }(S_x))$.
Where a Riemannian metric on a Banach manifold is defined just as in the finite-dimensional case but with smoothness with respect to the Fréchet derivative.

References

Link to web platform on causal fermion systems: www.causal-fermion-system.com
Beltiţă, D., Goliński, T., Tumpach, A.-B.: Queer Poisson brackets. J. Geom. Phys. 132, 358–362 (2018). arXiv:math-ph/1710.03057 [math.FA]
Bernard, Y., Finster, F.: On the structure of minimizers of causal variational principles in the non-compact and equivariant settings. Adv. Calc. Var. 7(1), 27–57 (2014). arXiv:1205.0403 [math-ph])
Article MathSciNet Google Scholar
Brink, D.: Hölder continuity of roots of complex and $p$-adic polynomials. Comm. Algebra 38(5), 1658–1662 (2010)
Article MathSciNet Google Scholar
Coleman, R.: Calculus on Normed Vector Spaces. Universitext. Springer, New York (2012)
Book Google Scholar
Dieudonné, J.: Foundations of Modern Analysis, Academic Press, New York-London (1969). Enlarged and corrected printing, Pure and Applied Mathematics, Vol. 10-I
Dunford, N., Schwartz, J.T.: Linear Operators. Part II: Spectral Theory. Self Adjoint Operators in Hilbert Space, with the Assistance of William G. Bade and Robert G. Bartle. Wiley, New York (1963)
Finster, F.: A variational principle in discrete space–time: existence of minimizers. Calc. Var. Partial Differential Equations 29(4), 431–453 (2007). arXiv:0503069 [math-ph]
Finster, F.: Causal variational principles on measure spaces. J. Reine Angew. Math. 646, 141–194 (2010). arXiv:0811.2666 [math-ph])
MathSciNet MATH Google Scholar
Finster, F.: The Continuum Limit of Causal Fermion Systems, Fundamental Theories of Physics, vol. 186. Springer, Berlin (2016).. (arXiv:1605.04742 [math-ph])
Book Google Scholar
Finster, F.: Causal fermion systems: a primer for Lorentzian geometers. J. Phys. Conf. Ser. 968, 012004 (2018). arXiv:1709.04781 [math-ph]
Finster, F., Jokel, M.: Causal fermion systems: an elementary introduction to physical ideas and mathematical concepts. In: Finster, F., Giulini, D., Kleiner, J., Tolksdorf, J. (eds.) Progress and Visions in Quantum Theory in View of Gravity, pp. 63–92. Birkhäuser, Basel (2020). arXiv:1908.08451 [math-ph]
Finster, F., Kamran, N.: Complex structures on jet spaces and bosonic Fock space dynamics for causal variational principles. Pure Appl. Math. Q. (2021). (to appear) arXiv:1808.03177 [math-ph]
Finster, F., Kamran, N., Oppio, M.: The linear dynamics of wave functions in causal fermion systems. J. Differential Equations (2021). (to appear) arXiv:2101.08673 [math-ph]
Finster, F., Kindermann, S.: A gauge fixing procedure for causal fermion systems. J. Math. Phys. 61, 082301 (2020). arXiv:1908.08445 [math-ph])
Article MathSciNet Google Scholar
Finster, F., Kleiner, J.: Causal fermion systems as a candidate for a unified physical theory. J. Phys. Conf. Ser. 626, 012020 (2015). arXiv:1502.03587 [math-ph]
Finster, F., Kleiner, J.: A Hamiltonian formulation of causal variational principles. Calc. Var. Partial Differential Equations 56:73(3), 33 (2017). arXiv:1612.07192 [math-ph]
Finster, F., Kleiner, J., Treude, J.-H.: An Introduction to the Fermionic Projector and Causal Fermion Systems, in preparation, www.causal-fermion-system.com/intro-public.pdf
Finster, F., Langer, C.: Causal variational principles in the $\sigma $-locally compact setting: existence of minimizers. Adv. Calc. Var. (2021). (to appear) arXiv:2002.04412 [math-ph]
Finster, F., Schiefeneder, D.: On the support of minimizers of causal variational principles. Arch. Ration. Mech. Anal. 210, 321–364 (2013). arXiv:1012.1589 [math-ph])
Article MathSciNet Google Scholar
Hilgert, J., Neeb, K.-H.: Structure and Geometry of Lie Groups. Springer Monographs in Mathematics. Springer, New York (2012)
Book Google Scholar
Kriegl, A., Michor, P.W.: The Convenient Setting of Global Analysis, Mathematical Surveys and Monographs, vol. 53. American Mathematical Society, Providence (1997)
Book Google Scholar
Lax, P.D.: Functional Analysis, Pure and Applied Mathematics (New York). Wiley-Interscience, New York (2002)
Google Scholar
Lee, J.M.: Riemannian Manifolds: An Introduction to Curvature, Graduate Texts in Mathematics, vol. 176. Springer, New York (1997)
Book Google Scholar
Oppio, M.: Hölder continuity of the integrated causal Lagrangian in Minkowski space, in preparation
Oppio, M.: On the mathematical foundations of causal fermion systems in Minkowski space. Ann. Henri Poincaré. 223, 873–949 (2021). arXiv:1909.09229 [math-ph])
Article MathSciNet Google Scholar
Rudin, W.: Real and Complex Analysis, 3rd edn. McGraw-Hill Book Co., New York (1987)
MATH Google Scholar
Werner, D.: Funktionalanalysis, 8th edn. Springer, Berlin (2018)
Book Google Scholar
Zeidler, E.: Nonlinear Functional Analysis and its Applications. IV, Springer-Verlag, New York, (1988), Applications to mathematical physics, Translated from the German and with a preface by Jürgen Quandt

Download references

Acknowledgements

We are grateful to Olaf Müller, Marco Oppio, Johannes Wurm and the referee for helpful discussions. M.L. acknowledges support by the Studienstiftung des deutschen Volkes.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Fakultät für Mathematik, Universität Regensburg, 93040, Regensburg, Germany
Felix Finster & Magdalena Lottner

Authors

Felix Finster
View author publications
You can also search for this author in PubMed Google Scholar
Magdalena Lottner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Finster.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Properties of the Fréchet derivative

This appendix lists a set of properties and computation rules for Fréchet derivatives which are needed for the direct computations in Appendix 2. It turns out that most derivation rules known from the finite dimensional case generalize to Fréchet derivatives in a straightforward way.

Lemma 6.1

(Properties of the Fréchet derivative) Let V, W, Z be real normed vector. Then the following Fréchet derivative rules hold:

(i)
Let $U \subseteq V$ open and $f: V \rightarrow W$ Fréchet-differentiable at $x_0\in U$, then f is continuous at $x_0$ and $Df|_{x_0}$ is well defined.
(ii)
Let $f\in \mathrm{L}(V,W)$ be linear and bounded, then it is Fréchet-smooth at any $x_0 \in V$ and $Df|_{x_0} = f$.
(iii)
A continuous bilinear map $B: V \times W \rightarrow Z$ is Fréchet-smooth at any $(v,w)\in V \times W$ and
$$\begin{aligned} DB|_{(v,w)}(h_v,h_w) = B(v,h_w) + B(h_v,w)\;,\;\;\; \forall \, (h_v,h_w)\in V \times W. \end{aligned}$$
(iv)
Chain rule: Let $U_V \subseteq V$ and $U_W \subseteq W$ open, $f: U_V \rightarrow W$, $g: U_W \rightarrow Z$ such that $f(U_V)\subseteq U_W$. If f is Fréchet-differentiable at $x_0\in U_V$ and g in $f(x_0)\in U_W$, then also $g\circ f$ is Fréchet-differentiable in $x_0$ and
$$\begin{aligned} D(g\circ f)|_{x_0} = Dg|_{f(x_0)} \circ Df|_{x_0}. \end{aligned}$$
(v)
Let $W_1,...,W_n$ be real normed vector spaces, $W:=W_1 \times W_2 \times ...\times W_n$ the product space, $U\subseteq V$ open and $f=(f_1,...,f_n): U \rightarrow W$ with $f_i: V \rightarrow f_i$ for $i=1,\dots ,n$. Then f is Fréchet-differentiable at $x_0\in U$ if and only if each $f_i$ is Fréchet-differentiable at $x_0$. Moreover, in this case, we have $Df|_{x_0} = (Df_1|_{x_0},...,Df_n|_{x_0})$.
(vi)
Let $U_V\subseteq V$ and $U_W \subseteq W$ be open and $f: U_V \rightarrow U_W$ a homeomorphism with inverse $g: U_W \rightarrow U_V$. If f is Fréchet-differentiable at $x_0\in U_V$ and g is Fréchet-differentiable in $y_0=f(x_0)\in U_W$, then $Df|_{x_0}$ is an isomorphism with inverse $Dg|_{y_0}$.

Proof

(i)
See [5, Prop. 2.2, Chapter 2.2].
(ii)
f is clearly Fréchet-differentiable with $Df|_{x_0}=f$ for any $x_0 \in U$ as $\Vert f(x+h) -f(x) -fh\Vert _W=0$ for all $x,h \in V$ (see also [6, pp. 149–150], and note that, the completeness of the vector spaces is not needed for this result). Moreover, as $Df: U \rightarrow L(V,W)$ is constant it is clear that all higher Fréchet-derivatives of f vanish (and in particular f is Fréchet-smooth).
(iii)
B is Fréchet differentiable with the stated Fréchet derivative as
$$\begin{aligned}&\Vert B(v+h_v,w+h_w)-B(v,w)-B(v,h_w)-B(h_v,w)\Vert _Z=\Vert B(h_v,h_w)\Vert _Z \\&\le C \Vert h_v\Vert \cdot \Vert h_w\Vert \le C \left(\mathrm{max}(\Vert h_v\Vert ,\Vert h_w\Vert )\right)^2\;, \end{aligned}$$
for a fixed $C>0$ (as B is continuous and bilinear), see also [6, pp. 149-150] (again the completeness in not needed). And since
$$\begin{aligned} DB: V \times W&\rightarrow L(V\times W, L(V\times W, Z))\\ (v,w)&\mapsto \left( (h_v,h_w) \mapsto B(v,h_w) + B(h_v,w) \right)\;, \end{aligned}$$
is clearly bounded linear, B is Fréchet-smooth due to part (ii).
(iv)
See [5, Theorem 2.1, Chapter 2.3].
(v)
See [6, pp. 149–151] (again completeness is not needed).
(vi)
This follows immediately from the chain rule and part (ii) since
$$\begin{aligned} \mathrm{id}_V \overset{(ii)}{=} D(\mathrm{id}_V)|_{x_0} = D(g \circ f)|_{x_0} \overset{(iv)}{=} Dg|_{y_0} \circ Df|_{x_0}\;, \end{aligned}$$
and similarly $\mathrm{id}_W = Df|_{x_0} \circ Dg|_{y_0}$.

$\square $

Lemma 6.2

Let V, W and Z be real normed spaces, $n\in \mathbb{N}$ arbitrary and $U_V \subseteq V, U_W \subseteq W$ open subsets, $f: U_V \rightarrow W$ n-times Fréchet-differentiable (Fréchet-smooth) and $g: U_W \rightarrow Z$ n-times Fréchet-differentiable (Fréchet-smooth) such that $f(U_V)\subseteq U_W$. Then, $g\circ f$ is also n-times Fréchet-differentiable (respectively, Fréchet-smooth).

Proof

We show the result by induction over n following [6, p. 183]: The case $n=1$ follows from the chain rule. Now, let $n \ge 2$ be arbitrary and suppose that the claim holds for $n-1$. Then, the induction hypothesis yields that the mapping $x \mapsto D(f\circ g)|_{x} = Df|_{g(x)}\circ Dg|_{x}$ is $(n-1)$-times Fréchet-differentiable, because Df, g and Dg are at least $(n-1)$-times Fréchet-differentiable and the operator $\circ $ is even Fréchet-smooth (as it is bounded linear). Thus, $f\circ g$ is n-times Fréchet-differentiable.

The smoothness result follows immediately from the result for n-times differentiability. $\square $

The following lemma gives a useful computation rule for higher Fréchet derivatives (see also [6, pp. 179, 181]):

Lemma 6.3

Let V, W be real normed spaces, $U\subseteq V$ open and $f: U \rightarrow W$ n-times differentiable. Then, for any $x_0 \in U$ and $v_1,\cdots , v_n \in V$,

$$\begin{aligned} D\left(D^{(n-1)}f|_{.}(v_1,\cdots ,v_{n-1})\right)\big|_{x_0}v_n =D^{(n)}f|_{x_0}(v_1,\cdots ,v_{n}). \end{aligned}$$

(6.1)

In particular, the map $U\ni x \mapsto D^{(n-1)}f|_{x}(v_1,\cdots ,v_{n-1})\in W$ is Fréchet-differentiable.

Proof

We follow the idea of the proof given in [6, pp. 179, 181] and also make use of the symmetry result in Lemma 2.3. We first fix $v_1,\dots ,v_n \in V$ and define a linear map by

$$\begin{aligned} E_{v_1,\dots ,v_{n-1}}: \mathrm{L}(V,W)_{n-1}&\rightarrow W\\ A&\mapsto (\dots ((Av_1)v_2)\dots )v_{n-1}\;, \end{aligned}$$

which simply inserts all the $v_1,\dots ,v_{n-1}$ in an $A\in \mathrm{L}(V,W)$. Note that, $E_{v_1,\dots ,v_{n-1}}$ is clearly bounded linear and thus Fréchet-smooth. So if we use the representation of $D^{(n-1)}f$ as element of

$$\begin{aligned} \underbrace{\mathrm{L}(V,\mathrm{L}(V,\dots \mathrm{L}(V}_{(n-1)\mathrm{ times}},W)\dots ))\;, \end{aligned}$$

the composition $E_{v_1,\dots ,v_{n-1}}\circ D^{(n-1)}f$ is also Fréchet-differentiable at any $x_0\in U$ with

$$\begin{aligned} \left( E_{v_1,\dots ,v_{n-1}}\circ D^{(n-1)}f \right)'\Big|_{x_0}v_n&= E_{v_1,\dots ,v_{n-1}} \circ D^{n}f|_{x_0}v_n =(\dots ((D^{n}f|_{x_0}v_n)v_1)\dots )v_{n-1} \\&=D^{n}f|_{x_0}(v_n, v_1,\dots ,v_{n-1}) = D^{n}f|_{x_0}(v_1,\dots ,v_{n-1}, v_n)\;, \end{aligned}$$

where in the first step, we used the chain rule and Lemma 6.1 (ii). In the second step, we used the definition of $E_{v_1,\dots ,v_{n-1}}$, whereas in the third step, we re-identified $D^{n}f|_{x_0}$ with the corresponding multilinear mapping from $V^n$ to W. Finally, in the last step, we used the symmetry of $D^{n}f|_{x_0}$. $\square $

We finally state one last computation rule:

Lemma 6.4

Let V, W and Z be normed vector spaces, $U\subseteq V$ open, $f:U\rightarrow W$ n-times Fréchet-differentiable and $A \in \mathrm{L}(W,Z)$. Then also the function $A\circ f$ is n-times Fréchet-differentiable and

$$\begin{aligned} D^n(A\circ f)|_{x_0}(v_1,\cdots ,v_n)=A\left( D^n f|_{x_0}(v_1,\cdots ,v_n) \right) \qquad \forall \, x_0\in U, v_1,\cdots , v_n \in V. \end{aligned}$$

(6.2)

Proof

The Fréchet-differentiability follows immediately from Lemma 6.2, using that A is Fréchet-smooth. We show the identity (6.2) by induction over n: The case $n=1$ follows immediately by the chain rule and Lemma 6.1 (ii). Now, let $n\ge 2$ and assume that the statement holds for $n-1$. Using the previous lemma (first step), the induction hypothesis (i.e. (6.2) for the $(n-1)$-st derivative) in the second step, as well as the chain rule, Lemma 6.1 (ii) and the symmetry of $f^(n)$, for all $x_0\in U$ and $v_1,\cdots , v_n \in V$ we obtain

$$\begin{aligned}&D^n(A\circ f)|_{x_0}(v_1,\cdots ,v_n)\overset{(\mathrm{A}.1)}{=}D \left( D^{(n-1)}(A\circ f)|_{.}(v_1,\cdots ,v_{n-1}) \right)\Big|_{x_0}v_n\\&\quad \overset{(IH)}{=} D\left( A\left( D^{(n-1)}f|_{.}(v_1,\cdots ,v_{n-1}) \right) \right)\Big|_{x_0}v_n = A\left( D^n f|_{x_0}(v_1,\cdots ,v_{n-1},v_n)\right) . \end{aligned}$$

$\square $

Appendix 2: The Riemannian metric in symmetric wave charts

In this subsection, we give a detailed computation of the Riemannian metric introduced in Sect. 3.4 in terms of the symmetric wave charts. Hereby, we adapt the methods in [15, Section 4] to the infinite-dimensional setting.

We begin by defining a distance function on ${\mathcal{F}}^{\mathrm{reg}}$ by

$$\begin{aligned} d: {\mathcal{F}}^{\mathrm{reg}} \times {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}^{+}_0\;,\;\;\; (x,y) \mapsto \sqrt{\mathrm{tr}((x-y)^2)}. \end{aligned}$$

The trace operator involved here is well defined and can be expressed in any orthonormal basis $(e_i)_{i\in \mathbb{N}}$ of ${\mathcal{H}}$ by (for details, see for example [23, Section 30.2])

$$\begin{aligned} \mathrm{tr}(A) = \sum _{i=1}^{\infty } \langle e_i|Ae_i\rangle _{{\mathcal{H}}} \qquad \text{for}\ A \in \mathrm{L}({\mathcal{H}})\ \text{of finite rank}. \end{aligned}$$

(7.1)

Moreover, note that, d does indeed define a distance function on ${\mathcal{F}}^{\mathrm{reg}}$ as for any two $x,y \in {\mathcal{F}}^{\mathrm{reg}}$ $d(x,y)=\Vert x-y\Vert _{{\mathscr{S}}({\mathcal{H}})}$, where $\Vert .\Vert _{{\mathscr{S}}({\mathcal{H}})}$ denotes the Hilbert–Schmidt norm (see for example [7, Section XI.6] or [28, pp. 321–322, 309–310]).

The following remark is a reminder of a calculation rule for the trace operator acting on operators with finite rank.

Remark 7.1

Let $A\in \mathrm{L}({\mathcal{H}})$ be of finite rank and $V \subseteq {\mathcal{H}}$ a finite-dimensional subspace $V \subseteq {\mathcal{H}}$ containing the image of A, i.e., $A({\mathcal{H}})\subseteq V$. Moreover, let $(e_i)_{1\le i\le k}$ be an orthonormal basis of v and $(\tilde{e}_i)_{i\in \mathbb{N}}$ an orthonormal basis of $V^{\bot }$. Then we obtain an orthonormal basis $(\hat{e}_i)_{i\in \mathbb{N}}$ of ${\mathcal{H}}$ by setting $\hat{e}_i:=e_i$ for $i=1,\cdots , k$ and $\hat{e}_{k+j}:=\tilde{e}_j$ for $j\in \mathbb{N}$. Using this basis in (7.1), the trace of A reduces to:

$$\begin{aligned} \mathrm{tr}(A) = \sum _{i=1}^k \langle \hat{e}_i|A\hat{e}_i\rangle _{{\mathcal{H}}} =\sum _{i=1}^k \langle e_i|Ae_i\rangle _{{\mathcal{H}}}. \end{aligned}$$

$\square $

The next lemma is mostly based on [28, Satz VI.5.8] and states some more properties of the trace operators.

Lemma 7.2

(Properties of the trace)

(1)
Linearity: The trace operator $\mathrm{tr}$ is linear.
(2)
Boundedness: For a finite-dimensional subspace $V\subseteq {\mathcal{H}}$, consider the corresponding subspace $V_{\mathrm{L}}:=\{A \in \mathrm{L}({\mathcal{H}})\, | \, A({\mathcal{H}}) \subseteq V \} \subseteq \mathrm{L}({\mathcal{H}})$. Then, $\mathrm{tr}|_{V_{\mathrm{L}}}$ is bounded.
(3)
Cyclic Permutation: For $x,y \in \mathrm{L}({\mathcal{H}})$ with x of finite rank it holds that:
$$\begin{aligned} \mathrm{tr}(xy)=\mathrm{tr}(yx). \end{aligned}$$
(4)
Trace of adjoint: For any $x \in \mathrm{L}({\mathcal{H}})$ of finite rank also $x^{\dagger }$ is of finite rank and:
$$\begin{aligned} \mathrm{tr}(x^{\dagger })=\overline{\mathrm{tr}(x)}. \end{aligned}$$

Proof

(i): Follows from the definition of tr by (7.1), see also [28, Satz VI.5.8 (a)].

(iii) and (iv): See [28, Satz VI.5.8 (c),(b)].

(ii): Let $A \in V_{\mathrm{L}}$. Then, as explained in Remark 7.1, choosing an orthonormal basis $(e_i)_{1\le i\le k}$ of V (so $\dim (V)=k$), we can estimate:

$$\begin{aligned} |\mathrm{tr}(A)| = \left|\sum _{i=1}^k\langle e_i| Ae_i\rangle _{{\mathcal{H}}}\right| \le \sum _{i=1}^k|\langle e_i| Ae_i\rangle |_{{\mathcal{H}}}\le \sum _{i=1}^k\Vert A\Vert _{{\mathcal{H}}} =k\Vert A\Vert _{{\mathcal{H}}}. \end{aligned}$$

This concludes the proof. $\square $

In the following lemma, we consider differentiability properties of a mapping E which corresponds to the square of the distance function d. Later, we want to use it to introduce the Riemannian metric as second Fréchet-derivative of E.

Lemma 7.3

The mappings:

$$\begin{aligned} E: {\mathcal{F}}^{\mathrm{reg}} \times {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}\;,\;\;\; (x,y) \mapsto \mathrm{tr}((x-y)^2)\;, \end{aligned}$$

and for any fixed $x \in {\mathcal{F}}^{\mathrm{reg}}$:

$$\begin{aligned} E_x: {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}\;,\;\;\; y \mapsto \mathrm{tr}((x-y)^2)\;, \end{aligned}$$

are Fréchet-smooth. Moreover, for all $x,y\in {\mathcal{F}}^{\mathrm{reg}}$ with $x \in \Omega _y$ and all $\mathbf{u }, \mathbf{v }\in V_y$,

$$\begin{aligned} D \left( E_x \circ \phi _y^{-1} \right)\big|_{\phi _y(x)}&=0 \qquad \mathrm{and}\nonumber \\ D^2\left( E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(x)}(\mathbf{v },\mathbf{u })&=4 {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(y\phi _y(x) \mathbf{v }^{\dagger }y\phi _y(x)\mathbf{u }^{\dagger }) +\mathrm{tr}(y\mathbf{u }\mathbf{v }^{\dagger }y\phi _y(x)\phi _y(x)^{\dagger })\right) \,. \end{aligned}$$

(7.2)

Proof

First we have to show that $E\circ (\phi _x^{-1}, \phi _y^{-1})$ is Fréchet-smooth for all $ x,y \in {\mathcal{F}}^{\mathrm{reg}}$.

To this end first consider the following calculation for arbitrary $\varphi \in W_x, \psi \in W_y$:

$$\begin{aligned}&E \circ (\phi _x^{-1}, \phi _y^{-1})|_{(\varphi , \psi )} =\mathrm{tr}\left((\varphi ^{\dagger }x\varphi -\psi ^{\dagger }y\psi )^2\right)\\&\quad = \mathrm{tr}(\varphi ^{\dagger }x\varphi \varphi ^{\dagger }x\varphi ) -\mathrm{tr}(\varphi ^{\dagger }x\varphi \psi ^{\dagger }y\psi ) -\mathrm{tr}(\psi ^{\dagger }y\psi \varphi ^{\dagger }x\varphi ) +\mathrm{tr}(\psi ^{\dagger }y\psi \psi ^{\dagger }y\psi )\\&\quad = \mathrm{tr}(x\varphi \varphi ^{\dagger }x\varphi \varphi ^{\dagger }) - \mathrm{tr}(x\varphi \psi ^{\dagger }y\psi \varphi ^{\dagger }) -\mathrm{tr}(y\psi \varphi ^{\dagger }x\varphi \psi ^{\dagger }) +\mathrm{tr}(y\psi \psi ^{\dagger }y\psi \psi ^{\dagger }) \;,\\&\quad =\mathrm{tr}|_{{S_x}_{\mathrm{L}}}(x\varphi \varphi ^{\dagger }x\varphi \varphi ^{\dagger }) -\mathrm{tr}|_{{S_x}_{\mathrm{L}}}(x\varphi \psi ^{\dagger }y\psi \varphi ^{\dagger }) -\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \varphi ^{\dagger }x\varphi \psi ^{\dagger }) +\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \psi ^{\dagger }y\psi \psi ^{\dagger })\;, \end{aligned}$$

where in the second step, we used the linearity of the trace and in the third step, the cyclic permutation property (which can be applied as all factors and summands obviously have finite rank). The last line is clearly a sum of composition of Fréchet-smooth mappings in $(\varphi ,\psi )$, which proves the Fréchet-smoothness of $E\circ (\phi _x^{-1}, \phi _y^{-1})$.

For calculating the Fréchet derivative of $E_x$, consider the expansion

$$\begin{aligned} E_x \circ \phi _y^{-1}(\psi )&= \mathrm{tr}((x-\psi ^{\dagger }y\psi )^2) \\&= \mathrm{tr}(x^2) -\mathrm{tr}(x\psi ^{\dagger }y\psi ) -\mathrm{tr}(\psi ^{\dagger }y\psi x) + \mathrm{tr}(\psi ^{\dagger }y\psi \psi ^{\dagger }y\psi ) \\&= \mathrm{tr}(x^2) -2\mathrm{tr}(x\psi ^{\dagger }y\psi ) +\mathrm{tr}(y\psi \psi ^{\dagger }y\psi \psi ^{\dagger })\\&= \mathrm{tr}|_{{S_x}_{\mathrm{L}}}(x^2) -2\mathrm{tr}|_{{S_x}_{\mathrm{L}}}(x\psi ^{\dagger }y\psi ) +\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \psi ^{\dagger }y\psi \psi ^{\dagger }) , \end{aligned}$$

which is again a sum of compositions of Fréchet-smooth functions showing that also $E_x \circ \phi _y^{-1}$ is Fréchet-smooth.

Applying the computation rule from Lemma 6.4 together with the Fréchet derivative rule for bilinear functions in Lemma 6.1 (iii) (multiple times and together with the chain rule) we obtain:

$$\begin{aligned} D\left( E_x \circ \phi _y^{-1} \right) \big|_{\psi }\mathbf{v }&= -2\mathrm{tr}|_{{S_x}_{\mathrm{L}}} (x\mathbf{v }^{\dagger }y\psi ) -2\mathrm{tr}|_{{S_x}_{\mathrm{L}}}(x\psi ^{\dagger }y\mathbf{v }) +\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\mathbf{v }\psi ^{\dagger }y\psi \psi ^{\dagger }) \\&\quad + \mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \mathbf{v }^{\dagger }y\psi \psi ^{\dagger }) + \mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \psi ^{\dagger }y\mathbf{v }\psi ^{\dagger }) + \mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \psi ^{\dagger }y\psi \mathbf{v }^{\dagger }). \end{aligned}$$

Using that Lemma 7.2 (iii) and (iv) this simplifies to

$$\begin{aligned}&D\left( E_x \circ \phi _y^{-1} \right) \big|_{\psi }\mathbf{v }\nonumber \\&\quad = -2\mathrm{tr}(x\mathbf{v }^{\dagger }y\psi )-2\mathrm{tr}(x\psi ^{\dagger }y\mathbf{v }) +2\mathrm{tr}(\psi \psi ^{\dagger }y\mathbf{v }\psi ^{\dagger }y) +2\mathrm{tr}(y\psi \mathbf{v }^{\dagger }y\psi \psi ^{\dagger }) \nonumber \\&\quad = 4\cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \psi ^{\dagger }y \psi \mathbf{v }^{\dagger }) - \mathrm{tr}|_{{S_y}_{\mathrm{L}}}(x\mathbf{v }^{\dagger }y\psi ) \right). \end{aligned}$$

(7.3)

$$\begin{aligned} = 4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(\psi ^{\dagger }y\psi \mathbf{v }^{\dagger }y\psi ) -\mathrm{tr}(x\mathbf{v }^{\dagger }y\psi )\right) \end{aligned}$$

(7.4)

In the case $\psi ^{\dagger }y\psi =\phi _y^{-1}(\psi )=x$, the terms in (7.4) cancel each other, showing that

$$\begin{aligned} D( E_x \circ \phi _y^{-1} ) \big|_{\phi _y(x)} = 0. \end{aligned}$$

Moreover, proceeding from (7.3) a straightforward computation using the properties of the Fréchet derivative and the trace operator as before gives

$$\begin{aligned}&D^2\left( E_x \circ \phi _y^{-1} \right) \big|_{\psi }(\mathbf{v },\mathbf{u }) \nonumber \\&\quad =4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\mathbf{u }\psi ^{\dagger }y\psi \mathbf{v }^{\dagger }) + \mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \mathbf{u }^{\dagger }y\psi \mathbf{v }^{\dagger }) +\mathrm{tr}|_{{S_y}_{\mathrm{L}}}(y\psi \psi ^{\dagger }y\mathbf{u }\mathbf{v }^{\dagger }) -\mathrm{tr}|_{{S_x}_{\mathrm{L}}}(x\mathbf{v }^{\dagger }y\mathbf{u })\right) \end{aligned}$$

(7.5)

$$\begin{aligned}&\quad = 4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(\psi ^{\dagger }y\psi \mathbf{v }^{\dagger }y\mathbf{u }) + \mathrm{tr}(y\psi \mathbf{u }^{\dagger }y\psi \mathbf{v }^{\dagger }) +\mathrm{tr}(y\psi \psi ^{\dagger }y\mathbf{u }\mathbf{v }^{\dagger }) -\mathrm{tr}(x\mathbf{v }^{\dagger }y\mathbf{u })\right) . \end{aligned}$$

(7.6)

As for $\psi ^{\dagger }y\psi =\phi _y^{-1}(\psi )=x$, the first and the last term cancel each other and we obtain

$$\begin{aligned}&D^2 \left( E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(x)}(\mathbf{v },\mathbf{u })\\&\quad = 4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(y\phi _y(x)\mathbf{u }^{\dagger }y\phi _y(x) \mathbf{v }^{\dagger })+\mathrm{tr}(y\phi _y(x)\phi _y(x)^{\dagger }y\mathbf{u }\mathbf{v }^{\dagger })\right) , \end{aligned}$$

which concludes the proof. $\square $

Lemma 7.4

$D^2( E_x \circ \phi _y^{-1})|_{\phi _y(x)}$ is independent of the choice of chart (i.e., the choice of y) as long as $y\in {\mathcal{F}}^{\mathrm{reg}}$ is chosen such that $x \in \Omega _y$. Moreover, for all tangent vector fields $\mathbf{v },\mathbf{u }\in \Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})$ and any $y\in {\mathcal{F}}^{\mathrm{reg}}$ with $x \in \Omega _y$

$$\begin{aligned} D_{\mathbf{v }(x)} \left( D_{\mathbf{u }(.)}E_x(.) \right)=D^2 \left( E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(x)}\left(D\phi _y(\mathbf{u }(x)) ,D\phi _y(\mathbf{v }(x))\right)\;, \end{aligned}$$

(7.7)

where the derivatives act on the arguments containing a dot.

This Lemma also shows that the order of differentiation of $E_x$ with respect to the two vector fields does not matter. The proof shows that this is due to the fact that the first derivative of $E_x$ vanishes.

Proof

Let $\mathbf{v },\mathbf{u }\in \Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})$ and $x,y\in {\mathcal{F}}^{\mathrm{reg}}$ with $x \in \Omega _y$ be arbitrary. As we have seen before, for the first directional derivative, we have

$$\begin{aligned} D_{\mathbf{u }(\tilde{x})} E_x= D \left( E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(\tilde{x})}D\phi _y(\mathbf{u }(\tilde{x})). \end{aligned}$$

It follows for the second directional derivative that

$$\begin{aligned}&D_{\mathbf{v }(\tilde{x})} \left( D_{\mathbf{u }(.)}E_x(.) \right)=D^2 \left(E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(\tilde{x})}\left(D\phi _y (\mathbf{v }(\tilde{x})),D\phi _y(\mathbf{u }(x))\right) \\&\qquad + D \left(E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(\tilde{x})} \,D \left( D \phi _y \left(\mathbf{u }(\phi _y^{-1}(.)) \right) \right) \Big|_{\phi _y(\tilde{x})} \,D\phi _y(\mathbf{v }(\tilde{x}))\;, \end{aligned}$$

where we applied the Fréchet derivative rule for $\mathbb{R}$-bilinear maps in Lemma 6.1 (iii) together with the chain rule. Evaluating this expression at $\tilde{x}=x$, the second summand vanishes in view of (7.2). We thus obtain

$$\begin{aligned} D_{\mathbf{v }(\tilde{x})} \left( D_{\mathbf{u }(.)}E_x(.) \right) \Big|_{\tilde{x}=x} =D^2 \left(E_x \circ \phi _y^{-1} \right) \big|_{\phi _y(x)} \left(D\phi _y(\mathbf{v }(x)),D\phi _y(\mathbf{u }(x))\right). \end{aligned}$$

Using the symmetry of the second Fréchet derivatives gives the result. $\square $

Remark 7.5

Equation (7.7) also shows that $D_{\mathbf{v }(x)}( D_{\mathbf{u }(.)}E_x(.))$ only depends on the value of the vector fields $\mathbf{u }$ and $\mathbf{v }$ at the point x. Moreover, since to arbitrary $x \in {\mathcal{F}}^{\mathrm{reg}}$ and $\mathbf{u },\mathbf{v }\in T_x{\mathcal{F}}^{\mathrm{reg}}$ one can always find a smooth tangent vector field with $\mathbf{v }(x)=\mathbf{v }$, $\mathbf{u }(x)=\mathbf{u }$ (e.g., using a suitable bump function in a chart around x), we can consider the expression

$$\begin{aligned} D_{2}^2E_x|_{x}(\mathbf{u },\mathbf{v }):=D^2(E_x \circ \phi _y^{-1})|_{\phi _y(x)} \left(D\phi _y(\mathbf{u }),D\phi _y(\mathbf{v })\right)\;, \end{aligned}$$

(7.8)

as a well-defined, coordinate invariant—in the sense that the right hand side of equation (7.8) returns the same values for any $y \in {\mathcal{F}}^{\mathrm{reg}}$ with $x \in \Omega _y$—and symmetric bilinear form. $\square $

Now it seems convenient to compute (7.8) 0 in the cart $\phi _x$. Then we have $\phi _x(x)=\pi _x$ and thus we obtain for any $\mathbf{u },\mathbf{v }\in V_x$:

$$\begin{aligned}&D^2\left( E_x \circ \phi _x^{-1} \right)|_{\phi _x(x)}(\mathbf{v },\mathbf{u }) =4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(x\pi _x\mathbf{u }^{\dagger }x\pi _x \mathbf{v }^{\dagger }) +\mathrm{tr}(x\pi _x\pi _x^{\dagger }x\mathbf{u }\mathbf{v }^{\dagger })\right)\\&\quad =4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(x\mathbf{u }^{\dagger }x v^{\dagger }) +\mathrm{tr}(x^2\mathbf{u }\mathbf{v }^{\dagger })\right) =4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(x\mathbf{v }x\mathbf{u })+\mathrm{tr}( x\mathbf{u }\mathbf{v }^{\dagger }x)\right). \end{aligned}$$

Motivated by this for any $x \in {\mathcal{F}}^{\mathrm{reg}}$ we set:

$$\begin{aligned} \tilde{g}_x: V_x \times V_x \rightarrow \mathbb{R}\;,\;\;\; (\mathbf{u },\mathbf{v }) \mapsto 4 \cdot {{\,\mathrm{Re}\,}}\left(\mathrm{tr}(x\mathbf{v }x\mathbf{u }) +\mathrm{tr}(x\mathbf{u }\mathbf{v }^{\dagger }x)\right). \end{aligned}$$

Due to the properties of the trace operator, $\tilde{g}_x$ defines a symmetric, real-valued bilinear form on $V_x$, which is even positive-definite as the following lemma shows:

Lemma 7.6

The symmetric bilinear form $\tilde{g}_x$ is positive definite and thus defines a real valued inner product on $V_x$.

Proof

Let $\mathbf{u }\in V_x$ be arbitrary, choose an orthonormal basis $(e_i)_{i=1,\cdots ,k}$ of the finite-dimensional vector-space $(S_x+\mathbf{u }^{\dagger }(S_x))$ and compute:^{Footnote 2}

$$\begin{aligned} \mathrm{tr}(x\mathbf{u }x\mathbf{u })&= \sum _{i=1}^{k}\langle e_i| x\mathbf{u }x\mathbf{u }e_i\rangle _{{\mathcal{H}}} =\sum _{i=1}^{k}\langle \mathbf{u }^{\dagger }xe_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}}\;, \\ \Rightarrow {{\,\mathrm{Re}\,}}\left( \mathrm{tr}(x\mathbf{u }x\mathbf{u })\right)&=\frac{1}{2}\sum _{i=1}^{k}\left(\langle \mathbf{u }^{\dagger }xe_i| xue_i\rangle _{{\mathcal{H}}} +\overline{\langle \mathbf{u }^{\dagger }xe_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}} }\right) \\&= \frac{1}{2}\sum _{i=1}^{k}\left(\langle \mathbf{u }^{\dagger }xe_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}} +\langle x\mathbf{u }e_i| u^{\dagger }xe_i\rangle _{{\mathcal{H}}}\right)\;,\\ \mathrm{tr}(x\mathbf{u }\mathbf{u }^{\dagger }x)&=\sum _{i=1}^{k} \langle e_i| x\mathbf{u }\mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}} =\sum _{i=1}^{k}\langle \mathbf{u }^{\dagger }xe_i| \mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}}\;,\\ \mathrm{tr}(x\mathbf{u }\mathbf{u }^{\dagger }x)&= \mathrm{tr}(\mathbf{u }^{\dagger }x^2\mathbf{u }) =\sum _{i=1}^{k}\langle e_i| \mathbf{u }^{\dagger }x^2\mathbf{u }e_i\rangle _{{\mathcal{H}}} =\sum _{i=1}^{k}\langle x\mathbf{u }e_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}}\;,\\ \Rightarrow {{\,\mathrm{Re}\,}}\left( \mathrm{tr}(x\mathbf{u }\mathbf{u }^{\dagger }x) \right)&={{\,\mathrm{Re}\,}}\left( \frac{1}{2} \sum _{i=1}^{k}\left( \langle \mathbf{u }^{\dagger }xe_i| \mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}} + \langle x \mathbf{u }e_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}}\right)\right) \\&= \frac{1}{2} \sum _{i=1}^{k}\left( \langle \mathbf{u }^{\dagger }xe_i| \mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}} + \langle x\mathbf{u }e_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}}\right)\;, \end{aligned}$$

where in the last step, we used that $\langle \mathbf{u }^{\dagger }xe_i| \mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}}= \Vert \mathbf{u }^{\dagger }x \Vert ^2_{{\mathcal{H}}}$ and $\langle x\mathbf{u }e_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}}= \Vert x\mathbf{u }\Vert ^2_{{\mathcal{H}}}$ are already real (for $i=1,\dots ,k$), so we can leave out the “${{\,\mathrm{Re}\,}}$.”

Combining this we obtain:

$$\begin{aligned} {{\,\mathrm{Re}\,}}\left( \mathrm{tr}(x\mathbf{u }x\mathbf{u }) + \mathrm{tr}(x\mathbf{u }\mathbf{u }^{\dagger }x) \right)&= \frac{1}{2}\sum _{i=1}^{k}\left(\langle (\mathbf{u }^{\dagger }x +x\mathbf{u })e_i| x \mathbf{u }e_i\rangle _{{\mathcal{H}}} + \langle (x\mathbf{u }+\mathbf{u }^{\dagger }x)e_i| \mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}}\right)\\&=\frac{1}{2}\sum _{i=1}^{k}\langle (\mathbf{u }^{\dagger }x +x\mathbf{u })e_i| (\mathbf{u }^{\dagger }x +x\mathbf{u }) e_i\rangle _{{\mathcal{H}}} \ge 0. \end{aligned}$$

This shows the positive semi-definiteness of $\tilde{g}_x$. Moreover, we see that $\tilde{g}_x(\mathbf{u },\mathbf{u })$ vanishes if and only if

$$\begin{aligned} 0=\left(\mathbf{u }^{\dagger }x +x\mathbf{u }\right)|_{S_x+\mathbf{u }^{\dagger }(S_x)} . \end{aligned}$$

But as $(\mathbf{u }^{\dagger }x +x\mathbf{u })$ is obviously selfadjoint and its image is contained in $S_x+\mathbf{u }^{\dagger }(S_x)$, it vanishes on the orthogonal complement of $S_x+\mathbf{u }^{\dagger }(S_x)$ anyhow,

so the previous equation is equivalent to

$$\begin{aligned} 0= \mathbf{u }^{\dagger }x +x\mathbf{u }. \end{aligned}$$

(7.9)

Moreover, denoting $\pi _I := \pi _x$ as the orthogonal projection on $S_x = I$ and $ \pi _J$ as the orthogonal projection on $J=(S_x)^{\bot }$, we can write

$$\begin{aligned} \mathbf{u }= \mathbf{u }\pi _I + \mathbf{u }\pi _J. \end{aligned}$$

Plugging this in equation (7.9) yields:

$$\begin{aligned} 0 = (\mathbf{u }\pi _I + \mathbf{u }\pi _J)^{\dagger }x + x(\mathbf{u }\pi _I + \mathbf{u }\pi _J) =\pi _I\mathbf{u }^{\dagger }x + x \mathbf{u }\pi _I + \pi _J\mathbf{u }^{\dagger }x + x \mathbf{u }\pi _J \;, \end{aligned}$$

(7.10)

Using $\mathbf{u }|_I \in \mathrm{Symm}(S_x)$, we conclude

$$\begin{aligned} X^{-1}\pi _I\mathbf{u }^{\dagger }X=X^{-1}(\mathbf{u }|_I)^{\dagger }X = \mathbf{u }|_I \;,\;\;\;\Rightarrow \pi _I\mathbf{u }^{\dagger }X = X\mathbf{u }|_I. \end{aligned}$$

As x is selfadjoint, this also yields

$$\begin{aligned} \pi _I\mathbf{u }^{\dagger }x = x\mathbf{u }\pi _I. \end{aligned}$$

Inserting this in (7.10) gives:

$$\begin{aligned} 0 = 2x \mathbf{u }\pi _I + \pi _J\mathbf{u }^{\dagger }x + x \mathbf{u }\pi _J . \end{aligned}$$

(7.11)

Using a block operator notation for the orthogonal decomposition ${\mathcal{H}}= I \oplus ^{\bot } J$, this equation can be visualized as

$$\begin{aligned} \left( \begin{array}{cc} 0 &{} 0\\ 0 &{} 0 \end{array} \right) = \left( \begin{array}{cc} 2x \mathbf{u }\pi _I &{} x \mathbf{u }\pi _J\\ \pi _J\mathbf{u }^{\dagger }x &{} 0 \end{array} \right) . \end{aligned}$$

This notation can be justified by “testing” equation (7.11) with $(v,0),(0,w)\in {\mathcal{H}}=I\oplus ^{\bot } J$ with $v\in I$ and $w\in J$ arbitrary.

Thus, we see that each of the operators $2x\mathbf{u }\pi _I$, $\pi _J\mathbf{u }^{\dagger }x$ and $x\mathbf{u }\pi _J$ must vanish individually. Furthermore, as $x|_I$ has full rank and $\mathbf{u }$ maps into $S_x=I$, this yields

$$\begin{aligned} \mathbf{u }\pi _I = 0\;,\;\;\; \mathbf{u }\pi _J = 0\;, \end{aligned}$$

and therefore also

$$\begin{aligned} \mathbf{u }=\mathbf{u }(\pi _I+\pi _J)=0. \end{aligned}$$

This proves the positive definiteness of $\tilde{g}_x(\mathbf{u },\mathbf{u }) ={{\,\mathrm{Re}\,}}\left( \mathrm{tr}(x\mathbf{u }x\mathbf{u }) + \mathrm{tr}(x\mathbf{u }\mathbf{u }^{\dagger }x) \right)$. $\square $

Now, we can finally introduce a Riemannian metric on ${\mathcal{F}}^{\mathrm{reg}}$:

Lemma 7.7

Setting pointwise for any $x\in {\mathcal{F}}^{\mathrm{reg}}$:

$$\begin{aligned} g_x: T_x{\mathcal{F}}^{\mathrm{reg}} \times T_x{\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}\;,\;\;\; (\mathbf{u },\mathbf{v }) \mapsto D_2^2E_x|_x(\mathbf{u },\mathbf{v })\;, \end{aligned}$$

we obtain a well-defined Riemannian metric on ${\mathcal{F}}^{\mathrm{reg}}$.^{Footnote 3}

Proof

First of all, $g_x$ is well defined due to Lemma 7.4 as explained in Remark 7.5.

Moreover, choosing representatives $[x,\mathbf{u },x]$, $[x,\mathbf{v },x] \in T_x{\mathcal{F}}^{\mathrm{reg}}$, we have:

$$\begin{aligned}&g_x([x,\mathbf{u },x],[x,\mathbf{v },x])=D_{2}^2E_x|_{x}([x,\mathbf{u },x],[x,v,x])\\&\quad =D^2(E_x \circ \phi _x^{-1})|_{\phi _x(x)} \left(\underbrace{D\phi _x([x,\mathbf{u },x])}_{=\mathbf{u }}, \underbrace{D\phi _x([x,v,x])}_{=\mathbf{v }}\right)=\tilde{g}_x(\mathbf{u },\mathbf{v })\;, \end{aligned}$$

and since we have already seen that for any $x \in {\mathcal{F}}^{\mathrm{reg}}$, $\tilde{g}_x$ defines a symmetric positive-definite bilinear form, so does $g_x$.

Thus, it only remains to show that g is Fréchet-smooth. But due to the (coordinate invariant) definition of $D_2^2E_x|_x$ in (7.8), this follows immediately from the Fréchet-smoothness of $D^2(E_x \circ \phi _y^{-1})$ (for this see Lemma 7.3, in particular equation (7.5)). More precisely, since for any two smooth vector fields $\mathbf{u },\mathbf{v }\in \Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})$ for any chart $\phi _y$ with $x \in \Omega _y$ also $D\phi _y\circ \mathbf{u }\circ \phi _y^{-1}$ and $D\phi _y\circ \mathbf{v }\circ \phi _y^{-1}$ are smooth, we have

$$\begin{aligned}&g_{\phi _y^{-1}(\psi )}(\mathbf{u }\circ \phi _y^{-1}(\psi ),\mathbf{v }\circ \phi _y^{-1}(\psi )) = D_{2}^2E_x|_{x}(\mathbf{u },\mathbf{v })\\&\quad =D^2(E_x \circ \phi _y^{-1})|_{\psi }\left(D\phi _y \left(\mathbf{u }\circ \phi _y^{-1}(\psi )\right),D\phi _y\left(\mathbf{v }\circ \phi _y^{-1}(\psi )\right)\right)\;,\;\;\;\forall \, \psi \in W_y\;, \end{aligned}$$

which is Fréchet-smooth as composition of Fréchet-smooth maps. More precisely, introducing the mappings

$$\begin{aligned}&B_1: \mathrm{L}(V_y,\mathbb{R})_2 \times V_y \rightarrow \mathrm{L}(V_y, \mathbb{R}) \;, \quad (A,\mathbf{v })\mapsto A\mathbf{v }\;,\\&B_2: \mathrm{L}(V_y,\mathbb{R}) \times V_y \rightarrow \mathbb{R}\;, \quad (A',\mathbf{v }')\mapsto A'\mathbf{v }'\;, \end{aligned}$$

which are both obviously $\mathbb{R}$-bilinear and continuous (and thus Fréchet-smooth), we can rewrite the previous equation to

$$\begin{aligned}&g_{\phi _y^{-1}(\psi )}(\mathbf{u }\circ \phi _y^{-1}(\psi ),\mathbf{v }\circ \phi _y^{-1}(\psi )) \\&\quad = B_2\left(B_1\left((d^2(x,\phi _y^{-1}(.)))^{(2)}|_{\psi }, \,D\phi _y (\mathbf{u }\circ \phi _y^{-1}(\psi ))\right),\, D\phi _y(\mathbf{v }\circ \phi _y^{-1}(\psi ))\right)\;, \end{aligned}$$

which is now clearly a composition of Fréchet-smooth maps. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Finster, F., Lottner, M. Banach manifold structure and infinite-dimensional analysis for causal fermion systems. Ann Glob Anal Geom 60, 313–354 (2021). https://doi.org/10.1007/s10455-021-09775-4

Download citation

Received: 04 February 2021
Accepted: 27 April 2021
Published: 31 May 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s10455-021-09775-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Banach manifold structure and infinite-dimensional analysis for causal fermion systems

Abstract

Similar content being viewed by others

On the Mathematical Foundations of Causal Fermion Systems in Minkowski Space

Fermionic Fock Spaces and Quantum States for Causal Fermion Systems

A Hamiltonian formulation of causal variational principles

1 Introduction

2 Preliminaries

2.1 Causal fermion systems and the causal action principle

Definition 2.1

2.2 Fréchet and Gâteaux derivatives

Definition 2.2

Lemma 2.3

Definition 2.4

2.3 Banach manifolds

Definition 2.5

Definition 2.6

3 Smooth Banach manifold structure of \({\mathcal{F}}^{\mathrm{reg}}\)

Definition 3.1

3.1 Wave charts and symmetric wave charts

Theorem 3.2

Proof

Definition 3.3

3.2 A Fréchet smooth atlas

Theorem 3.4

Proof

3.3 The tangent bundle

Definition 3.5

Proposition 3.6

Proof

Lemma 3.7

Proof

Remark 3.8

Remark 3.9

Definition 3.10

3.4 A Riemannian metric

Theorem 3.11

Proof

Theorem 3.12

Proof

4 Differential calculus on expedient subspaces

4.1 The expedient differentiable subspaces

Definition 4.1

Definition 4.2

Lemma 4.3

Proof

4.2 Derivatives along smooth curves

Proposition 4.4

Proof

Proposition 4.5

Proof

5 Application to causal fermion systems in infinite dimensions

5.1 Local Hölder continuity of the causal Lagrangian

Theorem 5.1

Lemma 5.2

Proof of Lemma 5.2

Proof of Theorem 5.1

Theorem 5.3

Proof

Remark 5.4

5.2 Definition of Jet Spaces

Definition 5.5

5.3 Derivatives of \({\mathcal{L}}\) and \(\ell \) along smooth curves

Theorem 5.6

Proof

Definition 5.7

Theorem 5.8

Proof

Theorem 5.9

Proof

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices