Abstract
A mathematical framework is developed for the analysis of causal fermion systems in the infinite-dimensional setting. It is shown that the regular spacetime point operators form a Banach manifold endowed with a canonical Fréchet-smooth Riemannian metric. The so-called expedient differential calculus is introduced with the purpose of treating derivatives of functions on Banach spaces which are differentiable only in certain directions. A chain rule is proven for Hölder continuous functions which are differentiable on expedient subspaces. These results are made applicable to causal fermion systems by proving that the causal Lagrangian is Hölder continuous. Moreover, Hölder continuity is analyzed for the integrated causal Lagrangian.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The theory of causal fermion systems is a recent approach to fundamental physics (see the basics in Sect. 2, the reviews [11, 12, 16], the textbook [10] or the website [1]). In this approach, spacetime and all objects therein are described by a measure \(\rho \) on a set \({\mathcal{F}}\) of linear operators of rank at most 2n on a Hilbert space \(({\mathcal{H}}, \langle .|. \rangle _{{\mathcal{H}}})\). The physical equations are formulated via the so-called causal action principle, a nonlinear variational principle where an action \({\mathcal{S}}\) is minimized under variations of the measure \(\rho \). If the Hilbert space \({\mathcal{H}}\) is finite-dimensional, the set \({\mathcal{F}}\) is a locally compact topological space. Making essential use of this fact, it was shown in [9] that the causal action principle is well defined and that minimizers exist. Moreover, as is worked out in detail in [15], the interior of \({\mathcal{F}}\) (consisting of the so-called regular points; see Definition 3.1) has a smooth manifold structure. Taking these structures as the starting point, causal variational principles were formulated and studied as a mathematical generalization of the causal action principle, where an action of the form
is minimized for a given lower-semicontinuous Lagrangian \({\mathcal{L}}: {\mathcal{F}}\times {\mathcal{F}}\rightarrow {\mathbb{R}}^+_0\) on an (in general non-compact) manifold \({\mathcal{F}}\) under variations of \(\rho \) within the class of regular Borel measures, keeping the total volume \(\rho ({\mathcal{F}})\) fixed. We refer the reader interested in causal variational principles to [19, Section 1 and 2] and the references therein.
This article is devoted to the case that the Hilbert space \({\mathcal{H}}\) is infinite-dimensional and separable. While the finite-dimensional setting seems suitable for describing physical spacetime on a fundamental level (where spacetime can be thought of as being discrete on a microscopic length scale usually associated to the Planck length), an infinite-dimensional Hilbert space arises in mathematical extrapolations where spacetime is continuous and has infinite volume. Most notably, infinite-dimensional Hilbert spaces come up in the examples of causal fermion systems describing Minkowski space (see [10, Section 1.2] or [26]) or a globally hyperbolic Lorentzian manifold (see for example [11]), and it is also needed for analyzing the limiting case of a classical interaction (the so-called continuum limit; see [10, Section 1.5.2 and Chapters 3-5]). A workaround to avoid infinite-dimensional analysis is to restrict attention to locally compact variations, as is done in [14, Section 2.3]. Nevertheless, in view of the importance of the examples and physical applications, it is a task of growing significance to analyze causal fermion systems systematically in the infinite-dimensional setting. It is the objective of this paper to put this analysis on a sound mathematical basis.
We now outline the main points of our constructions and explain our main results. Extending methods and results in [15] to the infinite-dimensional setting, we endow the set of all regular points of \({\mathcal{F}}\) with the structure of a Banach manifold (see Definition 3.1 and Theorem 3.4). To this end, we construct an atlas formed of so-called symmetric wave charts (see Definition 3.3). We also show that the Hilbert–Schmidt norm on finite-rank operators on \({\mathcal{H}}\) gives rise to a Fréchet-smooth Riemannian metric on this Banach manifold. More precisely, in Theorems 3.11 and 3.12, we prove that \({\mathcal{F}}^{\mathrm{reg}}\) is a smooth Banach submanifold of the Hilbert space \({\mathscr{S}}({\mathcal{H}})\) of selfadjoint Hilbert–Schmidt operators, with the Riemannian metric given by
In order to introduce higher derivatives at a regular point \(p \in {\mathcal{F}}\), our strategy is to always work in the distinguished symmetric wave chart around this point. This has the advantage that we can avoid the analysis of differentiability properties under coordinate transformations. The remaining difficulty is that the causal Lagrangian \({\mathcal{L}}\) and other derived functions are not differentiable. Instead, directional derivatives exist only in certain directions. In general, these directions do not form a vector space. As a consequence, the derivative is not a linear mapping, and the usual product and chain rules cease to hold. On the other hand, these computation rules are needed in the applications, and it is often sensible to assume that they do hold. This motivates our strategy of looking for a vector space on which the function under consideration is differentiable. Clearly, in this way, we lose information on the differentiability in certain directions which do not lie in such a vector space. But this shortcoming is outweighted by the benefit that we can avoid the subtleties of non-smooth analysis, which, at least for most applications in mind, would be impractical and inappropriately technical. Clearly, we want the subspace to be as large as possible, and moreover, it should be defined canonically without making any arbitrary choices. These requirements lead us to the notion of expedient subspaces (see Definition 4.2). In general, the expedient subspace is neither dense nor closed. On these expedient subspaces, the function is Gâteaux differentiable, the derivative is a linear mapping, and higher derivatives are multilinear.
The differential calculus on expedient subspaces is compatible with the chain rule in the following sense: If f is locally Hölder continuous, \(\gamma \) is a smooth curve whose derivatives up to sufficiently high order lie in the expedient differentiable subspace of f, then the composition \(f \circ \gamma \) is differentiable and the chain rule holds (see Proposition 4.4), i.e.,
where the index \({\mathcal{E}}\) denotes the derivative on the expedient subspace. We also prove a chain rule for higher derivatives (see Proposition 4.5). The requirement of Hölder continuity is a crucial assumption needed in order to control the error term of the linearization. The most general statement is Theorem 5.8 where Hölder continuity is required only on a subspace which contains the curve \(\gamma \) locally.
We also work out how the differential calculus on expedient subspaces applies to the setting of causal fermion systems. In order to establish the chain rule, we prove that the causal Lagrangian is indeed locally Hölder continuous with uniform Hölder exponent (Theorem 5.1), and we analyze how the Hölder constant depends on the base point (Theorem 5.3). Moreover, we prove that for all \(x,y \in {\mathcal{F}}\), there is a neighborhood \(U\subseteq {\mathcal{F}}\) of y with (see (5.9))
(where 2n is the maximal rank of the operators in \({\mathcal{F}}\)). Relying on these results, we can generalize the jet formalism as introduced in [17] for causal variational principles to the infinite-dimensional setting (Sect. 5.2). We also work out the chain rule for the Lagrangian (Theorem 5.6) and for the function \(\ell \) obtained by integrating one of the arguments of the Lagrangian (Theorem 5.9),
(where is a positive constant).
The paper is organized as follows. Section 2 provides the necessary preliminaries on causal fermion systems and infinite-dimensional analysis. In Sect. 3, an atlas of symmetric wave charts is constructed, and it is shown that this atlas endows the regular points of \({\mathcal{F}}\) with the structure of a Fréchet-smooth Banach manifold. Moreover, it is shown that the Hilbert–Schmidt norm induces a Fréchet-smooth Riemannian metric. In Sect. 4, the differential calculus on expedient subspaces is developed. In Sect. 5, this differential calculus is applied to causal fermion systems. Appendix gives some more background information on the Fréchet derivative. Finally, Appendix 2 provides details on how the Riemannian metric looks like in different charts.
We finally point out that in order to address a coherent readership, concrete applications of our methods and results for example to physical spacetimes have not been included here. The example of causal fermion systems in Minkowski space will be worked out separately in [25].
2 Preliminaries
2.1 Causal fermion systems and the causal action principle
We now recall the basic definitions of a causal fermion system and the causal action principle.
Definition 2.1
(causal fermion system) Given a separable complex Hilbert space \({\mathcal{H}}\) with scalar product \(\langle .|. \rangle _{{\mathcal{H}}}\) and a parameter \(n \in {\mathbb{N}}\) (the “spin dimension”), we let \({\mathcal{F}}\subseteq \mathrm{L}({\mathcal{H}})\) be the set of all selfadjoint operators on \({\mathcal{H}}\) of finite rank, which (counting multiplicities) have at most n positive and at most n negative eigenvalues. On \({\mathcal{F}}\), we are given a positive measure \(\rho \) (defined on a \(\sigma \)-algebra of subsets of \({\mathcal{F}}\)), the so-called universal measure. We refer to \(({\mathcal{H}}, {\mathcal{F}}, \rho )\) as a causal fermion system.
A causal fermion system describes a spacetime together with all structures and objects therein. In order to single out the physically admissible causal fermion systems, one must formulate physical equations. To this end, we impose that the universal measure should be a minimizer of the causal action principle, which we now introduce.
For any \(x, y \in {\mathcal{F}}\), the product xy is an operator of rank at most 2n. However, in general, it is no longer a selfadjoint operator because \((xy)^* = yx\), and this is different from xy unless x and y commute. As a consequence, the eigenvalues of the operator xy are in general complex. We denote these eigenvalues counting algebraic multiplicities by \(\lambda ^{xy}_1, \ldots , \lambda ^{xy}_{2n} \in {\mathbb{C}}\) (more specifically, denoting the rank of xy by \(k \le 2n\), we choose \(\lambda ^{xy}_1, \ldots , \lambda ^{xy}_{k}\) as all the nonzero eigenvalues and set \(\lambda ^{xy}_{k+1}, \ldots , \lambda ^{xy}_{2n}=0\)). We introduce the Lagrangian and the causal action by
The causal action principle is to minimize \({\mathcal{S}}\) by varying the measure \(\rho \) under the following constraints:
where C is a given parameter, \({{\,\mathrm{tr}\,}}\) denotes the trace of a linear operator on \({\mathcal{H}}\), and the absolute value of xy is the so-called spectral weight,
This variational principle is mathematically well posed if \({\mathcal{H}}\) is finite-dimensional. For the existence theory and the analysis of general properties of minimizing measures, we refer to [3, 8, 9]. In the existence theory, one varies in the class of regular Borel measures (with respect to the topology on \(\mathrm{L}({\mathcal{H}})\) induced by the operator norm), and the minimizing measure is again in this class. With this in mind, here, we always assume that
Let \(\rho \) be a minimizing measure. Spacetime is defined as the support of this measure,
Thus, the spacetime points are selfadjoint linear operators on \({\mathcal{H}}\). These operators contain a lot of additional information which, if interpreted correctly, gives rise to spacetime structures like causal and metric structures, spinors and interacting fields. We refer the interested reader to [10, Chapter 1].
The only results on the structure of minimizing measures which will be needed here concern the treatment of the trace constraint and the boundedness constraint. As a consequence of the trace constraint, for any minimizing measure \(\rho \), the local trace is constant in spacetime, i.e., there is a real constant \(c \ne 0\) such that (see [10, Proposition 1.4.1])
Restricting attention to operators with fixed trace, the trace constraint (2.4) is equivalent to the volume constraint (2.3) and may be disregarded. The boundedness constraint, on the other hand, can be treated with a Lagrange multiplier. Indeed, as is made precise in [3, Theorem 1.3], for every minimizing measure \(\rho \), there is a Lagrange multiplier \(\kappa >0\) such that \(\rho \) is a local minimizer of the causal action with the Lagrangian replaced by
leaving out the boundedness constraint.
2.2 Fréchet and Gâteaux derivatives
We now recall a few basic concepts from the differential calculus on normed vector spaces. In what follows, we let \((E, \Vert .\Vert _E)\) and \((F, \Vert .\Vert _F)\) be real normed vector spaces. The most common concept is that of the Fréchet derivative.
Definition 2.2
Let \(U \subseteq E\) be open and \(f : U \rightarrow F\) be an F-valued function on U. The function f is Fréchet-differentiable in \(x_0 \in U\) if there is a bounded linear mapping \(A \in \mathrm{L}(E, F)\) such that
where the error term \(r : U \rightarrow F\) goes to zero faster than linearly, i.e.,
The linear operator A is the Fréchet derivative, also denoted by \(Df|_{x_0}\). A function is Fréchet-differentiable in U if it is Fréchet-differentiable at every point of U.
The Fréchet derivative is uniquely defined. Moreover, the concept can be iterated to define higher derivatives. Indeed, if f is differentiable in U, its derivative Df is a mapping
Since \(\mathrm{L}(E,F)\) is a normed vector space (with the operator norm), we can apply Definition 2.2 once again to define the second derivative at a point \(x_0\) by
The second derivative can also be viewed as a bilinear mapping from E to F,
It is by definition bounded, meaning that there is a constant \(c>0\) such that
By iteration, one obtains similarly the Fréchet derivatives of order \(p \in {\mathbb{N}}\) as multilinear operators
A function is Fréchet-smooth on U if it is Fréchet-differentiable to every order.
Lemma 2.3
If the function \(f : U \subseteq E \rightarrow F\) is p times Fréchet-differentiable in \(x_0 \in U\), then its \(p^{\mathrm{th}}\) Fréchet derivative is symmetric, i.e., for any \(u_1, \ldots , u_p \in E\) and any permutation \(\sigma \in {{\mathcal{S}}}_p\),
We omit the proof, which can be found for example in [5, Section 4.4]. For the Fréchet derivative, most concepts familiar from the finite-dimensional setting carry over immediately. In particular, the composition of Fréchet-differentiable functions is again Fréchet-differentiable. Moreover, the chain and product rules hold. We refer for the details to [5, Sections 2.2 and 2.3] and [6, Chapter 8]Footnote 1 and Appendix 1.
A weaker concept of differentiability which we will use here is Gâteaux differentiability.
Definition 2.4
Let \(U \subseteq E\) be open and \(f : U \rightarrow F\) be an F-valued function on U. The function f is Gâteaux differentiable in \(x_0 \in U\) in the direction \(u \in E\) if the limit of the difference quotient exists,
The resulting vector \(d_u f(x_0) \in F\) is the Gâteaux derivative.
By definition, the Gâteaux derivative is homogeneous of degree one, i.e.,
Moreover, if f is Fréchet-differentiable in \(x_0\), then it is also Gâteaux differentiable in any direction \(u \in E\) and
However, the converse is not true because, even if the Gâteaux derivatives exist for any \(u \in E\), it is in general not possible to represent them by a bounded linear operator. As a consequence, the chain and product rules in general do not hold for Gâteaux derivatives. We shall come back to this issue in Sect. 5.
2.3 Banach manifolds
We recall the basic definition of a smooth Banach manifold (for more details see for example [29, Chapter 73]).
Definition 2.5
Let B be a Hausdorff topological space and \((E, \Vert .\Vert _E)\) a Banach space. A chart \((U, \phi )\) is a pair consisting of an open subset \(U \subseteq B\) and a homeomorphism \(\phi \) of U to an open subset \(V := \phi (U)\) of E, i.e.,
A smooth atlas \({{\mathcal{A}}} = ( \phi _i, U_i, E)_{i \in I}\) is a collection of charts (for a general index set I) with the properties that the domains of the charts cover B,
and that for any \(i, j \in I\), the transition map
is Fréchet-smooth. Two atlases \(( \phi _i, U_i, E)_{i \in I}\) and \(( \psi _i, V_i, E)_{j \in J}\) are called equivalent if all the transition maps \(\psi _j \circ \phi _i^{-1}\) and \(\phi _i \circ \psi _j^{-1}\) are Fréchet-smooth. We denote the corresponding equivalence class by \([{\mathcal{A}}]\). The union of the charts of all atlases in \([{\mathcal{A}}]\) is called maximal atlas \({\mathcal{A}}_{\mathrm{max}}\). The triple \((B, E, {{\mathcal{A}}})\) is referred to as a smooth Banach manifold with differentiable structure provided by \({\mathcal{A}}_{\mathrm{max}}\).
Definition 2.6
Just as in the case of finite-dimensional manifolds, we call a function \(f: U\subseteq A \rightarrow B\) between two smooth Banach manifolds \((A, E, {\mathcal{A}})\) and \((B, G, {\mathcal{B}})\) (with \(U\subseteq A\) open) n-times (Fréchet) differentiable (resp. smooth) if for all combinations of charts \(\phi _a:U_a \rightarrow V_a\) and \(\phi _b: U_b \rightarrow V_b\) of some (and thus all) atlases \(\tilde{{\mathcal{A}}}\) in \([{\mathcal{A}}]\), respectively, \(\tilde{{\mathcal{B}}}\) in \([{\mathcal{B}}]\), the mapping \(\phi _b \circ f\circ \phi _a^{-1}: V_a\rightarrow V_b\) is n-times (Fréchet) differentiable (resp. smooth).
3 Smooth Banach manifold structure of \({\mathcal{F}}^{\mathrm{reg}}\)
In the definition of causal fermion systems, the number of positive or negative eigenvalues of the operators in \({\mathcal{F}}\) can be strictly smaller than n. This is important because it makes \({\mathcal{F}}\) a closed subspace of \(\mathrm{L}({\mathcal{H}})\) (with respect to the norm topology), which in turn is crucial for the general existence results for minimizers of the causal action principle (see [9] or [18]). However, in most physical examples in Minkowski space or in a Lorentzian spacetime, all the operators in M do have exactly n positive and exactly n negative eigenvalues. This motivates the following definition (see also [10, Definition 1.1.5]).
Definition 3.1
An operator \(x \in {\mathcal{F}}\) is said to be regular if it has the maximal possible rank, i.e., \(\dim x({\mathcal{H}}) = 2n\). Otherwise, the operator is called singular. A causal fermion system is regular if all its spacetime points are regular.
In what follows, we restrict attention to regular causal fermion systems. Moreover, it is convenient to also restrict attention to all those operators in \({\mathcal{F}}\) which are regular,
\({\mathcal{F}}^{\mathrm{reg}}\) is a dense open subset of \({\mathcal{F}}\) (again with respect to the norm topology on \(\mathrm{L}({\mathcal{H}})\)).
3.1 Wave charts and symmetric wave charts
We now choose specific charts and prove that the resulting atlas endows \({\mathcal{F}}^{\mathrm{reg}}\) with the structure of a smooth Banach manifold (see Definition 2.5). In the finite-dimensional setting, these charts were introduced in [15]. We now recall their definition and generalize the constructions to the infinite-dimensional setting.
Given \(x \in {\mathcal{F}}^{\mathrm{reg}}\), we denote the image of x by \(I:=x({\mathcal{H}})\). We consider I as a 2n-dimensional Hilbert space with the scalar product induced from \(\langle .|. \rangle _{{\mathcal{H}}}\). Denoting its orthogonal complement by \(J:=I^{\perp }\), we obtain the orthogonal sum decomposition
This also gives rise to a corresponding decomposition of operators, like for example
Given an operator \(\psi \in \mathrm{L}({\mathcal{H}}, I)\), we denote its adjoint by \(\psi ^{\dagger } \in \mathrm{L}(I, {\mathcal{H}})\); it is defined by the relation
We now form the operator
By construction, this operator is symmetric and has at most n positive and at most n negative eigenvalues. Therefore, it is an operator in \({\mathcal{F}}\). Using (3.1), we conclude that \(R_x\) is a mapping
Before going on, it is useful to rewrite the operator \(R_x(\psi )\) in a slightly different way. On I, one can also introduce the indefinite inner product
referred to as the spin inner product. For conceptual clarity, we denote I endowed with the spin inner product by \((S_x, \prec .|. \succ _x)\) and refer to it as the spin space at x (for more details on the spin spaces, we refer for example to [10, Section 1.1]). It is an indefinite inner product space of signature (n, n). We denote the adjoint with respect to the spin inner product by a star. More specifically, for a linear operator \(A \in \mathrm{L}(S_x)\), the adjoint is defined by
Using again the definition of the spin inner product (3.4), we can rewrite this equation as
where we introduced the short notation
Taking adjoints in the Hilbert space \({\mathcal{H}}\) gives
(note that, the operator X is invertible because \(S_x\) is by definition its image). We thus obtain the relation
Using such transformations, one readily verifies that identifying the image of \(\psi \) with a subspace of \(S_x\), the right side of (3.2) can be written as \(-\psi ^* \psi \) (for details, see [15, Lemma 2.2]). Thus, with this identification, the operator \(R_x\) can be written instead of (3.2) and (3.3) in the equivalent form
where \(\psi ^*\) is the adjoint with respect to the corresponding inner products, i.e.,
We want to use the operator \(R_x\) in order to construct local parametrizations of \({\mathcal{F}}^{\mathrm{reg}}\). The main difficulty is that the operator \(R_x\) is not injective. For an explanation of this point in the context of local gauge freedom, we refer to [15]. Here, we merely explain how to arrange that \(R_x\) becomes injective. We let \(\mathrm{Symm}(S_x) \subseteq \mathrm{L}(S_x)\) be the real vector space of all operators A on \(S_x\) which are symmetric with respect to the spin inner product, i.e.,
We now restrict the operator \(R_x\) in (3.3) and (3.7) to
We write the direct sum decomposition as
Extending the analysis in [15, Section 6.1] to the infinite-dimensional setting, one finds that this mapping is a local parametrization of \({\mathcal{F}}^{\mathrm{reg}}\):
Theorem 3.2
There is an open neighborhood \(W_x\) of \((\mathrm{id}_{S_x}, 0) \in \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\) such that the restriction of \(R_x^{\mathrm{symm}}\) maps to an open subset \(\Omega _x :=R_x^{\mathrm{symm}}(W_x)\) of \({\mathcal{F}}^{\mathrm{reg}}\),
and is a homeomorphism to its image (always with respect to the topology induced by the operator norm on \(\mathrm{L}({\mathcal{H}})\)).
Proof
The estimate
shows that \(R_x^{\mathrm{symm}}\) is continuous. Since the point \(R_x^{\mathrm{symm}}(\mathrm{id}_{S_x}, 0)=x \in {\mathcal{F}}^{\mathrm{reg}}\) is regular, by continuity, we may choose an open neighborhood \(W_x\) of \((\mathrm{id}_{S_x}, 0)\) such that \(R_x\) maps to \({\mathcal{F}}^{\mathrm{reg}}\).
In order to show that \(R_x^{\mathrm{symm}}\) is bijective, we begin with the formula for \(\phi _x\) as derived in [15, Proposition 6.6], which will turn out to be the inverse of \(R_x^{\mathrm{symm}}\). It has the form
where P(x, y) (the kernel of the fermionic projector) and \(A_{xy}\) (the closed chain) are defined by
Our task is to show that for a sufficiently small open neighborhood \(\Omega _x\) of x, this formula defines a continuous mapping
and that the compositions
are both the identity (showing that \(\phi _x\) is indeed the inverse of \(R_x^{\mathrm{symm}}\)).
In preparation, we rewrite the formula (3.10) as
where we again used the notation (3.5). Choosing \(y=x\), the operator \(X^{-1} \,\pi _x y |_{S_x}\) is the identity on \(S_x\). We first choose an open neighborhood \(\tilde{\Omega }_x\) of x so small such that for any \(y \in \tilde{\Omega }_x\),
Then, the square root as well as the inverse square root of \(A=X^{-1}\pi _x y\) are well defined for all \(x\in \tilde{\Omega }_x\) by the respective power series,
with the generalized binomial coefficients given for \(\beta \in \mathbb{R}\) and \(n\in \mathbb{N}\) by
as for both power series the radius of convergence equals one. Moreover, note that, all square roots, inverse square roots, etc., appearing in the following are well defined as they are always applied to operators within their radius of convergence. We conclude that the mapping \(\phi _x\) is well defined and continuous on \(\tilde{\Omega }_x\). Now, by possibly shrinking \(W_x\), we can arrange that \(\Omega _x:=R_x^{\mathrm{symm}}(W_x)\) lies in \(\tilde{\Omega }_x\). Note that it now suffices to show that \(\phi _x|_{\Omega _x}\) is the inverse of \(R_x^{\mathrm{symm}}|_{W_x}\), because then the set \(\Omega _x=(\phi _x|_{\tilde{\Omega }_x})^{-1}(W_x)\) is open.
In order to verify that \(\phi _x\) maps into \(\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\), we restrict \(\phi _x(y)\) to \(S_x\),
A direct computation using (3.6) shows that the operator \(X^{-1}\pi _x y \pi _x|_{S_x}\) and hence also its square root are symmetric on \(S_x\).
It remains to compute the compositions in (3.12). First,
where in the last line, we applied (3.6) and used that \(\psi _I\) is symmetric on \(S_x\). Moreover,
Since the spectral calculus is invariant under similarity transformations, we know that for any invertible operator B on \(S_x\),
Hence,
(note that \(P(x,y) : S_y \rightarrow S_x\) is invertible in view of (3.14)). This concludes the proof. \(\square \)
The mapping \(\phi _x\), which already appeared in the proof of the previous lemma, can also be introduced abstractly to define the chart.
Definition 3.3
Setting
we obtain a chart \((\phi _x, \Omega _x)\), referred to as the symmetric wave chart about the point \(x \in {\mathcal{F}}^{\mathrm{reg}}\).
We remark that more general charts can be obtained by restricting \(R_x\) to another subspace of \(\mathrm{L}(I,S_x) \oplus \mathrm{L}(J,S_x)\), i.e., in generalization of (3.8),
where E is a subspace of \(\mathrm{L}(S_x)\) which has the same dimension as \(\mathrm{Symm}(S_x)\). The resulting charts \(\phi ^E_x\) are obtained by composition with a unitary operator \(U_x\) on \(S_x\), i.e.,
(for details and the connection to local gauge transformations, see [15, Section 6.1]). Since linear transformations are irrelevant for the question of differentiability, in what follows, we may restrict attention to symmetric wave charts.
3.2 A Fréchet smooth atlas
The goal of this section is to prove that the symmetric wave charts \((\phi _x, \Omega _x)\) form a smooth atlas of \({\mathcal{F}}^{\mathrm{reg}}\).
Theorem 3.4
(Symmetric wave atlas) The collection of all symmetric wave charts on \({\mathcal{F}}^{\mathrm{reg}}\) defines a Fréchet-smooth atlas of \({\mathcal{F}}^{\mathrm{reg}}\), endowing \({\mathcal{F}}^{\mathrm{reg}}\) with the structure of a smooth Banach manifold (see Definition 2.5).
Proof
We first verify that for any \(x\in {\mathcal{F}}^{\mathrm{reg}}\), the vector space \(\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\) together with the operator norm of \(\mathrm{L}({\mathcal{H}},I)=\mathrm{L}({\mathcal{H}},S_x)\) is a Banach space. To this end, we note that, this vector space coincides with the kernel of the mapping \(\psi \mapsto (X^{-1}\psi ^{\dagger } \pi _x X - \psi |_I)\) on \(\mathrm{L}({\mathcal{H}}, I)\). Since this mapping is continuous on \(\mathrm{L}({\mathcal{H}},I)\) (as one verifies by an estimate similar to (3.9)), its kernel is closed. As a consequence, the vector space \(\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\) is a closed subspace of \(\mathrm{L}({\mathcal{H}},I)\) and thus indeed a Banach space.
We saw in Theorem 3.2 that for any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), \((\phi _x, \Omega _x)\) defines a chart on \({\mathcal{F}}^{\mathrm{reg}}\). Since the \(\Omega _x\) clearly cover \({\mathcal{F}}^{\mathrm{reg}}\), it remains to show that all transition mappings are Fréchet-smooth. To this end, we first note that, for any \(x,y \in {\mathcal{F}}^{\mathrm{reg}}\) and \(\psi \in \phi _x(\Omega _x \cap \Omega _y)\),
Next, we define the mappings
(where the radius of the ball \(B_{1/2}(0)\) is taken with respect to the operator norm).
Recall that in the proof of Theorem 3.2 (more precisely (3.14)), we chose \(\Omega _y\) so small that the operator \(\Vert \mathrm{id}_{S_y}- Y^{-1}\pi _yz|_{S_y} \Vert <1/2\) for any \(z\in \Omega _y\). Thus, since for any \(\psi \in \phi _x(\Omega _x \cap \Omega _y)\) we have \(\psi ^{\dagger }X\psi =\phi ^{-1}_x(\psi )\in \Omega _y\), we obtain \(\tilde{B}_{xy}(\phi _x(\Omega _x \cap \Omega _y)) \subseteq B_{1/2}(\mathrm{id}_{S_y})\). Therefore, we can write the transition mapping \(\phi _y \circ \phi _x^{-1}\) as
Now note that, for the Fréchet derivative, we consider all vector spaces here as a real Banach spaces, but still with the canonical operator norm induced by \(\Vert .\Vert _{{\mathcal{H}}}\). In view of the chain rule for Fréchet derivatives (for details, see Lemma 6.2 in Appendix 1) and the properties of the Fréchet derivative in Lemma 6.1 in Appendix 1, it remains to show that the mappings W, \(B_{xy}\) and \(\tilde{B}_{xy}\) are Fréchet-smooth (note that, the composition operator of \(\mathbb{R}\)-linear mappings is also always Fréchet-smooth as it defines a bounded \(\mathbb{R}\)-bilinear map and the map \(\mathrm{L}(S_y)\ni y \mapsto \mathrm{id}_{S_y}-y\in \mathrm{L}(S_y)\) is clearly Fréchet-smooth as well). For W, this is clear due to [21, pp. 40–42] (note that, \(\mathrm{L}(S_y)\) obviously defines a finite-dimensional unital Banach-algebra). Moreover, the mappings \(B_{xy}\) and \(\tilde{B}_{xy}\) are obviously \(\mathbb{R}\)-bilinear and bounded and thus Fréchet-smooth. \(\square \)
3.3 The tangent bundle
Having endowed \({\mathcal{F}}^{\mathrm{reg}}\) with a canonical smooth Banach manifold structure, the next step is to consider its tangent bundle. For finite-dimensional manifolds, the tangent space can be defined either by equivalence classes of curves or by derivations, and these two definitions coincide (see for example [24, Chapter 2]). In infinite dimensions, however, this does no longer be the case: In general, the derivation-tangent vectors (usually called operational tangent vectors) form a larger class of than the curve-tangent vectors (called kinematic tangent vectors). There might even be operational tangent vectors that depend on higher-order derivatives of the inserted function (while the kinematic tangent vectors interpreted as directional derivatives only involve the first derivatives); for details on such issues, see for example [22, Sections 28 and 29] or [2, pp. 3–6]. It turns out that for our applications in mind, it is preferable to define tangent vectors as equivalence classes of curves. Indeed, as we shall see, with this definition, the usual computation rules remain valid. More specifically, the tangent vectors of \({\mathcal{F}}^{\mathrm{reg}}\) are compatible with the Fréchet derivative, and each fiber of the corresponding tangent bundle can be identified with the underlying Banach space
with respect to the chart \(\phi _x\).
Following [22, p. 284], we begin with the abstract definition of the (kinematic) tangent bundle, which makes it easier to see the topological structure. Afterward, we will show that this notion indeed agrees with equivalence classes of curves. Given \(x' \in {\mathcal{F}}^{\mathrm{reg}}\), we consider the set \(\Omega _{x'} \times V_{x'} \times \{x'\}\) (endowed with the topology inherited from the direct sum of Banach spaces). We take the disjoint union
and introduce the equivalence relation
For clarity, we point out that the first entry represents the point of the Banach manifold \({\mathcal{F}}^{\mathrm{reg}}\), whereas the third entry labels the chart.
Definition 3.5
We define the tangent bundle \(T{\mathcal{F}}^{\mathrm{reg}}\) as the quotient space with respect to this equivalence relation,
The canonical projection is given by
For every \(x \in {\mathcal{F}}^{\mathrm{reg}}\), the tangent space at x is defined by
Note that, each \(T_x{\mathcal{F}}^{\mathrm{reg}}\) has a canonical vector space structure in the following sense: Since all equivalence classes in \(T_x{\mathcal{F}}^{\mathrm{reg}}\) have a representative of the form \([x,\mathbf{v },x]\), this representative can be identified with \(\mathbf{v }\in V_x\). In this way, we obtain an identification of \(T_x{\mathcal{F}}^{\mathrm{reg}}\) with \(V_x\).
The tangent bundle is again a Banach manifold, as we now explain. For any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), the mapping
has the inverse
On \(T{\mathcal{F}}^{\mathrm{reg}}\), we choose the coarsest topology with the property that the natural projections of these mappings to \(\Omega _x\) and \(V_x\) are both continuous (where on \(\Omega _x\) and \(V_x\), we choose the topology induced by the norm topology of \(\mathrm{L}({\mathcal{H}})\)). With this topology, the mapping \((\phi _x, D\phi _x)\) defines a chart of \(T{\mathcal{F}}^{\mathrm{reg}}\). For any \((\psi ,\mathbf{v })\in (\phi _y, D\phi _y)\left(\pi ^{-1}(\Omega _x)\cap \pi ^{-1}(\Omega _y)\right)\), the transition mappings are given by
Proposition 3.6
\(T{\mathcal{F}}^{\mathrm{reg}}\) is again a Banach manifold.
Proof
We need to show that transition maps are Fréchet-smooth. This is clear for the first component because the transition mappings \(\phi _x \circ \phi _y^{-1}\) are Fréchet-smooth and fiberwise linear. The second component can be considered as the composition of the insertion map
(which is obviously continuous and bilinear and thus Fréchet-smooth, for details, see Lemma 6.1 in Appendix 1) with the mapping \(W_y\times V_y \ni (\psi ,\mathbf{v }) \mapsto ((\phi _x \circ \phi _y^{-1})'|_{\psi },\mathbf{v })\in \mathrm{L}(V_x,V_y)\times V_y\), which is Fréchet-smooth due to the Fréchet-smoothness of the transition mappings. \(\square \)
In what follows, we will sometimes use the notation
which also clarifies the independence of the choice of representatives.
Lemma 3.7
For any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), the mapping
is a local trivialization.
Proof
We need to verify the properties of a local trivialization. Clearly, the operator \(\pi \circ \psi _x\) is the projection to the first component, and for fixed \(y \in \Omega _x\), the mapping \(v \mapsto \psi _x(y,\mathbf{v })=[y,\mathbf{v },x]=[y, (\phi _y \circ \phi _x^{-1})'|_{\phi _x(x)}\mathbf{v },y]\) corresponds to \(\mathbf{v }\mapsto (\phi _y \circ \phi _x^{-1})'|_{\phi _x(x)}\mathbf{v }\) (by the identification of \(T_y{\mathcal{F}}^{\mathrm{reg}}\) with \(V_y\) from before), which is obviously an isomorphism of vector spaces in view of Lemma 6.1 (vi). \(\square \)
To summarize, the Banach manifold \({\mathcal{F}}^{\mathrm{reg}}\) has similar properties as in the finite-dimensional case.
We now explain how the above definition of tangent vectors relates to the equivalence classes of curves (following [22, p. 285]):
Remark 3.8
(equivalence classes of curves) On curves \(\gamma , \tilde{\gamma } \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})\), we consider the equivalence relation \(\gamma \sim \tilde{\gamma }\) defined by the conditions that \(\gamma (0) = \tilde{\gamma }(0)\) and that in a chart \(\phi _x\) with \(\gamma (0) \in \Omega _x\), the relation \((\phi _x\circ \gamma )'|_0= (\phi _x\circ \tilde{\gamma })'|_0\) holds. Note that, if the last relation holds in one chart, then it also holds in any other chart \(\phi _y\) with \(\gamma (0)\in \Omega _y\) because, due to the chain rule,
Now we can identify \(C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}}) / \sim \) with \(T{\mathcal{F}}^{\mathrm{reg}}\) via the mapping
which bijective with inverse (for details, see [22, p. 285])
where \(\xi _{\mathbf{v }} \in C_0^{\infty }(\mathbb{R})\) is a smooth cutoff function with \(0\le \xi _v \le 1\). Moreover, \(\mathrm{supp}(\xi _{\mathbf{v }})\subseteq (-\varepsilon ,\varepsilon )\) and \(\xi _{\mathbf{v }}|_{(-\varepsilon /2,\varepsilon /2)}\equiv 1\) with \(\varepsilon >0\) chosen so small that
Note that, in (3.16), the tangent vector at \(\gamma (0)\) was expressed in the specific chart \((\phi _{\gamma (0)}, \Omega _{\gamma (0)})\). However, the tangent vector can also be represented in another chart as follows. Let \(x \in {\mathcal{F}}^{\mathrm{reg}}\) and \([x,\mathbf{v },z] \in T_x{\mathcal{F}}^{\mathrm{reg}}\) be arbitrary. We say that a curve \(\gamma \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})\) represents \([x,\mathbf{v },z]\) if in one chart \(\phi _y\) with \(x \in \Omega _y\) (and thus any chart, as one can show using the chain rule just as before) it holds that
In order to show independence of y, let \(w\in {\mathcal{F}}^{\mathrm{reg}}\) with \(x \in \Omega _w\). Then,
and thus,
Hence, if (3.17) holds in one chart, it also holds in any other chart around x. \(\square \)
Remark 3.9
(directional derivatives) Let \(\gamma \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})\) be a curve that represents \([x,\mathbf{v },z]\). We define the directional derivative of a Fréchet-differentiable function \(f: {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}\) at x in the direction \([x,\mathbf{v },z]\) as
This definition is independent of the choice of the curve \(\gamma \). Indeed, for any chart \(\phi _w\) around x, we have
\(\square \)
We close this subsection with one last definition:
Definition 3.10
(Tangent vector fields) A tangent vector field on a Banach manifold is—similar to the finite-dimensional case—a Fréchet-smooth map \(\mathbf{v }: {\mathcal{F}}^{\mathrm{reg}} \rightarrow T{\mathcal{F}}^{\mathrm{reg}}\) such that \(\mathbf{v }(x) \in T_x{\mathcal{F}}^{\mathrm{reg}}\) (i.e. \(\pi (\mathbf{v }(x)) = x\)) for all \(x \in {\mathcal{F}}^{\mathrm{reg}}\). We denote the set of all tangent vectors fields of \({\mathcal{F}}^{\mathrm{reg}}\) by \(\Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})\).
We note that, according to this definition, multiplying a vector field by Fréchet-smooth real-valued function gives again a vector field. In other words, the space of all tangent vector fields forms a module over the ring of Fréchet-smooth functions from \({\mathcal{F}}^{\mathrm{reg}}\) to \({\mathbb{R}}\).
3.4 A Riemannian metric
In this section, we show that the Hilbert–Schmidt scalar product gives rise to a canonical Riemannian metric on \({\mathcal{F}}^{\mathrm{reg}}\). For the constructions, it is most convenient to recover \({\mathcal{F}}^{\mathrm{reg}}\) as a Banach submanifold of the real Hilbert space \({\mathscr{S}}({\mathcal{H}})\) of all selfadjoint Hilbert–Schmidt operators on \({\mathcal{H}}\) endowed with the scalar product (\({\mathscr{S}}\) because of the second Schatten class; for details, see [7, Section XI.6])
Theorem 3.11
\({\mathcal{F}}^{\mathrm{reg}}\) is a smooth Fréchet submanifold of \({\mathscr{S}}({\mathcal{H}})\) in the following sense. Given \(x \in {\mathcal{F}}^{\mathrm{reg}}\), we choose \(\psi _0 \in \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I)\) with \(x = -\psi _0^* \psi _0\). Then, the mapping
(where the last matrix denotes a block operator on \({\mathcal{H}}=I \oplus J\)) is a local Fréchet-diffeomorphism at \((\psi _0, 0)\). Its local inverse takes the form
where \(\hat{W} = W_x \oplus {\mathscr{S}}(J)\), \(\hat{\Omega }_x:=\mathscr{R}(\hat{W})=\Omega _x+{\mathscr{S}}(J)\) (with \(W_x\) and \(\Omega _x\) as in Theorem 3.2), and \(\phi _x(\pi _x E)\) is defined in analogy to (3.13) by
(the fact that this maps to the symmetric operators on \(S_x\) is verified as in (3.15)).
Proof
A direct computation shows that \({\mathscr{R}}\) and \(\Phi \) are inverses of each other: In order to compute \(\mathscr{R}\circ \Phi \), we use the block operator notation
Then, there exist operators \(\tilde{E}_J, \hat{E}_J \in {\mathscr{S}}(J)\) such that \(E_{JJ}=\tilde{E}_J+\hat{E}_J\), and the operator
is contained in \(\Omega _x\). Note that, \(\phi _x E = \pi _x\tilde{E}\) and therefore \(-\phi _x(\pi _x E)^*\phi _x(\pi _x E)=\tilde{E}\). We conclude that
In order to compute \(\Phi \circ \mathscr{R}\), we take \((\psi , B)\in \hat{W}\) arbitrary and note that, due to the definition of \(\phi _x\) in (3.13) and Theorem 3.2, we have
(note that, the first two mappings \(\phi _x\) are the ones defined in this theorem, whereas the third mapping is the one from (3.13)). We thus obtain
Next, the mappings \({\mathscr{R}}\) and \(\Phi \) are Fréchet-smooth because for operators of finite rank (namely rank at most 2n), the operator norm is equivalent to the Hilbert–Schmidt norm. Indeed, for an operator \(A \, :\, H \rightarrow I\) mapping to a finite-dimensional Hilbert space I,
This concludes the proof. \(\square \)
We consider a smooth curve
The corresponding equivalence class defines a tangent vector \([x,\mathbf{v },y] \in T_x {\mathcal{F}}^{\mathrm{reg}}\). On the other hand, considering \(\gamma \) as a curve in \({\mathscr{S}}\), it has the tangent vector
In the chart \(\phi _x\) and setting \(\psi _0 = \phi _x(x)\), the curve is parametrized by \(\psi (\tau ) := \phi _x \circ \gamma (\tau )\) with
and thus
As \(\psi _0=\phi _x(x)=\pi _x\), a direct computation (for details, see the proof of Lemma 7.6 in Appendix 2) that the map \(V_x \ni \mathbf{v }\mapsto -\mathbf{v }^*\psi _0-\psi ^*_0\mathbf{v }= -\mathbf{v }^*\pi _x-\pi ^*_x \mathbf{v }\) is injective.This makes it possible to write the tangent space as
Theorem 3.12
Using the identification (3.18), the mapping
defines a Fréchet-smooth Riemannian metric on \({\mathcal{F}}^{\mathrm{reg}}\). Moreover, the topology on \({\mathcal{F}}^{\mathrm{reg}}\) induced by the operator norm coincides with the topology induced by the Riemannian metric.
Proof
Follows immediately because \(g_x\) is the restriction of the Hilbert space scalar product to the smooth Fréchet submanifold \({\mathcal{F}}^{\mathrm{reg}}\).
We finally remark that the symmetric wave charts are related to Gaussian charts (see the formulas in [15, Sections 5 and 6.2], which apply to the infinite-dimensional case as well). Detailed computations for the Riemannian metric in symmetric wave charts are given in Appendix 2.
4 Differential calculus on expedient subspaces
If all functions arising in the analysis were Fréchet-smooth, all the methods and notions from the finite-dimensional setting could be adapted in a straightforward way to the infinite-dimensional setting. However, this procedure is not sufficient for our purposes, because the Lagrangian is not Fréchet-smooth. Therefore, we need to develop a differential calculus on Banach spaces for functions which are only Hölder continuous. Clearly, in general, such functions are not even Fréchet-differentiable, but the Gâteaux derivative may exist in certain directions. The disadvantage of Gâteaux derivatives is that the differentiable directions in general do not form a vector space. As a consequence, the usual computation rules like the linearity of the derivative or the chain and product rules cease to hold. Our strategy for preserving the usual computation rules is to work on suitable linear subspaces of the star-shaped set of all Gâteaux-differentiable directions, referred to as the expedient differentiable subspace.
4.1 The expedient differentiable subspaces
In this section, E and F denote Banach spaces.
Definition 4.1
Let \(U \subseteq E\) be open and \(f : U \rightarrow F\) an F-valued function. Moreover, let V be a subspace of E. The function f is k times V-differentiable at \(x_0 \in U\) if for every finite-dimensional subspace \(H \subseteq V\), the restriction of f to the affine subspace \(H+x_0\) denoted by
is k-times continuously differentiable at \(h=0\). If this condition holds, the subspace V is called k-admissible at \(x_0\).
Thus, a function f is once V-differentiable at \(x_0\) if for every finite-dimensional subspace \(H \subseteq V\), for every \(h_0\) in a small neighborhood of the origin,
and if \(Dg^H|_{h_0}\) is continuous in the variable \(h_0\) at \(h_0=0\). Equivalently, choosing a basis \(e_1, \ldots , e_L\) of H, this condition can be stated that all partial derivatives
exist and are continuous at \(\alpha _1,\ldots , \alpha _L=0\). The higher differentiability of \(g^H\) can be defined inductively or, equivalently, by demanding that all partial derivatives up to the order k, i.e., all the functions
with \(i_1,\ldots , i_p \in \{1,\ldots , L\}\) and \(p \le k\), exist and are continuous at \(\alpha _1,\ldots , \alpha _L=0\).
An admissible subspace V is maximal if there are no admissible proper extensions \(\tilde{V} \supsetneq V\). The existence of maximal admissible subspaces is guaranteed by Zorn’s lemma, but maximal subspaces are in general not unique. In order to obtain a canonical subspace, we take the intersection of all maximal admissible subspaces:
Definition 4.2
The expedient k-differentiable subspace \({\mathcal{E}}^k(f,x_0)\) of f at \(x_0\) is defined as the intersection
Since the expedient differentiable subspace is again admissible at \(x_0\), we obtain a corresponding derivative as follows. Given \(k\in {\mathbb{N}}\) and vectors \(h_1, \ldots , h_k \in {\mathcal{E}}(f,x_0)\), we choose H as a finite-dimensional subspace which contains these vectors. We set
(where again \(g^H(h):=f(x_0+h)\)).
Lemma 4.3
This procedure defines \(D^{k,{\mathcal{E}}} f|_{x_0}\) canonically as a symmetric, multilinear mapping
Proof
In order to show that \(D^{k,{\mathcal{E}}} f|_{x_0}\) is well defined, let H and \(\tilde{H}\) be two finite-dimensional subspaces of \({\mathcal{E}}(f,x_0)\) which contain the vectors \(h_1, \ldots , h_k\). Then, expressing the partial derivatives in terms of partial derivatives, it follows that
This shows that the definition (4.1) does not depend on the choice of H.
The symmetry and homogeneity follow immediately from the corresponding properties of \(D^k g^H\) in (4.1). In order to prove additivity, we let \(h_1, \ldots , h_k \in {\mathcal{E}}^k(f,x_0)\) and \(\tilde{h}_1, \ldots , \tilde{h}_k \in {\mathcal{E}}^k(f,x_0)\). We let H be the span of all these vectors and use that the corresponding operator \(D^k g^H|_0\) in (4.1) applied to \(h_1+\tilde{h}_1, \ldots , h_k +\tilde{h}_k\) is multilinear. \(\square \)
Note that, the operator \(D^{k,{\mathcal{E}}} f|_{x_0}\) is in general not bounded. Moreover, \({\mathcal{E}}^k(f,x_0)\) will in general not be a closed subspace of E nor will it in general be dense.
4.2 Derivatives along smooth curves
We now analyze under which assumptions directional derivatives exist. To this end, we let I be an interval and \(\gamma : I \rightarrow E\) a smooth curve (here, the notions of Fréchet and Gâteaux smoothness coincide). Moreover, let \(t_0 \in I\) with \(x_0:=\gamma (t_0) \in U\) and \(U\subseteq E\) open. Given a function \(f : U \rightarrow F\), we consider the composition
Proposition 4.4
(chain rule) Assume that f is locally Hölder continuous at \(x_0\), meaning that there is a neighborhood \(V \subseteq U\) of \(x_0\) as well as constants \(\alpha , c>0\) such that
Moreover, assume that all the derivatives of \(\gamma \) at \(x_0\) up to the order
(where \(\lceil \cdot \rceil \) is the ceiling function) lie in the expedient differentiable subspace at \(x_0\), i.e.,
Then, the function \(f \circ \gamma \) is differentiable at \(t_0\) and
Proof
We consider the polynomial approximation of \(\gamma \)
By assumption, this curve lies in the affine subspace \({\mathcal{E}}(f,x_0)+x_0\). Using that the restriction of f to this subspace is continuously differentiable, it follows that
It remains to control the error term of the polynomial approximation. Using that f is locally Hölder continuous, we know that
Using that \(\gamma \) is smooth, it follows that
According to (4.3), we know that \(\alpha p \ge 1\). Therefore, the error term is of the order \(o(t-t_0)\), which shows that also the function \(t\mapsto (f\circ \gamma )(t) - (f\circ \gamma _p)(t)\) is differentiable with vanishing derivative. This proves the desired result. \(\square \)
This result immediately generalizes to higher derivatives:
Proposition 4.5
(higher order chain rule) Assume that f is locally Hölder continuous at \(x_0\) (see (4.2)). Moreover, assume that all the derivatives of \(\gamma \) at \(x_0\) up to the order
lie in the expedient differentiable subspace at \(x_0\), i.e.,
Then, the function \(f \circ \gamma \) is q-times differentiable at \(t_0\), and the derivative can be computed with the usual product and chain rules (formula of Faà di Bruno).
Proof
We again consider f along the polynomial approximation \(\gamma _p\) (4.4) of the curve \(\gamma \). By assumption, this curve lies in a finite-dimensional subspace of the affine space
Using that the restriction of f to this subspace is continuously differentiable, we know that \(f\circ \gamma _p\) is q times continuously differentiable at \(t=t_0\), and the derivatives can be computed with the formula of Faà di Bruno,
Using (4.5) and (4.6), we conclude that
It follows that also this function is q-times differentiable and that all its derivatives vanish. This concludes the proof. \(\square \)
5 Application to causal fermion systems in infinite dimensions
5.1 Local Hölder continuity of the causal Lagrangian
The goal of this section is to prove the following result.
Theorem 5.1
The Lagrangian is locally Hölder continuous in the sense that for all \(x,y_0 \in {\mathcal{F}}\) there is a neighborhood \(U \subseteq {\mathcal{F}}\) of \(y_0\) and a constant \(c>0\) such that
where n is the spin dimension. Moreover, the integrand of the boundedness constraint is locally Lipschitz continuous in the sense that
We begin with a preparatory lemma.
Lemma 5.2
(Hölder continuity of roots) Let
be a complex monic polynomial of degree g with roots \(\lambda _1, \ldots , \lambda _g\). Then, there are constants \(C, \varepsilon >0\) such that any complex monic polynomial \(\tilde{{\mathcal{P}}}(\lambda ) =\lambda ^g + \tilde{c}_{g-1}\, \lambda ^{g-1} + \cdots + \tilde{c}_0\) of degree g which is close to \({{\mathcal{P}}}\) in the sense that
can be written as \(\tilde{{\mathcal{P}}}(\lambda ) = \prod _{i=1}^g (\lambda - \tilde{\lambda }_i)\) with
where \(p_i\) is the multiplicity of the root \(\lambda _i\).
This lemma is proven in a more general context in [4, Theorem 2]. For self-consistency, we here give a simple proof based on Rouché’s theorem:
Proof of Lemma 5.2
After the rescaling \(\lambda \rightarrow \nu \lambda \) and \(\lambda _i \rightarrow \nu \lambda _i\) with \(\nu >0\), we can assume that all the roots \(\lambda _i\) are in the unit ball. Then, the polynomial \(\Delta {{\mathcal{P}}} := \tilde{{\mathcal{P}}} - {{\mathcal{P}}}\) is bounded in the ball of radius two by
We denote the minimal distance of distinct eigenvalues by
Since there is a finite number of roots, it clearly suffices to prove the lemma for one of them. Given \(i \in \{1, \ldots , g\}\), we choose
Next, we choose \(\varepsilon \) so small that \(\delta <D/2\). We consider the ball \(\Omega = B_{\delta }(\lambda _i)\). Then, for any \(\lambda \in \partial \Omega \), the polynomial \({{\mathcal{P}}}\) satisfies the bound
where we used (5.4) and (5.3). Therefore, Rouché’s theorem (see for example [27, Theorem 10.36]) implies that the polynomials \({{\mathcal{P}}}\) and \(\tilde{{\mathcal{P}}}\) have the same number of roots in the ball \(\Omega \). Thus, after a suitable ordering of the roots,
Using (5.4) gives the result. \(\square \)
Proof of Theorem 5.1
Let \(x, y \in {\mathcal{F}}\). Since both operators x and y vanish on the orthogonal complement of the span their images combined, \(J :=\text{span}(S_x, S_y)\), it suffices to compute the eigenvalues on the finite-dimensional subspace J. Choosing an orthonormal basis of \(S_x=x({\mathcal{H}})\) and extending it to an orthonormal basis of J, the matrix \(x y|_J- {\mathbb {1}}_J\) has the block matrix form
Therefore, its characteristic polynomial is given by
This consideration shows that it suffices to analyze the operators \(x y \pi _x\) and similarly \(x \tilde{y} \pi _x\) on the finite-dimensional Hilbert space \(x({\mathcal{H}})\). We denote the corresponding characteristic polynomials by \({{\mathcal{P}}}\) and \(\tilde{{\mathcal{P}}}\), respectively. They are monic polynomials of degree \(g:= \dim x({\mathcal{H}})\). The difference of these polynomials can be estimated in terms of operator norms on \(\mathrm{L}({\mathcal{H}})\) as follows,
valid for all \(\tilde{y}\) with \(\Vert \tilde{y}\Vert \le 2 \,\Vert y\Vert \). According to Lemma 5.2, for sufficiently small \(\Vert y-\tilde{y}\Vert \), the eigenvalues of these matrices can be arranged to satisfy the inequalities
In order to prove (5.2), we consider the estimate
and use that \(g \le 2n\).
It remains to prove (5.1). In the case \(g<2n\), a simple estimate similar to (5.5) gives the result. In the remaining case \(g=2n\), using the abbreviation \(\Delta \lambda _i := \tilde{\lambda }_i - \lambda _i\), we obtain
where in the last step we used that whenever \(\lambda _i \ne \lambda _j\), the multiplicities of both roots are at most \(g-1\). The inequality
yields the desired Hölder inequality with exponent \(1/(2n-1)\). Finally, it is clear from the construction that the constant depends continuously on y. This concludes the proof. \(\square \)
In the case of spin dimension one, the Lagrangian is Lipschitz continuous, in agreement with the findings in [20]. If the spin dimension is larger, one still has Hölder continuity, but the Hölder exponent becomes smaller if the spin dimension is increased. This can be understood from the fact that the higher the spin dimension is, the higher the degeneracies of the eigenvalues of xy can be.
We next prove a global Hölder continuity result.
Theorem 5.3
(Global Hölder continuity) There is a constant c(n) which depends only on the spin dimension such that for all \(x,y \in {\mathcal{F}}\) with \(y \ne 0\), there is a neighborhood \(U\subseteq {\mathcal{F}}\) of y with
Proof
Without loss of generality, we can assume that \(x \ne 0\). Moreover, using that both sides of the inequality (5.6) have the same scaling behavior under the rescaling
it suffices to consider the case that \(\Vert x\Vert =\Vert y\Vert =1\).
Next, choosing a fixed 4n-dimensional subspace of \(I \subseteq {\mathcal{H}}\), we can always find a unitary transformation \(U: {\mathcal{H}}\rightarrow {\mathcal{H}}\) such that \(UxU^{-1}({\mathcal{H}}), UyU^{-1}({\mathcal{H}}) \subseteq I\). Since the Lagrangian and the operator norms are invariant under such joint unitary transformations (as they leave the eigenvalues of xy invariant), we can assume that both x and y map into the fixed finite dimensional subspace I.
After these transformations, the operators x and y can be considered as operators in \(\mathrm{L}(I)\). Therefore, they lie in the compact set \(\overline{B_1(0)} \subseteq \mathrm{L}(I)\). Since the Hölder constant for the local Hölder continuity depends continuously on x and y, a compactness argument shows that we can choose the Hölder constant uniformly in x and y: As the previous arguments show, the local Hölder constant can be written as a continuous function \(c: \mathrm{L}(I)\times \mathrm{L}(I) \rightarrow {\mathbb{R}}^+,\, (x,y) \mapsto c(x,y)\). Since \(\overline{B_1(0)} \times \overline{B_1(0)} \subseteq \mathrm{L}(I)\times \mathrm{L}(I)\) is compact, the local Hölder constant function c is bounded on this set by a constant \(c_{\mathrm{max}}>0\), which can then be taken as the desired global Hölder constant. \(\square \)
Remark 5.4
-
(1)
Since the Lagrangian is symmetric, Theorem 5.3 also gives rise to global Hölder continuity with respect to the other argument. Thus, for all \(x,y \in {\mathcal{F}}\) with \(x \ne 0\), there is a neighborhood \(U \subseteq {\mathcal{F}}\) of x such that
$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)| \le c(n) \Vert x\Vert ^{2-\frac{1}{2n-1}} \Vert y\Vert ^2 \Vert \tilde{x}-x\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{x}\in U. \end{aligned}$$(5.7) -
(2)
As explained in the proof of Theorem 5.3, the Lagrangian \({\mathcal{L}}(x,y)\) depends only on the nonzero eigenvalues of xy and these coincide with the eigenvalues of \(xy\pi _x\). Thus, denoting
$$\begin{aligned} J:=\mathrm{span}(S_x,S_{\tilde{x}}) , \end{aligned}$$we immediately obtain the following strengthened version of (5.7): Every \(x\ne 0\) has a neighborhood \(U \subset {\mathcal{F}}\) such that the inequality
$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)|&= |{\mathcal{L}}(x,\pi _J \,y\,\pi _J) -{\mathcal{L}}(\tilde{x},\pi _J \,y\, \pi _J)| \nonumber \\&\le c(n)\, \Vert x\Vert ^{2-\frac{1}{2n-1}} \,\Vert \pi _J \,y\, \pi _J\Vert ^2 \,\Vert \tilde{x} -x\Vert ^{\frac{1}{2n-1}} , \end{aligned}$$(5.8)holds for all \(\tilde{x} \in U\) and all \(y \in {\mathcal{F}}\). This estimate will be needed for the proof of the chain rule for the integrated Lagrangian \(\ell \) in Theorem 5.9.
-
(3)
In the case \(y=0\), a direct estimate of the eigenvalues shows that one has Hölder continuity with the improved exponent two,
$$\begin{aligned} \left| {\mathcal{L}}(x,\tilde{y}) \right| \le c(n) \, \Vert x\Vert ^2\, \Vert \tilde{y}\Vert ^2 . \end{aligned}$$This inequality can be combined with the result of Theorem 5.3 to the statement that for all x, y there is a neighborhood \(U\subseteq {\mathcal{F}}\) of y with
$$\begin{aligned} | {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n, y) \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U. \end{aligned}$$(5.9)Likewise, (5.8) generalizes to
$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)| \le c(n,x) \,\Vert \pi _J \,y\, \pi _J\Vert ^2 \,\Vert \tilde{x}-x\Vert ^{\frac{1}{2n-1}} . \end{aligned}$$(5.10)This inequality will be used in the proof of Theorem 5.9.
\(\square \)
5.2 Definition of Jet Spaces
For the analysis of causal variational principles, the jet formalism was developed in [17]; see also [13, Section 2]. We now generalize the definition of the jet spaces to causal fermion systems in the infinite-dimensional setting. Our method is to work with the expedient subspaces, where for convenience derivatives at x are always computed in the corresponding chart \(\phi _x\). For example, for analyzing the differentiability of a real-valued function f at a point \(x \in {\mathcal{F}}^{\mathrm{reg}}\), we consider the composition
We introduce \(\Gamma ^{\mathrm{diff}}_{\rho }\) as the linear space of all vector fields for which the directional derivative of the function \(\ell \) exists in the sense of expedient subspaces (see Definition 4.2),
This gives rise to the jet space
We choose a linear subspace \(\mathfrak {J}^{\mathrm{test}}_{\rho } \subseteq \mathfrak {J}^{\mathrm{diff}}_{\rho }\) with the property that its scalar and vector components are both vector spaces,
and the scalar component is nowhere trivial in the sense that
It is convenient to consider a pair \(\mathfrak {u}:= (a, \mathbf{u })\) consisting of a real-valued function a on M and a vector field \(\mathbf{u }\) on \(T{\mathcal{F}}^{\mathrm{reg}}\) along M and to denote the combination of multiplication and directional derivative by
For the Lagrangian, being a function of two variables \(x,y \in {\mathcal{F}}^{\mathrm{reg}}\), we always work in charts \(\phi _x\) and \(\phi _y\), giving rise to the mapping
where E is the Cartesian product of Banach spaces
with the norm
(where the subscripts x and y clarify the dependence on the base points, i.e., \(I_x = x(H)\), \(J_x = I_x^{\perp } \subseteq {\mathcal{H}}\) and similarly at y). We denote partial derivatives acting on the first and second arguments by subscripts 1 and 2, respectively. Throughout this paper, we use the following conventions for partial derivatives and jet derivatives:
- \({\blacktriangleright}\):
-
Partial and jet derivatives with an index \(i \in \{ 1,2 \}\), as for example in (5.15), only act on the respective variable of the function \({\mathcal{L}}\). This implies, for example, that the derivatives commute,
$$\begin{aligned} \nabla _{1,\mathfrak {v}} \nabla _{1,\mathfrak {u}} {\mathcal{L}}(x,y) = \nabla _{1,\mathfrak {u}} \nabla _{1,\mathfrak {v}} {\mathcal{L}}(x,y) . \end{aligned}$$ - \({\blacktriangleright}\):
-
The partial or jet derivatives which do not carry an index act as partial derivatives on the corresponding argument of the Lagrangian. This implies, for example, that
$$\begin{aligned} \nabla _{\mathfrak {u}} \int _{{\mathcal{F}}} \nabla _{1,\mathfrak {v}} \, {\mathcal{L}}(x,y) \, \mathrm{d}\rho (y) =\int _{{\mathcal{F}}} \nabla _{1,\mathfrak {u}} \nabla _{1,\mathfrak {v}}\, {\mathcal{L}}(x,y) \, \mathrm{d}\rho (y) . \end{aligned}$$
Definition 5.5
For any \(\ell \in {\mathbb{N}}_0 \cup \{\infty \}\), the jet space \(\mathfrak {J}_{\rho }^{\ell } \subseteq \mathfrak {J}_{\rho }\) is defined as the vector space of test jets with the following properties:
-
(i)
The directional derivatives up to order \(\ell \) exist in the sense that
$$\begin{aligned} \mathfrak {J}^{\ell }_{\rho }&\subseteq \Big \{ (b,\mathbf{v }) \in \mathfrak {J}_{\rho } \,\Big|\, \left(\mathbf{v }(x), \mathbf{v }(y) \right) \in \Gamma ^{\ell }_{\rho }(x,y) \\&\qquad \text{for all}\,y \in M\ \text{and}\,x\ \text{in an open neighborhood of}\,M \subseteq {\mathcal{F}}^{\mathrm{reg}}\Big \} , \end{aligned}$$where
$$\begin{aligned} \Gamma ^{\ell }_{\rho }(x,y) := {\mathcal{E}}^{\ell } \left({\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right), \left(\phi _x(x), \phi _y(y) \right) \right) . \end{aligned}$$The higher jet derivatives are defined by using (5.13) and multiplying out, keeping in mind that the partial derivatives act only on the Lagrangian, i.e.,
$$\begin{aligned}&\nabla ^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathfrak {v}_1(x), \mathfrak {v}_1(y) \right), \ldots , \left(\mathfrak {v}_p(x), \mathfrak {v}_p(y) \right) \right) \\&\quad := D^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathbf{v }_1(x), \mathbf{v }_1(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \left(b_1(x)+b_1(y) \right) \, D^{p-1, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))}\\&\qquad \qquad \times \left( \left(\mathbf{v }_2(x), \mathbf{v }_2(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \left(b_2(x)+b_2(y) \right)\, D^{p-1, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \\&\qquad \qquad \times \left( \left(\mathbf{v }_1(x), \mathbf{v }_1(y) \right), \left(\mathbf{v }_3(x), \mathbf{v }_3(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \cdots + \left(b_1(x)+b_1(y) \right) \cdots \left(b_p(x)+b_p(y) \right)\, {\mathcal{L}}(x,y). \end{aligned}$$ -
(ii)
The functions
$$\begin{aligned}&\left( \nabla _{1, \mathfrak {v}_1} + \nabla _{2, \mathfrak {v}_1} \right) \cdots \left( \nabla _{1, \mathfrak {v}_p} + \nabla _{2, \mathfrak {v}_p} \right) {\mathcal{L}}(x,y) \nonumber \\&\qquad := \nabla ^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right) \big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathfrak {v}_1(x), \mathfrak {v}_1(y) \right), \ldots , \left(\mathfrak {v}_p(x), \mathfrak {v}_p(y) \right) \right) \end{aligned}$$(5.15)are \(\rho \)-integrable in the variable y, giving rise to locally bounded functions in x. More precisely, these functions are in the space
$$\begin{aligned} L^{\infty }_{\mathrm{loc}}\left( M, L^1\left(M, d\rho (y) \right); d\rho (x) \right) . \end{aligned}$$ -
(iii)
Integrating the expression (5.15) in y over M with respect to the measure \(\rho \), the resulting function g (defined for all x in an open neighborhood of M) is continuously differentiable in the direction of every jet \(\mathfrak {u}\in \mathfrak {J}^{\mathrm{test}}_{\rho }\), i.e.,
$$\begin{aligned} \Gamma ^{\mathrm{test}}_x \subseteq {\mathcal{E}}(g, x) \qquad \text{for all}\,x \in M. \end{aligned}$$
5.3 Derivatives of \({\mathcal{L}}\) and \(\ell \) along smooth curves
In this section, we use the chain rule in Proposition 4.4 in order to differentiate the Lagrangian \({\mathcal{L}}\) and the function \(\ell \) along smooth curves.
Theorem 5.6
Let \(\gamma _1\) and \(\gamma _2\) be two smooth curves in \({\mathcal{F}}^{\mathrm{reg}}\),
Setting \(x=\gamma _1(0)\) and \(y=\gamma _2(0)\), we assume that the tangent vectors up to the order \(p=2n-1\) denoted by
are in the expedient differentiable subspace of the Lagrangian, i.e.,
Then, the function \({\mathcal{L}}(\gamma _1(\tau ), \gamma _2(\tau ))\) is \(\tau \)-differentiable at \(\tau =0\) and the chain rule holds, i.e.,
Proof
We again consider the Lagrangian in the charts \(\phi _x\) and \(\phi _y\), (5.14). In order to show that this function is locally Hölder continuous on E, we begin with the estimate
Noting that the function
is bilinear and therefore Fréchet-smooth, it follows that
where \(\psi _x := \phi ^{-1}_x(x)\) and \(\psi _y := \phi ^{-1}_y(y)\). This proves local Hölder continuity on E. Applying Proposition 4.4 gives the result. \(\square \)
We remark that using Proposition 4.5, the above method could be generalized in a straightforward manner to higher derivatives.
Definition 5.7
We call \(\ell \) Hölder continuous with Hölder exponent \(\alpha \) along a smooth curve \(\gamma : I \rightarrow {\mathcal{F}}\) (with I an open interval) if for any \(t_0 \in I\) with \(x_0 = \gamma (t_0)\) there exists a subspace \(E_0 \subseteq \mathrm{Symm}S_{x_0} \oplus {\mathcal{L}}(J_{x_0},I_{x_0})\) and \(\delta >0\) such that the mapping
is well defined and locally Hölder continuous with Hölder exponent \(\alpha \).
Theorem 5.8
Let \(\gamma : I \rightarrow {\mathcal{F}}\) be a smooth curve and \(\ell \) Hölder continuous along \(\gamma \) with Hölder exponent \(\alpha \). For \(t_0 \in I\) with \(x_0 = \gamma (t_0)\), we set
If for any \(x_0\in I\), the derivatives of \(\gamma _{x_0}\) up to the order \(p:=\lceil q/\alpha \rceil \) lie in the expedient differentiable subspace at \(x_0\), i.e.,
then the function \(\ell \circ \gamma = \ell _{x_0} \circ \gamma _{x_0}\) is q-times differentiable at \(t_0\). Moreover, the usual product and chain rules hold for \(\ell _{x_0} \circ \gamma _{x_0}\).
Proof
Applying proposition 4.5 to \(\ell _{x_0}\) and \(\gamma _{x_0}\) yields the claim as the assumptions for this theorem are clearly fulfilled. \(\square \)
We now give a sufficient condition which ensures that \(\ell \) is Hölder continuous along \(\gamma \). This condition needs to be verified in the applications; see for example [25].
Theorem 5.9
Let \(\gamma \) be a smooth curve in \({\mathcal{F}}\) with
where P(x, y) is again the kernel of the fermionic projector (3.11) and Y is (similar to (3.5)) the invertible operator
Then the integrated Lagrangian \(\ell \) defined by (1.1) is Hölder continuous along \(\gamma \) with Hölder exponent \(\frac{1}{2n-1}\).
Proof
The idea of the proof is to integrate the estimate (5.10) over M. To this end, it is crucial to estimate the factor \(\Vert \pi _J y \pi _J \Vert \). We let \((\tilde{\phi }_i)_{i\in 1, \dots m}\) be an orthonormal basis of J and denote the orthogonal projection on \(\mathrm{span}(\tilde{\phi }_i)\) by \(\pi _i\). Since on the finite-dimensional vector space L(J) all norms are equivalent, we can work with the Hilbert–Schmidt norm of \(\pi _J y\pi _J\), i.e., for a suitable constant \(C=C(n)\),
where in the last step, we used that the norm of an operator is the same as the norm of its adjoint. Combining this inequality with the estimate
we obtain
Using this estimate when integrating (5.10) over M and noting that \(\phi _{x}^{-1}\) is locally Lipschitz (since it is Fréchet-smooth) yields the claim. \(\square \)
Notes
In this reference, everything is worked out in the case of Banach spaces, but the completeness is not needed for these results.
As explained in Remark 7.1, the trace operator for the finite-rank operators \(x\mathbf{u }x\mathbf{u }\), \(x\mathbf{u }\mathbf{u }^{\dagger }x\) and \(\mathbf{u }^{\dagger }x^2\mathbf{u }\) can indeed be calculated like that as they all map into \((S_x+\mathbf{u }^{\dagger }(S_x))\).
Where a Riemannian metric on a Banach manifold is defined just as in the finite-dimensional case but with smoothness with respect to the Fréchet derivative.
References
Link to web platform on causal fermion systems: www.causal-fermion-system.com
Beltiţă, D., Goliński, T., Tumpach, A.-B.: Queer Poisson brackets. J. Geom. Phys. 132, 358–362 (2018). arXiv:math-ph/1710.03057 [math.FA]
Bernard, Y., Finster, F.: On the structure of minimizers of causal variational principles in the non-compact and equivariant settings. Adv. Calc. Var. 7(1), 27–57 (2014). arXiv:1205.0403 [math-ph])
Brink, D.: Hölder continuity of roots of complex and \(p\)-adic polynomials. Comm. Algebra 38(5), 1658–1662 (2010)
Coleman, R.: Calculus on Normed Vector Spaces. Universitext. Springer, New York (2012)
Dieudonné, J.: Foundations of Modern Analysis, Academic Press, New York-London (1969). Enlarged and corrected printing, Pure and Applied Mathematics, Vol. 10-I
Dunford, N., Schwartz, J.T.: Linear Operators. Part II: Spectral Theory. Self Adjoint Operators in Hilbert Space, with the Assistance of William G. Bade and Robert G. Bartle. Wiley, New York (1963)
Finster, F.: A variational principle in discrete space–time: existence of minimizers. Calc. Var. Partial Differential Equations 29(4), 431–453 (2007). arXiv:0503069 [math-ph]
Finster, F.: Causal variational principles on measure spaces. J. Reine Angew. Math. 646, 141–194 (2010). arXiv:0811.2666 [math-ph])
Finster, F.: The Continuum Limit of Causal Fermion Systems, Fundamental Theories of Physics, vol. 186. Springer, Berlin (2016).. (arXiv:1605.04742 [math-ph])
Finster, F.: Causal fermion systems: a primer for Lorentzian geometers. J. Phys. Conf. Ser. 968, 012004 (2018). arXiv:1709.04781 [math-ph]
Finster, F., Jokel, M.: Causal fermion systems: an elementary introduction to physical ideas and mathematical concepts. In: Finster, F., Giulini, D., Kleiner, J., Tolksdorf, J. (eds.) Progress and Visions in Quantum Theory in View of Gravity, pp. 63–92. Birkhäuser, Basel (2020). arXiv:1908.08451 [math-ph]
Finster, F., Kamran, N.: Complex structures on jet spaces and bosonic Fock space dynamics for causal variational principles. Pure Appl. Math. Q. (2021). (to appear) arXiv:1808.03177 [math-ph]
Finster, F., Kamran, N., Oppio, M.: The linear dynamics of wave functions in causal fermion systems. J. Differential Equations (2021). (to appear) arXiv:2101.08673 [math-ph]
Finster, F., Kindermann, S.: A gauge fixing procedure for causal fermion systems. J. Math. Phys. 61, 082301 (2020). arXiv:1908.08445 [math-ph])
Finster, F., Kleiner, J.: Causal fermion systems as a candidate for a unified physical theory. J. Phys. Conf. Ser. 626, 012020 (2015). arXiv:1502.03587 [math-ph]
Finster, F., Kleiner, J.: A Hamiltonian formulation of causal variational principles. Calc. Var. Partial Differential Equations 56:73(3), 33 (2017). arXiv:1612.07192 [math-ph]
Finster, F., Kleiner, J., Treude, J.-H.: An Introduction to the Fermionic Projector and Causal Fermion Systems, in preparation, www.causal-fermion-system.com/intro-public.pdf
Finster, F., Langer, C.: Causal variational principles in the \(\sigma \)-locally compact setting: existence of minimizers. Adv. Calc. Var. (2021). (to appear) arXiv:2002.04412 [math-ph]
Finster, F., Schiefeneder, D.: On the support of minimizers of causal variational principles. Arch. Ration. Mech. Anal. 210, 321–364 (2013). arXiv:1012.1589 [math-ph])
Hilgert, J., Neeb, K.-H.: Structure and Geometry of Lie Groups. Springer Monographs in Mathematics. Springer, New York (2012)
Kriegl, A., Michor, P.W.: The Convenient Setting of Global Analysis, Mathematical Surveys and Monographs, vol. 53. American Mathematical Society, Providence (1997)
Lax, P.D.: Functional Analysis, Pure and Applied Mathematics (New York). Wiley-Interscience, New York (2002)
Lee, J.M.: Riemannian Manifolds: An Introduction to Curvature, Graduate Texts in Mathematics, vol. 176. Springer, New York (1997)
Oppio, M.: Hölder continuity of the integrated causal Lagrangian in Minkowski space, in preparation
Oppio, M.: On the mathematical foundations of causal fermion systems in Minkowski space. Ann. Henri Poincaré. 223, 873–949 (2021). arXiv:1909.09229 [math-ph])
Rudin, W.: Real and Complex Analysis, 3rd edn. McGraw-Hill Book Co., New York (1987)
Werner, D.: Funktionalanalysis, 8th edn. Springer, Berlin (2018)
Zeidler, E.: Nonlinear Functional Analysis and its Applications. IV, Springer-Verlag, New York, (1988), Applications to mathematical physics, Translated from the German and with a preface by Jürgen Quandt
Acknowledgements
We are grateful to Olaf Müller, Marco Oppio, Johannes Wurm and the referee for helpful discussions. M.L. acknowledges support by the Studienstiftung des deutschen Volkes.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Properties of the Fréchet derivative
This appendix lists a set of properties and computation rules for Fréchet derivatives which are needed for the direct computations in Appendix 2. It turns out that most derivation rules known from the finite dimensional case generalize to Fréchet derivatives in a straightforward way.
Lemma 6.1
(Properties of the Fréchet derivative) Let V, W, Z be real normed vector. Then the following Fréchet derivative rules hold:
-
(i)
Let \(U \subseteq V\) open and \(f: V \rightarrow W\) Fréchet-differentiable at \(x_0\in U\), then f is continuous at \(x_0\) and \(Df|_{x_0}\) is well defined.
-
(ii)
Let \(f\in \mathrm{L}(V,W)\) be linear and bounded, then it is Fréchet-smooth at any \(x_0 \in V\) and \(Df|_{x_0} = f\).
-
(iii)
A continuous bilinear map \(B: V \times W \rightarrow Z\) is Fréchet-smooth at any \((v,w)\in V \times W\) and
$$\begin{aligned} DB|_{(v,w)}(h_v,h_w) = B(v,h_w) + B(h_v,w)\;,\;\;\; \forall \, (h_v,h_w)\in V \times W. \end{aligned}$$ -
(iv)
Chain rule: Let \(U_V \subseteq V\) and \(U_W \subseteq W\) open, \(f: U_V \rightarrow W\), \(g: U_W \rightarrow Z\) such that \(f(U_V)\subseteq U_W\). If f is Fréchet-differentiable at \(x_0\in U_V\) and g in \(f(x_0)\in U_W\), then also \(g\circ f\) is Fréchet-differentiable in \(x_0\) and
$$\begin{aligned} D(g\circ f)|_{x_0} = Dg|_{f(x_0)} \circ Df|_{x_0}. \end{aligned}$$ -
(v)
Let \(W_1,...,W_n\) be real normed vector spaces, \(W:=W_1 \times W_2 \times ...\times W_n\) the product space, \(U\subseteq V\) open and \(f=(f_1,...,f_n): U \rightarrow W\) with \(f_i: V \rightarrow f_i\) for \(i=1,\dots ,n\). Then f is Fréchet-differentiable at \(x_0\in U\) if and only if each \(f_i\) is Fréchet-differentiable at \(x_0\). Moreover, in this case, we have \(Df|_{x_0} = (Df_1|_{x_0},...,Df_n|_{x_0})\).
-
(vi)
Let \(U_V\subseteq V\) and \(U_W \subseteq W\) be open and \(f: U_V \rightarrow U_W\) a homeomorphism with inverse \(g: U_W \rightarrow U_V\). If f is Fréchet-differentiable at \(x_0\in U_V\) and g is Fréchet-differentiable in \(y_0=f(x_0)\in U_W\), then \(Df|_{x_0}\) is an isomorphism with inverse \(Dg|_{y_0}\).
Proof
-
(i)
See [5, Prop. 2.2, Chapter 2.2].
-
(ii)
f is clearly Fréchet-differentiable with \(Df|_{x_0}=f\) for any \(x_0 \in U\) as \(\Vert f(x+h) -f(x) -fh\Vert _W=0\) for all \(x,h \in V\) (see also [6, pp. 149–150], and note that, the completeness of the vector spaces is not needed for this result). Moreover, as \(Df: U \rightarrow L(V,W)\) is constant it is clear that all higher Fréchet-derivatives of f vanish (and in particular f is Fréchet-smooth).
-
(iii)
B is Fréchet differentiable with the stated Fréchet derivative as
$$\begin{aligned}&\Vert B(v+h_v,w+h_w)-B(v,w)-B(v,h_w)-B(h_v,w)\Vert _Z=\Vert B(h_v,h_w)\Vert _Z \\&\le C \Vert h_v\Vert \cdot \Vert h_w\Vert \le C \left(\mathrm{max}(\Vert h_v\Vert ,\Vert h_w\Vert )\right)^2\;, \end{aligned}$$for a fixed \(C>0\) (as B is continuous and bilinear), see also [6, pp. 149-150] (again the completeness in not needed). And since
$$\begin{aligned} DB: V \times W&\rightarrow L(V\times W, L(V\times W, Z))\\ (v,w)&\mapsto \left( (h_v,h_w) \mapsto B(v,h_w) + B(h_v,w) \right)\;, \end{aligned}$$is clearly bounded linear, B is Fréchet-smooth due to part (ii).
-
(iv)
See [5, Theorem 2.1, Chapter 2.3].
-
(v)
See [6, pp. 149–151] (again completeness is not needed).
-
(vi)
This follows immediately from the chain rule and part (ii) since
$$\begin{aligned} \mathrm{id}_V \overset{(ii)}{=} D(\mathrm{id}_V)|_{x_0} = D(g \circ f)|_{x_0} \overset{(iv)}{=} Dg|_{y_0} \circ Df|_{x_0}\;, \end{aligned}$$and similarly \(\mathrm{id}_W = Df|_{x_0} \circ Dg|_{y_0}\).
\(\square \)
Lemma 6.2
Let V, W and Z be real normed spaces, \(n\in \mathbb{N}\) arbitrary and \(U_V \subseteq V, U_W \subseteq W\) open subsets, \(f: U_V \rightarrow W\) n-times Fréchet-differentiable (Fréchet-smooth) and \(g: U_W \rightarrow Z\) n-times Fréchet-differentiable (Fréchet-smooth) such that \(f(U_V)\subseteq U_W\). Then, \(g\circ f\) is also n-times Fréchet-differentiable (respectively, Fréchet-smooth).
Proof
We show the result by induction over n following [6, p. 183]: The case \(n=1\) follows from the chain rule. Now, let \(n \ge 2\) be arbitrary and suppose that the claim holds for \(n-1\). Then, the induction hypothesis yields that the mapping \(x \mapsto D(f\circ g)|_{x} = Df|_{g(x)}\circ Dg|_{x}\) is \((n-1)\)-times Fréchet-differentiable, because Df, g and Dg are at least \((n-1)\)-times Fréchet-differentiable and the operator \(\circ \) is even Fréchet-smooth (as it is bounded linear). Thus, \(f\circ g\) is n-times Fréchet-differentiable.
The smoothness result follows immediately from the result for n-times differentiability. \(\square \)
The following lemma gives a useful computation rule for higher Fréchet derivatives (see also [6, pp. 179, 181]):
Lemma 6.3
Let V, W be real normed spaces, \(U\subseteq V\) open and \(f: U \rightarrow W\) n-times differentiable. Then, for any \(x_0 \in U\) and \(v_1,\cdots , v_n \in V\),
In particular, the map \(U\ni x \mapsto D^{(n-1)}f|_{x}(v_1,\cdots ,v_{n-1})\in W\) is Fréchet-differentiable.
Proof
We follow the idea of the proof given in [6, pp. 179, 181] and also make use of the symmetry result in Lemma 2.3. We first fix \(v_1,\dots ,v_n \in V\) and define a linear map by
which simply inserts all the \(v_1,\dots ,v_{n-1}\) in an \(A\in \mathrm{L}(V,W)\). Note that, \(E_{v_1,\dots ,v_{n-1}}\) is clearly bounded linear and thus Fréchet-smooth. So if we use the representation of \(D^{(n-1)}f\) as element of
the composition \(E_{v_1,\dots ,v_{n-1}}\circ D^{(n-1)}f\) is also Fréchet-differentiable at any \(x_0\in U\) with
where in the first step, we used the chain rule and Lemma 6.1 (ii). In the second step, we used the definition of \(E_{v_1,\dots ,v_{n-1}}\), whereas in the third step, we re-identified \(D^{n}f|_{x_0}\) with the corresponding multilinear mapping from \(V^n\) to W. Finally, in the last step, we used the symmetry of \(D^{n}f|_{x_0}\). \(\square \)
We finally state one last computation rule:
Lemma 6.4
Let V, W and Z be normed vector spaces, \(U\subseteq V\) open, \(f:U\rightarrow W\) n-times Fréchet-differentiable and \(A \in \mathrm{L}(W,Z)\). Then also the function \(A\circ f\) is n-times Fréchet-differentiable and
Proof
The Fréchet-differentiability follows immediately from Lemma 6.2, using that A is Fréchet-smooth. We show the identity (6.2) by induction over n: The case \(n=1\) follows immediately by the chain rule and Lemma 6.1 (ii). Now, let \(n\ge 2\) and assume that the statement holds for \(n-1\). Using the previous lemma (first step), the induction hypothesis (i.e. (6.2) for the \((n-1)\)-st derivative) in the second step, as well as the chain rule, Lemma 6.1 (ii) and the symmetry of \(f^(n)\), for all \(x_0\in U\) and \(v_1,\cdots , v_n \in V\) we obtain
\(\square \)
Appendix 2: The Riemannian metric in symmetric wave charts
In this subsection, we give a detailed computation of the Riemannian metric introduced in Sect. 3.4 in terms of the symmetric wave charts. Hereby, we adapt the methods in [15, Section 4] to the infinite-dimensional setting.
We begin by defining a distance function on \({\mathcal{F}}^{\mathrm{reg}}\) by
The trace operator involved here is well defined and can be expressed in any orthonormal basis \((e_i)_{i\in \mathbb{N}}\) of \({\mathcal{H}}\) by (for details, see for example [23, Section 30.2])
Moreover, note that, d does indeed define a distance function on \({\mathcal{F}}^{\mathrm{reg}}\) as for any two \(x,y \in {\mathcal{F}}^{\mathrm{reg}}\) \(d(x,y)=\Vert x-y\Vert _{{\mathscr{S}}({\mathcal{H}})}\), where \(\Vert .\Vert _{{\mathscr{S}}({\mathcal{H}})}\) denotes the Hilbert–Schmidt norm (see for example [7, Section XI.6] or [28, pp. 321–322, 309–310]).
The following remark is a reminder of a calculation rule for the trace operator acting on operators with finite rank.
Remark 7.1
Let \(A\in \mathrm{L}({\mathcal{H}})\) be of finite rank and \(V \subseteq {\mathcal{H}}\) a finite-dimensional subspace \(V \subseteq {\mathcal{H}}\) containing the image of A, i.e., \(A({\mathcal{H}})\subseteq V\). Moreover, let \((e_i)_{1\le i\le k}\) be an orthonormal basis of v and \((\tilde{e}_i)_{i\in \mathbb{N}}\) an orthonormal basis of \(V^{\bot }\). Then we obtain an orthonormal basis \((\hat{e}_i)_{i\in \mathbb{N}}\) of \({\mathcal{H}}\) by setting \(\hat{e}_i:=e_i\) for \(i=1,\cdots , k\) and \(\hat{e}_{k+j}:=\tilde{e}_j\) for \(j\in \mathbb{N}\). Using this basis in (7.1), the trace of A reduces to:
\(\square \)
The next lemma is mostly based on [28, Satz VI.5.8] and states some more properties of the trace operators.
Lemma 7.2
(Properties of the trace)
-
(1)
Linearity: The trace operator \(\mathrm{tr}\) is linear.
-
(2)
Boundedness: For a finite-dimensional subspace \(V\subseteq {\mathcal{H}}\), consider the corresponding subspace \(V_{\mathrm{L}}:=\{A \in \mathrm{L}({\mathcal{H}})\, | \, A({\mathcal{H}}) \subseteq V \} \subseteq \mathrm{L}({\mathcal{H}})\). Then, \(\mathrm{tr}|_{V_{\mathrm{L}}}\) is bounded.
-
(3)
Cyclic Permutation: For \(x,y \in \mathrm{L}({\mathcal{H}})\) with x of finite rank it holds that:
$$\begin{aligned} \mathrm{tr}(xy)=\mathrm{tr}(yx). \end{aligned}$$ -
(4)
Trace of adjoint: For any \(x \in \mathrm{L}({\mathcal{H}})\) of finite rank also \(x^{\dagger }\) is of finite rank and:
$$\begin{aligned} \mathrm{tr}(x^{\dagger })=\overline{\mathrm{tr}(x)}. \end{aligned}$$
Proof
(i): Follows from the definition of tr by (7.1), see also [28, Satz VI.5.8 (a)].
(iii) and (iv): See [28, Satz VI.5.8 (c),(b)].
(ii): Let \(A \in V_{\mathrm{L}}\). Then, as explained in Remark 7.1, choosing an orthonormal basis \((e_i)_{1\le i\le k}\) of V (so \(\dim (V)=k\)), we can estimate:
This concludes the proof. \(\square \)
In the following lemma, we consider differentiability properties of a mapping E which corresponds to the square of the distance function d. Later, we want to use it to introduce the Riemannian metric as second Fréchet-derivative of E.
Lemma 7.3
The mappings:
and for any fixed \(x \in {\mathcal{F}}^{\mathrm{reg}}\):
are Fréchet-smooth. Moreover, for all \(x,y\in {\mathcal{F}}^{\mathrm{reg}}\) with \(x \in \Omega _y\) and all \(\mathbf{u }, \mathbf{v }\in V_y\),
Proof
First we have to show that \(E\circ (\phi _x^{-1}, \phi _y^{-1})\) is Fréchet-smooth for all \( x,y \in {\mathcal{F}}^{\mathrm{reg}}\).
To this end first consider the following calculation for arbitrary \(\varphi \in W_x, \psi \in W_y\):
where in the second step, we used the linearity of the trace and in the third step, the cyclic permutation property (which can be applied as all factors and summands obviously have finite rank). The last line is clearly a sum of composition of Fréchet-smooth mappings in \((\varphi ,\psi )\), which proves the Fréchet-smoothness of \(E\circ (\phi _x^{-1}, \phi _y^{-1})\).
For calculating the Fréchet derivative of \(E_x\), consider the expansion
which is again a sum of compositions of Fréchet-smooth functions showing that also \(E_x \circ \phi _y^{-1}\) is Fréchet-smooth.
Applying the computation rule from Lemma 6.4 together with the Fréchet derivative rule for bilinear functions in Lemma 6.1 (iii) (multiple times and together with the chain rule) we obtain:
Using that Lemma 7.2 (iii) and (iv) this simplifies to
In the case \(\psi ^{\dagger }y\psi =\phi _y^{-1}(\psi )=x\), the terms in (7.4) cancel each other, showing that
Moreover, proceeding from (7.3) a straightforward computation using the properties of the Fréchet derivative and the trace operator as before gives
As for \(\psi ^{\dagger }y\psi =\phi _y^{-1}(\psi )=x\), the first and the last term cancel each other and we obtain
which concludes the proof. \(\square \)
Lemma 7.4
\(D^2( E_x \circ \phi _y^{-1})|_{\phi _y(x)}\) is independent of the choice of chart (i.e., the choice of y) as long as \(y\in {\mathcal{F}}^{\mathrm{reg}}\) is chosen such that \(x \in \Omega _y\). Moreover, for all tangent vector fields \(\mathbf{v },\mathbf{u }\in \Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})\) and any \(y\in {\mathcal{F}}^{\mathrm{reg}}\) with \(x \in \Omega _y\)
where the derivatives act on the arguments containing a dot.
This Lemma also shows that the order of differentiation of \(E_x\) with respect to the two vector fields does not matter. The proof shows that this is due to the fact that the first derivative of \(E_x\) vanishes.
Proof
Let \(\mathbf{v },\mathbf{u }\in \Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})\) and \(x,y\in {\mathcal{F}}^{\mathrm{reg}}\) with \(x \in \Omega _y\) be arbitrary. As we have seen before, for the first directional derivative, we have
It follows for the second directional derivative that
where we applied the Fréchet derivative rule for \(\mathbb{R}\)-bilinear maps in Lemma 6.1 (iii) together with the chain rule. Evaluating this expression at \(\tilde{x}=x\), the second summand vanishes in view of (7.2). We thus obtain
Using the symmetry of the second Fréchet derivatives gives the result. \(\square \)
Remark 7.5
Equation (7.7) also shows that \(D_{\mathbf{v }(x)}( D_{\mathbf{u }(.)}E_x(.))\) only depends on the value of the vector fields \(\mathbf{u }\) and \(\mathbf{v }\) at the point x. Moreover, since to arbitrary \(x \in {\mathcal{F}}^{\mathrm{reg}}\) and \(\mathbf{u },\mathbf{v }\in T_x{\mathcal{F}}^{\mathrm{reg}}\) one can always find a smooth tangent vector field with \(\mathbf{v }(x)=\mathbf{v }\), \(\mathbf{u }(x)=\mathbf{u }\) (e.g., using a suitable bump function in a chart around x), we can consider the expression
as a well-defined, coordinate invariant—in the sense that the right hand side of equation (7.8) returns the same values for any \(y \in {\mathcal{F}}^{\mathrm{reg}}\) with \(x \in \Omega _y\)—and symmetric bilinear form. \(\square \)
Now it seems convenient to compute (7.8) 0 in the cart \(\phi _x\). Then we have \(\phi _x(x)=\pi _x\) and thus we obtain for any \(\mathbf{u },\mathbf{v }\in V_x\):
Motivated by this for any \(x \in {\mathcal{F}}^{\mathrm{reg}}\) we set:
Due to the properties of the trace operator, \(\tilde{g}_x\) defines a symmetric, real-valued bilinear form on \(V_x\), which is even positive-definite as the following lemma shows:
Lemma 7.6
The symmetric bilinear form \(\tilde{g}_x\) is positive definite and thus defines a real valued inner product on \(V_x\).
Proof
Let \(\mathbf{u }\in V_x\) be arbitrary, choose an orthonormal basis \((e_i)_{i=1,\cdots ,k}\) of the finite-dimensional vector-space \((S_x+\mathbf{u }^{\dagger }(S_x))\) and compute:Footnote 2
where in the last step, we used that \(\langle \mathbf{u }^{\dagger }xe_i| \mathbf{u }^{\dagger }xe_i\rangle _{{\mathcal{H}}}= \Vert \mathbf{u }^{\dagger }x \Vert ^2_{{\mathcal{H}}}\) and \(\langle x\mathbf{u }e_i| x\mathbf{u }e_i\rangle _{{\mathcal{H}}}= \Vert x\mathbf{u }\Vert ^2_{{\mathcal{H}}}\) are already real (for \(i=1,\dots ,k\)), so we can leave out the “\({{\,\mathrm{Re}\,}}\).”
Combining this we obtain:
This shows the positive semi-definiteness of \(\tilde{g}_x\). Moreover, we see that \(\tilde{g}_x(\mathbf{u },\mathbf{u })\) vanishes if and only if
But as \((\mathbf{u }^{\dagger }x +x\mathbf{u })\) is obviously selfadjoint and its image is contained in \(S_x+\mathbf{u }^{\dagger }(S_x)\), it vanishes on the orthogonal complement of \(S_x+\mathbf{u }^{\dagger }(S_x)\) anyhow,
so the previous equation is equivalent to
Moreover, denoting \(\pi _I := \pi _x\) as the orthogonal projection on \(S_x = I\) and \( \pi _J\) as the orthogonal projection on \(J=(S_x)^{\bot }\), we can write
Plugging this in equation (7.9) yields:
Using \(\mathbf{u }|_I \in \mathrm{Symm}(S_x)\), we conclude
As x is selfadjoint, this also yields
Inserting this in (7.10) gives:
Using a block operator notation for the orthogonal decomposition \({\mathcal{H}}= I \oplus ^{\bot } J\), this equation can be visualized as
This notation can be justified by “testing” equation (7.11) with \((v,0),(0,w)\in {\mathcal{H}}=I\oplus ^{\bot } J\) with \(v\in I\) and \(w\in J\) arbitrary.
Thus, we see that each of the operators \(2x\mathbf{u }\pi _I\), \(\pi _J\mathbf{u }^{\dagger }x\) and \(x\mathbf{u }\pi _J\) must vanish individually. Furthermore, as \(x|_I\) has full rank and \(\mathbf{u }\) maps into \(S_x=I\), this yields
and therefore also
This proves the positive definiteness of \(\tilde{g}_x(\mathbf{u },\mathbf{u }) ={{\,\mathrm{Re}\,}}\left( \mathrm{tr}(x\mathbf{u }x\mathbf{u }) + \mathrm{tr}(x\mathbf{u }\mathbf{u }^{\dagger }x) \right)\). \(\square \)
Now, we can finally introduce a Riemannian metric on \({\mathcal{F}}^{\mathrm{reg}}\):
Lemma 7.7
Setting pointwise for any \(x\in {\mathcal{F}}^{\mathrm{reg}}\):
we obtain a well-defined Riemannian metric on \({\mathcal{F}}^{\mathrm{reg}}\).Footnote 3
Proof
First of all, \(g_x\) is well defined due to Lemma 7.4 as explained in Remark 7.5.
Moreover, choosing representatives \([x,\mathbf{u },x]\), \([x,\mathbf{v },x] \in T_x{\mathcal{F}}^{\mathrm{reg}}\), we have:
and since we have already seen that for any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), \(\tilde{g}_x\) defines a symmetric positive-definite bilinear form, so does \(g_x\).
Thus, it only remains to show that g is Fréchet-smooth. But due to the (coordinate invariant) definition of \(D_2^2E_x|_x\) in (7.8), this follows immediately from the Fréchet-smoothness of \(D^2(E_x \circ \phi _y^{-1})\) (for this see Lemma 7.3, in particular equation (7.5)). More precisely, since for any two smooth vector fields \(\mathbf{u },\mathbf{v }\in \Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})\) for any chart \(\phi _y\) with \(x \in \Omega _y\) also \(D\phi _y\circ \mathbf{u }\circ \phi _y^{-1}\) and \(D\phi _y\circ \mathbf{v }\circ \phi _y^{-1}\) are smooth, we have
which is Fréchet-smooth as composition of Fréchet-smooth maps. More precisely, introducing the mappings
which are both obviously \(\mathbb{R}\)-bilinear and continuous (and thus Fréchet-smooth), we can rewrite the previous equation to
which is now clearly a composition of Fréchet-smooth maps. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Finster, F., Lottner, M. Banach manifold structure and infinite-dimensional analysis for causal fermion systems. Ann Glob Anal Geom 60, 313–354 (2021). https://doi.org/10.1007/s10455-021-09775-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10455-021-09775-4