Banach Manifold Structure and Infinite-Dimensional Analysis for Causal Fermion Systems

A mathematical framework is developed for the analysis of causal fermion systems in the infinite-dimensional setting. It is shown that the regular spacetime point operators form a Banach manifold endowed with a canonical Fr\'echet-smooth Riemannian metric. The so-called expedient differential calculus is introduced with the purpose of treating derivatives of functions on Banach spaces which are differentiable only in certain directions. A chain rule is proven for H\"older continuous functions which are differentiable on expedient subspaces. These results are made applicable to causal fermion systems by proving that the causal Lagrangian is H\"older continuous. Moreover, H\"older continuity is analyzed for the integrated causal Lagrangian.


Introduction
The theory of causal fermion systems is a recent approach to fundamental physics (see the basics in Section 2, the reviews [11,12,16], the textbook [10] or the website [1]). In this approach, spacetime and all objects therein are described by a measure ρ on a set F of linear operators of rank at most 2n on a Hilbert space (H, .|. H ). The physical equations are formulated via the so-called causal action principle, a nonlinear variational principle where an action S is minimized under variations of the measure ρ. If the Hilbert space H is finite-dimensional, the set F is a locally compact topological space. Making essential use of this fact, it was shown in [9] that the causal action principle is well-defined and that minimizers exist. Moreover, as is worked out in detail in [15], the interior of F (consisting of the so-called regular points; see Definition 3.1) has a smooth manifold structure. Taking these structures as the starting point, causal variational principles were formulated and studied as a mathematical generalization of the causal action principle, where an action of the form S =ˆF dρ(x)ˆF dρ(y) L(x, y) is minimized for a given lower-semicontinuous Lagrangian L : F × F → R + 0 on an (in general non-compact) manifold F under variations of ρ within the class of regular Borel measures, keeping the total volume ρ(F) fixed. We refer the reader interested in causal variational principles to [19,Section 1 and 2] and the references therein.
This article is devoted to the case that the Hilbert space H is infinite-dimensional and separable. While the finite-dimensional setting seems suitable for describing physical spacetime on a fundamental level (where spacetime can be thought of as being discrete on a microscopic length scale usually associated to the Planck length), an infinite-dimensional Hilbert space arises in mathematical extrapolations where spacetime is continuous and has infinite volume. Most notably, infinite-dimensional Hilbert spaces come up in the examples of causal fermion systems describing Minkowski space (see [10,Section 1.2] or [26]) or a globally hyperbolic Lorentzian manifold (see for example [11]), and it is also needed for analyzing the limiting case of a classical interaction (the so-called continuum limit; see [10, Section 1.5.2 and Chapters 3-5]). A workaround to avoid infinite-dimensional analysis is to restrict attention to locally compact variations, as is done in [14,Section 2.3]. Nevertheless, in view of the importance of the examples and physical applications, it is a task of growing significance to analyze causal fermion systems systematically in the infinite-dimensional setting. It is the objective of this paper to put this analysis on a sound mathematical basis.
We now outline the main points of our constructions and explain our main results. Extending methods and results in [15] to the infinite-dimensional setting, we endow the set of all regular points of F with the structure of a Banach manifold (see Definition 3.1 and Theorem 3.4). To this end, we construct an atlas formed of so-called symmetric wave charts (see Definition 3.3). We also show that the Hilbert-Schmidt norm on finite-rank operators on H gives rise to a Fréchet-smooth Riemannian metric on this Banach manifold. More precisely, in Theorems 3.11 and 3.12 we prove that F reg is a smooth Banach submanifold of the Hilbert space S(H) of selfadjoint Hilbert-Schmidt operators, with the Riemannian metric given by In order to introduce higher derivatives at a regular point p ∈ F, our strategy is to always work in the distinguished symmetric wave chart around this point. This has the advantage that we can avoid the analysis of differentiability properties under coordinate transformations. The remaining difficulty is that the causal Lagrangian L and other derived functions are not differentiable. Instead, directional derivatives exist only in certain directions. In general, these directions do not form a vector space. As a consequence, the derivative is not a linear mapping, and the usual product and chain rules cease to hold. On the other hand, these computation rules are needed in the applications, and it is often sensible to assume that they do hold. This motivates our strategy of looking for a vector space on which the function under consideration is differentiable. Clearly, in this way we lose information on the differentiability in certain directions which do not lie in such a vector space. But this shortcoming is outweighted by the benefit that we can avoid the subtleties of non-smooth analysis, which, at least for most applications in mind, would be impractical and inappropriately technical. Clearly, we want the subspace to be as large as possible, and moreover it should be defined canonically without making any arbitrary choices. These requirements lead us to the notion of expedient subspaces (see Definition 4.2). In general, the expedient subspace is neither dense nor closed. On these expedient subspaces, the function is Gâteaux differentiable, the derivative is a linear mapping, and higher derivatives are multilinear.
The differential calculus on expedient subspaces is compatible with the chain rule in the following sense: If f is locally Hölder continuous, γ is a smooth curve whose derivatives up to sufficiently high order lie in the expedient differentiable subspace of f , then the composition f • γ is differentiable and the chain rule holds (see Proposition 4.4 where the index E denotes the derivative on the expedient subspace. We also prove a chain rule for higher derivatives (see Proposition 4.5). The requirement of Hölder continuity is a crucial assumption needed in order to control the error term of the linearization. The most general statement is Theorem 5.8 where Hölder continuity is required only on a subspace which contains the curve γ locally. We also work out how the differential calculus on expedient subspaces applies to the setting of causal fermion systems. In order to establish the chain rule, we prove that the causal Lagrangian is indeed locally Hölder continuous with uniform Hölder exponent (Theorem 5.1), and we analyze how the Hölder constant depends on the base point (Theorem 5.3). Moreover, we prove that for all x, y ∈ F there is a neighborhood U ⊆ F of y with (see (5.9)) (where 2n is the maximal rank of the operators in F). Relying on these results, we can generalize the jet formalism as introduced in [17] for causal variational principles to the infinite-dimensional setting (Section 5.2). We also work out the chain rule for the Lagrangian (Theorem 5.6) and for the function ℓ obtained by integrating one of the arguments of the Lagrangian (Theorem 5.9), The paper is organized as follows. Section 2 provides the necessary preliminaries on causal fermion systems and infinite-dimensional analysis. In Section 3 an atlas of symmetric wave charts is constructed, and it is shown that this atlas endows the regular points of F with the structure of a Fréchet-smooth Banach manifold. Moreover, it is shown that the Hilbert-Schmidt norm induces a Fréchet-smooth Riemannian metric. In Section 4 the differential calculus on expedient subspaces is developed. In Section 5, this differential calculus is applied to causal fermion systems. Appendix A gives some more background information on the Fréchet derivative. Finally, Appendix B provides details on how the Riemannian metric looks like in different charts.
We finally point out that, in order to address a coherent readership, concrete applications of our methods and results for example to physical spacetimes have not been included here. The example of causal fermion systems in Minkowski space will be worked out separately in [25].

Preliminaries
2.1. Causal Fermion Systems and the Causal Action Principle. We now recall the basic definitions of a causal fermion system and the causal action principle.
Definition 2.1. (causal fermion system) Given a separable complex Hilbert space H with scalar product .|. H and a parameter n ∈ N (the "spin dimension"), we let F ⊆ L(H) be the set of all selfadjoint operators on H of finite rank, which (counting multiplicities) have at most n positive and at most n negative eigenvalues. On F we are given a positive measure ρ (defined on a σ-algebra of subsets of F), the so-called universal measure. We refer to (H, F, ρ) as a causal fermion system. A causal fermion system describes a spacetime together with all structures and objects therein. In order to single out the physically admissible causal fermion systems, one must formulate physical equations. To this end, we impose that the universal measure should be a minimizer of the causal action principle, which we now introduce.
For any x, y ∈ F, the product xy is an operator of rank at most 2n. However, in general it is no longer a selfadjoint operator because (xy) * = yx, and this is different from xy unless x and y commute. As a consequence, the eigenvalues of the operator xy are in general complex. We denote these eigenvalues counting algebraic multiplicities by λ xy 1 , . . . , λ xy 2n ∈ C (more specifically, denoting the rank of xy by k ≤ 2n, we choose λ xy 1 , . . . , λ xy k as all the nonzero eigenvalues and set λ xy k+1 , . . . , λ xy 2n = 0). We introduce the Lagrangian and the causal action by Lagrangian: L(x, y) = 1 4n causal action: The causal action principle is to minimize S by varying the measure ρ under the following constraints: volume constraint: where C is a given parameter, tr denotes the trace of a linear operator on H, and the absolute value of xy is the so-called spectral weight, This variational principle is mathematically well-posed if H is finite-dimensional. For the existence theory and the analysis of general properties of minimizing measures we refer to [8,9,3]. In the existence theory, one varies in the class of regular Borel measures (with respect to the topology on L(H) induced by the operator norm), and the minimizing measure is again in this class. With this in mind, here we always assume that ρ is a regular Borel measure .
Let ρ be a minimizing measure. Spacetime is defined as the support of this measure, Thus the spacetime points are selfadjoint linear operators on H. These operators contain a lot of additional information which, if interpreted correctly, gives rise to spacetime structures like causal and metric structures, spinors and interacting fields. We refer the interested reader to [10,Chapter 1]. The only results on the structure of minimizing measures which will be needed here concern the treatment of the trace constraint and the boundedness constraint. As a consequence of the trace constraint, for any minimizing measure ρ the local trace is constant in spacetime, i.e. there is a real constant c = 0 such that (see [ where the error term r : U → F goes to zero faster than linearly, i.e.
The linear operator A is the Fréchet derivative, also denoted by Df The Fréchet derivative is uniquely defined. Moreover, the concept can be iterated to define higher derivatives. Indeed, if f is differentiable in U , its derivative Df is a mapping Df : U → L(E, F ) .
Since L(E, F ) is a normed vector space (with the operator norm), we can apply Definition 2.2 once again to define the second derivative at a point x 0 by The second derivative can also be viewed as a bilinear mapping from E to F , It is by definition bounded, meaning that there is a constant c > 0 such that By iteration, one obtains similarly the Fréchet derivatives of order p ∈ N as multilinear operators A function is Fréchet-smooth on U if it is Fréchet-differentiable to every order.
If the function f : U ⊆ E → F is p times Fréchet-differentiable in x 0 ∈ U , then its p th Fréchet derivative is symmetric, i.e. for any u 1 , . . . , u p ∈ E and any permutation σ ∈ S p , We omit the proof, which can be found for example in [5,Section 4.4]. For the Fréchet derivative, most concepts familiar from the finite-dimensional setting carry over immediately. In particular, the composition of Fréchet-differentiable functions is again Fréchet-differentiable. Moreover, the chain and product rules hold. We refer for the details to [ A weaker concept of differentiability which we will use here is Gâteaux differentiability. The function f is Gâteaux differentiable in x 0 ∈ U in the direction u ∈ E if the limit of the difference quotient exists, The resulting vector d u f (x 0 ) ∈ F is the Gâteaux derivative.
By definition, the Gâteaux derivative is homogeneous of degree one, i.e.
Moreover, if f is Fréchet-differentiable in x 0 , then it is also Gâteaux differentiable in any direction u ∈ E and However, the converse is not true because, even if the Gâteaux derivatives exist for any u ∈ E, it is in general not possible to represent them by a bounded linear operator. As a consequence, the chain and product rules in general do not hold for Gâteaux derivatives. We shall come back to this issue in Section 5.

Banach Manifolds.
We recall the basic definition of a smooth Banach manifold (for more details see for example [29, Chapter 73]). A smooth atlas A = φ i , U i , E) i∈I is a collection of charts (for a general index set I) with the properties that the domains of the charts cover B, and that for any i, j ∈ I, the transition map are Fréchet-smooth. We denote the corresponding equivalence class by [A]. The union of the charts of all atlases in [A] is called maximal atlas A max . The triple (B, E, A) is referred to as a smooth Banach manifold with differentiable structure provided by A max .

Smooth Banach Manifold Structure of F reg
In the definition of causal fermion systems, the number of positive or negative eigenvalues of the operators in F can be strictly smaller than n. This is important because it makes F a closed subspace of L(H) (with respect to the norm topology), which in turn is crucial for the general existence results for minimizers of the causal action principle (see [9] or [18]). However, in most physical examples in Minkowski space or in a Lorentzian spacetime, all the operators in M do have exactly n positive and exactly n negative eigenvalues. This motivates the following definition (see also [10 1. An operator x ∈ F is said to be regular if it has the maximal possible rank, i.e. dim x(H) = 2n. Otherwise, the operator is called singular. A causal fermion system is regular if all its spacetime points are regular.
In what follows, we restrict attention to regular causal fermion systems. Moreover, it is convenient to also restrict attention to all those operators in F which are regular, 3.1. Wave Charts and Symmetric Wave Charts. We now choose specific charts and prove that the resulting atlas endows F reg with the structure of a smooth Banach manifold (see Definition 2.5). In the finite-dimensional setting, these charts were introduced in [15]. We now recall their definition and generalize the constructions to the infinite-dimensional setting.
Given x ∈ F reg we denote the image of x by I := x(H). We consider I as a 2ndimensional Hilbert space with the scalar product induced from .|. H . Denoting its orthogonal complement by J := I ⊥ , we obtain the orthogonal sum decomposition This also gives rise to a corresponding decomposition of operators, like for example L(H, I) = L(I, I) ⊕ L(J, I) . (3.1) Given an operator ψ ∈ L(H, I), we denote its adjoint by ψ † ∈ L(I, H); it is defined by the relation u | ψ v I = ψ † u | v H for all u ∈ I and v ∈ H . We now form the operator By construction, this operator is symmetric and has at most n positive and at most n negative eigenvalues. Therefore, it is an operator in F. Using (3.1), we conclude that R x is a mapping R x : L(I, I) ⊕ L(J, I) → F . (3.3) Before going on, it is useful to rewrite the operator R x (ψ) in a slightly different way. On I, one can also introduce the indefinite inner product referred to as the spin inner product. For conceptual clarity, we denote I endowed with the spin inner product by (S x , ≺.|.≻ x ) and refer to it as the spin space at x (for more details on the spin spaces we refer for example to [10, Section 1.1]). It is an indefinite inner product space of signature (n, n). We denote the adjoint with respect to the spin inner product by a star. More specifically, for a linear operator A ∈ L(S x ), the adjoint is defined by Using again the definition of the spin inner product (3.4), we can rewrite this equation where we introduced the short notation (3.5) Taking adjoints in the Hilbert space H gives (note that the operator X is invertible because S x is by definition its image). We thus obtain the relation (3.6) Using such transformations, one readily verifies that, identifying the image of ψ with a subspace of S x , the right side of (3.2) can be written as −ψ * ψ (for details see [15,Lemma 2.2]). Thus, with this identification, the operator R x can be written instead of (3.2) and (3.3) in the equivalent form where ψ * is the adjoint with respect to the corresponding inner products, i.e.
We want to use the operator R x in order to construct local parametrizations of F reg . The main difficulty is that the operator R x is not injective. For an explanation of this point in the context of local gauge freedom we refer to [15]. Here we merely explain how to arrange that R x becomes injective. We let Symm(S x ) ⊆ L(S x ) be the real vector space of all operators A on S x which are symmetric with respect to the spin inner product, i.e.
We now restrict the operator R x in (3.3) and (3.7) to We write the direct sum decomposition as Extending the analysis in [15, Section 6.1] to the infinite-dimensional setting, one finds that this mapping is a local parametrization of F reg : and is a homeomorphism to its image (always with respect to the topology induced by the operator norm on L(H)).

Proof. The estimate
In order to show that R symm x is bijective, we begin with the formula for φ x as derived in [15,Proposition 6.6], which will turn out to be the inverse of R symm x . It has the form where P (x, y) (the kernel of the fermionic projector) and A xy (the closed chain) are defined by Our task is to show that for a sufficiently small open neighborhood Ω x of x, this formula defines a continuous mapping and that the compositions . In preparation, we rewrite the formula (3.10) as where we again used the notation (3.5). Choosing y = x, the operator X −1 π x y| Sx is the identity on S x . We first choose an open neighborhoodΩ x of x so small such that for any y ∈Ω x , (3.14) Then the square root as well as the inverse square root of A = X −1 π x y are well-defined for all x ∈Ω x by the respective power series, with the generalized binomial coefficients given for β ∈ R and n ∈ N by as for both power series the radius of convergence equals one. Moreover note that all square roots, inverse square roots, etc. appearing in the following are well-defined as they are always applied to operators within their radius of convergence. We conclude that the mapping φ x is well-defined and continuous onΩ x . Now by possibly shrinking W x we can arrange that Ω x := R symm In order to verify that φ x maps into Symm A direct computation using (3.6) shows that the operator X −1 π x yπ x | Sx , and hence also its square root, are symmetric on S x . It remains to compute the compositions in (3.12). First, where in the last line we applied (3.6) and used that ψ I is symmetric on S x . Moreover, Since the spectral calculus is invariant under similarity transformations, we know that for any invertible operator B on S x , (3.14)). This concludes the proof.
The mapping φ x , which already appeared in the proof of the previous lemma, can also be introduced abstractly to define the chart.
we obtain a chart (φ x , Ω x ), referred to as the symmetric wave chart about the point x ∈ F reg .
We remark that more general charts can be obtained by restricting R x to another subspace of L(I, S x ) ⊕ L(J, S x ), i.e. in generalization of (3.8), where E is a subspace of L(S x ) which has the same dimension as Symm(S x ). The resulting charts φ E x are obtained by composition with a unitary operator U (for details and the connection to local gauge transformations see [15, Section 6.1]). Since linear transformations are irrelevant for the question of differentiability, in what follows we may restrict attention to symmetric wave charts.

A Fréchet Smooth
Atlas. The goal of this section is to prove that the symmetric wave charts (φ x , Ω x ) form a smooth atlas of F reg .
Theorem 3.4 (Symmetric wave atlas). The collection of all symmetric wave charts on F reg defines a Fréchet-smooth atlas of F reg , endowing F reg with the structure of a smooth Banach manifold (see Definition 2.5).
Proof. We first verify that for any x ∈ F reg , the vector space Symm(S x ) ⊕ L(J, S x ) together with the operator norm of L(H, I) = L(H, S x ) is a Banach space. To this end, we note that this vector space coincides with the kernel of the mapping ψ → Since this mapping is continuous on L(H, I) (as one verifies by an estimate similar to (3.9)), its kernel is closed. As a consequence, the vector space Symm(S x ) ⊕ L(J, S x ) is a closed subspace of L(H, I) and thus indeed a Banach space. We saw in Theorem 3.2 that for any x ∈ F reg , (φ x , Ω x ) defines a chart on F reg . Since the Ω x clearly cover F reg , it remains to show that all transition mappings are Fréchet-smooth. To this end, we first note that for any x, y ∈ F reg and ψ ∈ φ x (Ω x ∩Ω y ), Next, we define the mappings (where the radius of the ball B 1/2 (0) is taken with respect to the operator norm).
Recall that in the proof of Theorem 3.2 (more precisely (3.14)) we chose Ω y so small that the operator id Sy − Y −1 π y z| Sy < 1/2 for any z ∈ Ω y . Thus, since for any . Now note that for the Fréchet derivative, we consider all vector spaces here as a real Banach spaces, but still with the canonical operator norm induced by . H . In view of the chain rule for Fréchet derivatives (for details see Lemma A.2 in Appendix A) and the properties of the Fréchet derivative in Lemma A.1 in Appendix A, it remains to show that the mappings W , B xy andB xy are Fréchet-smooth (note that the composition operator of R-linear mappings is also always Fréchet-smooth as it defines a bounded R-bilinear map and the map L(S y ) ∋ y → id Sy − y ∈ L(S y ) is clearly Fréchetsmooth as well). For W this is clear due to [21, p. 40-42] (note that L(S y ) obviously defines a finite-dimensional unital Banach-algebra). Moreover, the mappings B xy and B xy are obviously R-bilinear and bounded and thus Fréchet-smooth.
3.3. The Tangent Bundle. Having endowed F reg with a canonical smooth Banach manifold structure, the next step is to consider its tangent bundle. For finite-dimensional manifolds, the tangent space can be defined either by equivalence classes of curves or by derivations, and these two definitions coincide (see for example [24,Chapter 2]). In infinite dimensions, however, this does no longer be the case: In general, the derivation-tangent vectors (usually called operational tangent vectors) form a larger class of than the curve-tangent vectors (called kinematic tangent vectors). There might even be operational tangent vectors that depend on higher-order derivatives of the inserted function (while the kinematic tangent vectors interpreted as directional derivatives only involve the first derivatives); for details on such issues see for example [22,Sections 28 and 29] or [2, p. 3-6]. It turns out that for our applications in mind, it is preferable to define tangent vectors as equivalence classes of curves. Indeed, as we shall see, with this definition the usual computation rules remain valid. More specifically, the tangent vectors of F reg are compatible with the Fréchet derivative, and each fiber of the corresponding tangent bundle can be identified with the underlying Banach space Following [22, p. 284], we begin with the abstract definition of the (kinematic) tangent bundle, which makes it easier to see the topological structure. Afterward, we will show that this notion indeed agrees with equivalence classes of curves. Given x ′ ∈ F reg , we consider the set Ω x ′ × V x ′ × {x ′ } (endowed with the topology inherited from the direct sum of Banach spaces). We take the disjoint union For clarity, we point out that the first entry represents the point of the Banach manifold F reg , whereas the third entry labels the chart.
Definition 3.5. We define the tangent bundle T F reg as the quotient space with respect to this equivalence relation, The canonical projection is given by For every x ∈ F reg the tangent space at x is defined by Note that each T x F reg has a canonical vector space structure in the following sense: Since all equivalence classes in T x F reg have a representative of the form [x, v, x], this representative can be identified with v ∈ V x . In this way, we obtain an identification The tangent bundle is again a Banach manifold, as we now explain. For any x ∈ F reg , the mapping On T F reg we choose the coarsest topology with the property that the natural projections of these mappings to Ω x and V x are both continuous (where on Ω x and V x we choose the topology induced by the norm topology of L(H)). With this topology, the mapping (φ x , Dφ x ) defines a chart of T F reg . For any (ψ, v) ∈ (φ y , Dφ y ) π −1 (Ω x ) ∩ π −1 (Ω y ) , the transition mappings are given by Proof. We need to show that transition maps are Fréchet-smooth. This is clear for the first component because the transition mappings φ x • φ −1 y are Fréchet-smooth and fiberwise linear. The second component can be considered as the composition of the insertion map which is obviously continuous and bilinear and thus Fréchet-smooth, for details see In what follows, we will sometimes use the notation which also clarifies the independence of the choice of representatives.
Lemma 3.7. For any x ∈ F reg , the mapping is a local trivialization.
Proof. We need to verify the properties of a local trivialization. Clearly, the operator π • ψ x is the projection to the first component, and for fixed y ∈ Ω x , the mapping , which is obviously an isomorphism of vector spaces in view of Lemma A.1 (vi).
To summarize, the Banach manifold F reg has similar properties as in the finite-dimensional case.
We now explain how the above definition of tangent vectors relates to the equivalence classes of curves (following [22, p. 285]): Remark 3.8. (equivalence classes of curves) On curves γ,γ ∈ C ∞ (R, F reg ), we consider the equivalence relation γ ∼γ defined by the conditions that γ(0) =γ(0) and that in a chart φ x with γ(0) ∈ Ω x , the relation (φ x • γ) ′ | 0 = (φ x •γ) ′ | 0 holds. Note that if the last relation holds in one chart, then it also holds in any other chart φ y with γ(0) ∈ Ω y because, due to the chain rule, which bijective with inverse (for details see [22, p. 285]) Note that in (3.16) the tangent vector at γ(0) was expressed in the specific chart (φ γ(0) , Ω γ(0) ). However, the tangent vector can also be represented in another chart as follows. Let x ∈ F reg and [x, v, z] ∈ T x F reg be arbitrary. We say that a curve γ ∈ C ∞ (R, F reg ) represents [x, v, z] if in one chart φ y with x ∈ Ω y (and thus any chart, as one can show using the chain rule just as before) it holds that (3.17) In order to show independence of y, let w ∈ F reg with x ∈ Ω w . Then Hence if (3.17) holds in one chart, it also holds in any other chart around x. ♦ Remark 3.9. (directional derivatives) Let γ ∈ C ∞ (R, F reg ) be a curve that represents [x, v, z]. We define the directional derivative of a Fréchet-differentiable function f : This definition is independent of the choice of the curve γ. Indeed, for any chart φ w around x, we have d dt We close this subsection with one last definition: Definition 3.10. (Tangent vector fields) A tangent vector field on a Banach manifold is -similar to the finite-dimensional case -a Fréchet-smooth map v : F reg → T F reg such that v(x) ∈ T x F reg (i.e. π(v(x)) = x) for all x ∈ F reg . We denote the set of all tangent vectors fields of F reg by Γ(F reg , T F reg ).
We note that, according to this definition, multiplying a vector field by Fréchetsmooth real-valued function gives again a vector field. In other words, the space of all tangent vector fields forms a module over the ring of Fréchet-smooth functions from F reg to R.
and Ω x as in Theorem 3.2), and φ x (π x E) is defined in analogy to (3.13) by (the fact that this maps to the symmetric operators on S x is verified as in (3.15)).
Proof. A direct computation shows that R and Φ are inverses of each other: In order to compute R • Φ, we use the block operator notation Then there exist operatorsẼ J ,Ê J ∈ S(J) such that E JJ =Ẽ J +Ê J , and the operator In order to compute Φ • R, we take (ψ, B) ∈Ŵ arbitrary and note that, due to the definition of φ x in (3.13) and Theorem 3.2, we have (note that the first two mappings φ x are the ones defined in this theorem, whereas the third mapping is the one from (3.13)). We thus obtain Next, the mappings R and Φ are Fréchet-smooth because for operators of finite rank (namely rank at most 2n), the operator norm is equivalent to the Hilbert-Schmidt norm. Indeed, for an operator A : H → I mapping to a finite-dimensional Hilbert space I, This concludes the proof.
We consider a smooth curve The corresponding equivalence class defines a tangent vector [x, v, y] ∈ T x F reg . On the other hand, considering γ as a curve in S, it has the tangent vector In the chart φ x and setting ψ 0 = φ x (x), the curve is parametrized by ψ As ψ 0 = φ x (x) = π x , a direct computation (for details see the proof of Lemma B.6 in Appendix B) that the map V x v is injective.This makes it possible to write the tangent space as Theorem 3.12. Using the identification (3.18), the mapping defines a Fréchet-smooth Riemannian metric on F reg . Moreover, the topology on F reg induced by the operator norm coincides with the topology induced by the Riemannian metric.
Proof. Follows immediately because g x is the restriction of the Hilbert space scalar product to the smooth Fréchet submanifold F reg .
We finally remark that the symmetric wave charts are related to Gaussian charts (see the formulas in [15, Sections 5 and 6.2], which apply to the infinite-dimensional case as well). Detailed computations for the Riemannian metric in symmetric wave charts are given in Appendix B.

Differential Calculus on Expedient Subspaces
If all functions arising in the analysis were Fréchet-smooth, all the methods and notions from the finite-dimensional setting could be adapted in a straightforward way to the infinite-dimensional setting. However, this procedure is not sufficient for our purposes, because the Lagrangian is not Fréchet-smooth. Therefore, we need to develop a differential calculus on Banach spaces for functions which are only Hölder continuous. Clearly, in general such functions are not even Fréchet-differentiable, but the Gâteaux derivative may exist in certain directions. The disadvantage of Gâteaux derivatives is that the differentiable directions in general do not form a vector space. As a consequence, the usual computation rules like the linearity of the derivative or the chain and product rules cease to hold. Our strategy for preserving the usual computation rules is to work on suitable linear subspaces of the star-shaped set of all Gâteaux-differentiable directions, referred to as the expedient differentiable subspace.
is k-times continuously differentiable at h = 0. If this condition holds, the subspace V is called k-admissible at x 0 .
Thus a function f is once V -differentiable at x 0 if for every finite-dimensional subspace H ⊆ V , for every h 0 in a small neighborhood of the origin, and if Dg H | h 0 is continuous in the variable h 0 at h 0 = 0. Equivalently, choosing a basis e 1 , . . . , e L of H, this condition can be stated that all partial derivatives ∂ ∂α i g H α 1 e 1 + · · · + α L e L exist and are continuous at α 1 , . . . , α L = 0. The higher differentiability of g H can be defined inductively or, equivalently, by demanding that all partial derivatives up to the order k, i.e. all the functions ∂ p ∂α i 1 · · · α ip g H (α 1 e 1 + · · · + α L e L ) with i 1 , . . . , i p ∈ {1, . . . , L} and p ≤ k, exist and are continuous at α 1 , . . . , α L = 0. An admissible subspace V is maximal if there are no admissible proper exten-sionsṼ V . The existence of maximal admissible subspaces is guaranteed by Zorn's lemma, but maximal subspaces are in general not unique. In order to obtain a canonical subspace, we take the intersection of all maximal admissible subspaces: Since the expedient differentiable subspace is again admissible at x 0 , we obtain a corresponding derivative as follows. Given k ∈ N and vectors h 1 , . . . , h k ∈ E(f, x 0 ), we choose H as a finite-dimensional subspace which contains these vectors. We set Lemma 4.3. This procedure defines D k,E f | x 0 canonically as a symmetric, multilinear mapping Proof. In order to show that D k,E f | x 0 is well-defined, let H andH be two finitedimensional subspaces of E(f, x 0 ) which contain the vectors h 1 , . . . , h k . Then, expressing the partial derivatives in terms of partial derivatives, it follows that This shows that the definition (4.1) does not depend on the choice of H. The symmetry and homogeneity follow immediately from the corresponding properties of D k g H in (4.1). In order to prove additivity, we let h 1 , . . . , h k ∈ E k (f, x 0 ) andh 1 , . . . ,h k ∈ E k (f, x 0 ). We let H be the span of all these vectors and use that the corresponding operator D k g H | 0 in (4.1) applied to h 1 +h 1 , . . . , h k +h k is multilinear.
Note that the operator D k,E f | x 0 is in general not bounded. Moreover, E k (f, x 0 ) will in general not be a closed subspace of E, nor will it in general be dense.

Derivatives Along Smooth Curves.
We now analyze under which assumptions directional derivatives exist. To this end, we let I be an interval and γ : I → E a smooth curve (here the notions of Fréchet and Gâteaux smoothness coincide). Moreover let t 0 ∈ I with x 0 := γ(t 0 ) ∈ U and U ⊆ E open. Given a function f : U → F , we consider the composition f • γ : I → F .

Proposition 4.4. (chain rule)
Assume that f is locally Hölder continuous at x 0 , meaning that there is a neighborhood V ⊆ U of x 0 as well as constants α, c > 0 such that  Then the function f • γ is differentiable at t 0 and Proof. We consider the polynomial approximation of γ By assumption, this curve lies in the affine subspace E(f, x 0 ) + x 0 . Using that the restriction of f to this subspace is continuously differentiable, it follows that It remains to control the error term of the polynomial approximation. Using that f is locally Hölder continuous, we know that According to (4.3), we know that αp ≥ 1. Therefore, the error term is of the order o(t− t 0 ), which shows that also the function t → (f • γ)(t) − (f • γ p )(t) is differentiable with vanishing derivative. This proves the desired result.
This result immediately generalizes to higher derivatives:  Then the function f •γ is q-times differentiable at t 0 , and the derivative can be computed with the usual product and chain rules (formula of Faà di Bruno).
Proof. We again consider f along the polynomial approximation γ p (4.4) of the curve γ.
By assumption, this curve lies in a finite-dimensional subspace of the affine space Using that the restriction of f to this subspace is continuously differentiable, we know that f • γ p is q times continuously differentiable at t = t 0 , and the derivatives can be computed with the formula of Faà di Bruno, Using (4.5) and (4.6), we conclude that It follows that also this function is q-times differentiable and that all its derivatives vanish. This concludes the proof.

Local Hölder Continuity of the Causal Lagrangian.
The goal of this section is to prove the following result.
Theorem 5.1. The Lagrangian is locally Hölder continuous in the sense that for all x, y 0 ∈ F there is a neighborhood U ⊆ F of y 0 and a constant c > 0 such that where n is the spin dimension. Moreover, the integrand of the boundedness constraint is locally Lipschitz continuous in the sense that for all y,ỹ ∈ U .
We begin with a preparatory lemma.
This lemma is proven in a more general context in [4,Theorem 2]. For self-consistency we here give a simple proof based on Rouché's theorem: Proof of Lemma 5.2. After the rescaling λ → νλ and λ i → νλ i with ν > 0, we can assume that all the roots λ i are in the unit ball. Then the polynomial ∆P :=P − P is bounded in the ball of radius two by We denote the minimal distance of distinct eigenvalues by Since there is a finite number of roots, it clearly suffices to prove the lemma for one of them. Given i ∈ {1, . . . , g} we choose Next, we choose ε so small that δ < D/2. We consider the ball Ω = B δ (λ i ). Then for any λ ∈ ∂Ω, the polynomial P satisfies the bound where we used (5.4) and (5.3). Therefore, Rouché's theorem (see for example [27,Theorem 10.36]) implies that the polynomials P andP have the same number of roots in the ball Ω. Thus, after a suitable ordering of the roots, Using (5.4) gives the result.
Proof of Theorem 5.1. Let x, y ∈ F. Since both operators x and y vanish on the orthogonal complement of the span their images combined, J := span(S x , S y ), it suffices to compute the eigenvalues on the finite-dimensional subspace J. Choosing an orthonormal basis of S x = x(H) and extending it to an orthonormal basis of J, the matrix xy| J − 1 1 J has the block matrix form Therefore, its characteristic polynomial is given by This consideration shows that it suffices to analyze the operators xyπ x and similarly xỹπ x on the finite-dimensional Hilbert space x(H). We denote the corresponding characteristic polynomials by P andP, respectively. They are monic polynomials of degree g := dim x(H). The difference of these polynomials can be estimated in terms of operator norms on L(H) as follows, valid for allỹ with ỹ ≤ 2 y . According to Lemma 5.2, for sufficiently small y −ỹ the eigenvalues of these matrices can be arranged to satisfy the inequalities In order to prove (5.2), we consider the estimate and use that g ≤ 2n.
It remains to prove (5.1). In the case g < 2n, a simple estimate similar to (5.5) gives the result. In the remaining case g = 2n, using the abbreviation ∆λ i :=λ i − λ i , we obtain where in the last step we used that, whenever λ i = λ j , the multiplicities of both roots are at most g − 1. The inequality 2 yields the desired Hölder inequality with exponent 1/(2n − 1). Finally, it is clear from the construction that the constant depends continuously on y. This concludes the proof.
In the case of spin dimension one, the Lagrangian is Lipschitz continuous, in agreement with the findings in [20]. If the spin dimension is larger, one still has Hölder continuity, but the Hölder exponent becomes smaller if the spin dimension is increased. This can be understood from the fact that the higher the spin dimension is, the higher the degeneracies of the eigenvalues of xy can be.
We next prove a global Hölder continuity result. Proof. Without loss of generality we can assume that x = 0. Moreover, using that both sides of the inequality (5.6) have the same scaling behavior under the rescaling it suffices to consider the case that x = y = 1. Next, choosing a fixed 4n-dimensional subspace of I ⊆ H, we can always find a unitary transformation U : H → H such that U xU −1 (H), U yU −1 (H) ⊆ I. Since the Lagrangian and the operator norms are invariant under such joint unitary transformations (as they leave the eigenvalues of xy invariant), we can assume that both x and y map into the fixed finite dimensional subspace I.
After these transformations, the operators x and y can be considered as operators in L(I). Therefore, they lie in the compact set B 1 (0) ⊆ L(I). Since the Hölder constant for the local Hölder continuity depends continuously on x and y, a compactness argument shows that we can choose the Hölder constant uniformly in x and y: As the previous arguments show, the local Hölder constant can be written as a continuous (5.7) (2) As explained in the proof of Theorem 5.3, the Lagrangian L(x, y) depends only on the nonzero eigenvalues of xy and these coincide with the eigenvalues of xyπ x . Thus denoting J := span(S x , Sx) , we immediately obtain the following strengthened version of (5.7): Every x = 0 has a neighborhood U ⊂ F such that the inequality holds for allx ∈ U and all y ∈ F. This estimate will be needed for the proof of the chain rule for the integrated Lagrangian ℓ in Theorem 5.9.
(3) In the case y = 0, a direct estimate of the eigenvalues shows that one has Hölder continuity with the improved exponent two, This inequality can be combined with the result of Theorem 5. This inequality will be used in the proof of Theorem 5.9. ♦

Definition of Jet Spaces.
For the analysis of causal variational principles, the jet formalism was developed in [17]; see also [13,Section 2]. We now generalize the definition of the jet spaces to causal fermion systems in the infinite-dimensional setting.
Our method is to work with the expedient subspaces, where for convenience derivatives at x are always computed in the corresponding chart φ x . For example, for analyzing the differentiability of a real-valued function f at a point x ∈ F reg , we consider the composition : Ω x ⊆ Symm(S x ) ⊕ L(J, I) → R . We introduce Γ diff ρ as the linear space of all vector fields for which the directional derivative of the function ℓ exists in the sense of expedient subspaces (see Definition 4.2), for all x ∈ M . This gives rise to the jet space We choose a linear subspace J test ρ ⊆ J diff ρ with the property that its scalar and vector components are both vector spaces, (5.12) and the scalar component is nowhere trivial in the sense that It is convenient to consider a pair u := (a, u) consisting of a real-valued function a on M and a vector field u on T F reg along M , and to denote the combination of multiplication and directional derivative by For the Lagrangian, being a function of two variables x, y ∈ F reg , we always work in charts φ x and φ y , giving rise to the mapping where E is the Cartesian product of Banach spaces with the norm (where the subscripts x and y clarify the dependence on the base points, i.e. I x = x(H), J x = I ⊥ x ⊆ H and similarly at y). We denote partial derivatives acting on the first and second arguments by subscripts 1 and 2, respectively. Throughout this paper, we use the following conventions for partial derivatives and jet derivatives: ◮ Partial and jet derivatives with an index i ∈ {1, 2}, as for example in (5.15), only act on the respective variable of the function L. This implies, for example, that the derivatives commute, ◮ The partial or jet derivatives which do not carry an index act as partial derivatives on the corresponding argument of the Lagrangian. This implies, for example, that Definition 5.5. For any ℓ ∈ N 0 ∪ {∞}, the jet space J ℓ ρ ⊆ J ρ is defined as the vector space of test jets with the following properties: (i) The directional derivatives up to order ℓ exist in the sense that for all y ∈ M and x in an open neighborhood of M ⊆ F reg , The higher jet derivatives are defined by using (5.13) and multiplying out, keeping in mind that the partial derivatives act only on the Lagrangian, i.e. y) .
(ii) The functions are ρ-integrable in the variable y, giving rise to locally bounded functions in x.
More precisely, these functions are in the space Theorem 5.6. Let γ 1 and γ 2 be two smooth curves in F reg , Setting x = γ 1 (0) and y = γ 2 (0), we assume that the tangent vectors up to the order p = 2n − 1 denoted by v (1) Then the function L(γ 1 (τ ), γ 2 (τ )) is τ -differentiable at τ = 0 and the chain rule holds, i.e. d dτ Proof. We again consider the Lagrangian in the charts φ x and φ y , (5.14). In order to show that this function is locally Hölder continuous on E, we begin with the estimate is bilinear and therefore Fréchet-smooth, it follows that x (x) and ψ y := φ −1 y (y). This proves local Hölder continuity on E. Applying Proposition 4.4 gives the result.
We remark that, using Proposition 4.5, the above method could be generalized in a straightforward manner to higher derivatives.
Definition 5.7. We call ℓ Hölder continuous with Hölder exponent α along a smooth curve γ : I → F (with I an open interval) if for any t 0 ∈ I with x 0 = γ(t 0 ) there exists a subspace E 0 ⊆ Symm S x 0 ⊕ L(J x 0 , I x 0 ) and δ > 0 such that the mapping (1 1, 0) , is well defined and locally Hölder continuous with Hölder exponent α.
Theorem 5.8. Let γ : I → F be a smooth curve and ℓ Hölder continuous along γ with Hölder exponent α. For t 0 ∈ I with x 0 = γ(t 0 ) we set x 0 x + (1 1, 0) . If for any x 0 ∈ I the derivatives of γ x 0 up to the order p := ⌈q/α⌉ lie in the expedient differentiable subspace at x 0 , i.e.
Proof. Applying Proposition 4.5 to ℓ x 0 and γ x 0 yields the claim as the assumptions for this theorem are clearly fulfilled.
We now give a sufficient condition which ensures that ℓ is Hölder continuous along γ. This condition needs to be verified in the applications; see for example [25].
Theorem 5.9. Let γ be a smooth curve in F witĥ where P (x, y) is again the kernel of the fermionic projector (3.11) and Y is (similar to (3.5)) the invertible operator Y := y| Sy : S y → S y .
Then the integrated Lagrangian ℓ defined by (1.1) is Hölder continuous along γ with Hölder exponent 1 2n−1 . Proof. The idea of the proof is to integrate the estimate (5.10) over M . To this end, it is crucial to estimate the factor π J yπ J . We let (φ i ) i∈1,...m be an orthonormal basis of J and denote the orthogonal projection on span(φ i ) by π i . Since on the finite-dimensional vector space L(J) all norms are equivalent, we can work with the Hilbert-Schmidt norm of π J yπ J , i.e. for a suitable constant C = C(n), where in the last step we used that the norm of an operator is the same as the norm of its adjoint. Combining this inequality with the estimate π J y ψ 2 ≤ π x y ψ + πx y ψ 2 ≤ 2 π x y ψ 2 + 2 πx y ψ 2 , we obtain Using this estimate when integrating (5.10) over M and noting that φ −1 x is locally Lipschitz (since it is Fréchet-smooth) yields the claim.   [6, pp. 149-150] and note that the completeness of the vector spaces is not needed for this result). Moreover, as Df : U → L(V, W ) is constant it is clear that all higher Fréchet-derivatives of f vanish (and in particular f is Fréchet-smooth). (iii) B is Fréchet differentiable with the stated Fréchet derivative as for a fixed C > 0 (as B is continuous and bilinear), see also [6, p. 149-150] (again the completeness in not needed). And since is clearly bounded linear, B is Fréchet-smooth due to part (ii).
Lemma A.2. Let V , W and Z be real normed spaces, n ∈ N arbitrary and U V ⊆ V, U W ⊆ W open subsets, f : U V → W n-times Fréchet-differentiable (Fréchet-smooth) and g : Then g • f is also n-times Fréchet-differentiable (respectively Fréchet-smooth).
Proof. We show the result by induction over n following [6, p. 183]: The case n = 1 follows from the chain rule. Now let n ≥ 2 be arbitrary and suppose that the claim holds for n−1. Then the induction hypothesis yields that the mapping • Dg| x is (n − 1)-times Fréchet-differentiable, because Df , g and Dg are at least (n − 1)-times Fréchet-differentiable and the operator • is even Fréchet-smooth (as it is bounded linear). Thus f • g is n-times Fréchet-differentiable.
The smoothness result follows immediately from the result for n-times differentiability.
The following lemma gives a useful computation rule for higher Fréchet derivatives (see also [6, p. 179, 181]): Lemma A.3. Let V, W be real normed spaces, U ⊆ V open and f : U → W n-times differentiable. Then for any x 0 ∈ U and v 1 , · · · , v n ∈ V , Proof. We follow the idea of the proof given in [6, pp. 179, 181] and also make use of the symmetry result in Lemma 2.3. We first fix v 1 , . . . , v n ∈ V and define a linear map by which simply inserts all the v 1 , . . . , v n−1 in an A ∈ L(V, W ). Note that E v 1 ,...,v n−1 is clearly bounded linear and thus Fréchet-smooth. So if we use the representation of where in the first step we used the chain rule and Lemma A.1 (ii). In the second step, we used the definition of E v 1 ,...,v n−1 , whereas in the third step we re-identified D n f | x 0 with the corresponding multilinear mapping from V n to W . Finally, in the last step we used the symmetry of D n f | x 0 .
We finally state one last computation rule: Lemma A.4. Let V , W and Z be normed vector spaces, U ⊆ V open, f : U → W ntimes Fréchet-differentiable and A ∈ L(W, Z). Then also the function A • f is n-times Fréchet-differentiable and Proof. The Fréchet-differentiability follows immediately from Lemma A.2, using that A is Fréchet-smooth. We show the identity (A.2) by induction over n: The case n = 1 follows immediately by the chain rule and Lemma A.1 (ii). Now let n ≥ 2 and assume that the statement holds for n−1. Using the previous lemma (first step), the induction hypothesis (i.e. (A.2) for the (n − 1)-st derivative) in the second step as well as the chain rule, Lemma A.1 (ii) and the symmetry of f ( n), for all x 0 ∈ U and v 1 , · · · , v n ∈ V we obtain

Appendix B. The Riemannian Metric in Symmetric Wave Charts
In this subsection we give a detailed computation of the Riemannian metric introduced in Section 3.4 in terms of the symmetric wave charts. Hereby we adapt the methods in [15,Section 4] to the infinite-dimensional setting.
We begin by defining a distance function on F reg by . The trace operator involved here is well-defined and can be expressed in any orthonormal basis (e i ) i∈N of H by (for details see for example [23,Section 30 Moreover note that d does indeed define a distance function on F reg as for any two x, y ∈ F reg d(x, y) = x − y S(H) , where . S(H) denotes the Hilbert-Schmidt norm (see for example [7,Section XI.6] or [28, p. 321-322, 309-310]).
The following remark is a reminder of a calculation rule for the trace operator acting on operators with finite rank. be an orthonormal basis of v and (ẽ i ) i∈N an orthonormal basis of V ⊥ . Then we obtain an orthonormal basis (ê i ) i∈N of H by settingê i := e i for i = 1, · · · , k andê k+j :=ẽ j for j ∈ N. Using this basis in (B.1) the trace of A reduces to: This concludes the proof.
In the following lemma we consider differentiability properties of a mapping E which corresponds to the square of the distance function d. Later we want to use it to introduce the Riemannian metric as second Fréchet-derivative of E.
Lemma B.3. The mappings: and for any fixed x ∈ F reg : are Fréchet-smooth. Moreover, for all x, y ∈ F reg with x ∈ Ω y and all u, v ∈ V y , Proof. First we have to show that E • (φ −1 x , φ −1 y ) is Fréchet-smooth for all x, y ∈ F reg . To this end first consider the following calculation for arbitrary ϕ ∈ W x , ψ ∈ W y : where in the second step we used the linearity of the trace and in the third step the cyclic permutation property (which can be applied as all factors and summands obviously have finite rank). The last line is clearly a sum of composition of Fréchetsmooth mappings in (ϕ, ψ), which proves the Fréchet-smoothness of E • (φ −1 x , φ −1 y ). For calculating the Fréchet derivative of E x consider the expansion which is again a sum of compositions of Fréchet-smooth functions showing that also E x • φ −1 y is Fréchet-smooth. Applying the computation rule from Lemma A.4 together with the Fréchet derivative rule for bilinear functions in Lemma A.1 (iii) (multiple times and together with the chain rule) we obtain: Using that Lemma B.2 (iii) and (iv) this simplifies to In the case ψ † yψ = φ −1 y (ψ) = x the terms in (B.4) cancel each other, showing that φy(x) = 0 . Moreover, proceeding from (B.3) a straightforward computation using the properties of the Fréchet derivative and the trace operator as before gives As for ψ † yψ = φ −1 y (ψ) = x the first and the last term cancel each other we obtain which concludes the proof.
is independent of the choice of chart (i.e. the choice of y) as long as y ∈ F reg is chosen such that x ∈ Ω y . Moreover, for all tangent vector fields v, u ∈ Γ (F reg , T F reg ) and any y ∈ F reg with x ∈ Ω y D v(x) D u(.) E x (.) = D 2 E x • φ −1 y φy(x) Dφ y (u(x)), Dφ y (v(x)) , (B.7) where the derivatives act on the arguments containing a dot.
This Lemma also shows, that the the order of differentiation of E x with respect to the two vector fields does not matter. The proof shows that this is due to the fact that the first derivative of E x vanishes.
Proof. Let v, u ∈ Γ(F reg , T F reg ) and x, y ∈ F reg with x ∈ Ω y be arbitrary. As we have seen before, for the first directional derivative we have Dφ y (v(x)) , where we applied the Fréchet derivative rule for R-bilinear maps in Lemma A.1 (iii) together with the chain rule. Evaluating this expression atx = x, the second summand vanishes in view of (B.2). We thus obtain φy(x) Dφ y (v(x)), Dφ y (u(x)) . Using the symmetry of the second Fréchet derivatives gives the result.
Remark B.5. Equation (B.7) also shows that D v(x) (D u(.) E x (.)) only depends on the value of the vector fields u and v at the point x. Moreover, since to arbitrary x ∈ F reg and u, v ∈ T x F reg one can always find a smooth tangent vector field with v(x) = v, u(x) = u (for example by using a suitable bump function in a chart around x), we can consider the expression as a well defined, coordinate invariant -in the sense that the right hand side of equation (B.8) returns the same values for any y ∈ F reg with x ∈ Ω y -and symmetric bilinear form. ♦ Now it seems convenient to compute (B.8) in the cart φ x . Then we have φ x (x) = π x and thus we obtain for any u, v ∈ V x : = 4 · Re tr(xu † xv † ) + tr(x 2 uv † ) = 4 · Re tr(xvxu) + tr(xuv † x) .
Motivated by this for any x ∈ F reg we set: g x : V x × V x → R , (u, v) → 4 · Re tr(xvxu) + tr(xuv † x) .
Due to the properties of the trace operator,g x defines a symmetric, real-valued bilinear form on V x , which is even positive-definite as the following lemma shows: Lemma B.6. The symmetric bilinear formg x is positive definite and thus defines a real valued inner product on V x .
Proof. Let u ∈ V x be arbitrary, choose an orthonormal basis (e i ) i=1,··· ,k of the finitedimensional vector-space (S x + u † (S x )) and compute: 2 u † xe i |u † xe i H , 2 As explained in Remark B.1, the trace operator for the finite-rank operators xuxu, xuu † x and u † x 2 u can indeed be calculated like that as they all map into (Sx + u † (Sx)).

tr(xuu
where in the last step we used that u † xe i |u † xe i H = u † x 2 H and xue i |xue i H = xu 2 H are already real (for i = 1, . . . , k), so we can leave out the "Re". Combining this we obtain: Re tr(xuxu) + tr(xuu † x) = 1 2 This shows the positive semi-definiteness ofg x . Moreover we see thatg x (u, u) vanishes if and only if But as (u † x + xu) is obviously selfadjoint and its image is contained in S x + u † (S x ), it vanishes on the orthogonal complement of S x + u † (S x ) anyhow, so the previous equation is equivalent to 0 = u † x + xu . (B.9) Moreover, denoting π I := π x as the orthogonal projection on S x = I and π J as the orthogonal projection on J = (S x ) ⊥ , we can write u = uπ I + uπ J .
As x is selfadjoint this also yields π I u † x = xuπ I .
This notation can be justified by "testing" equation (B.11) with (v, 0), (0, w) ∈ H = I ⊕ ⊥ J with v ∈ I and w ∈ J arbitrary. Thus we see that each of the operators 2xuπ I , π J u † x and xuπ J must vanish individually. Furthermore as x| I has full rank and u maps into S x = I, this yields uπ I = 0 , uπ J = 0 , and therefore also u = u(π I + π J ) = 0 .
This proves the positive definiteness ofg x (u, u) = Re tr(xuxu) + tr(xuu † x) . Now we can finally introduce a Riemannian metric on F reg : Lemma B.7. Setting pointwise for any x ∈ F reg : , we obtain a well defined Riemannian metric on F reg . 3 Proof. First of all g x is well defined due to Lemma B.4 as explained in Remark B.5. Moreover, choosing representatives [x, u, x], [x, v, x] ∈ T x F reg we have: and since we have already seen that for any x ∈ F reg ,g x defines a symmetric positivedefinite bilinear form, so does g x . Thus it only remains to show that g is Fréchet-smooth. But due to the (coordinate invariant) definition of D 2 2 E x | x in (B.8) this follows immediately from the Fréchetsmoothness of D 2 (E x • φ −1 y ) (for this see Lemma B.3, in particular equation (B.5)). More precisely, since for any two smooth vector fields u, v ∈ Γ(F reg , T F reg ) for any chart φ y with x ∈ Ω y also Dφ y • u • φ −1 y and Dφ y • v • φ −1 y are smooth, we have , ∀ ψ ∈ W y , which is Fréchet-smooth as composition of Fréchet-smooth maps. More precisely, introducing the mappings which are both obviously R-bilinear and continuous (and thus Fréchet-smooth), we can rewrite the previous equation to =B 2 B 1 (d 2 (x, φ −1 y (.))) (2) | ψ , Dφ y (u • φ −1 y (ψ)) , Dφ y (v • φ −1 y (ψ)) , which is now clearly a composition of Fréchet-smooth maps.