The parabolic Anderson model on Riemann surfaces

We show well-posedness for the parabolic Anderson model on 2-dimensional closed Riemannian manifolds. To this end we extend the notion of regularity structures to curved space, and explicitly construct the minimal structure required for this equation. A central ingredient is the appropriate re-interpretation of the polynomial model, which we build up to any order.


Introduction
The last few years have seen an explosion of literature on singular stochastic partial differential equations (singular SPDEs). The simplest instance of such an equation is the parabolic Anderson model in two dimensions, formally written as Here u : [0, T ] × D → R is looked for, where D is some 2 dimensional domain, and ξ is (time-independent) white noise on the domain D. This equation is formally ill-posed (or "singular"), since u is not expected to be regular enough for the product uξ to be well-defined analytically. The standard tool of stochastic calculus, the Itō integral, is also of no use here, since the white-noise is constant in time.
With the breakthrough results of Hairer [9] and Gubinelli, Imkeller and Perkowski [7] a large class of such equations has become amenable to analysis. Let us sketch the approach of [9], since this is the one we shall use in this work.
• assume that u "looks like" the solution ν to the additive-noise equation which is classically well-defined via convolution with the heat semigroup P t • under this assumption, if we somehow define ν · ξ , then the framework defines u · ξ automatically • close the fixpoint argument, i.e.
1. u "looks like" ν 2. w := P t u 0 + t 0 P t−s [u s ξ ]ds 3. then w "looks like" ν It then only remains to define the missing ingredient "ν · ξ ". This can be done probabilistically and is actually the only place in this theory that is not deterministic. Using this procedure, it is shown in [9] that (PAM) possesses a unique solution for D = T 2 , the two dimensional torus.
In this work we show that the theory can be adapted to work for D = M, a 2-dimensional closed Riemannian manifold. The theory of regularity structures is intrinsically a local theory (as opposed to the theory of paracontrolled distributions, which, at least at first sight is global in spirit). It is hence natural to expect that it can be applied to general geometries. It turns out that the implementation of this heuristic is not straightforward.
At least two hurdles need indeed to be bypassed. On the one hand, at the core of Euclidean regularity structures stands the space of polynomials, encoding classical Taylor expansions at any point. The operation of re-expansion from a point to another leads to a morphism from (R d , +) to a space of unipotent matrices. On a manifold, one would need to look for such a space, encoding Taylor expansion and enjoying a similar structure. On the other hand, as usual for fixpoint arguments of (S)PDEs, one needs to estimate the improvement of the heat kernel in adequate spaces, which is a global operation (Schauder estimates).
To solve the first issue, we show that the space of polynomials on the tangent space of the manifolds is a suitable candidate for a canonical regularity structure, that allows to encode Hölder functions. This choice enforces a modified definition of a regularity structure. In particular one has to abandon the idea of one fixed vector space and work with vector bundles instead. For our definition of a model, there is no unipotent structure anymore and re-expansions are only approximately compatible. Within this new framework, when considering the parabolic Anderson model on a surface, we give a weak version of a Schauder estimate with elementary tools and heat kernel estimates.
This exposition does not demand any previous knowledge of regularity structures on the reader. In this sense it is self-contained, apart from a reference to the reconstruction theorem of Hairer in our Theorem 21 and in the construction of the Gaussian model in Sect. 8. Its proof using wavelet analysis is of no use reproducing here. We believe that the validity of that reconstruction theorem, which we use in coordinates, is easily believed.
We follow a very hands-on approach. Instead of trying to set up a general theory of regularity structures on manifolds, we work with the smallest structure that is necessary to solve PAM. We show the Schauder estimates explicitly. Apart from introducing for the first time regularity structures on manifolds, we believe our work also has a pedagogical value. Since everything is laid out explicitly and covers the flat case M = T 2 , it can serve as a gentle introduction to the general theory.
In future work we will investigate the algebraic foundation necessary for studying general equations, without having to build the regularity structure "by hand". For general equations a new proof of the Schauder estimates has also to be found.
During the writing of the present article, a different approached has been put forward in [2], where the notion of paracontrolled products using semi-groups is developed on general metric spaces. The advantage of the paracontrolled approach is that it requires less machinery. On the downside, the class of equations that can be covered is currently strictly smaller than in the setting of regularity structures. Let us point to [3] though, which pushes the framework to more general equations.
The outline of this paper is as follows. After presenting notational conventions, we give in Sect. 2 the notion of distributions on manifold we shall use in this work. Moreover we introduce Hölder spaces on manifolds. In Sect. 3 we introduce the notion of regularity structure, model and modelled distribution on a manifold. We show how these objects behave nicely under diffeomorphisms and use this fact to show the reconstruction theorem. In Sect. 4 we give the simplest non-trivial example of a regularity structure on a manifold; the regularity structure for "linear polynomials". This forms the basis for the regularity structure for PAM, which is constructed in Sect. 5. As input it takes the product νξ alluded to before. This is constructed in Sect. 8 via renormalization. Section 6 gives the Schauder estimate for modelled distributions in the setting of PAM and finally Sect. 7 solves the corresponding fixpoint equation. In Sect. 9 we show how the construction of Sect. 4 can be extended to "polynomials" of arbitrary order.

Notation
In all what follows M will be a d-dimensional closed Riemannian manifold. When we specialize to the parabolic Anderson model (PAM), the dimension will be d = 2. Denote by δ > 0 the radius of injectivity of M. For p ∈ M we denote with exp p : T p M → M the exponential function. It is a diffeomorphism on B T p M (0 p , δ) := {x ∈ T p M : |x| T p M < δ}, with inverse exp −1 p . For a function ϕ supported in B T p M (0 p , δ) define for λ ∈ (0, 1], p ∈ M, extended to all of M by setting it to zero outside of B T p M (0 p , δ).
For τ ∈ G, G a graded normed vector bundle with grading A we denote by ||τ || a the size of component in the a-th level, a ∈ A.
The differential of a smooth enough function f : M → R at a point p will be denoted d| p f ∈ T * p M. Similarly for higher order derivatives (see Sect. 9) ∇ | p f ∈ T * p M ⊗ . For the action on vectors W ∈ T p M ⊗ , we shall write either ∇ | p f , W or ∇ W f . For p ∈ M, r , δ > 0 denote Here r will depend on the situation, and will always be large enough so that the distributions under consideration can act on ϕ.
We shall use p, q for points in M and x, y, z to denote points in R d . For x ∈ R d , ϕ : R d → R we write which is consistent with the notation introduced above when considering R d as Riemannian manifold with the standard metric. We also define, analogously to above, B R d (0, δ) and B r ,δ R d . Balls in M are denote by B M ( p, δ) := {q ∈ M : d(q, p) < δ}. For γ ∈ R we denote by [γ ] the largest integer strictly smaller than γ . For a pairing of a distribution T with a test function we write T , ϕ . For two quantities f , g we write f g if there exists a constant C > 0 such that f ≤ Cg. To make explicit the dependence of C on a quantity h, we sometimes write f h g.
We denote the positive natural numbers by N and the non-negative ones by N 0 .

Definition 10
Let M be a closed Riemannian manifold. Let a finite partition of unity (φ i ) i∈I be given on M, subordinate to a finite atlas For γ > 0, an equivalent characterization of C γ (M) will be shown in Theorem 92. We now give one in the case γ ≤ 0.

Lemma 11
For γ ≤ 0, M a closed Riemannian manifold, an equivalent norm on C γ (M) is given by where we recall that ϕ λ p is defined in (2).
Proof Fix a finite atlas ( i , U i ) i∈I with subordinate partition of unity φ i . Denote Here the functionsφ i are chosen such that suppφ i ⊂ U i and φ iφi = φ i . Now, ϕ λ p is supported in a ball of radius δλ around p, hence for some for some z i ∈ R d . Here, the last inclusion follows from the fact that the atlas is finite andφ i is strictly contained in U i , so that the Lipschitz norm of i and −1 i are uniformly bounded in the regions of interest. By the same reasoning, the derivatives up to order r of exp −1 p • −1 i are uniformly bounded on the relevant regions. Hence and the result follows from Remark 7.
(C 1 C 2 ): By Remark 6, we have to show for some λ 0 > 0 and for all i ∈ I , Since φ i is supported away from the boundary of U i , there exists λ 0 > 0 such that for all x ∈ R d there is some p ∈ M such that supp ϕ i;λ 0 ;x ⊂ B M ( p, δ). 1 Since the atlas is finite, λ 0 can be chosen uniformly for all i ∈ I .
Now, one checks that ϕ i;λ;x •exp −1 i (x) falls under Remark 7 and hence this expression is indeed bounded by a constant times C 2 λ γ .
As immediate consequence we get the following statement.

Corollary 12
Let (¯ j ,Ū j ) j∈J be another finite atlas with subordinate partition of unity (φ j ) j∈J . Then for γ ≤ 0 with equivalent norms. We now give our definition of a regularity structure and a model on a manifold M. For concrete incarnations of these abstract definitions we refer the reader to Sect. 4 for the implementation of a first order "polynomial" structure; to Sect. 9 for a structure implementing "polynomials" of any order and right before Lemma 36 for the structure used for the parabolic Anderson model. Definition 13 (Regularity structure) A regularity structure is a graded vector bundle G on M, with a finite grading A = A(G) ⊂ R. For a ∈ A, G a denotes the vector bundle of homogeneity a; it is assumed to be finite dimensional. We denote the fiber at p ∈ M by G| p and the fiber of homogeneity a at p by G a | p . For p ∈ M, τ ∈ G| p , a ∈ A we write proj G a | p τ for the projection of τ onto G a | p . We assume that G comes equipped with a norm ||.||. We denote the fiber norm of the restriction to homogeneity a by ||τ || a := || proj G a | p τ ||.

Definition 14 (Model) Let a collection of open sets
be given. We assume there is for every compactum K ⊂ M a constant δ K = δ K ( , , {U q } q ) > 0, such that p←q is defined for p, q ∈ K, d( p, q) < δ K and for q ∈ K, exp q | B Tp M (0 p ,δ K ) is a diffeomorphism and exp q (B T p M (0 p , δ K )) ⊂ U q . Assume moreover p← p = id for every p ∈ M. Given β ∈ R, we say that ( , ) is a model with transport precision β if the following entity is finite for every compactum where we recall that the set of test functions B r ,δ T p M was defined in (3).

Remark 15
The additional restriction on distance and support in the second supremum are necessary, since otherwise the action of q τ on ϕ λ p might not be well-defined.

Remark 16
Note that the conditions on a model do not pin down the global regularity of q τ . Without loss of generality we will assume that q τ ∈ C α (U q ) for all q ∈ M, τ ∈ G| q and α := min A(G).
Our definition of a regularity structure and a corresponding model are slightly more general than the original formulation by Hairer [9]. This extension is necessary to accommodate the "polynomial regularity structure", which will be constructed up to first order in Sect. 4 and up to any order in Sect. 9. Let us point out the key differences.
• Derivatives of functions on a general manifold M can only be coordinate invariantly stored in a fibered space. Hence the regularity structure has to be a vector bundle and not a fixed vector space. • For this reason there cannot be a fixed structure group G in which the transport maps p←q take value. • The transport maps p←q can also act "upwards", see Remark 83.
• The distributions p τ as well as the transports p←q only make sense locally.
• The identities p←q q←r = p←r and p p←q = q do not hold. An approximate version of the latter is incorporated into the norm (transport precision β). The former is, in the flat case, used in an extension argument ([9, Proposition 3.31]), which we do not need here.
It turns out that the theory can handle these slight extensions. In particular the reconstruction theorem still holds, Theorem 23. Finally, we remark that our regularity structure does not include time and that the parabolic Anderson model will be treated by considering functions in time, valued in modelled distributions (Definition 18) on a manifold.
As in Lemma 8 we know how p τ acts on a more general class of functions: Note that in the sum |λ K z| < δ K /2. Hence, by assumption q := exp p (λ K z) ∈ K. Hence by definition of a model, || p←q τ || m ≤ || , || β,K ||τ ||d( p, q) ( −m)∨0 = || , || β,K ||τ |||λ K z| ( −m)∨0 , for τ ∈ G | q . 2 Then for those z Moreover, the model being of transport precision β, we get Definition 18 Let G be a regularity structure and ( , ) a model of precision β ∈ R. Define for γ > sup α∈A(G) |α| the space of modelled distributions Here δ K is the distance of points in K for which makes sense, see Definition 14. Note that the precision of transport β plays no role here.

Remark 19
As usual for Hölder norms, for every compactum K an equivalent norm is obtained by replacing in the supremum, for any δ ∈ (0, δ K ], the condition d( p, q) < δ K with the condition d( p, q) < δ .
Lemma 20 (Push-forward) Let M, N be Riemannian manifolds and let : M → N be a diffeomorphism. Let G be a regularity structure on M with model ( , ) with transport precision β ∈ R. Definē U q := (U −1 (q) ), Then,Ḡ is a regularity structure on N with grading¯A = A and (¯ ,¯ ) is a model with transport precision β. Moreover for every compactum C ⊂ N and all compacta Proof Since has derivative uniformly bounded below and above for every compactum K ⊂ C, one can choose for every compactum K a constantδ K as in the definition of a model, such that¯ p←q is well-defined for p, q ∈ K and d( p, q) < δ K as well as exp N q (B T q M (0 q , δ K ) ⊂Ū q . Here exp N denotes the exponential map on N . 1. Let q ∈ K ⊂ N and τ ∈G a | q and ϕ ∈ B r ,δ K again by Remark 7. Finally for p, q ∈ K ⊂ N with d( p, q) < δ K and τ ∈G a | q , we have and similarly for the distance of two modelled distributions. .
Here ϕ ∈ B r ,δ K T p M , r = −[α], (so that the action of x f (x) is well-defined) and K is the closure of the δ K thickening of K.

Remark 22
Uniqueness actually holds in the class of operators R that satisfy (5) with γ replaced by any θ > 0.

Proof Existence
We will apply [9, Proposition 3.25]. 3 This Proposition is formulated for R d , but the statement is local and also holds for M ⊂ R d . So we have to verify for ζ uniformly over x, y ∈ K, n ≥ n 0 , n 0 = log 2 (δ K ) ∨ 0 and 2 −n ≤ |x − y| ≤ δ K . In [9,Proposition 3.25] the upper bound 1 is chosen on |x − y|, but any upper bound works, so we chose δ K , since we need x←y to be well-defined. Here ϕ n x := 2 nd/2 ϕ(2 n (· − x)), and ϕ is a scaling function for a wavelet basis of regularity r > |α|. We have chosen n 0 also such that for n ≥ n 0 and x ∈ K, τ ∈ G| x the expression x τ, ϕ n x is welldefined. First, (7) follows from the fact that α is the lowest homogeneity in A(G) (note that ϕ n x is scaled to preserve the L 2 -norm, whereas the scaling in the definition of a model preserves the L 1 -norm). Now We bound the first term as The second term is bounded as where we again used the assumption 2 −n ≤ |x − y|. This proves (6) and an application of [9,Proposition 3.25] gives the existence of R f satisfying the bound (5). The preceding argument is valid for α < 0. For α = 0, one can run the argument for some α < 0 and get unique existence of R f ∈ C α with the claimed properties. In Corollary 24 below it is shown that actually R f ∈ C 0 .
Lemma 23 (Reconstruction for M a closed Riemannian manifold) Let M be a closed Riemannian manifold with regularity structure G and ( , ) a model with transport precision β ∈ R. Let γ > 0, and f ∈ D γ (M, G) and assume β ≥ γ . Denote α := inf A. Assume either that α < 0 or that α = 0 and that the lowest homogeneity in G is given by the constant distribution (of the polynomial regularity structure).
Then, there exists a unique distribution R f ∈ C α (M) such that Proof By a cutting up procedure, it is enough to show (8) for ϕ ∈ B r ,δ T p M with δ ∈ (0, δ M ] to be chosen. Let ( i , U i ) i∈I a finite atlas with subordinate partition of unity (φ i ) i∈I . On each chart, we push-forward the regularity structure, model and f to i (U i ), with corresponding reconstruction operationR i , model˜ i and modelled distributionf i . For each i ∈ I , fix a compactum K i ⊂ U i such that supp φ i is strictly contained in K i . By Lemma 20, Now reconstruct in each coordinate chart asT i :=R ifi using Theorem 21. Define have φ i ϕ λ p = 0, so the summand vanishes. Otherwise, with z := i ( p) Summing over i gives (8).

Corollary 24
In the setting of the previous theorem, assume that the lowest homogeneity in G is 0 and that it is given by the constant (as in the polynomial regularity structure of Sect. 4). Then R f is given by projection onto that homogeneity, i.e.
Recall that the projection proj is defined in Definition 13. The last term is of bounded by a constant times λ η , where η is the smallest homogeneity strictly larger than 0.
For the second to last term we first write By the properties of a model Hence, by Remark 22,R = R.
We want to apply the Lemma 23 to the terms in the heat kernel asymptotics (Theorem 43). The problem is that their support will be of order 1 (and not of order λ as for ϕ λ x ). Hence we need the following refinement which is similar to Lemma 8.

Lemma 25
In the setting of Lemma 23, let ϕ satisfy the assumptions of Lemma 8 with the additional condition supp ϕ ⊂ B T p M (0 p , δ K /4). Then Proof Let ϕ z,λ be given as in the Proof of Lemma 17 with K := M.
Note that in the sum |λ M z| < δ M /2. Hence exp p (λ M z) ∈ M is well-defined. Now the first summand can be written as The second summand is bounded as

Linear "polynomials" on a Riemannian manifold
The regularity structure for linear "polynomials" on the Riemannian manifold M will be built on the vector bundle (M × R) ⊕ T * M. For readability introduce the symbol 1 and decree that it forms a basis for R. Define the graded vector bundle q M be the fiber at q. A generic element of T q will be written as Note that, since R1 is a trivial fiber bundle, it is enough to specify it on the basis element 1. This is not possible on T * M. Note also that q ω is chosen to have value 0 and differential ω at q.
Finally define the re-expansion maps p←q : T q → T p as which is well-defined for d( p, q) < δ. and together form the polynomial model, where we take δ M = δ in Definition 14.
Remark 26 Note that in Euclidean space with ω = dx i we have the classical linear polynomials "based" at q. Moreover so we recover Hairer's definition [9].
The transport of ω ∈ T * q M is chosen such that q ω and p p←q ω have, at p, the same value and the same first derivative. Our re-expansion is not exact, i.e. we do not have q τ = p p←q τ , but we have the following.
, d| p f = d| q g and hence the statement follows from Taylor's theorem.

Remark 28
In the setting of the previous Lemma, not only f ( p) = g( p) but also f (q) = g(q). Indeed, for two points p, q ∈ M, at distance smaller than the cut locus and ω q ∈ T * q M, where the tangent map satisfies indeed is the unique path from p to q, with length and speed both equal to d( p, q), staying within the cut-locus from y, that is (exp p (tv p )) 0≤t≤1 : in other words, for any 0 ≤ t ≤ 1, Hence, The next lemma follows from Lemma 27 and is shown in more generality in Theorem 91.

Lemma 29
The above is a model of transport precision β = 2.
As a sanity check for our construction, we mention the following lemma, which is almost immediate in the flat case (see [9,Lemma 2.12]). We will prove it in Sect. 9 in a more general setting.

The regularity structure for PAM on a manifold
In the next five sections M is a 2-dimensional closed manifold.
The regularity structure for PAM will be built on two copies of the vector bundle, M × R 2 ⊕T * M. We denote these two copies by V and W. In order to distinguish the different elements of these bundles we introduce the symbols {1, , I[ ], I[ ] } and decree that they form a basis for R 4 . We then write where T * M is simply another copy of T * M. Formally we have, V = W . As before we will let T | p , V| p , and W| p denote the fibers of these bundles over p ∈ M.
The vector bundles V and W are graded, with gradings is the projection taking an element to its β -component. To be concrete, generic elements τ ∈ V| p , τ ∈ W| p are of the form And then for example All the graded fibers have a canonical norm, where on the cotangent space we use the norm induced by the Riemannian metric. For β ∈ A, τ ∈ V| p (or τ ∈ W| p ) we write as before, in a slight abuse of notation, ||τ || β := || proj β τ ||.
The model we shall use for the parabolic Anderson model will be time dependent, so we need slight extensions of our definitions.
Definition 31 For G = V, W, assume we are given a family of models where || t , t || β,M is defined in Definition 14. Note that for fixed t, the model comes with a reconstruction operator (Theorem 23), which we shall denote R t .
Definition 32 (Time-dependent modelled distributions) For G = V, W, given a family of models ( t , t ) parametrized by t ∈ [0, T ], denote by D t,γ (G) = D t,γ (M, G) the corresponding spaces of modelled distributions. That is, as defined in Definition 18, For N > 0, define the modified norm

Remark 33
The modified norms with scaling parameter N are necessary for the fixpoint argument, see Remark 39. As usual with Hölder-type spaces on compact domains, these spaces are complete metric spaces.
Remark 34 Note that we do not need transport in time, as opposed to the definition in [11,Definition 2.4]. The price we pay, is that Hölder regularity in time of the reconstruction of a solution is established in a roundabout way. Namely, by first verifying time regularity of the 0 component of the solution (Theorem 38) and then checking that reconstruction is given by projection on that component (Lemma 24).
In [11] it follows from the definition of a controlled distribution and their reconstruction theorem, Theorem 2.11.
We now build the model for the structures V, W. As input we need realizations of and I[ ] .
Definition 35 Assume for T > 0 we are given ξ ∈ C α (M) and a family of distributions where the action of the heat kernel p on ξ is well-defined by Theorem 37. Define where r := −[α] and δ is the radius of injectivity of M.
In our application to white-noise forcing, ξ will be the white noise on M and Z will be constructed via Gaussian renormalization in Sect. 8. Now define the models for V and W as By definition . Regarding transport, both the transport of and I[ ] are exact by definition and where we used Lemma 27 for the last step. Finally by the Schauder estimate Theorem 37, and Analogously, one gets the bounds for W with β = 2.

Schauder estimates
Let p be the heat kernel on M. We start with a proof Schauder estimate for distributions, as a warm-up to the one for modelled distributions.
Proof As in the proof of the next theorem we shall focus on the singular part p N in the decomposition p = p N + R N using heat asymptotics, Theorem 43, for arbitrary large N . Let us set ||F|| = sup t≤T ||F t || C α (M) , d = d( p, q) and let γ be a geodesic path from p to q of constant speed d. We single out the singularity of the heat kernel considering separately the cases t ≤ d 2 and t > d 2 where we bound the integral over Close to the singularity, using the first item of Lemma 44, When t > d 2 , we write on the other interval, where • stands for the variable that is being differentiated and · is the variable in the distribution's pairing. Then, according to the second item of Lemma 44, for any u ∈ (0, 1), The latter four inequalities yield the claim for the singular part of the heat kernel. Now, according to Theorem 43, for N large enough, This concludes the proof.
We now prove an extension of this classical result to the space of modelled distributions. For 4 The well-definedness of these terms is part of the following theorem.

Remark 39
Here we can see why we introduced the modified norm ||. .
Without it, i.e. with N ≡ 1, the factor on the right hand side cannot be made small, which is necessary for the fixpoint argument.
Remark 40 Contrary to classical Schauder estimates, we only get an "improvement of 4/3 derivatives". In order to get an "improvement of 2 derivatives" one has to include quadratic polynomials in the regularity structure. This is also the reason why we have to choose γ, γ 0 in such a specific way. Note that an improvement by 4/3 will be enough to set up the fix-point argument.
To be specific, in order to get an "improvement of 2 derivatives" the complete list of symbols necessary is, ordered by homogeneity, where i, j = 2, 3 stand for the space-directions. 6 These symbols would be the building blocks for the regularity structure on flat space. On a manifold the polynomials would represent the respective symmetric covariant tensor bundles, as laid out in Sect. 9. The Schauder estimate has to be shown on the level of each of theses symbols, and hence a treatment "by hand", as we do here, would be cumbersome.

Remark 41
The following proof based on the heat kernel (almost) being a scaled test function goes back, in the flat case, to [4]. A proof splitting up the heat kernel into a sum of smooth, compactly supported kernels (following the strategy of [9]) is also possible, but less convenient.

Proof of Theorem 38
Once the second statement is established, the first one follows from the definition of h 0 and the fact that reconstruction of modelled distributions taking values only in positive homogeneities is given by the projection onto homogeneity 0, see Lemma 24.
Recall that δ M = δ, the radius of injectivity. By Remark 19 we can, and will only consider points at distance less than δ/4.
Introduce the short notation Note that ||ξ || C α (M) ≤ C . We shall need the following facts. Since we have where we used the classical Schauder estimate Theorem 37.
Moreover for a function ϕ satisfying the assumptions of Lemmas 17 and 25 and similarly With these estimates at hand, we shall now control each term in the definition of the , using the decomposition of the heat kernel p = p N +R N , from Theorem 43, for N large enough, as we did above for the classical Schauder estimate.

Space regularity
Homogeneity 0 This term can be written as Regarding the contribution of the regular part R N of the heat kernel, we write where ∇ acts on the dummy variable • and convolution acts on · and γ is the geodesic connection q to p. Since this expression is well-defined for N large enough and of order We now treat the term involving p N . Denoting by g(t, s) the integrand of the above integral, for s ∈ [t − d( p, q) 2 , t], The first term we bound as where we used (10) together with Lemma 44 (i), as well as the Hölder continuity of f α in space (9) and in time.
The second we bound as where we used (10) together with Lemma 44 (i) as well as the Hölder continuity of f α in time.
The last one we bound as where we used (10) together with Lemma 44 (ii) as well as the Hölder continuity of f α in space (9) and in time. Hence and then by Lemma 42 Then the following are upper bounds toγ 2α + 4 − 2ε, α + 2γ 0 + 2 − 2ε.
Both are satisfied under our assumptions. Now consider s ∈ [0, t − d( p, q) 2 ]. By Theorem 62 we have where γ (r ) := exp q (r v), v := exp −1 q ( p), for any r ∈ [0, 1], and ∇ 2 is acting on the first variable of p N . Now The first term we bound as where we used (10) together with Lemma 44. 7 The second term we bound as where we used Lemma 44 and the Hölder continuity of f α in time.
Hence by Lemma 42 Then the following are upper bounds toγ 2α + 4 − 2ε, 2 + α + 2γ 0 − 2ε. 7 In coordinates, where are the Christoffel symbols. This gives the quadratic factor in |γ (r ) | = d( p, q). The blowup in t − s follows from an application of Lemma 44 (i), (ii) to the components here.
Both are satisfied under our assumptions. Hence Homogeneity α + 2 which is satisfied under our assumptions.

Homogeneity 1
We only treat here the terms involving p N within We write it as t 0 g(t, s)ds, with .
It is enough to bound this expression acting on X p ∈ T p M. Write For s ∈ [t − d( p, q) 2 , t] we bound (• denotes the dummy variable on which X p is acting, · denotes the dummy variable in the distribution-pairing) where we used (10) together with Lemma 44 (ii), as well as the Hölder continuity of f α in time. Now where we used (10) together with Lemma 44 (ii) with Y p := d| p exp −1 q (z) X p , as well as the Hölder continuity of f α in time.
Hence by Lemma 42 Then the following are upper bounds toγ − 1 Both are satisfied under our assumptions.
Again it is enough to bound the term acting on some X p ∈ T p M. For notational simplicity let v(z) := d| z p N t−s (z, ·) d| p exp −1 z X p and ζ s p := R s f (s) − f α (s, p)ξ . We then write the term to bound as where we used (10) together with Lemma 44 (iii). Similarly where we used Lemma 44 (iii) and the Hölder continuity of f α in time. Finally where we used Lemma 44 (ii) and the Hölder continuity of f α in space (9). Hence by Lemma 42 Then the following are upper bounds forγ − 1 Both are satisfied under our assumptions. Then

Time regularity
Our definition requires only to bound the time increment in homogeneity 0: Let us consider first the regular part of the heat kernel. According to Theorem 43, for N ≥ 4, Together with the bound sup 0≤r ≤T R r f (r ) C α < ∞, using a partition of unity, since α > −2, it yields sup m∈{0,1},0<r <t≤T Up to a multiplicative constant, we can now bound the contribution of R N by We now treat the term involving p N . Using (11) and Lemma 44 (i) Further, again using (11) and Lemma 44 (i) We then needγ Both are satisfied under our assumptions. Then We used the following results.
Lemma 42 Let ρ 1 , ρ 2 ∈ R, g : R 2 → [0, ∞) and assume for A ≤ t ≤ T , The following result on heat kernel asymptotics is classical and its proof can be found for example in [1, Theorem 2.30]; see also [15,Section 3.2]. In these references the norm || · || C (M×M) is defined via a partition of unity as in Definition 10. There is a slight difference to our notation. In the cited references, C 1 for example means "continuously differentiable", while in our notation it only means "Lipschitz continuous". But it is enough to know that our norm is dominated by the norm in the references.

Theorem 43 Let M be a d-dimensional, closed Riemannian manifold and p be the heat kernel on M. Then there exist smooth functions
Moreover for all p ∈ M 0 ( p, p) = 1.

Let p ∈ M and define for z in the range of exp −1 p , Y p ∈ T p M a tangent vector and Z ∈ (T M) a vector field
(Y p acting on * , Z * acting on •).
(Note that because of the small support of p N , these are globally well-defined smooth functions by continuation with zero outside of the range of exp −1 p .) Then for any multiindex k, any n ≥ 0 and = 0, 1.
Proof The summands of p N are of the same form, apart from the factors t i , i = 0, . . . , N . Since for i ≥ 1 they improve the singularity at t = 0, it is enough to treat N = 0. Then exp p (z)).
Since z → ( p, exp p (z)) is smooth, uniformly in p, with support in B T p M (0 p , δ/4) and the factor 1/4 in the exponential is irrelevant, we consider where we abuse notation and keep the same name. Now this is the Schwartz function z → exp(−z 2 ) scaled by a factor of √ t, and so part (i) with = 0 follows from Remark 9. Now The first term is treated as above, now having the additional prefactor t −1 = √ t −2 .
We write the second term as where φ(s) := s 2 exp(−s 2 ) is Schwartz. By Remark 9 part (i) with = 1 is proven. For the second statement The first term has worse blowup in t and the factor 1/4 in the exponential is irrelevant, so it is enough to consider f (z)g(z) where
The third statement follows in a similar fashion from Lemmas 45 and 46.

Lemma 45 Let Y p ∈ T p M act on the first component of d 2 as follows
Then |g(z)| |z||Y p | |D β g(z)| |Y p |, for any multiindex β.
Proof Since ( p, q) → d 2 ( p, q) is smooth, we only need to show g(z) |z||Y p |.

Lemma 46 For Y p ∈ T p M and a vector field Z let
Then, for any multi-index β, Proof This follows from the fact that ( p, q) → d 2 ( p, q) is smooth. Proof This can be verified using the Faa di Bruno formula.

Fixpoint argument
The following lemma follows from a direct application of the definition of modelled distributions. . Theorem 49 Let u 0 ∈ C ∞ (R 2 ). Define v t := P t u 0 and lift it to the regularity structure as

Lemma 48
Let (ξ, Z ) be given as in Definition 35 and let t,G p , t,G p←q be the corresponding models given by Lemma 36, G = V, W. Let α ∈ (−4/3, −1), γ 0 := α/2 + 1 and γ ∈ (4/3, 2α + 4). Then there exists T > 0 and a unique u ∈ D Proof We follow a standard fixpoint argument. Denote N). Indeed, by Theorem 38 and Lemma 48, for a constant c > 0 possibly changing from line to line, Hence for T small enough and N large enough, (B(R, N)) ⊂ B (R, N), for any R > 0. Let us show that is a contraction on B (R, N): for any f , f ∈ B (R, N), Hence for T small enough and N large enough, is a contraction on B(R, N) for any R > 0. We therefore get unique existence of a solution for small T > 0.
To apply this theorem to white noise forcing, the only ingredient missing is "lifting" it to a model (Definition 35). This is done in the next section.

The Gaussian model
Let ξ be a white noise on M. We recall that ξ is a Gaussian process associated to the Hilbert space L 2 (M, vol M ), on a probability space ( , B, P).
Proof For any coordinate chart ψ defined on an open subset U ⊂ M, and ρ a positive function with support in U, ξ U = ρ • ψ −1 ψ * ξ is a Gaussian process associated to the Hilbert space L 2 (R 2 , ρ 2 • ψ −1 det(g • ψ −1 )). Note that ξ U has the same law as ην, with η := ρ • ψ −1 g • ψ −1 and ν a white-noise on R d . According to [9, Lemma 10.2] ν has a version which is almost surely in C α (R d ) and hence ξ U ∈ C α (R d ).
Let now (ρ i ) 1≤i≤n be a partition of unity subordinated to an atlas (U i , ψ i ) 1≤i≤n . Then, there is a realization of (ξ U i ) 1≤i≤n such that almost surely for all α < −1, i ∈ {1, . . . , n}, ξ U i ∈ C α (R 2 ). Then, n i=1 ψ * i ξ U i is a realization of ξ belonging almost surely to C α (M).
Thanks to this realization, we can already define the transport map used in the following Lemma (point (i)).
Lemma 51 Let ξ be the white noise on M and Z t p , p ∈ M, t ∈ [0, T ] be a collection of random distributions on M such that for some α ∈ (−4/3, −1), some κ, δ > 0, Proof For t > s ≥ 0, define for a chart ( , U) Note thatZ s,t x , x ∈ (U) andξ are elements of D ( (U)). Then where we denote S s, One can then apply [9, Proposition 3.32] to get for every compactum Let now ( i , U i ) be a finite atlas with subordinate partition of unity φ i and δ > 0 be the radius of injectivity of M. Then for s, t ∈ [0, T ], p ∈ M, ϕ ∈ B r ,δ ). We can apply Remark 7 to φ i ϕ λ p • −1 i and can estimate, using (15), Here Then, from the argument before, for any s, t ≥ 0 and q large enough, we have The result now follows from the Kolmogorov continuity theorem.
A simple way to define Z t p is here to recenter the terms involving one product of distributions by their mean; this is an instance of a Wick product, see for instance [8]. For any t > 0, the heat kernel and the heat operator are denoted respectively by p t : M 2 → R and P t , and we write for p ∈ M, q t ( p) = p t ( p, p). According to Lemma 50 and Theorem 43, we can consider P t (ξ ) as a function and the map t ∈ R >0 → P t (ξ ) ∈ C ∞ (M) is continuous.
We set for any p ∈ M, t ∈ R ≥0 and any function ϕ ∈ C ∞ (M), Z t p := t 0 (ξ P s (ξ ) − ξ P s (ξ )( p))ds, where for any s > 0 and ϕ ∈ C ∞ (M), Note that for any s > 0, For any t ≥ 0, let us consider the operator K t = t 0 P s ds and for any p, q ∈ M with p = q, set k t ( p, q) = t 0 p s ( p, q)ds. Let us note that the operator has a continuous kernel according to Theorem 43, that we shall denote k 2,t .
Remark 52 Note that, in (17) we subtract a function depending on space. This is different than the renormalization in the flat case, [9, Section 9.1], where just a constant is subtracted. This could be avoided, by realizing that the only factor contributing to the blowup of t 0 ds q s (·), is, by Theorem 43, which is independent of space. Since we do not need the renormalization to be independent of space (or time, for that matter), we do not pursue this.

Proposition 53
For any t ∈ R ≥0 , almost surely for any p ∈ M and ϕ ∈ C ∞ (M), Z t p , ϕ is well-defined and there exists a modification of the process given by ( Z t p , ϕ ) p∈M,ϕ∈C ∞ (M),t≥0 such that almost surely (13) holds true. 9

Proof of Proposition 53
It is enough to prove the assumption of Lemma 51. Let us fix p ∈ M and T > 0. Let δ be the radius of injectivity of M. Let us first check that for any ϕ ∈ C ∞ (M), Z t x (ϕ), is well defined for 0 ≤ t ≤ T . Therefor, let us recall -see Theorem 43 -that The Wick formulas imply for any 0 < s < T , It follows that Besides, E ξ, ϕ 2 P s ξ( p) 2 = q 2s ( p) ϕ 2 2 + P s (ϕ)( p) 2 ≤ s(L + 1) ϕ 2 ∞ , so that Z t p (ϕ) is well defined. We shall now prove that for any κ < 0, 9 In particular, almost surely, for all ϕ ∈ C ∞ (M) and p ∈ M, t → t p (τ ), ϕ is measurable and bounded.
which together with Lemma 50 shall yield the claim. We fix now κ < 0. Let us first prove that the expectation of the second integrand in t p (I( ) ) is almost surely of homogeneity κ. Indeed, according to Theorem 43, there exists C T > 0, such that, for all 0 < t < T , p, q ∈ M, with p = q, where the second line follows from the Cauchy-Schwarz inequality. It follows from Lemma 54 below, that there exists C > 0, such that for any ϕ ∈ B r ,δ T p M , t ∈ [0, T ], Hence for any λ ∈ (0, 1] and ϕ ∈ B 2,δ T p M , Lemma 54 For any ν > η > 0, T ≥ 0, there exists C > 0, such that for any q ∈ M, t ∈ [0, T ], Proof On the one hand, according to (18) and Theorem 43, the left-hand-side of (22) is uniformly bounded by C T t, for all t ∈ [0, T ], for some C T > 0. On the other hand, the estimate (22) would hold true, with η = 0, if K t would be replaced by a C 2 symmetric function on M 2 . Indeed if K : M 2 → R is a C 2 symmetric function, where the index below the connexion symbol indicates the variable on which the latter is acting, and γ is a geodesic from p to q. According to Theorem 43, one can therefore consider K 2,N t = t 0 s P N s ds in place of K 2 t , as soon as N is large enough. This same theorem ensures that there exists a smooth function : [0, T ] × M 2 → R ≥0 such that for all τ ∈ (0, T ], p, q ∈ M, Let us set q τ (r ) = 1 2πτ e − r 2 τ , for any r , τ > 0. We shall apply (23) to K t,ε = t ε s P N s ds, for any fixed ε > 0. Up to a constant, the integrand of the right-handside of (23) is bounded by d( p, q). The first term can be bounded by and the second by for some constant C T > 0. These two bounds, once integrated in (23), imply that for any α > 0, the left-hand-side of (22) is bounded by C T d( p, q) 2−α , uniformly on p, q ∈ M and t ∈ [0, T ]. Using the bound min{a, b} ≤ a η b 1−η , for a, b, η ∈ (0, 1), gives (22).

Higher order covariant derivatives
We want to mirror as best we can the flat space polynomial model described above, in the general context of a closed d dimensional Riemannian manifold. In order to do to this we need to store higher order derivatives of functions f : M → R in a coordinate independent fashion. There is a canonical way to do this on a Riemannian manifold by making use of the associated Levi-Civita connection.
We recall the notion of higher order covariant derivatives of functions f : M → R on a Riemannian manifold with Levi-Civita 12 connection ∇ (see for example [14,Lemma 4.6]).
and then inductively by; where X 1 , . . . , X are arbitrary vector fields on M.
A few remarks are in order.
1. As the notation suggests, ∇ | p f is indeed tensorial, i.e. the right side of the previously displayed equation really only depends on the vector fields, {X i } i=1 , through their values at p. 2. In the literature ∇ f sometimes denotes the gradient of f . We never use the gradient of a function in this work. 3. We shall also sometimes write For a curve γ in M we will write for parallel translation along γ . For a tensor field W along γ we write for the covariant derivative of W along γ . 12 In general, ∇ can be any affine connection.

Lemma 57 If f is an -times continuously differentiable function in a neighborhood of p ∈ M, v ∈ T p M, and γ v (t)
More generally, if , n ∈ N 0 , f is an ( + n + 1)-times continuously differentiable function in a neighborhood of p ∈ M and W t : Proof Let γ v (t) := exp p (tv) so that γ v (t) solves the geodesic differential equation, The case k = 1 amounts to the definition that For the induction step we have by the product rule; wherein the last equality we have again used the product rule to conclude that Definition 58 (Symmetrizations) If V is a real vector space and ∈ N, we let Sym : V ⊗ → V ⊗ denote the symmetrization projection uniquely determined by where S is the permutation group on {1, 2, . . . , } . Often we will simply write Sym for Sym as it will typically be clear what is from the argument put into the symmetrization function.
As usual we let V * denote the dual space to a vector space V and let ·, · denote the pairing between a vector space and its dual. We will often identify (V * ) ⊗ with V ⊗ * where the identification is uniquely determined by Proof Let g (t) := ∇ W t f and recall that the standard Taylor's theorem with remainder states; The results now follow by using Lemma 57 in order to compute the g (k) (t) for 1 ≤ k ≤ n + 1.

Remark 63
Since parallel translation is isometric it follows (continuing the notation in Theorem 62) that ∇ +n+1 Since M is a compact Riemannian manifold it is necessarily complete and therefore, by the Hopf-Rinow theorem, for each q ∈ M we may find at least one v ∈ T p M such that q = exp p (v) and d (q, p) = |v| . Using these remarks we can reformulate (30) as follows.
Corollary 64 If f is (n + 1)-times continuously differentiable on M, p, q ∈ M, and v ∈ T p M is chosen so that q = exp p (v) and d (q, p) = |v| , then Furthermore if f is n-times continuously differentiable on M then Definition 65 (Taylor approximations) Suppose that U ⊂ M is an open subset of M, p ∈ U , f a n-times continuously differentiable function on M and ε > 0 is sufficiently small so that B M ( p, ε) ⊂ U and ε is smaller than the injectivity radius of M. We then define, Tay n p f ∈ C ∞ (B M ( p, ε)) by Remark 66 With this notation, Corollary 64 reads as In the case M = R d and f is a polynomial of degree at most n, it follows by Taylor's theorem that f = Tay n p f for all p ∈ R d . So in the flat case the error term here is no longer present.
Lemma 67 If f is a n-times continuously differentiable function on M and f (q) = o d ( p, q) n , then (V f ) (p) = 0 for any n th -order differential operator V and in particular, ∇ k | p f = 0 for all 0 ≤ k ≤ n.
Proof Let ( , U ) be a chart on with p ∈ U and ( p) = 0 and define F := f • −1 ∈ C n Ũ := (U ) . Then the give assumption implies F (x) = o |x| n and therefore for any x ∈ R d and t ∈ R small we have F (t x) = o (t n ) from which it easily follows that As D k F (0) is symmetric and x ∈ R d was arbitrary we may conclude that D k F (0) = 0 ∈ R d * ⊗k for 0 ≤ k ≤ n. As any n th -order differential operator U on C n (M) may be written locally as

Symmetric parts of covariant derivatives determine all derivatives
We will now make Corollary 68 more precise.

Definition 70
If (x, U := dom(x)) is a chart on M, let D x denote the flat covariant derivative on T U determined by D x ∂ ∂ x j = 0 for 1 ≤ j ≤ d.
Using D x ∂ ∂ x j = 0, it easily follows that for all ∈ N and any -times continuously differentiable function f we have Proof Let D = D x and be the End (T U ) -valued connection one form on T U so that ∇ = D + . It is enough to verify that (32) holds on a basis for T p U ⊗n . To this end, let i j ∈ {1, 2 . . . , d} , for 1 ≤ j ≤ n and let V j = ∂ ∂ x i j . Then, which shows that (32) holds for n = 1. For the sake of completing the proof by induction, let us now assume that (32) holds at level n − 1 and below. In particular we assume On one hand, while on the other hand (using the induction hypothesis, the product rule, and DV k = 0 for all k), Comparing the last two displayed equations shows, From this expression it follows that ∇ n V n ⊗···⊗V 1 f may be expressed in the form claimed in (32).
Corollary 73 Let us continue the notation in Lemma 72. Then, there exists Q ,n ∈ Hom T U ⊗n , T U ⊗ , for 1 ≤ ≤ n, such thatQ n,n = id and for all n-times continuously differentiable functions f , Proof The proof is again by induction on n. For n = 1, we have D W f = W f = ∇ W f , so there is nothing to prove. For the inductive step, suppose that (33) holds at level n − 1 and below. From (32) with W replaced by Sym n W , it follows that, wherein the last equality we have used that D n f is already symmetric. From the previous equation along with the inductive hypothesis, we conclude that D n f , W may be expressed as described in (33).

Corollary 74 If ∇ is a covariant derivative on T M, then there exists
such that Q ∇ n,n = id and for all n-times continuously differentiable functions f , We note the following corollary for completeness.

Corollary 75
If ∇ is a covariant derivative on T M and L is a linear n th -order differential operator on C ∞ (M) , then there exists smooth sections, W ∈ T M for 0 ≤ ≤ n such that where each Q ∇ ,n is constructed from certain combinations of covariant derivatives of the torsion and curvature tensor of ∇.

The polynomial regularity structure and model
We are now ready to set up to regularity structure for "polynomials" up to order n on a manifold.

Remark 83
For n ≥ 2 this transport will in general also go "upwards." That is, if τ ∈ T α | y some α < n, then in general x←y τ will have components in homogeneities strictly larger than α. This is not allowed in the original formulation of a regularity structure by Hairer [9, Definition 2.1]. As we have seen in the main text, this poses no problem, since our modified definition of a model (Definition 14) allows for it. We moreover believe that any transport that wants to achieve the following lemma for a "polynomial model" is forced to do this.
For the proof of the first half of Theorem 92 below, it is convenient to introduce as in [5] the notion of a parallelism on a vector bundle, E, over M.
If U is only defined on a local diagonal domain, we refer to U as a local parallelism.
Example 87 (Parallel translation and parallelisms) One natural example of a parallelism when (M, g) is a Riemannian manifold and E is equipped with a covariant derivative, ∇ E , is to define where p, q ∈ M are "close enough" so there is a unique vector v p with minimum length such that q = exp p v p and // E (·) denotes the parallel translation operator on E relative to ∇ E . For our purposes below E will be a bundle associated to T M and ∇ E will be the induced connection on this bundle associated to the Levi-Civita covariant derivative on (M, g) .
Example 88 (Charts and parallelisms) Each chart ( , U) induces a local parallelism on (T * M) ⊗ for any ∈ N as follows. If A ∈ (T * p M) ⊗ is expressed as In other words, U (q, p) is uniquely determined by requiring With the aid of a parallelism, we can now define the notion of γ -Hölder section, S, on E. In what follows we assume that E is equipped with a smoothly varying inner product, ·, · E . We do not necessarily assume that ∇ E is compatible with ·, · E or that U ( p, q) is unitary for all ( p, q) ∈ D.
Lemma 89 Let S be a continuous section of a vector bundle E. Let (U , D), (U , D) be parallelisms on E. Then for every compactum K ⊂ D Proof We work in a local trivialization. Let U , U : The statement then follows from smoothness of U , U , the fact that they coincide at x, x and local boundedness of S.
Lemma 90 Let f ∈ C (M) , γ > 0 and n = γ ∈ N 0 . Then f ∈ C γ (M) (as in Definition 10) iff is f a n-times continuously differentiable function on M and for any (local) parallelism U on the vector bundle n T * M, Proof Recall from Definition 10, for every coordinate chart ( , U). These conditions are equivalent to f being n-times continuously differentiable and the n th -derivatives of f • −1 being locally (γ − n)-Hölder on (U) . The latter condition may be expressed as saying where D = D is the flat connection defined in Notation 70. From Lemma 72 and Corollary 73 we may express where L is a linear differential operator of order at most n − 1. As L f is continuously differentiable it follows that is continuously differentiable and vanishes at q = p and therefore (by the fundamental theorem of calculus) From (41) and (42) it follows that (40) is equivalent to Lastly using Lemma 89 we conclude that the estimates in (43) and (39) are also equivalent.
Theorem 91 Fix n ∈ N 0 and construct T and ( , ) as above. Then T is a regularity structure (in the sense of Definition 13) and ( , ) is a model of transport precision n + 1 (in the sense of Definition 14).

Proof
The fact that T is a regularity structure is immediate. Let us now set δ M = δ to be the injectivity radius of M and for q ∈ M, let U q := exp q (B T q M (0 q , δ M )).
We have to check that The homogeneity estimate, | p τ, ϕ λ p | λ , for τ ∈ T | p follows from the fact that p τ is a monomial of order in exp −1 p -coordinates. Lemma 84 gives the transport precision, i.e.
Let D be the covariant derivative induced by the chart exp −1 q . Using Lemma 72 we get and hence || p←q τ || m d( p, q) −m , which finishes the proof.
We are finally able to characterize C γ (M) in terms of the "polynomial" regularity structure.  13 The space of modelled distributions was defined in Definition 18.
i.e.f ( p) := Sym[∇ | p f ] for 0 ≤ ≤ γ =: n. We have to check thatf ∈ D γ (M, T ), i.e. for all ≤ γ and d( p, q) < δ or equivalently, using the definition of q← p , if = n: Recall γ − n ∈ (0, 1]. Now the term to bound in (44) reads as By Lemma 80 and since the expression is smooth in q we can focus on Define on the vector bundle n T * M the parallelism Then by Lemma 90 so for = n we are done. = 0, . . . , n − 1: We need to show (44). It is enough to bound for w ∈ T p M, with v := exp −1 p (q), Here For this purpose, define Therefore by Taylor's theorem and the fact that ∇ m | p F p = 0 for 0 ≤ m ≤ n, 14 we have Since g is smooth we apply the fundamental theorem of calculus to find Using this estimate, it follows that As shown in the step = n, we then get , p) γ −n |v| n− |w| = C |v| γ − |w| .
and hence Plugging this estimate back into (46) shows, which completes the proof of (44).
Step 1: We will show that f is n-times differentiable and 1 ! Sym[∇ f ] =f for = 0, . . . , n. This will be done by induction. So assume for some = 0, . . . , n − 1 we know that By Taylor's theorem (Theorem 62) Now by assumption Plugging this into (47) and using the fact that |γ v (t)| = |v| we get Now, since g is smooth and ∇ dtγ v (t) = 0, we have and therefore by Taylor's theorem (in one variable) together with Lemma 57 A simple integration by parts argument shows Combining (48), (49) and (50), we get As v → exp p (v) is a local diffeomorphism, it now follows from (51) that f is + 1 times differentiable at p and moreover since, we may conclude, using Lemma 57 that Then by Remark 59 it follows that 1 ( + 1)! Sym[∇ +1 f ] p =f +1 ( p).
Step 2: So far we have shown that f is n-times continuously differentiable and that Sym[∇ | p f ] =f ( p) for = 0, . . . , n. Then with U defined in (45)  The second to last term is of order d(q, p) γ −n by assumption. Moreover, for ≤ n − 1, by Corollary 80, we have ∇ n | p pf ( p) = 0. Hence the last term is of order d(q, p) d(q, p) γ −n . By Lemma 90 we hence get that f ∈ C γ (M).

A reformulation: the jet bundle
In this section we briefly outline that the "polynomial" regularity structure is in fact (isomorphic to) the jet bundle. In anticipation of possible future work with vector bundle valued equations, we formulate this in the general setting of a vector bundle E π → M with model fiber being a real finite dimensional vector space V . The connection to the previous sections is proven in Theorem 99. We then conclude by showing Taylor's theorem in this setting, Theorem 103. We leave the complete construction of a polynomial regularity structure and its model in this fibered setting to future work. For background on vector bundles, we refer to [13,Chapter 10] and [16].
For each m ∈ M, let (m) be the germ of C ∞ -local sections of E whose domain contains m. Fixing an integer n ∈ N 0 , we define an equivalence relation on (m) as follows. Let (x, u) be a chart and local frame such that m ∈ dom (x) = dom (u). We say S, T ∈ (m) are equivalent and write S n ∼ T provided where u −1 ( p) := u ( p) −1 : E p → V is the inverse of the linear operator, u ( p) : V → E p . It is well known and easy to check that " n ∼" is an equivalence relation which (by the chain and product rules) is independent of the choice of chart and local frame (x, u) as above.
The equivalence relation in (52) may also be written as S n ∼ T provided where for an open subset, U ⊂ R d , a ∈ U , and g ∈ C ∞ (U , V ) , D k g (a) is the k-linear form on R d defined by which shows tay n m is surjective and completes the proof.
In order to write out tay n m more explicitly, let ∇ denote the covariant derivative constructed on any of the bundles T * M ⊗k ⊗ E which is constructed from ∇ E and ∇ M in such a way that the product rule holds. Note that ∇ k S is then a section of is an element of E m . This is consistent with earlier notation, where for E = M × R the pairing was real valued.
Proposition 100 If S ∈ (m) and v ∈ T m M, then Proof We will show by induction that The case k = 0 is trivial, and the case k = 1 holds, since For the induction step we compute; wherein we have used, In the case of the trivial bundle E = M ×R, Theorem 99 shows that the jet bundle is just another representation of the "polynomial" regularity structure of the preceeding section. We finish this section, by showing Taylor's theorem in the setting of general vector bundle presented here.