The general relativistic constraint equations


We present the state-of-the-art concerning the relativistic constraints, which describe the geometry of hypersurfaces in a spacetime subject to the Einstein field equations. We review a variety of solvability results, the construction of several classes of solutions of special relevance and place results in the broader context of mathematical general relativity. Apart from providing an overview of the subject, this paper includes a selection of open questions, as well as a few complements to some significant contributions in the literature.


Let \((L^{1+n},\gamma )\) be a Lorentzian manifold of dimension \(n+1\ge 2\) and signature

$$\begin{aligned}(-\underbrace{+\cdots +}_{n\,{\rm slots}})\end{aligned}$$

solving the Einstein field equations

$$\begin{aligned} {\mathrm {Ric}}_{\gamma }-\frac{1}{2}R_{\gamma }\gamma +\varLambda \gamma = \kappa T. \end{aligned}$$

Here, we let \({\mathrm {Ric}}_{\gamma }\) denote the Ricci curvature of \(\gamma \), \(R_{\gamma }\) its scalar curvature and we employ the letter \(\kappa \) for a positive constant, whose value (and physical dimensions) depends on the specific conventions one adopts. For instance, when \(n=3\) and adopting ‘nongeometrised units’ one would find \(\kappa =(8\pi G/c^2)\) as could be checked by studying the so-called Newtonian limit of (1.1), see e.g., Misner et al. (1973) or Wald (1984). In addition, as it is rather customary in the literature, T stands for the stress-energy tensor of the sources while \(\varLambda \in {\mathbb {R}}\) stands for a cosmological constant.

Now, consider a spacelike hypersurface \(M^n\) in L. Its intrinsic geometry, as encoded by a notion of distance between points on M, is determined by the so-called first fundamental form g (that is nothing but the restriction of \(\gamma \) to the tangent bundle TM, a restriction which we assume to be Riemannian i.e., positive definite). On the other hand, the extrinsic geometry of M, so they way M is bent inside L, is described by the so-called second fundamental form k. It is then a well-known fact that the triple (Mgk) solves a system of equations that takes the form

$$\begin{aligned} \left\{ \begin{aligned} &R_g -\Vert k\Vert ^2_g+(tr_g k)^22\kappa \mu +2\varLambda \\& div_g(k-(tr_g k))=\kappa J \end{aligned}\right. \end{aligned}$$

where \(\mu , J\) are suitable components of T. More precisely, if we let V be a timelike unit normal vector field to M we have set \(\mu =T(V,V)\) and \(J=T(V,\cdot )\), where it is understood that the second slot is allocated for tangent vectors to M. These equations are obtained by combining (1.1) with the Gauss and Codazzi equations for the submanifold M (see e.g., O’Neill 1983 or Petersen 2006) as we will anyway review below in Sect. 1.1. We note that the first equation in (1.2) is known in the literature as Hamiltonian constraint, while the second one is rather referred to as momentum constraint instead.

A fundamental discovery, often attributed to Lichnerowicz (1944), is that some sort of converse of the previous assertion holds true: going beyond the quest for exact solutions, one can regard the field equations (1.1) not only as a ‘curvature prescription problem’ but also, and more effectively, as an evolution problem whose initial data are not freely specifiable but are subject to certain restrictions, which are indeed those given by (1.2). Building upon this work, it was then rigorously proven in Choquet-Bruhat (1952) that a triple (Mgk) solving (1.2) gives rise, in a suitable gauge, to a well-posed hyperbolic system of partial differential equations which can indeed be uniquely solved, at least for short times. In fact, one can also prove a suitable global uniqueness theorem, as was further studied and formalised in later joint work Choquet-Bruhat and Geroch (1969). While we do not primarily aim here at an historical reconstruction of the early days of the subject, it is worth mentioning how, according to Choquet-Bruhat (2015), the distinction between ‘constraints’ and ‘evolution equations’ is already present, at least in some form, in work by Darmois dating back to 1937. In any event, the problem of solving (1.1) decouples into that of constructing initial data, on the one hand, and that of studying their evolution, on the other.

The scope of the present review is to describe the state-of-the-art as far as the former theme is concerned, providing a broad-spectrum overview about what is known (and what is not known) concerning the solvability of the general relativistic constraint equations. Before presenting a more detailed outline of the contents of this survey, let us then digress on the way these equations are derived, on some heuristics behind them, and introduce a list of basic questions that may guide the reader through our discussion.

Deriving the Einstein constraint equations

In the simple argument we are about to present and, in fact, throughout this work, we will adopt the terminology and conventions recalled in Appendix A, see also O’Neill (1983) for an extensive introduction to semi-Riemannian geometry.

To get started, we need to recall the fundamental equations describing the interplay between the intrinsic and extrinsic geometry of a submanifold inside any given ambient semi-Riemannian manifold. In the setting described in the previous section, in particular assuming Lorentzian signature, we shall write

$$\begin{aligned} k(X,Y)=\gamma (\nabla _X V,Y)=-\gamma (\nabla _X Y, V) \end{aligned}$$

for every couple of sections of TM and where V is a (locally defined) timelike, future-pointing, unit normal vector field to M (cf. Definition A.9). If XYZW denote smooth sections of the tangent vector bundle of TM then

$$\begin{aligned} {\mathrm {Riem}}_{g}(X,Y,Z,W)={\mathrm {Riem}}_{\gamma }(X,Y,Z,W)-k(X,Z)k(Y,W)+k(X,W)k(Y,Z), \end{aligned}$$

which is the Gauss equation for spacelike hypersurfaces in Lorentzian manifolds; furthermore

$$\begin{aligned} {\mathrm {Riem}}_{\gamma }(X,Y,Z,V)=\nabla _X k(Y,Z)-\nabla _Y k(X,Z), \end{aligned}$$

which is a special case of the general Codazzi–Mainardi equation.

In the setting above, given \(x\in M\), let \(\left\{ E_1,\ldots , E_n\right\} \) be a local orthonormal frame for M which we shall assume (without loss of generality) to be parallel at x. This is completed to a local Lorentzian frame by means of the timelike unit vector \(E_0=V\) (as above).

Let us start by deriving the first constraint: considering the 00-component of the Einstein field equation we get

$$\begin{aligned} ({\mathrm {Ric}}_{\gamma })_{00}+\frac{1}{2}R_{\gamma } -\varLambda = \kappa \mu \end{aligned}$$

but on the other hand, recalling the definition of scalar curvature,

$$\begin{aligned} R_{\gamma }&=-\sum _{i=0}^n ({\mathrm {Riem}}_{\gamma })_{0i0i} -\sum _{i=0}^n ({\mathrm {Riem}}_{\gamma })_{i0i0} +\sum _{i,j=1}^{n}({\mathrm {Riem}}_{\gamma })_{ijij} \\&=-2({\mathrm {Ric}}_{\gamma })_{00}+\sum _{i,j=1}^{n}({\mathrm {Riem}}_{\gamma })_{ijij} \end{aligned}$$


$$\begin{aligned} R_{\gamma }+2({\mathrm {Ric}}_{\gamma })_{00} =\sum _{i,j=1}^{n}({\mathrm {Riem}}_{\gamma })_{ijij} \end{aligned}$$

so that we can use the (Lorentzian) Gauss equation (1.3) to get

$$\begin{aligned} R_{\gamma }+2({\mathrm {Ric}}_{\gamma })_{00}=\sum _{i,j=1}^{n}({\mathrm {Riem}}_{g})_{ijij} +\sum _{i=1}^n k_{ii}\sum _{j=1}^n k_{jj}-\sum _{i,j=1}^{n}k^2_{ij} =R_g+(tr_g k)^2-\Vert k\Vert ^2_g. \end{aligned}$$

As a result, combining (1.5) and (1.6) one derives the first constraint.

On the other hand, if we evaluate the Einstein field equation at \((V,E_j)\) for some index \(1\le j\le n\) then we get

$$\begin{aligned} ({\mathrm {Ric}}_{\gamma })_{0j}=\kappa J(E_j) \end{aligned}$$

and one can rewrite the left-hand side using the Codazzi equation (1.4) as follows

$$\begin{aligned} ({\mathrm {Ric}}_{\gamma })_{0j}&=-({\mathrm {Riem}}_{\gamma })_{000j}+\sum _{i=1}^{n}({\mathrm {Riem}}_{\gamma })_{i0ij} =\sum _{i=1}^{n}({\mathrm {Riem}}_{\gamma })_{iji0}\\&=\nabla _i k_{ji}-\nabla _j k_{ii} =(div_g k)_j- (d(tr_g k))_j = (div_g k)_j- div_g((tr_g k)g)_j \end{aligned}$$

thus the second constraint follows as well.

At a conceptual level, we need to stress what follows: while the existence of some constraints, related to (1.3) and (1.4), has nothing to do with the specific curvature conditions imposed by the field equation (1.1), it is a remarkable feature that the structure of (1.1) allows to derive a closed differential system for gk (that is to say: a system where the other components of the tensor \(\gamma \) do not appear).

Furthermore, it is appropriate to stress how constraint equations can, more generally, be derived for hypersurfaces of arbitrary causal character (see Mars 2013 and references therein), although their study has attracted interest mostly in the case of spacelike slices, at least on the mathematical side, which is what we will discuss in this review.

The Einstein equations as an evolution problem

For what concerns our discussion, it may be helpful and enlightening to recall how the constraints come into play in converting the field equations (1.1) into an hyperbolic system. We will therefore briefly recall this fundamental connection.

In doing that, it is appropriate to keep in mind that even in the flat setting of the Euclidean space \({\mathbb {R}}^3\) the question whether the Gauss and Codazzi equations are actually necessary and sufficient conditions for the local realisation of a surface has been extensively studied. In that case, the result is affirmative, as granted by the following classical statement.

Theorem 1.1

(Bonnet) The Gauss and Codazzi equations always locally determine one and only one Euclidean surface up to rigid motions: given gk defined on an open set \(\varOmega \subset {\mathbb {R}}^2\) and satisfying the Gauss and Codazzi equations, and \(z\in \varOmega \), one can always find an open connected set \(\varOmega _0\subset \varOmega \) containing z and a smooth immersion \(\varphi :\varOmega _0\rightarrow {\mathbb {R}}^3\) with first fundamental form g and second fundamental form k; furthermore if \(\varphi _1,\varphi _2:\varOmega _0\rightarrow {\mathbb {R}}^3\) are two such immersions then there exist \(\varrho \in SO(3)\) and \(b\in {\mathbb {R}}^3\) such that \(\varphi _2=\varphi _1\cdot \varrho +b\).

This is sometimes referred to as fundamental theorem of the local theory of surfaces, see e.g., Sect. 4.9 in Abate and Tovena (2012) for a complete proof.

That being said, let us get back to our discussion and explain how to turn (1.1) into an evolution problem for which general existence and uniqueness results can be invoked. To avoid unnecessary complications, we shall focus here on the vacuum case, i e. we consider the field equations with no sources (\(T=0\)) and take \(\varLambda =0\). Due to their geometric nature, the Einstein equations are invariant under the action of the whole diffeomorphisms group of M: as a result, when we write the system above in local coordinates we find that the leading sysmbol is not hyperbolic, so that standard results on hyperbolic equations are not directly applicable. Roughly speaking, this issue is overcome by choosing a gauge, namely a preferred system of spacetime coordinates satisfying some convenient properties. To proceed, we further assume, for the sake of simplicity and expository convenience, the background manifold M to be smoothly diffeomorphic to \({\mathbb {R}}^3\), so that there exist globally defined coordinates \(\left\{ x^1, x^2, x^3\right\} \). We consider the product manifold \({\mathbb {R}}\times M\) and we let \(\gamma \) denote the Lorentzian metric we wish to construct. Let then \(\left\{ x^0,x^1,x^2,x^3\right\} \) be a system of coordinates on \({\mathbb {R}}\times M\) that extend those pre-assigned on \(M\cong \left\{ 0\right\} \times M\). Here and throughout this review, we adopt the following conventions:

  • Greek letters are used to denote indices that vary in \(\left\{ 0,1,2,3\right\} \);

  • Latin letters are used to denote indices that vary in \(\left\{ 1,2,3\right\} \).

Furthermore, we shall always tacitly adopt the standard convention concerning summations over repeated indices (the range of the summation depending on the type of the letter in the obvious fashion); indices employed after the comma refer to partial derivatives in local coordinates.

In local coordinates, the Ricci tensor (cf. Appendix A) takes the form

$$\begin{aligned} R_{\mu \nu }=-\frac{1}{2}\gamma ^{\alpha \beta }\gamma _{\mu \nu ,\alpha \beta } -\frac{1}{2}\gamma ^{\alpha \beta }\left( \gamma _{\alpha \beta ,\mu \nu }-\gamma _{\nu \beta ,\mu \alpha } -\gamma _{\mu \alpha ,\nu \beta }\right) +F_{\mu \nu }(\gamma ,\partial \gamma ) \end{aligned}$$

where \(F(\cdot ,\cdot )\) denotes a suitably smooth function of its entries that vanishes, together with gradient, at (0, 0). The reader is advised that, in the following discussion, we allow terms of this type vary from line to line, without changing the corresponding notation.

In order to fix the gauge, we introduce harmonic/wave coordinates. Specifically, let us set

$$\begin{aligned} H^{\mu }{:}{=}\square _\gamma x^{\mu }, \quad \mu =0,1,2,3 \end{aligned}$$

where \(\square _\gamma \) denotes the wave operator in metric \(\gamma \) (that is to say: the Laplace-Beltrami operator with respect to the Lorentzian metric \(\gamma \)).

Expanding the \(\gamma \)-Laplacian and recalling the Jacobi identity for differentiating the determinant of a function we find

$$\begin{aligned} H^{\mu }=\gamma ^{\alpha \mu }_{,\alpha }+\frac{1}{2}\gamma ^{\alpha \mu }\gamma ^{\rho \sigma }\gamma _{\rho \sigma ,\alpha } =-\gamma ^{\alpha \rho }\gamma ^{\mu \sigma }\gamma _{\rho \sigma ,\alpha }+\frac{1}{2}\gamma ^{\alpha \mu } \gamma ^{\rho \sigma }\gamma _{\rho \sigma ,\alpha } \end{aligned}$$

and, as a result, we obtain the following identity

$$\begin{aligned} -\frac{1}{2}\gamma ^{\alpha \beta }\left( \gamma _{\alpha \beta ,\mu \nu }-\gamma _{\nu \beta ,\mu \alpha } -\gamma _{\mu \alpha ,\nu \beta }\right) =-\frac{1}{2}\gamma _{\alpha \mu }H^{\alpha }_{,\nu } -\frac{1}{2}\gamma _{\alpha \nu }H^{\alpha }_{,\mu }+F_{\mu \nu }(\gamma ,\partial \gamma ). \end{aligned}$$

As suggested by this computation, we introduce the following modified system

$$\begin{aligned} R^{H}_{\mu \nu }{:}{=}R_{\mu \nu }+\frac{1}{2}\gamma _{\alpha \mu }H^{\alpha }_{,\nu } +\frac{1}{2}\gamma _{\alpha \nu }H^{\alpha }_{,\mu }=0. \end{aligned}$$

These equations are sometimes referred to in the literature as reduced Einstein equations; notice that they take the equivalent form

$$\begin{aligned} \frac{1}{2}\gamma ^{\alpha \beta }\gamma _{\mu \nu ,\alpha \beta }+F_{\mu \nu }(\gamma ,\partial \gamma )=0. \end{aligned}$$

Now, the key remark is that if \(H^{\mu }=0\) along the evolution, then the reduced Einstein equations are equivalent to the actual field equations. Hence, we wish to write down a system of differential equations for both the metric \(\gamma \) and \(H^{\mu }\), the latter with homogeneous initial conditions so that the previous observation applies. The way to implement this strategy can then roughly be summarised in three steps, which we now outline.

Step 1 We choose initial data \(\gamma _{\mu \nu }\) and \(\gamma _{\mu \nu ,0}\) so that \(H^{\mu }=0\) at time \(t=0\), namely along the hypersurface \(M\cong \left\{ 0\right\} \times M\subset {\mathbb {R}}\times M\). We consider data of the form

$$\begin{aligned} \gamma _{\mu \nu }=\begin{pmatrix} -1 &{} \quad 0 \\ 0 &{}\quad g_{ij} \end{pmatrix} \end{aligned}$$


$$\begin{aligned} \gamma _{\mu \nu ,0}=\begin{pmatrix} *&{}\quad *\\ *&{}\quad 2k_{ij} \end{pmatrix} \end{aligned}$$

where \(*\) stands for terms we need to fix so that the condition \(H^{\mu }=0, \ \mu =0,1,2,3\) is initially satisfied. With that goal in mind, observe that (straight from the definition) when \(\mu =1,2,3\) one gets

$$\begin{aligned} H^{i}=\gamma ^{ij}\gamma _{0j,0}+ b^{i} \end{aligned}$$

where \(b^{i}\) stands for known terms, namely an expression involving the components of g and k only. Therefore, imposing \(H^{i}=0\) for \(i=1,2,3\) gives a linear \(3\times 3\) system that can be uniquely solved in the unknowns \(\gamma _{01,0}, \gamma _{02,0}, \gamma _{03,0}\). For \(\mu =0\) one similarly finds

$$\begin{aligned} H^{0}=-\gamma _{00,0}-\frac{1}{2}\gamma ^{ij}\gamma _{ij,0}+\frac{1}{2}\gamma _{00,0}+ b^{0} \end{aligned}$$

so that \(H^{0}=0\) determines the value of \(\gamma _{00,0}\). From now onwards, we agree that initial data have been chosen according to this procedure.

Step 2 Suppose that the metric \(\gamma _{\mu \nu }\) solves the reduced Einstein equations (we will impose this to be the case, as explained below). Then, by the very definition of \(R^H_{\mu \nu }\), we have

$$\begin{aligned} R_{\mu \nu }=-\frac{1}{2}\gamma _{\alpha \mu }H^{\alpha }_{,\nu }-\frac{1}{2}\gamma _{\alpha \nu }H^{\alpha }_{,\mu } \end{aligned}$$

whence we find, for the Einstein tensor

$$\begin{aligned} G_{\gamma }={\mathrm {Ric}}_{\gamma }-\frac{1}{2}R_{\gamma }\gamma , \end{aligned}$$

the expression

$$\begin{aligned} G_{\mu \nu }=-\frac{1}{2}\gamma _{\alpha \mu }H^{\alpha }_{,\nu } -\frac{1}{2}\gamma _{\alpha \nu }H^{\alpha }_{,\mu }+\frac{1}{2}H^{\alpha }_{,\alpha }\gamma _{\mu \nu }. \end{aligned}$$

On the other hand, let us recall that the (in this case: vacuum) constraint equations ensure \(G_{0\mu }=0\) for \(\mu =0,1,2,3\). Thus, we have in particular

$$\begin{aligned} 0=G_{00}=-\frac{1}{2}(-H^0_{,0}+H^{i}_{,i}) \end{aligned}$$

and so, since evaluating along M gives \(H^i_{,i}=0\) for each fixed \(i=1,2,3\) thanks to the previous step, we further conclude \(H^0_{,0}=0\). Then we can consider the other three equations, namely

$$\begin{aligned} 0=G_{0i}=-\frac{1}{2}\gamma _{\alpha i}H^{\alpha }_{,0}=-\frac{1}{2}\gamma _{ij}H^{j}_{,0} \end{aligned}$$

which is a linear \(3\times 3\) homogeneous system, having the sole trivial solution \(H^{1}_{,0}=H^2_{,0}=H^3_{,0}=0\). To summarise: the Einstein constraints ensure that, with respect to a metric \(\gamma \) subject to the reduced Einstein equations we have that

$$\begin{aligned} H^{\mu }=0 \ (\text {zeroth-order vanishing}) \ \Longrightarrow \ H^{\mu }_{,0}=0 \ (\text {first-order vanishing}). \end{aligned}$$

Step 3 The shortcut to conclude is provided by second Bianchi identity, i.e., by the fact that the Einstein tensor G has vanishing divergence. If we express \(G_{\mu \nu }\) in terms of the first derivatives of H, which we wrote in the previous step, we get from that conservation law the equation

$$\begin{aligned} 0=-\frac{1}{2}\gamma ^{\mu \rho }\gamma _{\alpha \nu }H^{\alpha }_{,\mu \rho }-\frac{1}{2} \gamma ^{\mu \rho }\gamma _{\alpha \mu }H^{\alpha }_{,\nu \rho }+\frac{1}{2}\gamma ^{\mu \rho } \gamma _{\mu \nu }H^{\alpha }_{,\alpha \rho } + F_{\nu }(H,\partial H) \end{aligned}$$

hence we just need to observe that two of the summands are equal (hence, having opposite signs, cancel out) and we actually find

$$\begin{aligned} 0=-\frac{1}{2}\gamma ^{\mu \rho }\gamma _{\alpha \nu }H^{\alpha }_{,\mu \rho }+ F_{\nu }(H,\partial H) \end{aligned}$$

or, equivalently, the homogeneous equation

$$\begin{aligned} \square _\gamma H^{\mu }+F^{\mu }(H,\partial H)=0. \end{aligned}$$

At this stage, we wrap everything up together and just consider the differential system given by

$$\begin{aligned} \left\{ \begin{array}{l} \frac{1}{2}\gamma ^{\alpha \beta }\gamma _{\mu \nu ,\alpha \beta }+F_{\mu \nu }(\gamma ,\partial \gamma )=0. \\ \square _\gamma H^{\mu }+F^{\mu }(H,\partial H)=0. \\ \gamma _{\mu \nu }(0,x^1,x^2,x^3)= {\text {as determined in step 1}} \\ \gamma _{\mu \nu , 0}(0, x^1,x^2,x^3)={ \text {as determined in step 1}} \\ H^{\mu }(0, x^1,x^2,x^3)=0 \\ H^{\mu }_{,0}(0, x^1,x^2,x^3)=0 \end{array}\right. \end{aligned}$$

for which one can invoke standard existence and uniqueness results, whose specific form depends on the actual setting we stick to (there are local and global versions, and in the latter case, for non-compact manifolds, weighted spaces may come into play). For a reference, see e.g., Ringström (2009) Chap. 9, specifically Proposition 9.12, for the existence part and Chap. 12, specifically Lemma 12.8, for the uniqueness part (alternatively, see Chap. VI of the monograph Choquet-Bruhat 2009). It is also appropriate to note how the introduction of the harmonic/wave coordinates \(x^{\mu }\) above, and of the associated functions \(H^{\mu }\), is by no means the only way of approaching the problem of reducing the Einstein field equations to an hyperbolic system: for an interesting discussion of the local theory, including the delicate gauge questions and a discussion of various ways of writing the equations as hyperbolic systems, we refer the reader to the first sections of Friedrich and Rendall (2000) (see also, specifically, the discussion given at p. 540 in Friedrich 1985).

In any event, the homogeneity of the equation for \(H^{\mu }\), together with zero initial conditions, ensures that \(H^{\mu }\equiv 0\) on the whole existence domain, whence the reduced Einstein equations are equivalent to (1.1). We are thus given a solution of the field equations at least on a tubular neighbourhood of the form \((-\tau ,\tau )\times M\). Through the conceptual path we just presented, one can prove, in our context, a theorem which mirrors the classical result by Bonnet. Indeed, one can easily turn the argument sketched in this section into a proof of the following existence theorem.

Theorem 1.2

Let (Mgk) be a triple satisfying the vacuum constraint equations:

$$\begin{aligned} \left\{ \begin{aligned} &R_g-\Vert k\Vert ^2_g +(tr_g k)^2=0 \\ &div_g(k-(tr_g k)g)=0 \end{aligned}\right. \end{aligned}$$

then, there exists a spacetime \((L,\gamma )\) and an embedding \(\iota : M\rightarrow L\) such that the following assertions are true:

  1. 1.

    the spacetime \((L,\gamma )\) is Ricci-flat ;

  2. 2.

    g is the induced metric by \(\iota \), namely \(\iota ^{*}(\gamma )=g\);

  3. 3.

    k is the second fundamental form of \(\iota : M\rightarrow L\).

Recall that in the vacuum case, and with zero cosmological constant, the field equations (1.1) reduce to the requirement that the Lorentzian metric \(\gamma \) be Ricci-flat.

As we already recalled at the very beginning of this introduction, Theorem 1.2 is a pioneering result of Choquet-Bruhat (1952), later refined in Choquet-Bruhat and Geroch (1969) (cf. Sbierski 2016) to prove existence of a maximal, globally hyperbolic development (see the second part of Appendix A). All in all, one can then simply assert that the Einstein constraint equations provide a necessary and sufficient condition for the embeddability of a Riemannian hypersurface inside a spacetime solving the Einstein field equations. Obviously, there are also more general versions of the theorem stated above, which apply to the case when sources are present and whatever the sign and value of the cosmological constant \(\varLambda \).

To phrase things a bit differently, the moral of the story is that if we wish to think of the field equations (1.1) as an actual hyperbolic evolution problem the price is to pay are additional restrictions on the admissible pairs one can assign as initial data. In order to clarify the point, the reader may want to contrast that situation with those associated to two classes of (unconstrained) differential equations that play a fundamental role in classical physics.

The first, and by far simplest model, is that of a Cauchy problem for a second-order ODE in normal form, i.e.,

$$\begin{aligned} \left\{ \begin{aligned} &y''(t)=f(t,y,y') \\ &y(t_0)=y_0 \\ &y'(t_0)=y_1 \end{aligned}\right. \end{aligned}$$

to be solved, classically, for \(y\in C^2(I;{\mathbb {R}}^k)\), where \(I\subset {\mathbb {R}}\) is an open interval containing \(t_0\). The Cauchy–Lipschitz theorem ensures the local solvability of this problem for any pair of initial data \((y_0, y_1)\in {\mathbb {R}}^k\times {\mathbb {R}}^k\) under mild regularity assumptions on f. For instance, in the case of Newton’s equations of motion one typically takes \(k=3\), and \(y_0\) (respectively \(y_1\)) are the position (respectively the velocity) of the particle whose motion is being described. In the context of this finite-dimensional analogy, the constraints (1.1) should be thought of taking the form of a submanifold defined by

$$\begin{aligned} P(y_0, y_1)=0 \end{aligned}$$

for some suitably smooth \(P: {\mathbb {R}}^k\times {\mathbb {R}}^k \rightarrow {\mathbb {R}}^{d}\), \(d\ge 1\).

The second model, which in fact gets much closer to the nature of (1.1), is instead an inhomogeneous nonlinear wave problem

$$\begin{aligned} \left\{ \begin{aligned}& \Box u=f(t,u,\nabla u) \\ &u(t_0)=u_0 \\& u(t_0)=u_1. \end{aligned}\right. \end{aligned}$$

which we decided to formulate, for the sake of simplicity, with respect to a flat background metric (i.e., \(\Box \) denotes the standard wave operator). If the right-hand side has certain reasonable properties (see e.g., Strauss 1989) then one can again prove local existence and uniqueness results for the function u whatever the choice of \((u_0, u_1)\in X_0\times X_1\) where \(X_0, X_1\) are suitable functional spaces depending on the specific setup under consideration. In the simplest possible case, that of a linear wave equation, \(u_0\) (respectively \(u_1\)) can be interpreted as the position (respectively the velocity) of a string at the fixed time \(t_0\). In that respect, one may think of the constraints as some additional requirement, defined by means of a map \(T:X_0\times X_1\rightarrow Y\) and taking the form \(T(u_0, u_1)=0\).

Some key questions and a roadmap

These very basic preliminaries being recalled, we are now ready to give a first outline of the themes we will present in this review. Summarising things to the extreme, we shall be concerned with the solvability of the Einstein constraint equations, (1.2). Our discussion will then stem from the following, intentionally vague, question:

given sources \(\mu , J\) (possibly unspecified but subject to additional energy conditions) and \(\varLambda \), is (1.2) solvable for an assigned background manifold M?

It is here tacitly understood that one looks for physically meaningful solutions, by which we mean, among other things, that additional a priori boundary conditions and/or conditions at infinity are also imposed. Then, proceeding on the question above: is the answer possibly negative depending on the topology of M? Can we place restrictions on the ‘shapes’ of those spacetimes compatible with a certain physical model just by investigating the problem at the level of spacelike slices? What is the qualitative behaviour of solutions and what does this imply in terms of future evolution of the data in question?

It is also natural to wonder about the set of all solutions for a given background manifold and in a certain physical regime: should we aim at uniqueness or, instead, multiplicity results? Are solutions isolated? If not, can we study the local structure of the space of solutions? Are solutions rigid or flexible (i.e., can they be deformed, either locally or globally)? Can we find effective parametrisations?

Starting from such general research directions, mathematicians can easily produce a number of variations on the theme. To highlight a relevant point, we did not specify a definite degree of differentiability for the solutions, assuming (somewhat naively) these would be smooth (i.e., \(C^{\infty }\)), but one can equally well study this network of questions at a lower level of regularity, and indeed there are various situations where a more robust approach proves to be helpful both for physically natural applications and for numerical simulations.

Along a less predictable road, as we shall see later on, the mathematical investigation that stems from the questions above sometimes leads to highly surprising ‘side effects’, such as the discovery of novel classes of solutions which exhibit very unexpected, counterintuitive properties and are far from what we now consider physically detectable phenomena in whatever reasonable sense. As a striking example, we shall mention the construction of localised solutions to the Einstein equations i.e., of a vast class of non-trivial asymptotically flat spacetimes that are exactly Minkowskian away from a Lorentzian cone.

Many readers, especially those farthest from the PDE community, may expect these questions to find their answer in the context of an impeccable, complete and finalised theory. Yet, as a special incarnation of some more universal principles (cf. Klainerman 2010), one could say no such general theory for the Einstein constraints actually exists and the landscape in front us is rather a collection of partial (although sometimes striking) results obtained by very diverse methods.

In particular, a large fraction of existence results for Eq. (1.2), has been obtained within two different approaches. The first one to be developed is the so-called conformal method, which basically goes back to an idea in Lichnerowicz (1944) and, somewhat more specifically, was formalised in York (1973) (see also Ó Murchadha and York 1973; Choquet-Bruhat and York 1980 and Isenberg 1995 among others). Roughly speaking, the strategy behind this approach, which relies on a clever covariant decomposition of the space of symmetric tensors, is to freely assign certain data as an Ansatz for parametrisation and solve for the remaining subset of unknowns: one considers special (generalised conformal) deformations so to make the initial problem (1.2), which is of under-determined character, into a determined elliptic system. Section 2 will be devoted by a systematic presentation of this method, in its basic structure, together with a vast selection of recent developments.

The second method, which relates to the analysis in Fischer and Marsden (1975a) concerning the local deformability of scalar curvature, fits into the realm of what we call gluing methods and, for the specific case of the Einstein constraints, owes a lot to Corvino (2000), Corvino and Schoen (2006) and Chruściel and Delay (2003). As the name suggests, this approach is based on the idea of combining/merging together solutions which may in fact be just defined in some subregion of the background manifold, with the goal of thereby generating new classes of data. The detailed description of this method is the object of Sect. 3, where we will see how these techniques have led, in recent years, to the aforementioned construction of rather surprising solutions to (1.1) that exhibit an exotic localisation property (as first discovered in Carlotto and Schoen 2016). In that respect, we will also see how such gluing methodologies have been employed in constructing other classes of data (e.g., asymptotically hyperbolic ones) or for other field equations (e.g., for linearised relativistic constraints, for higher-spin fields and more general instances). Furthemore, in the last decade new, fascinating, perspectives have emerged. Among others, the (Sobolev) extension procedure devised in Czimek (2018) has already allowed to obtain localised counterparts of the \(L^2\)-boundedness conjecture proven in Klainerman et al. (2015); as we shall discuss, this methodology raises a number of significant open questions for the years to come. Again related to gluing and extension problems is also the proposed definitions of quasi-local mass, according to the scheme envisioned in Bartnik (1989): here we will focus around the construction presented in Mantoulidis and Schoen (2015), which produces a large class of horizon data for which the variational problem which comes into play in the definition of Bartnik mass does not have a solution. In turn, these ideas relate to the general question whether it is possible to desingularize singular loci (e.g., codimension one interfaces) keeping the constraints valid.

Finally, in Sect. 4 we will say something about the intriguing problem of studying spaces of solutions to the constraints both in the small (do we have a locally regular Banach/Hilbert manifold structure or are there singular points?), and in the large, namely with the viewpoint of algebraic topology, so to describe the global shape of the class of all initial data sets that are compatible with a given physical model. On the one hand, the first class of questions will naturally lead us to say something about the phase space for the Einstein equations, and to rephrase them as an Hamiltonian system, i.e., as a system of ODEs in an infinite-dimensional setup. On the other hand we will, in particular, hint at some of the dramatic advances allowed by the employment of the Ricci flow devised in Hamilton (1982) and crowned by Perelman’s breakthroughs (Perelman 2002, 2003a, b) in the study of spaces of positive scalar curvature metrics, which in turn led to a much better understanding of the class of asymptotically flat initial data sets for (1.1), at least in the maximal case.

Constructive versus descriptive

At the cost of running the risk of producing some sort of caricature of the reality of things, one could assert that the Einstein constraints can be studied from two conceptually distinct (yet deeply intertwined) perspectives. The former one, which is the one we have embraced so far, is primarily concerned with the solvability of the constraints and may thus be call constructive.

Yet, there is also a different viewpoint one can take in studying the Einstein constraints. This perspective, which one may call descriptive, is built around the following general question instead: what sort of physical/geometric conclusions one can draw about a triple (Mgk) in view of the fact that it solves the Einstein constraints? From this viewpoint, one is not really concerned with the existence of some or many solutions to (1.2), but rather (assuming existence as a postulate) studies the implications coming from the constraints.

A striking example in this direction is given by the positive mass theorem. In the simplest possible setting, so for asymptotically flat time-symmetric data (Schoen and Yau 1979b; Witten 1981) one studies the geometric restrictions imposed by the constraints to deduce non-negativity of the ADM mass (for this class of data) together with a full characterisation of the ground state corresponding to zero mass. Other manifestations of the same principle are, more generally, the study of functional inequalities involving the mass, such as the Penrose inequality (proven in Huisken and Ilmanen 2001 for the case of connected boundary and in Bray 2001 in full generality) or inequalities involving the angular momentum and/or the charge of the data in question (see e.g., Dain 2008 or Khuri et al. 2017 as well as Dain and Gabach-Clement 2018 for a highly pleasant survey on these themes). On a different theme, we also wish to mention some recent advances on Bartnik’s stationarity conjecture, i.e., on the conjectural characterisation of mass-minimising extensions of given Bartnik data (cf. Sect. 3.6) as initial data sets that can be isometrically embedded into a stationary vacuum spacetime, see An (2020) and in particular Huang and Lee (2020a); we will say something more about these results to the extent they relate to some central themes of the present survey.

Now, the reality is that there are lots of connections between these two worlds, the constructive and the descriptive ones, and we cannot reasonably present one and neglect the other. That being said, our main focus here will be on the former aspects, leaving the latter ones as a diffused theme that permeates the whole discussion. There are lots of reasons why we made this choice. Since any sort of reasonable coverage of all of the themes we mentioned above goes beyond the scope and the size of a review, one needs to define some sort of reasonable path within the subject and thus, somewhat arbitrarily, opt for certain topics in spite of others. This is especially difficult in our case because both perspectives have proven very fertile in recent years. On (what we called) the descriptive side, the monumental work Lohkamp (2015) on the one hand and Schoen and Yau (2017) on the other, both aimed at an unconditional proof of the Riemannian positive mass theorem for asymptotically flat manifolds of arbitrary dimension, have also opened the way to finally filling a number of significant gaps in the literature, such as (among others) the analysis of the spacetime positive mass theorem, with its equality case, in all dimensions (see Huang and Lee 2020b, cf. Eichmair et al. 2016) and the corresponding results in the asymptotically hyperbolic case (Chruściel and Delay 2019; Huang et al. 2020). That said, we do see some conceptual and technical unity that is common to these works, which are in fact to a large extent covered in the recent, excellent textbook Lee (2019).

By contrast, as it has already been said above, the landscape on the constructive side is much more fragmented, with lots of results scattered throughout the literature, often with very diverse notation and background. While several other surveys on the subject exist, as we will further describe below in Sect. 1.6, we aim here at a broader-spectrum perspective and a more detailed and unified presentation both for the classical and the modern part of the story. That being said, we have (very briefly) summarised the state of the art about positive mass theorems for asymptotically flat (respectively: asymptotically hyperbolic) data sets in Appendix B (respectively: Appendix C) as relevant background coming into play along the course of our discussion of the Einstein constraints.

Special cases and heuristics

In order to give a more concrete character to the roadmap we presented above, and to get an idea of the sort of problems we will be studying (in their connection to several fundamental themes in Geometric Analysis) it may be appropriate for us to say something about the special case when \(k=0\), in which case the constraints take the form of the single equation

$$\begin{aligned} R_g = 2\kappa \mu + 2\varLambda . \end{aligned}$$

This case is known in the literature as Riemannian or time-symmetric. The latter terminology descends from the fact that the spacetime \((L,\gamma )\) produced solving the Einstein equations with (Mg, 0) as initial data, where M is the background manifold we work with, naturally comes with an isometric involution having M as the set of its fixed points. In particular, let us note that the condition \(k=0\) means, geometrically speaking, that the hypersurface M is totally geodesic inside L.

The very first aspect that is apparent from (1.7) is the under-determined nature of the problem: we have one single equation for \(n(n+1)/2\) unknown functions. This may suggest the idea, to a large extent correct, that solutions this equation (when unobstructed) should exist in abundance. To indicate one way the under-determined character of the problem could be dealt with, let us first focus on the regime where no physical sources are present: under such an assumption, we need to solve the equation \(R_g=2\varLambda \) so we wonder whether the (pre-assigned) manifold M does support metrics of constant scalar curvature, with sign pre-determined depending on the value of the cosmological constant \(\varLambda \). Even when M is a compact manifold without boundary (which is, in some sense, the simplest possible case), we then face a highly non-trivial mathematical question: Does every closed manifold support metrics of constant scalar curvature?

One way to attack this problem is to consider conformal deformations: taken a background metric \(g_0\) on M, we consider all metrics of the form \(g=u^{4/n-2}g_0\) where u is a smooth positive factor, and we wonder whether one can find, within this conformal class, a metric of constant scalar curvature. This is the famous Yamabe problem (cf. Yamabe 1960), completely solved in Schoen (1984) through a spectacular inversion argument, reducing the analysis of the cases left open by earlier work by Trudinger and Aubin to a statement about the positivity of the ADM mass (see Lee and Parker 1987 for a detailed account). From a PDE perspective, this matter boils down to understanding the solvability of the (critical) semilinear elliptic equation

$$\begin{aligned} -\frac{4(n-1)}{n-2}\varDelta _{g_0} u + R_{g_0} u= u^{\frac{n+2}{n-2}}f \end{aligned}$$

where f is the datum we wish to assign (in this specific case \(f=2\varLambda \), a constant). The conformal method we alluded to in the previous section can indeed be regarded as a generalisation of this trick to the case of the full constraints.

In spite of the affirmative solution to the Yamabe problem, which ensures the existence of constant scalar curvature metrics in any given conformal class, recall that, when one fixes (a background manifold M and) a conformal class \([g_0]\), the sign of any constant scalar curvature metric is uniquely determined by \([g_0]\) (see Theorem 2.3), thus for pre-assigned \(\varLambda \) there may well be conformal classes for which the equation is anyway not solvable. To be concrete, if we took \(M=S^3\) and \(\varLambda >0\) as we shall see there are anyway plenty of conformal classes for which Eq. (1.8) does not have solutions, which shows the weakness of an Ansatz-based approach like we described above. In many respects, it is then more fruitful to allow for non-conformal deformations (i.e., deformations that are not restricted to happen within a given conformal class), which is indeed the perspective we shall embrace both in Sect. 3 and, even more explicitly, in Sect. 4.

However, when doing so, one faces new and somewhat unexpected phenomena. Indeed, keeping in mind what we wrote above one may ask: if a manifold M supports a metric of constant scalar curvature, say \(+1\), can we rule out a priori the existence of constant scalar curvature metrics attaining values 0 or \(-1\)? Set aside the case when \(n=2\), so when M is a closed surface, which can be easily treated with nineteenth century tools such as the Gauss–Bonnet theorem and uniformisation results, for \(n\ge 3\) the story gets a lot more intricate. About this question, the first thing to say is that the answer is negative for indeed (when \(n\ge 3\)) any closed manifold supports metrics of (constant) negative scalar curvature (see Theorem 4.13). Hence, it is clear, for instance, that the n-dimensional standard sphere supports metrics with both positive and negative constant scalar curvature. On the other hand, there are indeed some restrictions on the sign of a constant scalar curvature metric on a given manifold: for instance, it is a well-known result that n-dimensional tori do not support metrics of non-negative scalar curvature except the flat metric (this statement used to be referred to as Geroch conjecture and is now more often referred to as the torus rigidity theorem, cf. Schoen and Yau 1979c; Gromov and Lawson 1980a). Significantly enough, a (by now) well-known compactification argument due to Lohkamp (1999) allows to derive the n-dimensional Riemannian positive mass theorem from the corresponding n-dimensional torus rigidity theorem, which is indeed the approach adopted in Schoen and Yau (2017).

In the first part of this section we have tried to give an idea (sticking to the time-symmetric case) of the way the under-determined problem (1.7) can be turned into a determined one by arbitrarily deciding to work within a conformal class. Let us now instead describe, in the same special case, some heuristics behind gluing methods. The key issue here is to learn how to deform an approximate solution (obtained, for instance, by interpolating between two solutions) into an actual, exact solution of the constraints. From an analytic perspective, the first thing to do is to consider the linearisation, at a given datum g, of the scalar curvature operator. If we do so, we obtain the elliptic operator

$$\begin{aligned} L_g h=-\varDelta _g(tr_g h)+div_g(div_g h)- g(h, {\mathrm {Ric}}_g). \end{aligned}$$

Whenever we work in asymptotic regime (e.g., locally near a point, or instead near infinity for suitably asymptotically flat or asymptotically hyperbolic data), say \(g=g_0+h\) where \(g_0\) is a reference metric and h is a suitably small additional term, we have that \(R_g=R_{g_0+h}\) shall differ from \(L_{g_0} h\) by higher-order terms, hence the analysis of (1.7) reduces, at least in some respects, to the study of the mapping properties of the operator \(L_g\), as well as of its (formal) dual

$$\begin{aligned} L^{*}_{g} u=-(\varDelta _g u) g + \mathrm {Hess}_g u-u {\mathrm {Ric}}_g. \end{aligned}$$

Loosely speaking, one can exploit the under-determined nature of the equation we wish to solve to prove an a priori estimate, which one may call coercivity estimate, for the over-determined adjoint operator \(L^{*}_g\). By this we mean an estimate of the form

$$\begin{aligned} \Vert L^{*}_{g_0} u\Vert _{Y}\ge C \Vert u\Vert _X \end{aligned}$$

where XY denote suitable Banach space of tensors of the appropriate type, and we regard \(L^*_g: X \rightarrow Y\). This is nothing but a quantitative, uniform injectivity estimate for \(L^*_g\) whence standard duality argument allow, at least formally, to derive surjectivity of \(L_g\), which ultimately means solvability of the linearised equation. The obstruction to this, intentionally oversimplified, approach is the existence of static potentials, i.e., of elements \(u\in X\) belonging to the kernel of the operator \(L^{*}_{g_0}\) (cf. Appendix D): in this case the idea is to work orthogonally to the kernel (equivalently: orthogonally to the cokernel of the linearised scalar curvature operator) and to reduce the solvability of (1.7) to a finite-dimensional problem that can be tackled, for instance, by means of degree-theoretic methods (as was the case in Corvino 2000).

More generally, some of these ideas and techniques also come into play when we study Eq. (1.7) in presence of sources, in which case one is concerned with a scalar curvature prescription problem. Most often, the study of this equation is performed assuming that the physical sources in question satisfy some reasonable energy assumption. In the context of the present review we will typically focus on the case when the dominant energy condition is postulated to hold. That is to say: we assume the stress energy tensor T is such that for any future-pointing causal vector field V the vector field \(-T(V,\cdot )^{\#}\) is itself future-pointing causal. This condition forces, at the level of initial data sets, the inequality

$$\begin{aligned} \mu \ge |J|_g, \end{aligned}$$

which reduces, in the time-symmetric case we are dealing with in this section, to the requirement that the energy density \(\mu \) be non-negative at all points. Hence, when \(\varLambda \ge 0\) (i.e., for positive or negligible cosmological constant) equation (1.1) concerns the study of spaces of positive (in general: non-negative) scalar curvature metrics, a theme that has been thoroughly investigated for decades (cf. e.g., Carlotto 2021; Gromov 2018; Rosenberg and Stolz 2001; Schick 2014 as well as references therein).

We conclude this section by mentioning another special case of (1.2) (in fact: a strict generalisation of the previous one) which, in frequent circumstances, can be investigated by means of similar methods. This is the so-called maximal case, which corresponds to the analysis of the constraints under the assumption that \(tr_g k=0\). Hence, (1.2) takes the special form

$$\begin{aligned} \left\{ \begin{aligned} &R_g -\Vert k\Vert ^2_g=2\kappa \mu +2\varLambda \\ &tr_g(k)=0 \\ &div_g(k)=\kappa J. \end{aligned}\right. \end{aligned}$$

The reason why this terminology is employed is by analogy with the well-known case of minimal surfaces in Riemannian geometry: in a Lorentzian manifold a spacelike hypersurface is called maximal when it is a critical point for the area functional, in which case it does indeed locally maximize area (for fixed boundary). The significant advantage of the maximal case, compared to the time-symmetric one, is that this condition is a lot less restrictive for the resulting spacetime obtained by evolving the corresponding data. In other words, while it is somewhat rare (in a sense to be suitably specified) for a given spacetime to contain a time-symmetric spacelike slice, it is much less so for a maximal spacelike slice (see e.g., Marsden and Tipler 1980 and Bartnik 1984 for classical existence results in this direction). On the other hand, the form of the constraints in the maximal case is still rather special, and reasonably tractable, when compared to (1.2). In particular, if \(\varLambda \ge 0\) and the dominant energy condition is assumed to hold for the sources, then the first equation \(R_g=\Vert k\Vert ^2_g+2\kappa \mu +2\varLambda \) indicates that we are within the (comparatively well-understood) realm of positive scalar curvature metrics.

Other surveys and bibliographical references

The present review is by no means the first survey devoted to the relativistic constraint equations, and does not aim to be a superset of the existing ones. Among very many bibliographical references, we would like to mention, in chronological order and limited to the last twenty years, Bartnik and Isenberg (2004), Corvino and Pollack (2011), Isenberg (2014), Galloway et al. (2015) and (among surveys of broader character) Chruściel (2005) as well as Chruściel et al. (2010b). The monograph Choquet-Bruhat (2009) is also highly recommended, see in particular Chap. VI therein.

The main differences between the present work and most of the ones we listed above are perhaps (besides the scale of this review) a greater emphasis on the PDE aspects of the theory and a special focus on some of the latest developments in the field. That being said, we have tried hard to build a presentation which may, in spite of its self-evident mathematical nature, be accessible and appealing to a large audience (hopefully transversal to the mathematical and physical communities). Precisely for this reason, we have refrained from presenting any results in their greatest possible generality, or under sharp technical assumptions, but have rather decided to set technicalities aside and build a path around a few key ideas in the field. A word on open problems: these have been presented at due course along our discussion to the scope of indicating, in a more of less unified context, some questions that naturally arise in the development of the theory (as it is now). There is no doubt that some of them are certainly well-known, at least to the experts in the field. A selection of ten questions have been especially highlighted and are collected in Table 4.

An important remark we wish to add is that a fundamental cluster of topics has been omitted from this survey: we did not even try to account for the several remarkable contributions to the numerical study of the Einstein constraints. This is partly due to the lack of expertise of the author and, to some extent, to the ascertainment that a proper treatment of this topic would deserve a separate review in its own right (cf. Cook 2000); we refer the interested reader to the monograph Baumgarte and Shapiro (2010) as well as to the corresponding sections, and the remarkably rich, updated bibliography in Holst et al. (2016). In addition, a second theme that comes up a few times in the discussion (see, in particular, Sect. 2.4) but that we decided not to treat in detail is the study of the Einstein constraints under symmetry assumptions (i.e., in presence of one or more spacelike Killing vector fields). In the case when we postulate the presence of exactly one Killing vector field the constraints take the form of a semilinear (coupled) elliptic system on a surface, of Liouville/Toda type, which partly resembles the output of the conformal method (namely: the conformally rephrased constraints, which we present in Sect. 2.2). For interesting, recent contributions along these lines, see Huneau (2016) (first part) and Huneau (2015) (second part); Choquet-Bruhat and Moncrief (2003) and Moncrief (2013) are also suggested for a broader contextualisation.

Needless to say, the literature on the subject of the Einstein constraints is huge and extremely diverse. We have tried to write a reasonable account, without unrealistic claims of completeness, focusing on some key ideas, on significant open problems, and stressing connections between different techniques and applications. This review inevitably built on some choices, and on the more accurate exposition of some contributions in spite of others. Such choices are not meant to reflect some pre-defined hierarchies of value, but only serve the scope of building clear images of some aspects of the constraint landscape, guiding the reader towards a deeper exploration.

Solving the constraints via conformal methods

In this section we start by presenting the conformal method for studying the Einstein constraint equations and discuss, following Isenberg, existence and non-existence of solutions in the so-called CMC case, namely when the mean curvature of the initial data set to be constructed is a constant scalar. This will also be the starting point for the more general discussion to be given in the sequel, and devoted to more recent developments in very many directions (construction of initial data with varying mean curvature, on different classes of background manifolds and possibly in presence of physical sources of several significant types...).

A primer in conformal geometry

We shall first recall the definition of Yamabe invariant and a simple characterisation thereof. Let \((M,g_0)\) be compact manifoldFootnote 1 without boundary and let \([g_0]\) be the conformal class of the Riemannian metric \(g_0\). If \(n\ge 3\) and one sets \(g=u^{4/(n-2)}g_0\) then the scalar curvature changes according to the well-known equation

$$\begin{aligned} R_{g}=-\frac{1}{c(n)}u^{-\frac{n+2}{n-2}}P_{g_0} u \end{aligned}$$


$$\begin{aligned} P_{g_0} u =\varDelta _{g_0}u-c(n)R_{g_0} u, \ \ c(n)=\frac{n-2}{4(n-1)}. \end{aligned}$$

Definition 2.1

In the setting above, we define the conformal Yamabe constant of \((M,[g_0])\) as

$$\begin{aligned} Y(M,[g_0])=\inf \left\{ \frac{\int _{M}R_g dV_g}{V(M,g)^{\frac{n-2}{n}}} \ : \ g\in [g_0] \right\} \end{aligned}$$

or equivalently

$$\begin{aligned} Y(M,[g_0])=\inf _{u>0, u\in H^1(M,g_0)} \ \frac{\int _M (c(n)^{-1}|\nabla _{g_0}u|^2+R_{g_0}u^2)\,dV_{g_0}}{\left( \int _{M}u^{\frac{2n}{n-2}}\,dV_{g_0}\right) ^{\frac{n-2}{n}}}. \end{aligned}$$

This number is (tautologically) a conformal invariant and, differently from what happens in dimension two, it contains more information than purely topological. We note that the functional, defined on smooth Riemannian metrics, by

$$\begin{aligned} E(g)=\frac{\int _{M}R_g dV_g}{V(M,g)^{\frac{n-2}{n}}} \end{aligned}$$

is often referred to in the literature as renormalized Einstein-Hilbert functional. In this language, we remark how the solution of Yamabe’s problem, which we already alluded to in the introduction, followed by showing that (in the setting above) \(E: [g_0] \rightarrow {\mathbb {R}}\) does attain its infimum.

It is very hard, without additional assumptions, to compute the value of \(Y(M,[g_0])\). For instance, this is known in the case of round spheres (see e.g., Theorem 3.3 in Lee and Parker 1987). The results in Aubin (1976) and then Schoen (1984) allow to conclude that for any background (closed, connected) manifold and/or conformal class the value of \(Y(M,[g_0]\) is strictly less than that computable threshold (which is then the essential input to infer the existence of a minimiser for the variational problem given above).

That being said, there are some more abstract (or: less explicit) results concerning the problem of determining the value of the conformal Yamabe constant. The most basic one can be stated as follows: we know that \(Y(M,[g_0])=E(g_0)\) if \(g_0\) is Einstein, namely \({\mathrm {Ric}}_{g_0}=c g_0\) for some \(c\in {\mathbb {R}}\) (given the existence of a minimiser in the conformal class this follows from the rigidity results in Obata (1962, 1971), the latter for the specific case of the standard conformal class of spheres). Luckily, for most purposes one does not need to know the exact value of the conformal Yamabe constant, but rather only determine its sign.

With that goal in mind, it is useful to compare \(Y(M,[g_0])\) with the first eigenvalue of the conformal Laplace operator \(-P_{g_0}\) namely with the value

$$\begin{aligned} \lambda _1(-P_{g_0})=\inf _{u\in H^1(M,g_0)\setminus \left\{ 0\right\} }\frac{\int _M (|\nabla _{g_0}u|^2+c(n)R_{g_0}u^2)\,dV_{g_0}}{\int _{M}u^2\,dV_{g_0}} =\inf _{\begin{array}{c} u\in H^1(M,g_0),\\ \Vert u\Vert _{L^2(M,g_0)}=1 \end{array}}I(u) \end{aligned}$$

where we have set

$$\begin{aligned} I(u)=\int _{M}(|\nabla _{g_0}u|^2+c(n)R_{g_0}u^2)\,dV_{g_0}. \end{aligned}$$

It is a basic result in linear analysis that the value \(\lambda _1(-P_{g_0})\) is attained, and in fact there exists \(u_1\in H^1(M,g_0)\) such that \(u_1>0\) and \(P_{g_0}u_1=-\lambda _1 u_1\). That being said, a helpful fact is contained in the following assertion:

Proposition 2.2

In the setting above, the two numbers \(Y(M,[g_0])\) and \(\lambda _1(-P_{g_0})\) are either both zero or always have the same sign.


We start by obtaining two general comparison estimates. Directly from Hölder’s inequality we get

$$\begin{aligned} \frac{1}{\Vert u^2\Vert _{L^{\frac{n}{n-2}}(M,g_0)}}\le \frac{V(M,g_0)^{2/n}}{\Vert u^2\Vert _{L^1(M,g_0)}}. \end{aligned}$$

Hence, the following two implications are straightforward:

  • if \(I(u)\ge 0\) for all \(u\in H^1(M,g_0)\) then (multiplying the above inequality by \(c(n)^{-1}I(u)\) and taking the infimum in u) one gets

    $$\begin{aligned} 0\le Y(M,[g_0])\le \lambda _1(-P_{g_0})\frac{V(M,g_0)^{2/n}}{c(n)}; \end{aligned}$$
  • if \(I(u)< 0\) for some \(u\in H^1(M,g_0)\) then one similarly gets

    $$\begin{aligned} 0> Y(M,[g_0])\ge \lambda _1(-P_{g_0})\frac{V(M,g_0)^{2/n}}{c(n)}. \end{aligned}$$

Now, we argue as follows:

  1. 1a.

    if \(Y(M,[g_0])<0\) then there exists \(u\in H^1(M,g_0)\) such that \(I(u)<0\) hence by (2.6) \(\lambda _1<0;\)

  2. 1b.

    if \(\lambda _1<0\) then there exists \(u\in H^1(M,g_0)\) such that \(I(u)<0\) hence \(Y(M,[g_0])<0;\)

  3. 2a.

    if \(Y(M,[g_0])>0\) then for all \(u\in H^1(M,g_0)\) one has \(I(u)\ge 0\) hence by (2.5) \(\lambda _1>0;\)

  4. 2b.

    if \(\lambda _1>0\) then for all \(u\in H^1(M,g_0)\) one has \(I(u)\ge 0\) hence \(Y(M,[g_0])\ge 0\) and we give below an ad hoc argument to rule out the possibility that \(Y(M,[g_0])=0;\)

  5. 3a.

    if \(Y(M,[g_0])=0\) then for all \(u\in H^1(M,g_0)\) one has \(I(u)\ge 0\) hence by (2.5) \(\lambda _1\ge 0\) and we again appeal to the argument below to rule out the possibility that \(\lambda _1>0;\)

  6. 3b.

    if \(\lambda _1=0\) then for all \(u\in H^1(M,g_0)\) one has \(I(u)\ge 0\) hence by (2.5) \(Y(M,[g_0])=0.\)

So, in order to fully complete the proof we only need to check that

$$\begin{aligned} Y(M,[g_0])=0 \ \ \Longrightarrow \ \ \lambda _1(-P_{g_0})=0. \end{aligned}$$

For the sake of a contradiction, let us assume instead that \(\lambda _1(-P_{g_0})\ne 0\). Clearly, it cannot be \(\lambda _1<0\): indeed, if it were the case then (keeping in mind (2.1)) we get that the metric \(g=u_1^{4/(n-2)}g_0\), where \(u_1>0\) is the first eigenfunction of \(-P_{g_0}\), is such that \(R_{g}<0\) hence \(Y(M,[g_0])<0\), contradiction. Thus, we are left with the case \(\lambda _1>0\), so that (similarly replacing, without renaming, \(g_0\) by \(u_1^{4/(n-2)}g_0\)) we can assume without loss of generality that \(R_{g_0}>0\). Now, \(Y(M,[g_0])=0\) only happens if one can find a sequence \(\left\{ u_k\right\} \subset H^1(M,g_0)\) such that

$$\begin{aligned} \frac{\int _M |\nabla _{g_0}u_k|^2\,dV_{g_0}}{\left( \int _{M}u^{\frac{2n}{n-2}}_k\,dV_{g_0} \right) ^{\frac{n-2}{n}}}\le \delta '_k, \ \ \frac{\int _M |u_k|^2\,dV_{g_0}}{\left( \int _{M}u^{\frac{2n}{n-2}}_k\,dV_{g_0}\right) ^{\frac{n -2}{n}}}\le \delta ''_k \end{aligned}$$

with \(\delta '_k, \delta ''_k\) two sequences of positive numbers such that \(\delta '_k+\delta ''_k\rightarrow 0\) as one lets \(k\rightarrow \infty \). Yet, for k large enough this would violate the Sobolev inequality (see e.g., Aubin 1998)

$$\begin{aligned} \left( \int _{M}u^{\frac{2n}{n-2}}\,dV_{g_0}\right) ^{\frac{n-2}{n}}\le C_1 \int _M |\nabla _{g_0}u|^2\,dV_{g_0} +C_2 \int _M |u|^2\,dV_{g_0} \end{aligned}$$

and this contradiction completes the proof. \(\square \)

As a byproduct, this fact ensures that the sign of \(\lambda _1(-L_{g})\) is invariant for \(g\in [g_0]\). More remarkably, the sign of this number determines whether M can be endowed with a conformal metric of that sign.

Theorem 2.3

(trichotomy theorem) Let \((M^n,g_0)\) be a connected, compact manifold without boundary. Then, there are exactly three mutually distinct possibilities, depending on the value of \(Y(M,[g_0])\):

  1. 1.

    \(Y(M,[g_0])>0\), if and only if there exists \(g\in [g_0]\) with \(R_g>0\), if and only if \(\lambda _1(-P_{g_0}) > 0\).

  2. 2.

    \(Y(M,[g_0])=0\), if and only if there exists \(g\in [g_0]\) with \(R_g=0\), if and only if \(\lambda _1(-P_{g_0}) = 0\).

  3. 3.

    \(Y(M,[g_0])<0\), if and only if there exists \(g\in [g_0]\) with \(R_g<0\), if and only if \(\lambda _1(-P_{g_0}) < 0\).


The assertions follow from Proposition 2.2, for indeed if \(\lambda _1>0\) (resp. \(\lambda _1=0\) or \(\lambda _1<0\)) then, just by looking at equation (2.1), the metric \(g{:}{=}u_1^{4/(n-2)}g_0\) satisfies \(R_g>0\) (resp. \(R_g=0\) or \(R_g<0\)), where \(u_1\in H^1(M,g_0)\) denotes a first positive eigenfunction for \(P_{g_0}\) (hence with our sign convention \(P_{g_0}u_1=-\lambda _1 u_1\)); thus the conclusion follows once we prove these two claims ensuring the mutual incompatibility of the options:

  • if \(\lambda _1<0\) there cannot be \({\hat{g}}\in [g_0]\) such that \(R_{{\hat{g}}}\ge 0\);

  • if \(\lambda _1=0\) then there is no metric in \([g_0]\) whose scalar curvature is either everywhere positive or everywhere negative.

However, both statements are straightforward applications of the maximum principle given the equation describing the change of scalar curvature under conformal change of a background metric. \(\square \)

Remark 2.4

A conclusion of this type is definitely not true if we do not work within a conformal class. The reader may wish to compare Theorem 2.3 with our preliminary discussion given in Sect. 1.5 and with the results in Theorem 4.13.

Setting up the conformal method

We shall be concerned here with the solvability of the vacuum Einstein constraint equations on a connected, compact 3-manifold with empty boundary (henceforth denoted by M). The extension of our analysis to the higher-dimensional case is straightforward, with modifications of purely notational character.

As we anticipated above, the idea behind the conformal method is that system (1.2) is underdetermined, hence one introduces free data that serve as parametrising data for the space of solutions to a well-posed elliptic system. At a formal level (hence with no emphasis on the regularity of the objects that we introduce, which will be addressed later) one proceeds as follows:

  1. 1.

    given M as above we consider free data \((g,\sigma ,\tau )\) where

    • g is Riemannian metric on M;

    • \(\sigma \) is a transverse traceless (0, 2) tensor, namely \(div_g(\sigma )=0\) and \(tr_g (\sigma ) =0\);

    • \(\tau \) is a function on M;

  2. 2.

    we seek solutions to (1.2) of the form

    $$\begin{aligned} \left\{ \begin{aligned}& {\overline{g}}=u^4 g \\& {\overline{k}}=u^{-2}(\sigma +K_g W)+\frac{\tau }{3}u^4 g \end{aligned}\right. \end{aligned}$$

    where it is always tacitly assumed that \(u>0\) and \(K_g\) is the conformal Killing operator associated to the Riemannian manifold (Mg), namely

    $$\begin{aligned} K_gW={\mathscr {L}}_{W}g-\frac{1}{3}tr_g({\mathscr {L}}_{W}g)g \end{aligned}$$

    or, equivalently, \(K_g W={\mathscr {L}}_{W}g -\frac{2}{3}div_g (W)g\) for any given smooth vector field W on M;

  3. 3.

    one checks that \((M,{\overline{g}},{\overline{k}})\) is an initial data set solving (1.2) in the vacuum case if and only if the couple (uW) solves the elliptic system

    $$\begin{aligned} \left\{ \begin{aligned} &\varDelta _g u -\frac{1}{8}R_g u=\frac{1}{12}\tau ^2u^5 -\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g u^{-7} \\& div_g(K_g W)=\frac{2}{3}u^{6}d\tau . \end{aligned}\right. \end{aligned}$$

We note that one can easily derive (2.9) from (1.2) given the two basic formulae (see e.g., Besse 2008)

$$\begin{aligned} R_{{\overline{g}}}=-8u^{-5}(\varDelta _gu-\frac{1}{8}R_gu), \ \ div_{{\overline{g}}}(u^{-2}k)=u^{-6}div_g(k). \end{aligned}$$

Thus, to summarise, the conformal method (which combines ideas of Choquet-Bruhat, Lichnerowicz and York) is the prototype of what might be called a seed to solution method, where the seeds are just parameters introduced to transform (1.2) into a determined problem and one expects/hopes that, under reasonable assumptions, (2.9) is solvable (hopefully with a well-behaved and well-understood seed-to-solution map). Incidentally, there is also a deeper (and less obvious) reason why this procedure is indeed effective, which concerns the fact that it transforms a quasilinear (coupled) elliptic system into a semilinear one, thereby allowing to apply the method of sub- and supersolutions, as we are about to see.

Remark 2.5

Observe that \(\tau =tr_{{\overline{g}}}({\overline{k}})\), so that if \((M^3,{\overline{g}},{\overline{k}})\) is an initial data set for the Einstein field equations and \((L,\gamma )\) is the associated maximal spacetime then \(\tau \) is the mean curvature of such a spacelike slice inside the Lorentzian manifold in question.

These preliminaries being presented, we now wish to address the natural question about the mathematical origin of the Ansatz given above. This relies on the description of the York decomposition of symmetric (0, 2) tensors. We remark how this can be regarded as a special case of a suitably general Hodge decomposition theorem, but we stick to the special case in question to avoid unnecessary complications.

We let S denote the space of smooth, i.e., \(C^{\infty }\), (0, 2) symmetric tensors on M; note that using the background metric g, assigned on M, we can naturally define an inner product on S. Hence, we have a pointwise orthogonal decomposition of the tensors in S into their pure-trace part and the trace-free part (denoted by T). This decomposition happens, so to say, at a purely algebraic level. Set X the space of smooth, i.e., \(C^{\infty }\), vector fields on M, we note that \(K_g: X\rightarrow S\) actually takes values in the subspace T, and we will henceforth regard \(K_g: X\rightarrow T\).

At that stage, one can further decompose the space T. Indeed, this space admits a canonical \(L^2\)-orthogonal decomposition

$$\begin{aligned} T=TT \oplus _{\perp } K_g X \end{aligned}$$

where \(K_g\) is the conformal Killing operator that has been introduced above. In other words, given \(\rho \in T\) there is a unique decomposition

$$\begin{aligned} \rho =\sigma + K_g W \end{aligned}$$

where \(\sigma \in TT\), the subspace of transverse trace-free tensors in T. Let us see why.

Consider the \(L^2\)-pairing \((\cdot ,\cdot )\) and set \(K_g^{*}\) the formal adjoint of \(K_g\), that is characterised by the equation

$$\begin{aligned} (K_g W,\sigma )=(W,K_g^{*}\sigma ) \ \ \ \forall \ W\in X, \ \forall \ \sigma \in T. \end{aligned}$$

One has the following two facts:

  • a direct computation shows that \(K_g^{*}\sigma =(-2 div_g \sigma -\frac{2}{3}d(tr_g \sigma ))^{\#}\) so in particular for \(K_g^{*}: T\rightarrow X\) one has \(K_g^{*}\sigma =(-2 div_g \sigma )^{\#}\);

  • the composite operator \(-K_g^{*}K_g: X\rightarrow X\) is self-adjoint and elliptic.

We claim that \(\ker (K_g)=\ker (K_g^{*}K_g)\). For indeed, it is obvious that \(\ker (K_g)\subseteq \ker (K_g^{*}K_g)\) while the other implication is checked as follows: if \(W\in \ker (K_g^{*}K_g)\) then for all \({\tilde{W}}\in X\) one has

$$\begin{aligned} 0=(-K_g^*K_g W,{\tilde{W}})=-(K_g W,K_g{\tilde{W}}) \end{aligned}$$

so that in particular we can take \({\tilde{W}}=W\) and conclude \(0=(K_g W,K_g W)\) which is only possible if \(W\in \ker (K_g)\). Using the second statement given above, we then gain the orthogonal decomposition \(X=Y\oplus _{\perp }Y^{\perp }\) where \(Y{:}{=}\ker (K_g)=\ker (K_g^*K_g)\), a finite-dimensional vector space. Furthemore, the restriction

$$\begin{aligned}-K_g^{*}K_g: Y^{\perp }\rightarrow Y^{\perp }\end{aligned}$$

is a linear isomorphism. Hence, given \(\rho \in T\) consider \(-K_g^{*}\rho \in X\) and let \(W\in X\) such that \(-K_g^{*}\rho =-K_g^*K_g W\) (such an element W may not uniquely determined, but in any event \(K_gW\) is canonical; note that existence of W is granted since trivially \(-K_g^{*}\rho \in Y^{\perp }\)). Now, set \(\sigma =\rho -K_gW\): this is a a trace-free, divergence-free tensor (see above) and thereby the claim is verified.

As a result, the Ansatz we gave in item 2. above follows by simply considering a York decomposition for the tensor \(u^2{\overline{k}}\), noting that the form of the pure-trace term is uniquely determined by the geometric requirement that \(\tau \) be the mean curvature of \((M,{\overline{g}},{\overline{k}})\), like we had explained in Remark 2.5.

Constructing constant mean curvature data

We will repeatedly invoke the following basic sovability result, known in the literature as method of sub- and supersolutions. Let (Mg) be a compact, Riemannian manifold without boundary: we wish to solve equations of the form

$$\begin{aligned} P u = G(x,u) \end{aligned}$$

where we assume P to be a uniformly elliptic operator with coefficients in \(C^{0,\mu }(M)\) for some fixed \(\mu \in (0,1)\) and the function \(G:M\times {\mathbb {R}}\rightarrow {\mathbb {R}}\) on the right-hand side is subject to the following requirements:

  1. 1.

    \(|G(x_1,z_1)-G(x_2,z_2)|\le \varLambda (d_g(x_1,x_2)^{\mu }+|z_1-z_2|), \ \ \forall \ x_1, x_2\in M, \ \forall \ |z_i|\le K\);

  2. 2.

    \(|G(x,z)|\le \varLambda , \ \ \forall \ x\in M, \ \forall \ |z|\le K\)

both to hold for some positive constants \(K, \varLambda \in {\mathbb {R}}\).

Theorem 2.6

In the setting above, let \(u^{-},u^{+}\in C^{2,\mu }(M)\) such that

$$\begin{aligned} Pu^{-}\ge G(x,u^{-}), \ Pu^{+}\le G(x,u^{+}), \ -K\le u^{-}\le u^{+}\le K. \end{aligned}$$

Then there exists \(u\in C^{2,\mu }(M)\) solving Eq. (2.10).

Remark 2.7

If \(G\in C^1(M\times {\mathbb {R}})\) then for any \(u^{-}\) (resp. \(u^{+}\)) satisfying \(Pu^{-}\ge G(x,u^{-})\) (resp. \(Pu^{+}\le G(x,u^{+})\)) one can always find constants \(K, \varLambda \) so that the two conditions above are satisfied. This is the case in most applications.

Remark 2.8

Note that, possibly by subtracting to both the left-hand side and the right-hand side of (2.10) a term of the form \(-cu\) for c a large positive constant, we can always assume that the linear operator \(P:C^{2,\mu }(M)\rightarrow C^{0,\mu }(M)\) is an isomorphism; this does not affect the other aspects of the discussion in any way. That being said, in principle the method provides two solutions to Eq. (2.10), which can roughly be characterised as the largest subsolution and the smallest supersolution and are concretely defined as uniform limits of the iteration schemes

$$\begin{aligned} \left\{ \begin{aligned}& u_0=u^{-} \\ &u_{k+1}=P^{-1}(G(x,u_k)) \end{aligned}\right. \end{aligned}$$


$$\begin{aligned} \left\{ \begin{aligned} &u_0=u^{+} \\ &u_{k+1}=P^{-1}(G(x,u_k)) \end{aligned}\right. \end{aligned}$$

respectively. Yet, there is a priori the possibility that the two coincide, and it is often the case in applications that one does not have simple criteria for ensuring this phenomenon does not occur.

Remark 2.9

With very minor changes, one can prove similar theorems in case the background manifold (Mg) has non-empty boundary or it is non-compact with some controlled geometry at infinity.

We shall now discuss the solvability of (2.9) in the so-called CMC case, namely when \(\tau =c\in {\mathbb {R}}\). Of course, one can always take \(W=0\) to satisfy the second equation (this being the only solution if (Mg) has no conformal Killing vector fields) and thus the whole system is reduced to the semilinear elliptic equation

$$\begin{aligned} \varDelta _g u =\underbrace{ \frac{1}{8}R_g u +\frac{1}{12}\tau ^2u^5 -\frac{1}{8}\Vert \sigma \Vert ^2_g u^{-7}}_{G(x,u)}. \end{aligned}$$

In general, the solvability of this equation depends on three factors:

  1. 1.

    the sign of the Yamabe invariant \(Y(M,[g_0])\);

  2. 2.

    the vanishing/non-vanishing of the mean curvature \(\tau \);

  3. 3.

    the vanishing/non-vanishing of the tensor \(\sigma \).

More precisely, we can summarise the situation by means of a well-known scheme, which collects the results in Isenberg (1995).

Theorem 2.10

Let (Mg) be a compact Riemannian manifold, without boundary, of dimension three. Depending on the data \(\sigma \) and \(\tau \) the solvability of the system (2.9), for \(W=0\), is described by Table 1.

Table 1 The Isenberg chart for the CMC case

In particular, in any of the affirmative cases, there exist CMC solutions of the vacuum Einstein constraints (1.2).

Roughly speaking, the table in question can be justified as follows:

  • for negative (i.e., non-existence) results, argue by maximum principle;

  • for positive (i.e., existence) results, employ the method of sub- and supersolutions.

The latter tools have been presented in the first part of this section. For the former, we will recall the following basic statement.

Lemma 2.11

(Baby maximum principle) In the setting above, the equation

$$\begin{aligned} \varDelta _g v= f \end{aligned}$$

is not solvable in \(v\in C^2(M)\) if \(f\in C^0(M)\) has a sign, namely if either \(f>0\) or \(f<0\). Similarly, the equation

$$\begin{aligned} \varDelta _g v-cv =f \end{aligned}$$

does not have positive solutions \(v\in C^2(M)\) if \(c<0\) and \(f\le 0\), but f does not vanish identically on M.

In this specific form, the result above is completely elementary, as it to suffices to integrate the equation on M and apply the divergence theorem. With different methods, i.e., essentially by observing that at a maximum point \(\varDelta _g v\le 0\) while at a minimum \(\varDelta _g v \ge 0\) (both conclusions following by tracing the Hessian matrix of v, which is definite at an extremal point) one can actually prove more refined results, in particular the so-called strong maximum principle.

As an instructive sample, we will now discuss the 3rd column of the table above (the other cases being analogous and therefore left to the reader).


The application of the two parts of Lemma 2.11 to the cases when \(Y(M,[g])=0\) and \(Y(M,[g])<0\), respectively (where we can assume to have a background metric with \(R_g=0\) or \(R_g<0\), respectively) is straightforward so the only non-trivial task is to prove solvability for positive Yamabe invariant, namely for \(Y(M,[g])>0\).

We can assume, without loss of generality, that the scalar curvature of the background metric g is positive, namely \(R_g>0\) and we apply the method of sub- and supersolutions as follows:

  • a supersolution is given by a large positive constant, \(u^+=\varLambda >0\) such that

    $$\begin{aligned} \varLambda ^8>\frac{\Vert \sigma \Vert ^2_g}{\inf _{M}R_g}; \end{aligned}$$
  • a subsolution is given by \(u^-= \varepsilon v\) where v is the only solution to the linear problem

    $$\begin{aligned} \varDelta _g v-\frac{1}{8}R_g v =-\frac{1}{8}\Vert \sigma \Vert ^2_g \end{aligned}$$

    which is easily checked to be positive (arguing as we described after the statement of Lemma 2.11) and \(\varepsilon >0\) is chosen small enough that

    $$\begin{aligned} \varepsilon ^8<\inf _M v^{-7}. \end{aligned}$$

Of course, possibly making \(\varLambda \) even larger, we can always assume that \(u^+>u^-\) as well so the claim follows at once. \(\square \)

Bifurcation analysis and parametrisation problems

An important feature of the work by Isenberg, cf. Theorem 2.10, is that (with the exception of the special case \(\sigma =0,\tau =0\) when it is checked at once that all constants are solutions to the Lichnerowicz equation) whenever solutions of (2.30) exist, they are unique. That is to say: if we pick a triple of seed data \((g,\sigma ,\tau )\) that fits one of the five solvable cases in the the second, third or fourth column of Table 1, then there exists a unique (positive) function u solving the equation

$$\begin{aligned} \varDelta _g u -\frac{1}{8}R_g u =\frac{1}{12}\tau ^2u^5 -\frac{1}{8}\Vert \sigma \Vert ^2_g u^{-7}. \end{aligned}$$

The (rather elementary) uniqueness argument is provided in Sect. 6 of Isenberg (1995). Therefore, people normally assert that (in this specific setting) the seed data provide a parametrisation of the CMC solutions to the Einstein constraints. To phrase this precisely we need a bit of care. One can summarise the discussion we presented in Sect. 2.3 asserting that there exists a (tautologically surjective) map from the space X of triples corresponding to seed data to the space Y corresponding to pairs solving (1.2) and such that \(\tau =tr_g k=\text {constant}\); then the PDE uniqueness result above implies, in particular, that such a map induces a bijection \(S: X/\sim \ \rightarrow Y\) provided one declares

$$\begin{aligned} (g,\sigma ,\tau )\sim (g',\sigma ',\tau ') \ \ \text {if} \ \ \ g'=u^4g, \ \sigma '=u^{-2}\sigma , \ \ \tau '=\tau , \end{aligned}$$

for some \(u>0\). Besides the (somewhat arbitrary) assumption of looking for CMC solutions, recall that the analysis presented there concerns the case when the background manifold M is closed (compact with no boundary), there are no matter fields and the cosmological constant is postulated to be equal to zero. A very interesting question is whether the same outcome is true in greater generality, for instance in the (cosmologically significant) case when \(\varLambda >0\). This question is the key motivation behind the bifurcation study presented in Chruściel and Gicquaud (2017).

As a preliminary remark, let us note that the CMC analysis presented in Isenberg (1995) carries through, without any modification, to cover the case when \(\varLambda \ne 0\) provided

$$\begin{aligned} \tau ^2\ge \frac{2n}{n-1}\varLambda \end{aligned}$$

for \(n\ge 3\) the dimension of the background manifold M, the conditions \(\tau =0\) or \(\tau >0\) being replaced (in this greater generality, and sticking to \(n=3\) for the sake of consistency) by the conditions \(\tau ^2= 3\varLambda \) or \(\tau ^2> 3\varLambda \), respectively. Hence, based on the uniqueness results given above, it is no loss of generality to confine the investigation to the case when the opposite inequality holds instead, i.e., when \(\tau ^2<3\varLambda \). It turns out, perhaps a bit surprisingly, that in that regime the scenario is a lot wilder than in the cases covered by Isenberg. With the goal of providing an exhaustive description of all solutions of the Einstein constraints (which is a very challenging task in full generality) the authors restrict their study to the CMC analysis for the case \(M=S^1\times S^2\), with seed data that are invariant under the action of the group \(U(1)\times SO(3)\); so it is understood that this symmetry is postulated not only for the background metric g but also for the TT tensor \(\sigma \) (the mean curvature \(\tau \) being constant anyway).

The equation to be studied takes the form

$$\begin{aligned} \varDelta _g u -\frac{1}{8}R_g u =\left( \frac{1}{12}\tau ^2-\frac{\varLambda }{4}\right) u^5 -\frac{1}{8}\Vert \sigma \Vert ^2_g u^{-7}. \end{aligned}$$

The choice of the background manifold M and of the product structure is motivated by earlier works on the Yamabe equation, to which (2.11) reduces when \(\sigma =0\): note that the assumption \(\tau ^2<3\varLambda \) geometrically corresponds to looking for conformal metrics with positive scalar curvature, which is indeed the most delicate case for that problem. The non-uniqueness/multiplicity analysis for the Yamabe problem (normalised, constant scalar curvature metrics on \(S^1\times S^2\)) was first presented in Sect. 2 of Schoen (1989) (see also Carlotto et al. 2015). Roughly speaking, that study consists of two parts: first one employs the moving plane method or variations thereof (see, in particular, Gidas et al. 1979, 1981 and Caffarelli et al. 1989) to show that all geometrically admissible solutions to the Yamabe equation are symmetric (in the sense that they inherit the symmetry of the data), and second one classifies all such solutions by means of a phase-space analysis for an Hamiltonian system in the plane. One could assert that Chruściel and Gicquaud (2017) is an extension of these ideas, and of the same approach, to the harder case given by the Lichnerowicz equation (where an additional term in the form of a negative power is also in play).

For the symmetry results, the authors appeal to the earlier study Jin et al. (2008) (which concerns the ‘method of moving spheres’), whence they then mainly focus on the delicate ODE and bifurcation analysis, which takes most of their work. We find it appropriate to provide the reader with a precise statement of their main theorem. First of all, one considers background product metrics of the form

$$\begin{aligned} \mathring{g}=\left( \frac{\mathring{T}}{2\pi }\right) ^2d\psi ^2 +\frac{2}{\mathring{R}}d\varOmega ^2 \end{aligned}$$

where \(\psi \) is a \(2\pi \)-periodic coordinate on \(S^1\) and \(d\varOmega ^2\) denotes the unit round metric on \(S^2\). Note that the scalar curvature of \(\mathring{g}\) is exactly equal to \(\mathring{R}\).

Similarly, due to \(U(1)\times SO(3)\) symmetry assumption, one can conveniently write

$$\begin{aligned} \mathring{\sigma }=\frac{2\alpha }{\sqrt{6}} \left( \frac{\mathring{T}}{2\pi }\right) ^2d\psi ^2-\frac{1}{\mathring{R}}d\varOmega ^2 \end{aligned}$$

so that in fact \(\alpha =|\mathring{\sigma }|_{\mathring{g}}\). If we then declare

$$\begin{aligned} \beta ^2{:}{=}2\varLambda -\frac{2}{3}\tau ^2 \end{aligned}$$

we have that the Lichnerowicz equation takes the specific form

$$\begin{aligned} \varDelta _{\mathring{g}} u -\frac{1}{8}\mathring{R} u =-\frac{\beta ^2}{8}u^5 -\frac{\alpha ^2}{8}u^{-7}. \end{aligned}$$

Moving one step further, note that such an equation has a non-empty set of (positive) solutions if and only if it admits (positive) constant solutions: indeed, if \(u>0\) is a solution and we look at a point \(p\in M\) where \(\varDelta _{\mathring{g}}u(p)=0\) (which certainly exists since the function \(\varDelta _{\mathring{g}}u(p)=0\) integrates to zero by the divergence theorem) then we get

$$\begin{aligned} \frac{1}{8}\mathring{R} u =\frac{\beta ^2}{8}u^5 +\frac{\alpha ^2}{8}u^{-7}, \end{aligned}$$

which ensures that \(c{:}{=}u(p)\) solves the Lichnerowicz equation. As a result, one can exploit the conformal covariance of the problem (as explained above, so redefining the seed data) in a way that, whenever the set of positive solutions is not empty, the constant 1 is itself a solution, which implies the normalisation condition

$$\begin{aligned} \mathring{R}=\alpha ^2+\beta ^2. \end{aligned}$$

It follows that we can finally write the equation in the most convenient form

$$\begin{aligned} 8\varDelta _{\mathring{g}} u -(\alpha ^2+\beta ^2)u =-\alpha ^2u^{-7}-\beta ^2u^5. \end{aligned}$$

and, even more significantly, we have reduced to three free parameters only, namely \(\alpha ,\beta \) and \(\mathring{T}\) (which is ‘hidden’ in \(\mathring{g}\)); these parameters vary in the first open octant of \({\mathbb {R}}^3\). So, here is the aforementioned result, which provides, in this special case, a clear description of the observed landscape as far as the parametrisation problem is concerned.

Theorem 2.12

In the setting above, the following statements hold:

  1. 1.


    $$\begin{aligned} \alpha ^2\beta ^4>4\left( \frac{\alpha ^2+\beta ^2}{3}\right) ^3 \end{aligned}$$

    then (2.12) has no solutions;

  2. 2.


    $$\begin{aligned} \alpha ^2\beta ^4=4\left( \frac{\alpha ^2+\beta ^2}{3}\right) ^3 \end{aligned}$$

    then (2.12) has precisely one solution, that is constant;

  3. 3.


    $$\begin{aligned} \alpha ^2\beta ^4<4\left( \frac{\alpha ^2+\beta ^2}{3}\right) ^3 \end{aligned}$$

    then (2.12) is always solvable, and all solutions are SO(3) invariant (the isometric action being understood on the second factor of the product). Moreover, there exists a period function \(T=T(\alpha ,\beta )\) with \(T\rightarrow \infty \) as \(\alpha ^2\beta ^4\uparrow 4\left( (\alpha ^2+\beta ^2)/3\right) ^3\), such that for \(\mathring{T}\in (nT, (n+1)T]\) there exist exactly two constant solutions, and exactly n non-constant solutions (counted modulo isometry).

Note that, in the statement above, the condition discriminating the three cases is given by an homogeneous degree 6 polynomial in \(\alpha ,\beta \) so that the three cases above correspond to scaling-invariant regions. Also, note that one could equivalently describe the solvability in case 3. in terms of \(\alpha \) instead. Yet, this is most transparently done when considering the four parameters \(\mathring{T}, \mathring{R}, \alpha , \beta \) as independent: set \(k_{\max }\) the largest integer k such that

$$\begin{aligned} \left( \frac{2\pi }{\mathring{T}}\right) ^2k^2<\frac{\mathring{R}}{2} \end{aligned}$$

there exist explicit constants

$$\begin{aligned} \alpha _0=\frac{2}{\beta ^2}\left( \frac{\mathring{R}}{3}\right) ^{3/2}>\alpha _1(\beta ,\mathring{T},\mathring{R})>\cdots>\alpha _{k_{\max }} (\beta ,\mathring{T},\mathring{R})>0 \end{aligned}$$

such that for \(\alpha \in [\alpha _{k+1},\alpha _k)\), respectively \(\alpha \in [0,\alpha _{k_{\max }})\), (2.12) has two constant solutions and k, respectively \(k_{\max }\) non-constant solutions (counted modulo isometry). See Eq. (6.4) in Chruściel and Gicquaud (2017) for the expression of the threshold values \(\alpha _1,\ldots ,\alpha _{k_{\max }}\) in terms of the other three parameters.

It should be said that the analysis by Chruściel and Gicquaud also includes a careful study and classification of the possible bifurcation types (cf. e.g., Crandall and Rabinowitz 1971 and Nirenberg 2001) whenever bifurcation occurs. Furthermore, the authors prove that any initial data set (Mgk) of the type constructed above is in fact a CMC slice of a spacetime belonging to the Schwarzschild-de Sitter family (possibly degenerating to Nariai or de Sitter), a conclusion which ultimately appeals to the Birkhoff-type theorem given in Schleich and Witt (2010), and then relate the parameters listed above to the natural physical parameters describing such spacetimes. The discussion is extremely accurate and instructive, and enriched by several enlightening plots; we further refer the reader to page 676 in Chruściel and Gicquaud (2017) for additional conclusions related to the question about where the initial data set in question will lie in the (uniquely-determined) associated spacetime, with respect to the cosmological and hole horizons. The reader may also wish to refer to Bizoń et al. (2015) for related contributions.

The net outcome is that, set aside the special scenario covered by the analysis in Isenberg (1995), one cannot really expect the conformal method to provide a global parametrisation of solutions for the constraints (like we explained above in terms of the bijective map S) in any reasonable sense. Nevertheless, it would still be valuable to understand the local structure of the space of solutions away from non-generic triples \((g,\sigma ,\tau )\). A bit more precisely, one could pose the following problem.

Open Problem 2.13

Is it true that for Baire-generic conformal data \((g,\sigma ,\tau )\) the conformal method provides a local k-to-1 parametrisation of the space of solutions to the Einstein constraint equations, for some \(k=k(g,\sigma ,\tau )\) (possibly \(k=0\))?

We have presented, a bit prematurely, this question here as it already makes sense in the specific setting above, so for a closed background manifold and considering CMC vacuum solutions with positive cosmological constant. However, the same problem can be rephrased in greater generality, so in relation to the settings we shall present in the coming sections.

Beyond the CMC case

We have seen in the Sect. 2.3 how the solvability of the ellitpic system (2.9) (corresponding to the vacuum constraints, and \(\varLambda =0\)) is fully understood when \(\tau \) is constant, namely in the so-called CMC case that is the object of Isenberg (1995). Perturbative results have then been obtained in various later works, such as Isenberg and Moncrief (1996) (for the Yamabe negative case), Allen et al. (2008) (for the positive or zero case): roughly speaking, one proves existence of solutions for the system above when \(d\tau /\tau \) is very small in a suitable functional norm. It is also interesting to note how in Isenberg and Ó Murchadha (2004) a negative persistence result was proven instead (more precisely: given a background metric g having non-negative scalar curvature, a vanishing TT tensor \(\sigma \), and \(\tau =\tau _0+\eta \) where \(\tau _0\ne 0\) and \(|d\eta |/|\tau _0|\) is sufficiently small the system (2.9) is still not solvable).

In general, the study of the far from CMC regime requires more refined methods. A first solvability criterion was proposed in Holst et al. (2008, 2009): in particular, this can be applied to obtain existence results when the tensor \(\sigma \) is suitably small, but with no restrictions on the mean-curvature parameter \(\tau \). This approach relies on the notion of global sub- and supersolutions, which we are about to recall.

We make the following general assumption: the background manifold (Mg) has no conformal Killing vector field, namely

$$\begin{aligned} K_g W =0 \ \Longleftrightarrow \ W=0. \end{aligned}$$

Then, it follows from the previous discussion that the operator \(div_g(K_g \cdot ): X\rightarrow X\) is an isomorphism (the notation for X is as above) and therefore, given any \(u\in C^{\infty }(M)\) there exists a unique vector field \(W_{u} \in X\) satisfying

$$\begin{aligned} div_g(K_g W)=\frac{2}{3}u^{6}d\tau . \end{aligned}$$

With an eye towards later applications, also note that for fixed \(\tau \in C^{\infty }(M)\), given \(\mu \in (0,1)\) one has that the equation above is also solvable for \(u\in C^{0,\mu }(M)\), in which case one rather gets \(W_{u}\in C^{2,\mu }(M)\).

Remark 2.14

For the sake of completeness, and for future reference, let us discuss an important aspect concerning condition (2.14). It is a classical result in Riemannian geometry (see e.g., Ebin 1970, or Mounoud 2015 for an extension to pseudo-Riemannian metrics of arbitrary signature (pq)) that, given a compact background manifold, the set of metrics whose associated isometry group is trivial (namely: consisting of the identity map only) is open and dense. Here it is understood that the space of metrics is endowed with the smooth \(C^{\infty }\) topology, but the case of tensors having a finite degree of regularity (as encoded by spaces such as \(C^k\) or \(C^{k,\alpha }\) is analogous, in fact strictly simpler from a topological perspective).

In a similar vein, it is proven in Beig et al. (2005) that, considered the collection of Riemannian metrics on a three-dimensional manifold with a \(C^k\)-topology for \(k\ge 5\) (weighted, with arbitrary weights, in the non-compact case), then the set of Riemannian metrics which have no globally defined conformal Killing vectors is open and dense (hence generic in the sense of Baire). We note that this result, specifically Proposition 9.3 therein, follows from a local non-existence result, which is indeed the reason why the conclusion can be applied both to the compact and the non-compact cases, and (in the latter case) with signficant flexibility on the topology one chooses for the space of metrics. We further remark that an analogous conclusion holds in arbitrary dimension \(n\ge 3\) provided one takes \(k\ge k_0(n)\) large enough. On the other hand, the (more subtle) case when \(k=\infty \) i.e., when one deals with \(C^{\infty }\) metrics (a Fréchet space, rather than a Banach space) is not discussed in Beig et al. (2005). That the conclusion still holds true is certainly plausible, although not yet in the literature.

That being said, we give the following definitions:

Definition 2.15

In the setting above, set

$$\begin{aligned} T(u, W)=\varDelta _g u -\frac{1}{8}R_g u -\frac{1}{12}\tau ^2u^5 +\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g u^{-7}. \end{aligned}$$

We shall say that a positive function \(u^+\in C^{2,\mu }(M)\) is a global supersolution if

$$\begin{aligned} T(u^+, W_u)\le 0 \ \ \forall \ u\in C^{2,\mu }(M) \ \text {such that} \ 0\le u\le u^+. \end{aligned}$$

We shall say that a positive function \(u^-\in C^{2,\mu }(M)\) is a global subsolution if

$$\begin{aligned} T(u^-, W_{u})\ge 0 \ \ \forall \ u\in C^{2,\mu }(M) \ \text {such that} \ u\ge u^-. \end{aligned}$$

Here is the general result in Holst et al. (2008) which allows to derive existence away from the CMC regime:

Theorem 2.16

In the setting above, given any triple of parametrising data \((g,\sigma ,\tau )\in C^{2,\mu }\times C^{0,\mu }\times C^{1,\mu }\), assume the existence of a global subsolution \(u^-\) as well as of a global supersolution \(u^+\) such that \(u^-\le u^+\). Then there exists a solution \((u, W)\in C^{2,\mu }(M)\times C^{2,\mu }(M)\) of the system (2.9).

Given the (remarkably clever) Definition 2.15, it turns out that the proof of the previous theorem follows, modulo certain technical complications, the same conceptual scheme behind the well-known argument for Theorem 2.6. Starting with the global subsolution we are provided, namely \(u_0{:}{=}u^-\) we set up an inductive scheme by first solving the second equation (thereby getting the vector field \(W_{u^-}\)) and then solving the first equation for \(u_1\), and so on. Arguing by induction, using the monotonicity assumptions and the maximum principle one easily checks that the sequence \((u_k)\) is monotone non-decreasing and uniformly bounded from above by \(u^+\). Hence it must converge to a pointwise limit \(u_{\infty }\). At that stage, if we write our scalar equation in the form

$$\begin{aligned} Pu=G(x,u), \end{aligned}$$

where we can always arrange that P a linear elliptic operator such that \(P(1)<0\), it is readily checked that

$$\begin{aligned} \Vert G(x,u_k)\Vert _{2,\mu }\le \varLambda (1+\Vert u_k\Vert _{0,\mu }+\Vert u_k\Vert ^2_{0,\mu }) \end{aligned}$$

for some large constant \(\varLambda >0\) which depends on the parametrising data, as well as on \(u^-, u^+\). Therefore, using a standard interpolation inequality one checks that the sequence \((u_k)\) is uniformly bounded in \(C^{2,\mu }(M)\). As a result, by the Arzelà-Ascoli compactness theorem, we have that \(u_k\rightarrow u_{\infty }\) subsequentially in \(C^2(M)\), in fact sequentially since the pointwise limit is uniquely determined to be \(u_{\infty }\), hence \(W_{u_k}\rightarrow W_{u_{\infty }}\) sequentially in \(C^{2,\mu }(M)\) by Schauder estimates. Thus \(u_{\infty }\) solves, in a classical sense, the equation

$$\begin{aligned} \varDelta _g u_{\infty } -\frac{1}{8}R_g u_{\infty } =\frac{1}{12}\tau ^2 u_{\infty }^5 -\frac{1}{8}\Vert \sigma +K_gW_{u_{\infty }}\Vert ^2_g u_{\infty }^{-7} \\ \end{aligned}$$

where the right-hand side patently belongs to \(C^{0,\mu }\). Hence the conclusion comes at once by invoking Schauder estimates again. Some remarks are in order:

  1. (a)

    both in Holst et al. (2008, 2009), the authors work in Sobolev spaces (rather than in Hölder spaces as above) and spend some serious efforts in building a theory as robust as possible, so to deal with rough data;

  2. (b)

    the functional-analytic argument behind Theorem 2.16 is presented, in these sources, as a coupled topological fixed-point argument (hence from a more abstract perspective than we chose above, for the sake of expository convenience);

  3. (c)

    of course, Theorem 2.16 reduces the problem of solving (2.9) to that of constructing global sub- and supersolutions, which is accomplished (by the same authors) in certain special cases and assuming the existence of suitable sources (so that there are additional terms on the right-hand side of the system): for instance an existence result is obtained when [g] is a Yamabe-positive conformal class (namely: \(Y(M,[g])>0\)), (Mg) has no conformal Killing vector fields and the data \(\mu , J\) are suitably small but the energy density \(\mu \) is not identically zero.

The next task is then to attack the problem of constructing global sub- and supersolutions in some greater generality, for instance aiming at a neat existence result in the vacuum case. Pushing further this approach, Maxwell (2009) provides natural geometric conditions that ensure the existence of a global subsolution for the elliptic system (2.9). Here is the relevant statement:

Theorem 2.17

Given parametrising data \((g,\sigma ,\tau )\), assume one of the following three conditions holds:

  1. 1.

    \(Y(M,[g])> 0\) and \(\sigma \ne 0\);

  2. 2.

    \(Y(M,[g])= 0\) and \(\sigma \ne 0,\tau \ne 0\);

  3. 3.

    \(Y(M,[g])<0\) and \(\exists \ {\hat{g}}\in [g]\) such that \(R_{{\hat{g}}}=-\frac{2}{3}\tau ^2\).

If there exists a global supersolution \(u^+\) then system (2.9) admits a solution (uW) with \(0<u\le u^+\).

As it had been anticipated above, this somewhat abstract result has an interesting application:

Corollary 2.18

In the setting above, if \(Y(M,[g])>0\) and \(\sigma \) is small enough but not identically zero, then the system (2.9) is solvable for any choice of the mean curvature function \(\tau \). More precisely: for any \(\tau \in C^{1,\mu }(M)\) there exists \(\delta =\delta (\tau )>0\) such that (2.9) is solvable whenever \(\Vert \sigma \Vert _{L^\infty (M,g)}<\delta \).

We shall first present the proof of such a corollary given the theorem above and then move back.


The conclusion follows immediately from Theorem 2.17 once we construct a global supersolution for our system. This goes as follows. By our assumption on the positive sign of the Yamabe invariant of (M, [g]) we can assume, without loss of generality, that \(R_g>0\) at all points. That being said, we claim that a global supersolution to our problem is provided by any positive constant \(\varepsilon \) chosen small enough. For indeed

$$\begin{aligned} T(\varepsilon , W)=-\frac{1}{8}R_g \varepsilon -\frac{1}{12}\tau ^2\varepsilon ^5 +\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g \varepsilon ^{-7} \end{aligned}$$

and so if we let \(W=W_u\) for \(0\le u\le \varepsilon \) we have (by virtue of the second equation of (2.9))

$$\begin{aligned} \Vert K_g W\Vert _{L^\infty (M,g)}\le C \Vert K_g W\Vert _{W^{1,p}(M,g)}\le C \Vert W\Vert _{W^{2,p}(M,g)}\le C\varepsilon ^6\Vert d\tau \Vert _{L^p(M,g)}\le C\varepsilon ^6 \Vert \tau \Vert _{C^{1,\mu }(M,g)} \end{aligned}$$

for any \(p>3\). Here C denotes a constant which depends only on p, on the ambient Riemannian manifold (Mg) and on the operator \(K_g\). We remark that the first inequality relies on the standard Sobolev embedding theorem and the third on elliptic estimates for the operator \(div_g(K_g \ \cdot )\). Since we can write, for any such choice of \(W=W_u\),

$$\begin{aligned} T(\varepsilon , W)\le -\frac{1}{8}R_g \varepsilon +\frac{1}{8}\varepsilon ^5 \Vert \varepsilon ^{-6}\sigma +\varepsilon ^{-6}K_g W\Vert ^2_g \end{aligned}$$

one gets that patently \(T(\varepsilon , W)\le 0\) provided we ensure

$$\begin{aligned} \left\{ \begin{aligned} &\Vert \sigma \Vert _{L^\infty (M,g)}\le \varepsilon ^6 \\ &\varepsilon ^4\le \frac{\inf _M R_g}{2(1+C^2\Vert \tau \Vert ^2_{C^{1,\mu }(M,g)})}. \end{aligned}\right. \end{aligned}$$

Hence, we first choose \(\varepsilon \) (depending on \(\Vert \tau \Vert _{C^{1,\mu }(M,g)}\)) in order to accomodate the second condition, and then make sure to satisfy the first inequality by requiring \(\sigma \) to be small enough. This completes the proof. \(\square \)

Remark 2.19

The reader may wish to compare this result with Gicquaud (2018), where an existence result is proven (in the case \(Y(M,[g])=0\)) under the assumption that the \(L^2\) norm, rather than the \(L^{\infty }\) norm, of \(\sigma \) be small enough.

We now proceed and prove Theorem 2.17.


Thanks to Theorem 2.16, we need to construct a global subsolution for the problem in question, that lies below the given supersolution \(u^{+}\).

This is a lot simpler for case 3., so let us start from there. Consistently with the notation employed in the statement, let \({\hat{g}}\) satisfy \(R_{{\hat{g}}}=-\frac{2}{3}\tau ^2\) and let us write, since \({\hat{g}}\in [g]\), \({\hat{g}}=v^4 g\) for some \(v>0\). Hence

$$\begin{aligned} \varDelta _g v-\frac{1}{8}R_g v=\frac{1}{12}\tau ^2 v^5. \end{aligned}$$

It follows that for any W one has

$$\begin{aligned} T(v,W)=\varDelta _g v -\frac{1}{8}R_g v -\frac{1}{12}\tau ^2 v^5 +\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g v^{-7}\ge 0. \end{aligned}$$

As a result, it is enough to set \(u^-=\varepsilon v\) for \(\varepsilon >0\) chosen small enough that \(u^-\le u^+\) for \(u^+\) the global supersolution we are provided, since indeed

$$\begin{aligned} T(\varepsilon v,W)=\varepsilon (1-\varepsilon ^4)\frac{1}{12}\tau ^2 v^5 +\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g \varepsilon ^{-7} v^{-7}\ge 0. \end{aligned}$$

Let us now proceed with the analysis of case 1.; the modifications to similarly handle case 2. are left to the reader. Let \(\xi :{\mathbb {R}}_{\ge 0}\rightarrow {\mathbb {R}}_{\ge 0}\) be any fixed, smooth (\(C^{\infty }\)), non-increasing function such that

$$\begin{aligned} \xi (t)= {\left\{ \begin{array}{ll} 1 &{} \text {if} \ t=0, \\ 0 &{} \text {if} \ t\ge \min u^+, \end{array}\right. } \end{aligned}$$

and set \(\chi _{\varepsilon }(t)=t+\varepsilon \xi (t)\), which we shall regard as a suitably small perturbation of the identity map from \({\mathbb {R}}_{\ge 0}\) to itself. Furthermore, define

$$\begin{aligned} T_\varepsilon (v, W)=\varDelta _g v -\frac{1}{8}R_g v -\frac{1}{12}\tau ^2 v^5 +\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g \chi _{\varepsilon }(v)^{-7}. \end{aligned}$$

It is clear that:

  • \(u^-=0\) satisfies \(T_{\varepsilon }(0, W)\ge 0\) for any vector field W;

  • \(u^+\) still satisfies \(T_{\varepsilon }(u^+,W)\le 0\) for all \(W_u, \ 0<u\le u^+\) because \(\chi _{\varepsilon }(u^+)=u^+\) by construction.

Hence, we can adapt (with rather small changes) the proof of Theorem 2.16 to prove existence of a solution \((u_{\varepsilon }, W_{\varepsilon })\) for the approximate problem

$$\begin{aligned} \left\{ \begin{aligned}& \varDelta _g u-\frac{1}{8}R_g u=\frac{1}{12}\tau ^2u^5 -\frac{1}{8} \Vert \sigma +K_g W\Vert ^2_g \chi _{\varepsilon }(u)^{-7} \\ &div_g(K_g W)=\frac{2}{3}u^{6}d\tau . \end{aligned}\right. \end{aligned}$$

However, a priori we only know that \(0\le u_{\varepsilon }\le u^+\) so \(u_{\varepsilon }\) may in fact vanish at some points and we only have uniform \(L^{\infty }\)-bounds from above. Here is the key claim in the proof: if \(\sigma \ne 0\) (namely: if \(\sigma \) does not vanish identically) then there exists \(\delta >0\) such that

$$\begin{aligned} \min u_{\varepsilon }\ge \delta \ \ \text {for all} \ \ 0<\varepsilon \le \varepsilon _0. \end{aligned}$$

Once this is gained, compactness (hence subsequential convergence as one lets \(\varepsilon \rightarrow 0\)) for our sequence follow by rather well-known arguments. So let us justify the claim above. First observe that the function \(u_{\varepsilon }\) satisfies (in a classical sense)

$$\begin{aligned} \varDelta _g u_{\varepsilon }+Q_{\varepsilon }u_{\varepsilon }\le 0, \ \text {where} \ Q_{\varepsilon }=-\frac{1}{8}R_g-\frac{\tau ^2}{12}u^4_{\varepsilon } \end{aligned}$$

where \(Q_{\varepsilon }\) can be bounded (in, say, \(C^0(M)\)) independently of \(\varepsilon \). Given this inequality a standard application of the maximum principle ensures that in fact \(u_\varepsilon >0\), namely there cannot be points where these non-negative solutions to the approximate problems vanish. Yet the issue is the apparent lack of a uniform positive lower bound. To take care of it, one can however just apply the De Giorgi–Nash–Moser theory (see e.g., Chap. 3 in Ambrosio et al. 2018), which in particular provides the mean-value inequality

$$\begin{aligned} \min _M u_{\varepsilon }\ge C^{-1}\int _{M}u_{\varepsilon }\,dV_g \end{aligned}$$

for some \(C>0\) independent of \(\varepsilon \). For the sake of a contradiction, assume then the existence of a sequence \(\varepsilon _i \searrow 0\) and \(\min u_i\rightarrow 0\) (where we have set \(u_i{:}{=}u_{\varepsilon _i}\) and we will similarly write \(\chi _i\) in lieu of \(\chi _{\varepsilon _i}\) as well as \(W_i\) for \(W_{u_i}\)). Possibly by extracting a subsequence we can assume that one has \(K_g W_i\rightarrow K_g W\) in \(C^0\). It follows that

$$\begin{aligned} \int _M \Vert \sigma +K_g W\Vert ^2_g \, dV_g =\int _M \Vert \sigma \Vert ^2_g \, dV_g +\int _M \Vert K_g W\Vert ^2_g \, dV_g \ge \int _M \Vert \sigma \Vert ^2_g \, dV_g>0 \end{aligned}$$

where the first equality relies on the fact that the forms \(\sigma \) and \(K_g W\) are \(L^2\)-orthogonal, the first being trace-free and the second pure-trace (see Sect. 2.2). As a result, we can find an open set \(\varOmega \ne \emptyset \) such that

$$\begin{aligned} \Vert \sigma +K_g W_i\Vert ^2_g\ge \alpha >0 \ \text {in} \ \varOmega \end{aligned}$$

for all i large enough. Now, integrating the equation solved by \(u_i\) provides the bound

$$\begin{aligned} \frac{1}{8}\int _M \Vert \sigma +K_g W_i\Vert ^2_g \chi _i(u_i)^{-7} dV_g=\int _M \left( -\varDelta _g u_i+\frac{1}{8}R_g u_i+\frac{\tau ^2}{12}u_i^5\right) dV_g\le C \end{aligned}$$

where we have exploited the fact that \(\partial M=\emptyset \) as well as \(0\le u_i\le u^+\) (and we allow, as usual, C denote a constant, independent of \(\varepsilon \), which is allowed to vary from line to line and even within the same line). Therefore,

$$\begin{aligned} C\ge \frac{1}{8}\int _M \Vert \sigma +K_g W_i\Vert ^2_g \chi _i(u_i)^{-7}dV_g\ge \frac{\alpha }{8} \int _{\varOmega } \chi _i(u_i)^{-7}dV_g. \end{aligned}$$

Applying Hölder twice we then obtain the following chain of inequalities

$$\begin{aligned} |\varOmega |^2&\le \int _{\varOmega }\chi _i(u_i)\int _{\varOmega }\chi _i(u_i)^{-1}dV_g \le \left( \int _{\varOmega }\chi _i(u_i)dV_g\right) \left( \int _{\varOmega } \chi _i(u_i)^{-7}dV_g\right) ^{1/7}|\varOmega |^{6/7} \\&\le C |\varOmega |^{6/7}\int _{\varOmega } (u_i+\varepsilon _i)dV_g \end{aligned}$$

and so we can derive a uniform, positive lower bound for \(\int _M u_i dV_g\), which gives a contradiction since the mean-value inequality (2.17) would imply \(\int _M u_i dV_g\rightarrow 0\) as we let \(i\rightarrow \infty \). Thereby, the proof is complete. \(\square \)

In spite of these advances, there remains the question whether one can reasonably aim at (unconditional) existence results beyond the perturbative regime, namely when neither

$$\begin{aligned} \frac{\max |d\tau |_g}{\min \tau } \ \text {is sufficiently small} \end{aligned}$$


$$\begin{aligned} \max |\sigma |_g \ \text {is sufficiently small, with smallness depending on} \ \tau \end{aligned}$$

are assumed to hold. In order to investigate this matter, and also to study the aforementioned parametrisation problem in the much harder non-CMC regime, a simplified model for the Einstein constraints (under strong symmetry assumptions and for special parametrising data) was carefully analysed in Maxwell (2011). More precisely, the author restricts to initial data sets (Mhk) where (Mg) is an \(n-\)dimensional flat torus i.e.,

$$\begin{aligned} M^n=S^1_{r_1}\times \cdots S^1_{r_n} \ \ \ (\text {without loss of generality} \ r_n=1) \end{aligned}$$

with product metric, \(h\in [g]\) and, furthermore, both h and k are assumed to depend on one variable only (specifically they are functions of \(x=x^n\in [-\pi ,\pi ]\)). It turns out, through rather elementary arguments that we omit, that the conformal formulation of the vacuum Einstein constraints (with zero cosmological constant) can be written as a coupled system of ODEs which for \(n=3\) reads

$$\begin{aligned} \left\{ \begin{aligned} &12\varphi ''+3\eta ^2\varphi ^{-7}+(\mu +w')^2\varphi ^{-7}-\tau ^2\varphi ^5=0 \\ &w''-\varphi ^6\tau '=0 \end{aligned}\right. \end{aligned}$$

for conformal data described by triples of (scalar) functions \((\mu ,\eta ,\tau )\) and unknowns \((\varphi , w)\). Loosely speaking, one looks for symmetric solutions of a symmetric problem, and wishes to use this model (where the space of conformal data is much restricted) to get some understanding of the general case. We note parenthetically that a similar discussion is also developed for the conformal thin-sandwich method (for indeed Maxwell 2014 would come a few years later, see Sect. 2.7), although we shall not report about it separately.

Maxwell further postulates \(\mu \) and \(\eta \) to be constant and the mean curvature to take the special form

$$\begin{aligned} \tau _{\lambda }=t+\lambda \end{aligned}$$

where \(t\in {\mathbb {R}}\) and \(\lambda \) is the standard jump function

$$\begin{aligned} \lambda (x)={\left\{ \begin{array}{ll} -1 &{} \text {if} \ -\pi<x<0 \\ 1 &{} \text {if} \ \ 0\le x<\pi . \end{array}\right. } \end{aligned}$$

Thereby one gets a finite-dimensional model on the torus (in fact: with three scalar parameters), which allows a very detailed and rather explicit analysis. It is noted by the author that these data are more singular then the general results in Holst et al. (2008) would allow, although in the special case one can appeal to Choquet-Bruhat (2004) instead.

All that being said, the author proves that there are conformal data fitting Case 2. of Theorem 2.17 namely corresponding to

$$\begin{aligned} Y(M,[g])=0, \ \ \sigma \ne 0, \ \ \tau \ne 0 \end{aligned}$$

for which multiple (genuinely distinct) solutions exist, as well as other conformal data (also in Case 2.) for which no symmetric solutions exist, which may lead correspond to either non-existence or instead multiplicity for solutions that are not subject to an a priori symmetry assumption. A bit more specifically, and keeping in mind the form of \(\tau _{\lambda }\) given above, it is observed that when the mean curvature is allowed to change sign (which one can think of as a condition on the size of t, since the function \(\lambda \) is fixed once and for all) then indeed one enters a regime where solutions either do not exist or are non-unique, and anyway the coupling of the system comes into play in a rather dramatic fashion.

It is worth remarking how non-uniqueness theorems become quite easier to prove if one allows for matter sources, as was indeed done in the earlier contributions Baumgarte et al. (2007) and Walsh (2007) by a clever design of non-scaling sources; see also Pfeiffer and York (2005) for non-uniqueness results for the extended conformal thin-sandwich method. We also stress that this study should be regarded as a test for the conformal method in its full generality rather than an attempt to smartly parametrising data on symmetric background manifolds, like \(U(1)\times U(1)\) symmetric initial data sets on the 3-torus, for indeed that case had already been handled, through a different and better adapted methodology, in Chruściel (1990) (among other things, the author gives Ernst and Gowdy-like parameterisations for the full class of \(U(1)\times U(1)\) symmetric spacetimes with compact Cauchy hypersurfaces).

In any event, getting back to Maxwell (2011) one is then lead to conclude that, even in the most basic case of closed background manifold, no matter sources and \(\varLambda =0\) the conformal method does not provide a good parametrisation scheme for solutions of the Einstein constraint equations. As stressed by Maxwell, there remains the problem to see whether there is a different approach (degenerating to the conformal method in the CMC case) which allows for a better (i.e., more complete and more transparent) description of the moduli space of solutions in generality. From our perspective, it would be perhaps a bit more reasonable to rather aim at a generic structure theorem in the sense of Open Problem 2.13.

We wish to conclude this section by giving a brief and informal description of the nice approach presented by in Dahl et al. (2012) to the scope of providing a general solvability criterion (of new and non-perturbative character) for the Einstein constraint equations in their conformal reformulation.

The starting observation is that the system (2.9) is, loosely speaking, critical, in the sense that the exponent 6 on the right-hand side of the second equation causes the two terms \(\tau ^2u^5\) and \(\Vert K_g W\Vert _g^2u^{-7}\) to be comparable, with opposite signs, thereby making the construction of global supersolutions more complicated than it would be otherwise. Instead, one can take a slight detour and consider for \(1\le q<6\) the subcritical system

$$\begin{aligned} \left\{ \begin{aligned} &\varDelta _g u -\frac{1}{8}R_g u=\frac{1}{12}\tau ^2u^5 -\frac{1}{8}\Vert \sigma +K_g W\Vert ^2_g u^{-7} \\ &div_g(K_g W)=\frac{2}{3}u^{q}d\tau . \end{aligned}\right. \end{aligned}$$

It is straightforward to check that, if \(\tau \) has a sign (say \(\tau >0\)), then any large positive constant provides a global supersolution for the system above. Hence, we can invoke a minor variation of Theorem 2.17 (adapted to the subcritical system) to obtain a global subsolution, for instance when \(Y(M,[g])>0\), thereby producing a smooth, positive solution \(u_q\) to (2.22) whenever \(q<6\). At that stage, one is naturally lead to study the convergence of any sequence \(u_{q_i}\) for \(q_i\rightarrow 6\). Incidentally, the strategy of replacing a critical problem with a subcritical one, and then studying the convergence of solutions to the subcritical equation is a recurrent theme in Analysis (a famous case being provided, for instance, by the construction of harmonic maps in Sacks and Uhlenbeck 1981).

Roughly speaking, the core idea in Dahl et al. (2012) is that if compactness does not hold then the blow-up rate of \(u_{q_i}\) is somehow uniform (thanks to the Harnack inequality, as we are considering positive solutions) and one can in fact find an associated sequence \(\varepsilon _{q_i}\) such that the renormalized functions \(\varepsilon _{q_i}u_{q_i}\) converge (in a suitably strong sense) to a limit function v. At that stage, if we simply write

$$\begin{aligned} u_{q_i}(x)=\varepsilon ^{-1}_{q_i}\left( v(x)+r_i(x)\right) \end{aligned}$$

where \(r_i(x)\) is a higher-order error term, and make (for what concerns the second equation) the change of variable \(W=\varepsilon ^{-q}{\overline{W}}\) we easily find

$$\begin{aligned} div_g(K_g{\overline{W}}_{q_i})=\frac{2}{3}(v+r_i)^{q_i} d\tau . \end{aligned}$$

Notice that if \(r_i\rightarrow 0\) in suitable Sobolev or Schuader norms, then standard elliptic regularity ensures the convergence of \({\overline{W}}_{q_i}\) to \({\overline{W}}\) solving \(div_g(K_g{\overline{W}})=2v^6 d\tau /3\), as one lets \(i\rightarrow \infty \). Furthermore, one obtains for v an equation of the form

$$\begin{aligned} \varepsilon ^4_{q_i}\left( \varDelta _{g}v-\frac{1}{8}R_g v\right) =\frac{\tau ^2}{12}v^5-\frac{1}{8}\Vert \varepsilon ^6_{q_i}\sigma +\varepsilon _{q_i}^{6-q_i}K_g{\overline{W}}_{q_i}\Vert ^2_g v^{-7}+\delta _i \end{aligned}$$

where \(\delta _i\) collects all the error terms containing \(r_i\) and derivatives thereof, up to order two. Hence, we want to let \(i\rightarrow \infty \) in (2.23). At this stage, a delicate blow-up analysis ensures the existence of the limit

$$\begin{aligned} \lim _{i\rightarrow \infty }\varepsilon ^{6-q_i}_{q_i}=\alpha \in (0,1]. \end{aligned}$$

Thereby, we can indeed let \(i\rightarrow \infty \) in (2.23) and get

$$\begin{aligned} 0=\frac{\tau ^2}{12}v^5-\frac{1}{8}\alpha ^2\Vert K_g{\overline{W}}\Vert ^2_g v^{-7}. \end{aligned}$$

If we combine such an equation with the second one in (2.9) we get the condition, not involving the function v, that

$$\begin{aligned} div(K_g{\overline{W}})=\sqrt{\frac{2}{3}}\alpha \Vert K_g{\overline{W}}\Vert _g\frac{d\tau }{\tau }. \end{aligned}$$

Developing and formalising these heuristics, Dahl–Gicquaud–Humbert proved the following dichotomy:

Theorem 2.20

In the setting above, either one of the following assertions is true (or both):

  1. (i)

    the set of solutions of (2.9) is non-empty and compact;

  2. (ii)

    there exists a non-zero vector field \({\overline{W}}\) solving the Eq. (2.24) for some \(\alpha \in (0,1]\).

Remark 2.21

For a rather different approach to the construction of far-from-CMC solutions, based on (infinite-dimensional) degree-theoretic methods, we refer the reader to Anderson (2018) (where the limit equation criterion is also discussed).

A gallery of variations on the conformal theme

Now, it should be clear that the program we have outlined in the previous sections i.e.,

  1. 1.

    basic Ansatz and conformal reformulation of the constraints;

  2. 2.

    analysis of the CMC case depending on the Yamabe type of the background manifold;

  3. 3.

    perturbative theory for the nearly-CMC case;

  4. 4.

    global theory and analysis of the far-from-CMC case;

could also be developed in settings other than closed manifolds. Sticking, for the sake of simplicity, to the case of vacuum constraints, one can consider the variations on the theme corresponding to the case when the background manifold is a non-compact manifold with empty boundary, a compact manifold with boundary, or a non-compact manifold with boundary.

The reader will certainly recognise that both the study of manifolds with boundary and/or the study of the Einstein constraints on non-compact manifolds stem from physical motivations that are by no means artificial, as general relativity is indeed supposed to model a large class of diverse phenomena where these different geometric setups naturally arise. From the mathematical perspective, we note that (when just trying to proceed by formal analogy with what has been presented) as a preliminary step we face the (non-trivial!) problem of well-posing and studying the Yamabe problem in a category which is not that of closed manifolds, or at the very least the problem of understanding whether one can assume to work with metrics whose scalar curvature has a definite sign.

We shall now quickly review some of the contributions along those lines, starting with the case of non-compact manifolds. If we do not fix a conformal class, we first note that in Bland and Kalka (1989) it was proven that any non-compact manifold of dimension \(n\ge 3\) does indeed support complete metrics of constant negative scalar curvature (cf. Gromov 1969), which mirrors a well-known theorem by Aubin in the closed case (see Aubin 1976). However, if we now fix a conformal class [g] and look for pointwise conformal metrics of constant scalar curvature, which we further require (following e.g., Yau 1982 or Kazdan 1985) to be complete, things change dramatically. Indeed, it was proven in Jin (1988) that there exists (M, [g]), of any dimension \(n\ge 3\), for which the Yamabe problem, in the form just stated, is not solvable. With this important caveat in mind, we will now specify our setting and examine the issue (hence: the solvability of the Einstein constraints via the conformal method) in three model cases: those of (a) asymptotically flat, (b) asymptotically hyperbolic and (c) asymptotically cylindrical data. So there we go:

  1. (a)

    while the Yamabe problem for asymptotically flat Riemannian manifolds (see Appendix B) has been studied at various places (cf. Cantor 1977 and Brill and Cantor 1981), a fully correct result was only given in Maxwell (2005b). Using compactly supported smooth functions one can define the Yamabe invariant Y(M, [g]) looking at the infimum value of the right-hand side of (2.3) and prove that an asymptotically flat metric can be conformally transformed to another one (again: asymptotically flat) with constant, necessarily zero, scalar curvature if and only if its Yamabe invariant is positive. In this case we then note that one can analyse the existence of CMC solutions of (2.9) in the very same way we presented for closed background manifolds (cf. Remark 2.9). We further remark that this is, for instance, then case of the Euclidean space with its standard conformal structure \(({\mathbb {R}}^n,[\delta ])\) for any \(n\ge 3\): indeed the positivity of the conformal Yamabe constant is (in this case) perfectly equivalent to the Sobolev inequality for \(p=2\).

    That said, it is then natural to wonder what happens when the conformal Yamabe invariant is non-positive (i.e., \(Y=0\) or \(Y<0\)). This corresponds to a non-empty set of asymptotically flat manifolds, see Friedrich (2011). Loosely speaking, the first step towards a full comprehension of the conformal constraints in the CMC case, and beyond, is the task of understanding whether one can conformally deform any given background manifold to one having, say, scalar curvature with a sign. With this natural goal in mind, in Dilts and Maxwell (2018) the authors analyse the scalar curvature prescription problem on asymptotically flat manifolds. In particular, they derived a necessary and sufficient condition for an asymptotically Euclidean manifold to be conformally related to one with pre-assigned nonpositive scalar curvature: the zero set of the desired scalar curvature must have a positive Yamabe invariant (suitably defined for any measurable subset of the ambient manifold). In fact, as a special case of their general analysis, they show that given an asymptotically flat metric g, and a strictly negative function that decays to zero suitably at infinity, the conformal class of g includes a metric with that scalar curvature regardless of the sign of the Yamabe invariant. So every strictly negative scalar curvature is attainable for every conformal class, but zero scalar curvature is attainable only for Yamabe positive metrics.

    We remark that there is a subtle, apparent ambiguity problem lying here: for instance, asymptotically flat metrics with positive Yamabe invariant can be conformally deformed (again: within the realm of asymptotically flat metrics, suitably defined) to scalar positive, scalar null and scalar negative metrics (that is to say: all three cases occur in the same conformal class). This seeming paradox depends on the fact that (2.3) and (2.2) are only equal if integration by parts is legitimate, which may not be the case when the background manifold is not compact. To stress the point: this result indicates how (even in the specific setting we are now considering) one cannot expect mutually exclusive alternatives as in Theorem 2.3 to hold. On the other hand, this remark has a pleasant consequence: since in any row of Table 1 there is at least an affirmative answer, we can conclude that (2.9) is solvable in any of the four cases listed there (depending on the vanishing/non-vanishing of \(\sigma \) and \(\tau \)) if we have a positive Yamabe invariant.

    Furthermore, Dilts and Maxwell showed in the same article that the Yamabe class of an asymptotically Euclidean manifold is the same as the Yamabe class of its conformal compactification (thereby obtaining a neat characterisation of the Yamabe classes of asymptotically flat manifolds, linking it to the closed case which is well-understood, and allowing for a simple topological characterisation when \(n=3\)).

    After this necessary preamble, let us get back to the solvability of the Einstein constraint equations. Partial existence results in the context of asymptotically flat data have been obtained, again via the conformal method (or ancestors thereof), in Cantor 1977, 1979; Choquet-Bruhat 1993 and then in Choquet-Bruhat et al. 2000 with the so-called conformal thin-sandwich method (which we will review later). The same theme was then the the object of the PhD thesis Dilts (2015). In particular, we mention here the following contributions, which are indeed part of that dissertation:

    • two classes of solvability results, one for the far-from-CMC case and one for the near-CMC case, in Dilts et al. (2014);

    • the refined analysis in Dilts and Isenberg (2016), where the authors obtain an admissibility criterion which provides a necessary condition on the seed data for the conformal constraint equations to (possibly) admit a solution and examine the blowup properties of solutions as the seed data sets approach sets for which no solutions exist.

  2. (b)

    for what concerns the asymptotically hyperbolic case, which one typically regards as (loosely speaking) the case of conformally compact manifolds whose sectional curvatures tend to \(-1\) on approach to the conformal boundary (see Appendix C), the analysis of the CMC case in connection to the corresponding Yamabe problem was carried through in Andersson and Chruściel (1996), see also earlier work Andersson et al. (1992); the study of the near-CMC case was then done in Isenberg and Park (1997). An accurate analysis of the scalar curvature prescription problem, with different sorts of boundary conditions, was instead performed in Gicquaud (2010), where the author further derived solvability results for the constraints in the CMC case but possibly in presence of apparent horizons, see below.

    A (partly) complementary study, in the spirit of Dilts and Maxwell (2018) (and then aimed at understanding the Lichnerowicz equation in this context), has been presented in the recent work Gicquaud (2019). For the derivation of a limit equation and criteria ensuring the existence of asymptotically hyperbolic initial data sets with non-constant mean curvature see Gicquaud and Sakovich (2012). Following the approach of Dahl et al. (2012) the authors consider the subcritical equations and study the solutions when the exponent tends to its actual value. Like in the compact case, it is proven that if the limit equation admits only the trivial solution, then the set of solutions of the constraint equations of the conformal method is non-empty and compact, and the authors give conditions that ensure that such scenario does indeed occur.

  3. (c)

    the study of the asymptotically cylindrical case is comparatively recent, and the resulting picture is somewhat less complete than in (a) and (b) above. To fix the ideas, a good motivating example to keep in mind is that of an extremal Kerr black hole, which is a stationary rotating black hole whose total angular momentum |a| is, with appropriately chosen units, equal to its total mass m. This solution admits a maximal time slicing by manifolds diffeomorphic to \({\mathbb {R}}^3\setminus \left\{ 0\right\} \) which are asymptotically flat as \(r\rightarrow \infty \) and asymptotically conformally cylindrical as \(r\rightarrow 0\).

    In a way, one could assert that the systematic study of this case was initiated in Chruściel and Mazzeo (2015), which is devoted to the solvability of the Lichnerowicz equation (as always, in its connection with the corresponding Yamabe problem). This paper has also the merit of being built starting from a very general setup, which comprises (among others) the asymptotically periodical and asymptotically conical geometries as well. In addition, we further mention how one of the two appendices included in this paper is fully devoted to listing and describing a number of natural physical situations when asymptotically cylindrical data naturally arise. This study was complemented by Chruściel et al. (2013), devoted to the vector constraint instead (namely: the second equation in (2.9)), thereby leading to a rather complete analysis of the CMC case under a remarkably general set of background assumptions. Partial results in the far-from-CMC case have then been proven in Leach (2014), while the corresponding limit equation criterion has been derived, in this setting, in Dilts and Leach (2015). Thereby, they obtained novel existence results for the conformal constraints in a non-perturbative setting, yet requiring suitably decay of the Ricci tensor of the background metric, non-existence of square-integrable conformal Killing vector fields plus other technical assumptions (see Corollary 2.3 therein for a precise statement).

    Along a strongly related line (both at a conceptual and strictly technical level) we further mention the work Leach (2016), that is devoted to the construction of trumpet solutions, by which we mean (non-CMC) solutions to the Einstein constraints possessing finitely many ends each being either of asymptotically flat or asymptotically conformally cylindrical (or periodic) type. This poses the problem of simultaneously dealing with different geometries at infinity, and indeed this is (at least in certain cases) feasible in the context of the conformal method. These results should be compared with some earlier works on similar themes, such as Dain and Gabach Clément (2009), Gabach Clément (2010) (both based on a singular limit process with fake boundary data, thereby viewing the asymptotically cylindrical end as a limit of asymptotically flat ones), Waxenegger et al. (2011) and Dain and Gabach Clément (2011).

Let us now turn our attention to the (physically very relevant) case of manifolds with boundary. It is in fact convenient to merge the discussion for compact or non-compact background manifolds, as some of the most significant issues that arise (in addition to ones we already presented above) are common to the two scenarios. We will also partly/mostly follow the chronological evolution of the results in this direction.

First of all, as we rephrase the constraints (1.2) through the conformal Ansatz, we immediately face the well-posedness problem for the system (2.9) i.e., we need to add suitable boundary conditions for the problem to have reasonable chances of solvability (together with other standard mathematically desirable properties). In principle, from the purely mathematical perspective, one could consider either Dirichlet or Neumann boundary conditions or, more generally, oblique boundary conditions (which are sometimes also referred to in the literature as Robin boundary conditions). Oversimplifying things, oblique boundary conditions naturally arise when modelling an apparent horizon (that is to say: a surface for which the outgoing expansion vanishes, \(\theta _{+}=0\)), while Dirichlet conditions, somewhat less naturally, come up when one replaces (e.g., in describing an asymptotically flat data set) a non-compact manifold by means of a large but finite subregion thereof (so in relation to excision/approximation schemes). In fact, as it is well described in the first sections of Holst and Tsogtgerel (2013) a fully general treatment of the solvability problem for the constraints, under the conformal method, does need to consider coupled oblique boundary conditions, which poses a number of interesting technical problems. From a somewhat different perspective, we note that in initial-value problems for the Einstein field equations understanding the solvability of the constraints on manifolds with boundary, and under different classes of boundary conditions, is of considerable interest. To ensure the smoothness of the corresponding spacetime, the data on the spacelike slice M and on the timelike boundary T need to satisfy certain compatibility conditions on \(M\cap T\): if the physical situation requires the data on T to satisfy certain specific properties, it may not be obvious at all whether one can actually construct solutions to the constraints that are consistent with such requirements, and it is thus important to gain the most general possible understanding of the corresponding boundary value problem for the constraints.

Before we proceed with the description of such coupling issues, let us open a brief digression on the notion of outgoing expansion that has been introduced above. Let a Lorentzian manifold \((L,\gamma )\) be given and let \(\left( M,g,k\right) \) be an initial data set inside it. Therefore, if V is the future directed timelike unit normal vector field to M we recall that its second fundamental form is given by \(k(X,Y)=\gamma \left( D_{X}V,Y\right) \) for \(X,Y\in \varGamma (TM)\). In this setting, let \(\varSigma \) be a two-sided hypersurface in M: we will denote by \(\nu \) a (choice of) smooth unit normal vector field of \(\varSigma \) in M and, by convention, we will refer to such choice as outward pointing. At this stage, we can define \(\ell _{+}=V+\nu \) (resp. \(\ell _{-}=V-\nu \)) as future-directed outward (resp. future-directed inward) pointing null vector field along \(\varSigma \). Note that \(\varSigma \) is a codimension two submanifold of L and therefore its extrinsic geometry cannot be described in terms of a scalar function. Instead, it is customary to decompose its second fundamental form into two scalar functions, \(\chi _{+}, \chi _{-}\) that are associated to \(\ell _{+}, \ell _{-}\) respectively. More precisely, the function \(\chi _{+}\) is defined by

$$\begin{aligned} \chi _{+}:T_{p}\varSigma \times T_{p}\varSigma \rightarrow {\mathbb {R}}, \ \ \ \ \chi _{+}(X,Y)=\gamma \left( D_{X}\ell _{+}, Y\right) \end{aligned}$$

and similarly for \(\chi _{-}.\) Furthermore, one can consider the associated null mean curvatures also known as expansions that are gotten by tracing with respect to the first fundamental form induced on \(\varSigma \) by the metric g:

$$\begin{aligned} \theta _{\pm }=\text {tr}_{g}\chi _{\pm }=\text {div}_{\varSigma }\ell _{\pm }. \end{aligned}$$

A simple, but useful remark is that in fact the null mean curvatures satisfy the equation

$$\begin{aligned} \theta _{\pm }=\text {tr}_{\varSigma }k\pm H \end{aligned}$$

where H denotes the scalar mean curvature of \(\varSigma \) in (Mg). We will limit ourselves to recall that \(\theta _{\pm }\) measure the divergence of outgoing and ingoing light rays emanating from \(\varSigma \), respectively. In the most trivial example, that of a round sphere in Euclidean slices of the Minkowski spacetime, one obviously has \(\theta _{-}<0\) and \(\theta _{+}>0\), but in presence of a gravitational field it might happen that for a given surface \(\varSigma \) both \(\theta _{-}\) and \(\theta _{+}\) are negative, in which case we say that \(\varSigma \) is a trapped surface. More precisely, in the setting above we say that \(\varSigma \) is outer trapped if \(\theta _{+}<0\) on \(\varSigma \) and, similarly, we say that a \(\varSigma \) is marginally outer trapped if instead \(\theta _{+}=0\).

That being said, let us get back to our discussion: if we express the condition \(\theta _{+}=0\) for an initial data set \((M,{\overline{g}},{\overline{k}})\) in terms of the seed data, with respect to (2.8), then we end up finding a coupled condition of the form

$$\begin{aligned} \frac{\partial _{\nu }u}{u}=-\frac{1}{4}H-\frac{1}{6}\tau u^{2} +\frac{1}{4}u^{-4} (\sigma +K_g W)(\nu ,\nu ) \end{aligned}$$

where we have denoted by H the mean curvature of \(\partial M\) in (Mg), adopting the convention that the unit sphere in \({\mathbb {R}}^3\) has mean curvature 2, and \(\nu \) stands for the outward-pointing unit normal in metric g.

We can now review some of the contributions on the theme, starting with CMC analysis and proceeding along the conceptual scheme that has been outlined above. Once again, this discussion refers to the vacuum case (and, unless otherwise stated, with zero cosmological constant). Inspired by much earlier work of Thornburg (see in part. Thornburg (1987) having numerical simulations as key target) the solvability of the constraints on manifolds with boundary was first studied in Maxwell (2005b) and Dain (2004): in both cases one wishes to construct asymptotically flat solutions to the constraints having a pre-assigned number of apparent horizons (a posteriori each boundary component will be a marginally trapped surface satisfying \(\theta _{+}=0\) and \(\theta _{-}\le 0\)). The motivation for this study has to do with the correct mathematical description of black holes (in fact: an isolated system of \(N\ge 1\) black holes) in the context of the Einstein constraint equations. Assuming a suitable form of weak cosmic censorship, if the apparent horizons are well separated the resulting initial data will evolve into a spacetime containing an arbitrary, pre-assigned number of black holes.

Both Maxwell and Dain study the CMC case. Clearly, the standard decay assumptions imply (cf. Remark 2.5) that \(\tau =0\) so the basic task to address is the construction of seeds \(g,\sigma \) for which the full system

$$\begin{aligned} \left\{ \begin{aligned} &\varDelta _g u -\frac{1}{8}\left( R_g u -\Vert \sigma \Vert ^2_g u^{-7}\right)=0\quad {\text { on }}\, M \\ &\partial _{\nu }u+\frac{1}{4}\left( Hu-\sigma (\nu ,\nu )u^{-3}\right)=0\quad {\text { on }}\, \partial M \end{aligned}\right. \end{aligned}$$

is solvable. This can indeed be done: in Maxwell (2005b) it is proven that if (first) the metric g is chosen so that the conformal Escobar invariant

$$\begin{aligned} \inf _{u>0, u\in C^{\infty }_c} \ \frac{\int _M (c(n)^{-1}|\nabla _{g_0}u|^2+R_{g_0}u^2)\,dV_{g_0}+\int _{\partial M} 2Hu^2\,dA_{g_0}}{\left( \int _{M}u^{\frac{2n}{n-2}}\,dV_{g_0}\right) ^{\frac{n-2}{n}}}. \end{aligned}$$

is positive and (second) \(\sigma \) is chosen so that \(H\le \sigma (\nu ,\nu )\le 0\) (conditions that can indeed be easily accomodated, as it is well explained in Sect. 5 therein) then the problem above is solvable. In fact, a posteriori, as we mentioned above it is easily checked that the resulting boundary consists of marginally trapped surfaces. In a very similar setting, so again with the goal of constructing initial data for multiple black hole spacetimes, this study was later brought forward by Holst and Meier (2015), who obtained results in the non-CMC regime, both perturbative in \(\tau \) and not so. The CMC results in Maxwell (2005b) should also be compared with aforementioned work by Gicquaud (2010), dealing with similar issues (in what is then a conformal boundary) in the asymptotically hyperbolic context.

In the case of compact manifolds with boundary, an attempt to find a complete analogue of Theorem 2.10 (see Table 1) immediately stumbles across the (highly non-trivial) problem of determining canonical metrics within any given conformal class, so say metrics with constant scalar curvature and zero mean curvature (this is the famous Escobar problem, see Escobar (1992b) and Brendle and Chen (2014); see also Escobar (1992a) and Marques (2005) for the different, but equally natural, question where one looks for pointwise conformal scalar-flat metrics with constant mean curvature boundary). At the time this review is written none of these fundamental two problems in conformal geometry has been fully solved. That being said, if we look in retrospective at the closed case we note that the full solution of the Yamabe problem is not actually needed. Correspondingly, what is needed is rather the following weaker statement: in the setting above, inside any given conformal class [g] one can always find a metric that has scalar curvature of constant sign (or identically zero) and vanishing boundary mean curvature, and moreover the sign of this scalar curvature is determined by [g] (so that in particular, two conformally equivalent metrics with vanishing boundary mean curvature cannot have scalar curvatures of distinct signs, and this defines three disjoint sets in the space of (conformal classes of) metrics: they are referred to as the Yamabe classes). In fact, one can prove that in the positive (respectively null, negative) case there is a metric in the conformal class whose scalar curvature is continuous and positive (resp. zero or negative), and whose boundary mean curvature is continuous and has any given sign (resp. is identically zero, has any given sign). With these basic remarks in mind, we refer the reader to the aforementioned work Holst and Tsogtgerel (2013) for a very accurate analysis of the Lichnerowicz equation and the study of the CMC case in this setting. The construction of near-CMC solutions, and of far-from-CMC solutions (in the positive case) has been obtained in Dilts (2014) and Holst et al. (2018) (the latter approach having the virtue of robustness, as it allows for low-regularity data and solutions). Furthermore, we note how Dilts (2014) also presented partial progress towards a limit equation criterion in the spirit of Theorem 2.20, although certain difficulties arise and the analogous theorem has not yet been established. To conclude this panoramic overview, a special mention is due to the work Gicquaud and Ngô (2014) which presents (a few years after the original sources) a technically simpler and unified treatment of far-from-CMC results (in the spirit of Holst et al. (2008, 2009) and Maxwell 2009) that is easily applicable to the closed, compact with boundary, and asymptotically flat cases.

Remark 2.22

Also, it is worth noting how, in some of the works above, significant efforts were spent to deal with rough metrics i.e., allowing for minimal regularity assumptions. In particular, we point out (in chronological order) the contributions in Choquet-Bruhat (2004), Maxwell (2005a, 2005b, 2006), Holst et al. (2009, 2018), Holst and Tsogtgerel (2013) together with references therein. Of course, these results should be regarded in relation to the corresponding study for the evolution problem (i.e., for the well-posedness of the Einstein field equation under minimal assumption, see Smith and Tataru 2005; Klainerman and Rodnianski 2005a, b and Klainerman et al. 2015).

So far we have (almost entirely) focused on the vacuum case. However, in recent years we have also witnessed some interesting advances on the study of conformal methods for the Einstein constraints in presence of sources. If we set aside more isolated contributions in other directions (such as e.g., Hebey and Veronelli 2014 for the existence and stability analysis of the CMC case for the Einstein-Maxwell theory in presence of a cosmological constant), significant efforts have been spent to understand the so-called Einstein-scalar field constraint equations, namely the system

$$\begin{aligned} \left\{ \begin{aligned} &R_g -\Vert k\Vert ^2_g+(tr_g k)^2=\pi ^2+|\nabla \psi |^2+2V(\psi ) \\&div_g(k-(tr_g k))=-\pi d\psi . \end{aligned}\right. \end{aligned}$$

This system arises, in the same way we explained in Sect. 1.1, from the field equations associated to an action functional that is not quite the usual Einstein-Hilbert one (regarded over the Lorentzian manifold \((L,\gamma )\)), but rather has a Lagrangian whose integrand takes the form

$$\begin{aligned} R_{\gamma }-\frac{1}{2}\gamma (\nabla \varPsi ,\nabla \varPsi )-V(\varPsi ) \end{aligned}$$

where \(\varPsi \) is here a real-valued scalar field and V is a smooth potential (such as e.g., \(V(\varPsi )=m^2|\varPsi |^2/2\) in the familiar Klein–Gordon case). In terms of these physical fields, we have that (in (2.27)) \(\psi :M\rightarrow {\mathbb {R}}\) is the restriction to M of \(\varPsi \), while \(\pi \) stands for the derivative of \(\varPsi \) with respect to a (say: future-pointing) timelike unit normal to M within \((L,\gamma )\).Footnote 2

Especially in the case when the background manifold M is closed, a physical motivation for the study of these sorts of sources is that it has become more and more relevant in cosmology to admit the existence of a scalar field (with a potential to estimate) in order to explain recent observations of far-away stars and galaxies as well as the possible origin of matter (hence, ultimately, to explain the observed acceleration of the expansion of the universe). The reader is referred to Rendall (2004, 2005, 2006) and Sahni (2005) for physical background on this theme. We explicitly note here that the analysis of the above system includes the vacuum constraint equations with an arbitrary cosmological constant as a special case.

Now, it is checked at once that \((M,{\overline{g}},{\overline{k}}, {\overline{\psi }},{\overline{\pi }})\) is an initial data set solving (2.27) if and only if (uW) solves the elliptic system

$$\begin{aligned} \left\{ \begin{aligned}& \varDelta _g u -\frac{1}{8}\left( R_g -|\nabla \psi |^2_g\right) u=\frac{1}{8}\left( \frac{2}{3}\tau ^2-4V(\psi )\right) u^5 -\frac{1}{8} \left( \Vert \sigma +K_g W\Vert ^2_g +\pi ^2\right) u^{-7} \\ &div_g(K_g W)=\frac{2}{3}u^6d\tau -\pi d\psi \end{aligned}\right. \end{aligned}$$

where \((g,\sigma ,\tau ,\psi ,\pi )\) play the role of seeds, and we have applied the transformation (2.29) below. For the sake of consistency we have decided to stick to the \(n=3\) case although analogous formulae can be derived for any \(n\ge 3\).

Remark 2.23

Observe that in the vacuum case there are no \(\nabla u\) terms nor (if \(\tau \) is constant) u terms in the momentum equation, and there are no \(|\nabla u|^2\) in the Hamiltonian equation. The conformal deformation of data

$$\begin{aligned} \left\{ \begin{aligned} &{\overline{g}}=u^4 g \\& {\overline{k}}=u^{-2}(\sigma +K_g W)+\frac{\tau }{3}u^4 g\\& {\overline{\psi }}=\psi \\ &{\overline{\pi }}=u^{-6}\pi \end{aligned}\right. \end{aligned}$$

is designed so to guarantee this (fundamental) property.

From a mathematical perspective, the Lichnerowicz equation above (i.e., the first equation of (2.28)) is just a special instance of an interesting class of equations, of semilinear elliptic type, that take the form

$$\begin{aligned} \varDelta _g u-Ru+A u^{-a}-B u^b=0 \end{aligned}$$

for \(a, b\in {\mathbb {R}}\) positive constants, and RAB assigned function in a suitable class. Of course, in the specific case above (and for \(n=3\)) we have \(a=7, b=5\) and

$$\begin{aligned} \begin{aligned} R&=R_{g,\psi }= \frac{1}{8}\left( R_g -|\nabla \psi |^2_g\right) , \\ A&=A_{g,\sigma ,\pi ,W}=\frac{1}{8}\left( \Vert \sigma +K_g W\Vert ^2_g +\pi ^2\right) , \\ B&=B_{\tau ,\psi }=\frac{1}{8}\left( \frac{2}{3}\tau ^2-4V(\psi )\right) . \end{aligned} \end{aligned}$$

These equations have, at least formally, a variational structure, for indeed (2.30) is the Euler–Lagrange equation associated to the functional \(I=I^{(1)}+I^{(2)}\) where

$$\begin{aligned} I^{(1)}(u)=\frac{1}{2}\int _M (|\nabla u|^2_g +R u^2)\,dV_g+\frac{1}{1+b}\int _M B u^{1+b}\,dV_g, \end{aligned}$$

and, for \(a>1\),

$$\begin{aligned} I^{(2)}(u)=\frac{1}{a-1}\int _M \frac{A}{u^{a-1}}\,dV_g \end{aligned}$$

although the functional in question is defined (in many cases of geometric and/or physical interest) not on a vector space but, rather, on a cone (e.g., the cone of positive functions having a certain degree of regularity).

Getting back to the specific case of the Lichnerowicz equation associated to an Einstein-scalar field theory, the key remark is that, when we compare this equation to the corresponding vacuum constraint (or those arising in the Einstein–Maxwell or Einstein–Yang–Mills cases) we have that both the term \(R_{g,\psi }\) and the term \(B_{\tau ,\psi }\) do not typically have a sign. Quite intuitively, this creates problems insofar the application of the maximum principle is concerned (unless additional assumptions are added), and is also an obstacle to a straightforward application of the method of sub- and supersolutions. On this theme, we note that, in order to handle the possible sign changes of R, one can define (following Choquet-Bruhat et al. 2007b) a modified Yamabe invariant which incorporates the energy term \(|\nabla \psi |^2_{g}\) in the natural fashion (i.e., we replace \(R_g\) by \(R_g-|\nabla \psi |^2_g\) in (2.3)) and it is not too difficult to prove a neat trichotomy invariant which fully mirrors, mutatis mutandis, the statement of Theorem 2.3.

The research path that one may describe is then, to some extent, similar to what we have already presented above, so going through the four steps we listed at the very beginning of this section (and, again, with different assumptions on the background manifold M). Without entering into too many details on this fascinating story, we refer the reader to the following references for what concerns the analysis of (2.28) on closed manifolds:

  • Choquet-Bruhat et al. (2007b, 2007a) for the analysis of the CMC case: by means of the aforementioned Yamabe-scalar field conformal invariant, the free conformal data are divided into 36 classes depending on the signs of this invariant and additional coefficients (namely: A and B) in the equations. For 24 of these classes a definite answer is obtained. For 8 of the remaining classes, partial results in the form of sufficient conditions for solvability or a reduction to a specific prescribed scalar curvature problem are proven (Tables 2, 3);

    Table 2 The chart by Choquet-Bruhat, Isenberg and Pollack for the CMC case, assuming \(A_{g,\sigma ,\pi ,W}\equiv 0\)
    Table 3 The chart by Choquet–Bruhat, Isenberg and Pollack for the CMC case, assuming \(A_{g,\sigma ,\pi ,W}\not \equiv 0\)
  • Hebey et al. (2008), based a different approach (of variational character) for the Lichnerowicz equation in (2.28). This strategy turns out to be quite effective as it allows to go beyond the CMC case, and indeed led to novel existence results in the case of initial data with a positive Yamabe-scalar field conformal invariant (thereby allowing to refine the still partial conclusions in the third row of Table 2 at pp. 817 of Choquet-Bruhat et al. (2007b), which we copy here for a reference). That being said, the authors do not study the existence of critical points of (2.32), but rather consider a sequence of perturbed functionals of the form \(I^{(1)}+I^{(2)}_{\varepsilon }\) where the latter is an elliptic regularization of the form

    $$\begin{aligned} I^{(2)}_{\varepsilon }=\frac{1}{a-1}\int _M \frac{A}{((\varphi ^+)^{2}+\varepsilon )^{(1-a)/2}}\,dV_g, \end{aligned}$$

    prove via a mountain pass lemma the existence of critical points for the perturbed functional and finally send the perturbation parameter to zero. In any event, this approach opened the way to the application of diverse min-max schemes so to obtain interesting exsitence and multiplicity results, such as in Ngô and Xu (2012) (for negative Yamabe-scalar field invariant and sign-changing B), in Ngô and Xu (2015) (for the null Yamabe-scalar field invariant and sign-changing B) or in Ma and Wei (2013) (again for positive Yamabe-scalar field invariant but positive A and B); for multiplicity issues we further mention the accurate study in Premoselli (2014) where, in in particular, in the CMC positive cosmological constant case, the author shows that each time a solution exists, the equation produces a second solution with the exception of one critical value for which the solution is unique.

  • Premoselli (2014) provides instead a first existence result in the non-CMC regime (yet, in a near-CMC regime) for the full (i.e., fully coupled) system (2.27), under the general assumptions that (M, [g]) be of positive Yamabe type and that g has no conformal Killing vector fields. The proof here relies on a fixed-point argument, hence ultimately on the use of sub- and supersolutions. See instead Gicquaud and Nguyen (2016) for partial results in the far-from-CMC regime, in the spirit of Holst et al. (2008, 2009) and Maxwell (2009).

Still considering this same class of physical sources, we further mention Choquet-Bruhat et al. (2006) for the asymptotically flat case, Sakovich (2010) for the asymptotically hyperbolic case [but see also the discussion in Sect. 6 of Choquet-Bruhat et al. (2007b)], and Albanese and Rigoli (2016, 2017) (among others) as well as the Section 6 of Ma and Wei (2013) for further results on the Licherowicz (or, more generally, Lichnerowicz-type) equations on complete manifolds with boundary, with no pre-assigned geometry at infinity. In each of these papers, sufficient conditions on the scalar field and the potential are given so that the Lichnerowicz equation in (2.27) is solvable.

The thin-sandwich method

So far we have fully focused our attention on the conformal method of York (1973) (CMC case) and Ó Murchadha and York (1974). However, this scheme represents only one of the possible ways of transforming the underdetermined problem (1.2) into a determined elliptic system, the diversifying factor being the choice of the parametrising data (which in the sections above was a triple \((g,\sigma ,\tau )\)).

In fact, there exist in the literature various competing methods, each of them having a (more or less substantial) network of existence and non-existence results in different regimes. Among them, we wish to mention the (Lagrangian) conformal thin-sandwich approach proposed in York (1999), and the later Hamiltonian formulation of the same approach (see Pfeiffer and York 2003). Just to give the reader an idea about these methods, let us briefly describe the first of them.

The conformal thin-sandwich method partly stems from an attempt of attacking the thin-sandwich conjecture, see Bartnik and Fodor (1993) (which, in turn, is an infinitesimal version of the thick sandwich conjecture concerning the construction of Ricci flat Lorentzian manifolds bounding two pre-assigned spacelike hypersurfaces, which is a boundary value problem for the vacuum Einstein field equations (1.1)). In the infinitesimal conjecture one wishes to construct a Ricci-flat spacetime containing a spacelike slice for which both the metric \({\overline{g}}\) and the \(\dot{{\overline{g}}}\) are prescribed, where \(\dot{{\overline{g}}}\) denotes the Lie derivative of the (to be determined) Lorentzian metric with respect to a timelike vector field T, which one then decomposes into lapse and shift as \(T={\overline{N}}\nu +X\), so that (using the same notation as the previous sections) the imposed condition takes the form

$$\begin{aligned} \dot{{\overline{g}}}=2{\overline{N}} {\overline{k}}+{\mathscr {L}}_X {\overline{g}}. \end{aligned}$$

Roughly speaking, one then makes an Ansatz to attack this problem, working (i) in a given conformal class for the metric, (ii) assigning the physical mean curvature \(\tau \), (iii) prescribing (again modulo a conformal factor) the trace-free part of \(\dot{{\overline{g}}}\), and (iv) also prescribing (again modulo a conformal factor) a lapse function.

As a result, on a background manifold M one considers data of the form \((g,\rho ,\tau , N)\) where g is a Riemannian metric, \(\rho \) is a trace-free symmetric tensor (not necessarily a TT tensor) while \(\tau \) and N are scalar functions (the latter, related to the physical lapse by rescaling, subject to the additional requirement of being positive). The unknowns are a conformal factor u and a vector field W, which are required to solve the system

$$\begin{aligned} \left\{ \begin{aligned} &\varDelta _g u-\frac{1}{8}R_g u=\frac{1}{12}\tau ^2 u^5 -\frac{1}{8}\left\| \frac{1}{2N}\left( \rho -K_g W\right) \right\| ^2_g u^{-7} \\ &div_g\left( \frac{1}{2N}(\rho -K_g W)\right)=\frac{2}{3} u^{6}d\tau . \end{aligned}\right. \end{aligned}$$

One easily checks, through standard computations, that if indeed u and W solve this system then the following two assertions are true:

  • the data \({\overline{g}}=u^4 g\) and \({\overline{k}}=u^{-2}\left( \frac{1}{2N}(\rho -K_g W)\right) +\frac{\tau }{3}{\overline{g}}\) solve the vacuum Einstein constraint equations (1.2);

  • set \({\overline{N}}=u^{6}N\), then the pair \(({\overline{N}},{\overline{W}})\) is the physical lapse-shift pair for a solution of the thin-sandwich conjecture, by which we mean that the condition (2.33) is also fulfilled.

In particular, this approach can be adopted as an alternative way to conformally rephrase the constrainst as a determined elliptic system.

It was shown quite recently in Maxwell (2014) that, at least in the case of closed manifolds, all four common conformal methods are, in a very precise sense, fully equivalent: there is, so to say, a multi-lingual dictionary that allows to transform variables from one method to any other one, hence to rephrase any given result in any of the languages, and the set of solutions for the constraint equations one obtains from the conformal method is identical to the set parametrised by the other competing parametrisation schemes (such as the conformal thin-sandwich method).

This correspondence is, in our opinion, very valuable both from a purely theoretical perspective and with respect to the physical applicability of the results. Indeed, as noted by Maxwell, for some reason physicists tend to often adopt the conformal thin-sandwich method, while in the mathematical community quite the opposite is true (which is, ultimately, one of the reasons why we have placed a lot more emphasis, in this survey, on the approach of Ó Murchadha and York 1974). Of course, the existence of a conversion table between the parameters has also a direct, beneficial impact in numerical relativity, as for instance the standard reference Baumgarte and Shapiro (2010) presents the conformal method and the conformal thin-sandwich method as two different theories, with a different collection of specifiable parameters.

To be concrete, when comparing the standard conformal method to the conformal thin-sandwich method described above, such a conversion recipe would work as follows. Given data \((g,\rho ,\tau ,N)\) for the (Lagrangian) conformal thin-sandwich method, we first decompose \(\rho \) into its TT part and the complement thereof, i.e., we write

$$\begin{aligned} \rho =2N\sigma +K_g Y \end{aligned}$$

which can be done, as follows from the discussion above in Sect. 2.2, in an essentially unique way (except for the possibility of adding to Y a conformal Killing vector field); hence we solve for \(\psi >0\) the equation

$$\begin{aligned} \psi ^{q}N=1/2 \end{aligned}$$

and finally set

$$\begin{aligned} {\hat{g}}=\psi ^{4}g, \ \text {and} \ {\hat{\sigma }}=\psi ^{-2}\sigma . \end{aligned}$$

Hence, it follows from Maxwell (2014) (see, in particular, Sect. 7 therein) that the conformal thin-sandwich method generates, for data \((g,\rho ,\tau ,N)\), the very same set of solutions as the conformal method, for data \(({\hat{g}},{\hat{\sigma }},\tau )\).

In spite of the apparent simplicity of this conversion scheme, it is quite a non-trivial fact that indeed such equivalence results hold. The reasons why these methods are the same are much better read off at an abstract (coordinate free) level, and rephrased in terms of alternative identification(s) between the tangent and the cotangent bundle of the space of smooth conformal classes.

One may be tempted to say that these conversion schemes are pointwise recipes, but that is, strictly speaking, only partly correct. For instance, in the discussion above the definition of \(\sigma \) (given \(\rho \)) relies on the York decomposition (which, as we have clearly seen above, is of global character as it depends on solving an elliptic PDE). For this reason, the extension of these results to the case of non-compact manifolds and/or manifolds with boundary, although very reasonable, is not entirely trivial. In any event, the additional complications should ultimately only be related to a wise design of the functional spaces coming into play (possibly encompassing boundary conditions, whenever appropriate). Certainly, the exposition of Maxwell (2014) has the advantage of isolating a crystal-clear correspondence from secondary technical aspects, which may be added and analysed in detail when needed for specific applications.

Density theorems à la Schoen–Yau

In this section we would like to briefly present a partly different approach to constructing solutions of the Einstein constraints, which turns out to be a good conceptual bridge between the conformal methods, that have been presented above, and the gluing methods that will be described next.

The approach in question was first proposed in Schoen and Yau (1979b) as a first step in the proof of the Riemannian positive mass theorem. Roughly speaking, it served the scope of approximating, in suitably weighted functional spaces, a given asymptotically flat metric with ones having a simple structure (hence: an asymptotic expansion) at infinity. This methodology was then further developed in diverse directions, thereby generating a plethora of density theorems both for asymptotically flat and asymptotically hyperbolic data; further related contributions, of somewhat different character, will be surveyed in Sect. 2.9.

We will now present the most basic result by Schoen and Yau for asymptotically flat time-symmetric data; once again the reader is referred to Appendix B for the necessary background, which we have refrained from inserting here so not to interrupt the natural course of our presentation. Also, we shall not digress on the role of this construction with respect to the proof of the positive mass theorem [for which the reader may consult either Schoen (1989) or the monograph Lee (2019)]. Our perspective is, instead, that of producing solutions of the constraints.

As a reference question to start from, one may consider the problem of constructing scalar-flat metrics on \({\mathbb {R}}^3\) that resemble the Euclidean geometry at infinity. The following statements show that the two cheapest strategies to produce non-flat solutions are doomed to fail.

Proposition 2.24

  1. (a)

    Let \(g=u^4\delta \) be a scalar-flat Riemannian metric on \({\mathbb {R}}^3\) such that \(u\rightarrow 1\) as one lets \(|x|\rightarrow \infty \). Then g is the Euclidean metric.

  2. (b)

    Let g be a rotationally symmetric, scalar-flat Riemannian metric on \({\mathbb {R}}^3\) such that \(g_{ij}-\delta _{ij}=o(1)\) as one lets \(|x|\rightarrow \infty \). Then g is the Euclidean metric.

The proof of both statements is straightforward, as the first claim follows at once from the maximum principle (for any such u would be a bounded, entire harmonic function) while the second one follows, by reduction to the former, via a standard change of variable trick. On the other hand, there is a way of bypassing these special obstructions. The result below provides a machinery that takes a solution as input and produces a new (typically simpler) one as output.

Proposition 2.25

Let \((M^n,g)\) be a scalar-flat, asymptotically flat manifold. Given any \(\epsilon >0\) there exists a metric \({\overline{g}}\) on M that is scalar-flat, takes the form \(u^{4/(n-2)}\delta \) outside a compact set (for some harmonic function u such that \(u\rightarrow 1\) as one lets \(|x|\rightarrow \infty \)) and \(|m({\overline{g}})-m(g)|<\epsilon \), where m(g) (respectively \(m({\overline{g}})\)) denotes the ADM mass of g (respectively \({\overline{g}}\)).


Let \(\chi :[0,+\infty )\rightarrow [0,+\infty )\) be a smooth, monotonically decreasing cutoff function that equals 1 for \(t\le 1\) and vanishes for \(t\ge 2\). For any \(k\ge 1\) set \(\chi _k{:}{=}\chi (x/k)\) so that one has that the first (resp. second) derivatives of \(\chi _k\) are bounded by some uniform constant times \(k^{-1}\) (resp. \(k^{-2}\)) as one lets \(k\rightarrow \infty \). Consider the interpolating metric

$$\begin{aligned} {\hat{g}}_k=\chi _k g +(1-\chi _k)\delta . \end{aligned}$$

Set \({\overline{g}}_k=u^{4/n-2}_k {\hat{g}}_k\). We now want to reimpose the constraints, which in this (very special) case means that we want to solve for \(u_k\) the equation

$$\begin{aligned} \varDelta _{{\hat{g}}_k}u_k-c(n)R_{{\hat{g}}_k}u_k=0. \end{aligned}$$

To that scope, we observe that the sole region where the scalar curvature of \({\hat{g}}_k\) is not zero is the annulus of radii k and 2k so that one easily checks (based on the decay assumptions on the metric g, which are those described in Appendix B to the scope of having the ADM mass well-defined) that

$$\begin{aligned} \Vert R_{{\hat{g}}_k}\Vert _{L^{n/2}}\rightarrow 0, \ \text {as one lets} \ k\rightarrow \infty . \end{aligned}$$

Now, we employ this information to solve (2.35). A partition of unity argument allows to prove, for any smooth compactly supported function \(u:M\rightarrow {\mathbb {R}}\) a Sobolev inequality of the form

$$\begin{aligned} \left( \int _M |u|^\frac{2n}{n-2}\,dV_{{\hat{g}}_k}\right) ^{\frac{n-2}{n}}\le C \int _M |\nabla u|^2\,dV_{{\hat{g}}_k} \end{aligned}$$

for a constant \(C>0\) independent of k. Hence, a direct variational approach allows to solve equation (2.35), for any k large enough, by employing (2.37) and (2.36).

At this stage, we note that (for any fixed k) the function \(u_k\) serving as conformal factor outside a compact set is actually, outside a compact set, harmonic with respect to the Euclidean metric. Thus, it is a classical fact that one has the asymptotic expansion of the form

$$\begin{aligned} u_k(x)=1+\frac{A_k}{|x|^{n-2}}+O(|x|^{-n+1}), \ |x|\rightarrow \infty . \end{aligned}$$

Recall that \({\overline{m}}_k=2A_k\) where \({\overline{m}}_k\) denotes the ADM mass of \({\overline{g}}_k\). On the other hand, we also know (e.g., by the divergence theorem applied to (2.35)) that

$$\begin{aligned} A_k=-\frac{1}{(n-2)\omega _{n-1}}\int _M c(n)R_{{\hat{g}}_k}u_k\,dV_{{\hat{g}}_k}. \end{aligned}$$

As a result, one obtains


hence, applying the divergence theorem in the shell, and keeping in mind the definition of \({\hat{g}}_k\) as interpolation of g and \(\delta \) one finally gets

$$A_k=\frac{1}{4(n-1)\omega_{n-1}}\int_{|x|=k}(g_{ij,i}-g_{ii,j})\frac{x^j}{|x|}\,dA_g +o(1)$$

so the conclusion \({\overline{m}}_k\rightarrow m\) for \(k\rightarrow \infty \) follows directly from the defintion of ADM mass. \(\square \)

Like we mentioned above, this argument has then lead to several interesting developements. First of all, along the proof of the spacetime positive mass theorem a similar approximation result is needed for general asymptotically flat initial data sets, which was indeed obtained in Eichmair et al. (2016). We refer the reader to Sect. 6 therein, see in part. the statement of Lemma 23. We note that a similar result had been obtained, for \(n=3\), in Corvino and Schoen (2006) (see Theorem 1 therein). Now, while the latter reference deals (as we are doing here) with the vacuum constraints, the former also handles the (highly nontrivial) problem of producing new data sets that still satisfy the dominant energy condition if one was given with this property. In fact, for the specific purposes of Eichmair et al. (2016) it was also important to investigate when the dominant energy condition can in fact be promoted to a strict inequality, a theme that was then also studied in Corvino and Huang (2016) more from the perspective of gluing constructions and also in the very recent aforementioned work Huang and Lee (2020a).

The Ansatz behind these density theorems for general data sets is strongly reminescent of that lying behind the conformal method. Given a data set (Mgk), one first performs an interpolation with \(g_0=\delta \) (at the level of the metric) and \(\pi _0=0\) (at the level of the momentum tensor, which we recall to be defined as

$$\begin{aligned} \pi ^{ij}=k^{ij}-tr_{g}\left( k\right) g^{ij}, \end{aligned}$$

and contains the very same amount of information as the second fundamental form of the data in question) and then attempts to reimpose the constraints by requiring, say in the case \(n=3\), that the new triple \((M,{\overline{g}},{\overline{k}})\) be of the form

$$\begin{aligned} \left\{ \begin{aligned}& {\overline{g}}=u^4{\hat{g}} \\& {\overline{\pi }}=u^2({\hat{\pi }}+{\mathscr {L}}_{V}{\hat{g}} -div_{{\hat{g}}}(V){\hat{g}}) \end{aligned}\right. \end{aligned}$$

for a smooth function such that \(u\rightarrow 1\) at infinity, and a vector field V vanishing at infinity. Hence one is lead to the study of a nonlinear, coupled system, which is then attacked by linearisation and then invoking a suitable form of the implicit function theorem in Banach spaces (which one can appeal to because of the smallness of certain terms, that is a consequence of the decay assumptions on the data).

In a similar spirit, density theorems have been proven for asymptotically hyperbolic data as well. In particular, it was shown in Dahl and Sakovich (2015) that a given asymptotically hyperbolic initial data set (Mgk) satisfying the dominant energy condition can be approximated by an initial data set with conformally hyperbolic asymptotics which obeys the strict dominant energy condition, while changing the value of the mass functional by an arbitrary small amount. Once again, if we set aside the delicate issue of promoting the dominant energy condition to a strict inequality, a key step in the argument is precisely to replace given data with simpler ones having an asymptotic expansion at infinity (with respect to a model background metric, in this case the hyperbolic one). That being said, this reduction to the case of data sets with conformally hyperbolic asymptotics was then crucially employed in the very recent proof of the hyperbolic positive mass theorem in the form of Sakovich (2020), cf. Appendix C.

(Generalised) connected sums and conformal deformations

As a transition between the second part of the present review, devoted to conformal methods, and the third part, devoted to gluing methods, we wish to present some different incarnations of the method we have just described in Sect. 2.8, which roughly consists in ‘interpolating data and conformally re-imposing the constraints’. A natural problem one can try to tackle that way is that of constructing constant scalar curvature metrics on compact manifolds obtained as connected sum of two (or more) summands.

Recall that, given smooth connected manifolds \(X_1, X_2\) (both having dimension \(n\ge 3\)) their connected sum, denoted by \(X_1\# X_2\), is obtained, roughly speaking, by removing small balls around two points, say \(x_1\in X_1, x_2\in X_2\) and joining them by means of a connecting neck of the form \(S^{n-1}\times I\) for \(I=[0,1]\subset {\mathbb {R}}\). One can prove that this operation, properly formalised, is well-defined, in the sense that the resulting smooth manifold one gets as outcome does not depend on the choice of the basepoints (i.e., on the choice of \(x_1\in X_1\) and \(x_2\in X_2\)) and of the identification maps for the boundary spheres (cf. Kosinski 1993). Now, if \(X_1\) and \(X_2\) come endowed with Riemannian metrics, then one can perform an interpolation/rough patch so that \(X_1\# X_2\) comes endowed with a Riemannian metric as well, which coincides with \(g_1\) on \(X_1\) away from a small ball, and similarly coincides with \(g_2\) on \(X_2\) away from a small ball. In particular, if we assume that \(g_1\) and \(g_2\) are constant scalar curvature metrics (which one can always assume by virtue of the positive solution of the Yamabe problem, whenever \(X_1\) and \(X_2\) are compact manifolds without boundary, as we shall assume throughout this subsection) then the interpolating metric will also come with a constant scalar curvature metric away from the connecting neck. It would then be ideal to arrange for a localised deformation of such a metric so to produce a constant scalar curvature metric on \(X_1\# X_2\) which still agrees with the given ones away from a possibly marginally larger connecting region. Unfortunately, this approach (which exemplifies the gluing problems we will describe next) is sometimes doomed to fail and, even in those cases when it is unobstructed, is quite non-trivial at a technical level. Instead, one can take a somewhat cheaper approach (yet at a cost, corresponding to a more limited control) and, following the scheme we described in Sect. 2.8 above, simply try to re-impose the constraints in the conformal class of the interpolating metric. Any such deformation will not be localised (i.e., it will change the geometry away from the connecting neck as well), yet one can still investigate a number of important and natural questions. While we know, by prime principles (namely by Schoen 1984) that \(X_1\# X_2\) will also carry a constant scalar curvature metric, it is totally unclear whether such a metric will be in any reasonable sense close to the interpolating one, or more generally what the resulting shape will be. These questions have first been carefully investigated in part of the PhD thesis by Joyce (which was then later published in Joyce 2003) and in Mazzeo et al. (1995).

Let us describe the contributions in these articles in some more detail. In Joyce (2003), the author starts by considering two different families of interpolating metrics \((g'_t)\) and \((g''_t)\) (both depending on a scale parameter t which is related to the radius of the excised balls) and estimating the corresponding conformal Yamabe constant depending on the given input metrics \(g_1\) and \(g_2\), so that precisely nine cases need to be considered. He then proves (without appealing to the general results) existence and multiplicity theorems for constant scalar curvature metrics on \(X_1\# X_2\) by minimising the Yamabe functional (1.8) in the special case under consideration, with an effective control on the conformal factor. It is then shown that the metrics that are so constructed either develop small necks separating \(X_1\) and \(X_2\) (contrary to the intuition that there should be necessarily long and thin tubes) or one of the two summands is crushed small by the conformal factor. In the most delicate case when both \((X_1, g_2)\) and \((X_2, g_2)\) have positive scalar curvature, the author shows that (when suitably interpolating the given metrics) one can actually find three metrics with constant, positive scalar curvature 1 in the same conformal class and, in fact, such three metrics precisely describe the three possible qualitative types of behaviour described above (balanced sum with small neck, or collapsing of either of the two summands). We further note that the proposed constructions give metrics that have constant positive scalar curvature only in the single case when both summands have positive scalar curvature, and metrics with constant negative scalar curvature in all other eight cases.

The analysis in Mazzeo et al. (1995) is, at least in certain respects, more general as it encompasses the case when either of the two summands (or both) is non-compact, which is of special interest since, as we have already mentioned, the Yamabe problem is not always solvable in that more general setting (cf. Jin 1988). The authors prove that, under suitable technical assumptions, given two complete Riemannian manifolds \((X_1, g_1)\) and \((X_2, g_2)\) of positive constant scalar curvature, then there exists on their connected sum \(X_1\# X_2\) complete Riemannian metrics of positive constant scalar curvature. In particular, this result applies to the closed case but (perhaps more significantly) also relates to another work by the same authors (namely: Mazzeo et al. 1996) devoted to the delicate study of the moduli space of all complete metrics with constant positive scalar curvature, on \(S^n\) minus a finite number of points, that are pointwise conformally related to the standard round metric (which, in turn, was motivated by the earlier investigation in Schoen and Yau 1988 and Schoen 1988). More specifically, the results above can be fruitfully applied to the connected sum of finitely many Delaunay-type manifolds: since the metrics the authors produce are proven to be non-degenerateFootnote 3 one derives a rather exhaustive local understanding of such moduli space, which turns out to be a real-analytic set whose dimension equals the number of punctures.

Among closely related contributions, we wish to mention the work Mazzieri (2008) where the study developed by Joyce is extended to generalised connected sums, meaning that rather than removing points from the summands one actually removes two copies of the same codimension k submanifold (for \(k\ge 3\)) and then joins the two resulting manifolds through what may be called a ‘generalised neck’. The author is able perform this generalised connected sum under the assumptions that the two initial Riemannian metrics have the same constant scalar curvature \(\rho \) and the linearised Yamabe operators about the metrics \(g_1, g_2\) have trivial kernels (which is indeed the standard non-degeneracy condition); the output is again a metric of scalar curvature equal to \(\rho \). An important caveat, though, is that when \(\rho =0\) (namely when both summands are in fact scalar-flat) then the procedure only provides a constant scalar curvature metric on the generalised connected sum, but such a constant will not in general be equal to zero, an issue which was then rectified with the improved later results in Mazzieri (2009a).

Of course, getting back to the main purpose of the present review, it is natural to wonder how these sorts of constructions (blending a rough interpolation and a successive conformal deformation) can be generalised to the case of the full constraints, namely to solve (1.2). It is quite clear that the conceptual scheme of the argument can be transposed in a rather straightforward fashion, although the actual analysis of the resulting elliptic system is a lot more subtle. To the best of our knowledge, this task was first taken care of in Isenberg et al. (2002) for closed background manifolds in the case of standard connected sum around points and later in Mazzieri (2009b) for the generalised connected sum. We note that both contributions concern the CMC case, in the sense that both input data \((X_1, g_1, k_1)\) and \((X_2, g_2, k_2)\) are assumed to have constant mean curvature in the sense we explained in Sect. 2.3 (namely: these are constant mean curvature spacelike hypersurfaces in the Lorentzian manifolds obtained as local developments), and the output is also a CMC solution in the same sense. It should be emphasized that this construction concerns the vacuum case, and we refer the reader to Isenberg et al. (2005) instead for the analysis of the analogous problem for Einstein-matter field theories.

We also note that the constructions in Isenberg et al. (2002) cover different cases, as they allow for the background manifold to be compact or asymptotically Euclidean or asymptotically hyperbolic, with suitable corresponding conditions on the extrinsic curvature (but again in the compact setting a mild nondegeneracy condition is required); such extensions to non-compact model geometries are also the object of the first part of Delay and Mazzieri (2014), which then proceeds to explore (in the special CMC case, and under technical assumptions) the problem of employing localised rather than conformal deformations, namely considering deformation of the type we will describe below when presenting the contributions in Corvino (2000). Summarising things to the extreme, the authors prove that generically with respect to the initial data sets that serve as input, the above gluing procedure can be localised, in order to obtain new solutions which actually coincide with the original ones outside of a small neighbourhood of the compact region where the preliminary interpolation takes place (however note that such localised solutions will not, in general, be CMC). In a similar vein, we should also mention the related work Corvino et al. (2013), where it is shown (among other things) how to design a localised gluing theorem for constant scalar curvature metrics (over compact manifolds) in which the total volume is preserved, see in particular Theorem 1.6 therein for a precise statement. We stress that the last two sentences anticipate some possible applications of the localised Corvino-style gluing procedure and, as such, should then be reviewed, in retrospective, after the next few sections.

Solving the constraints via gluing methods

Gluing methods are ubiquitous within geometric analysis, and have been successfully employed, over the last few decades, to tackle a variety of problems. One could mention, among others, their use to construct new classes of solutions to PDEs in most diverse contexts, to produce minimal or CMC surfaces with peculiar properties, novel solitons for geometric flows (e.g., self-shrinkers), and Riemannian manifolds with special holonomy.

While a detailed description of these amazing enterprises is not quite the scope of the present review, we will now present the way gluing techniques naturally come up in solving the Einstein constraints. We will initially focus on the time-symmetric vacuum case, and then proceed by increasing degrees of generality. We will explain how this approach allows to produce a large class of asymptotically flat solutions that are the simplest possible at infinity (i.e., equal to an element of the Kerr family outside a large compact side) and how, at the opposite end of the spectrum, one can produce highly anisotropic asymptotically flat solutions to the Einstein constraints. Starting from there, this approach has been successfully applied in the context of other model geometries at infinity (e.g., asymptotically hyperbolic), and to different or partly different physical fields.

A cartoon for the method

To outline the localised gluing approach in the simplest possible case, assume to be given a smooth background manifold M, and an open cover consisting of two regular domains \(\varOmega _1, \varOmega _2\). We further assume to have on \(\overline{\varOmega _i}\) a (suitably smooth) Riemannian metric \(g_i\) solvingFootnote 4 the Riemannian vacuum constraint

$$\begin{aligned} R_{g_i}=2 \varLambda \end{aligned}$$

together with, possibly, additional requirements in case M is non-compact or has a boundary. If we postulate that the intersection \(\varOmega _1\cap \varOmega _2\) is not empty, and has a regular (say \(C^{\infty }\)) boundary consisting of the disjoint union of the relative boundaries of \(\varOmega _1\) and \(\varOmega _2\) (that is to say: \(\partial \varOmega _1\setminus \partial M\) and \(\partial \varOmega _2\setminus \partial M\)) we can simply write

$$\begin{aligned} M=\varOmega _1\cup \varOmega _2= \varOmega '_1 \sqcup {\overline{\varOmega }}\sqcup \varOmega '_2. \end{aligned}$$

where we have set

$$\begin{aligned} \left\{ \begin{aligned}& \varOmega '_1=\varOmega _1\setminus \overline{\varOmega _2} \\& \varOmega '_2=\varOmega _2\setminus \overline{\varOmega _1} \\ &\varOmega= \varOmega _1\cap \varOmega _2. \end{aligned}\right. \end{aligned}$$

If we then consider some sort of (appropriately defined) interpolation

$$\begin{aligned} g=\chi _1 g_1 +\chi _2 g_2 \end{aligned}$$

under the sole assumption, beyond smoothness of \(\chi _1\) and \(\chi _2\), that

$$\begin{aligned} g= {\left\{ \begin{array}{ll} g_1 &{} \text {in} \ \varOmega '_1 \\ g_2 &{} \text {in} \ \varOmega '_2 \end{array}\right. } \end{aligned}$$

we trivially have that g solves (3.1) away from \(\varOmega \). Hence, one would like to deform g in \({\overline{\varOmega }}\), so to obtain a global solution to our equation, which coincides with \(g_1\) on \(\varOmega '_1\) and with \(g_2\) in \(\varOmega '_2\). However, not only such a local deformation may be obstructed at the linear level (by the presence of static potentials, see Appendix D) but it may be in fact impossible.

A simple example of this phenomenon has already been alluded to in Sect. 2.8 above. Take \(M={\mathbb {R}}^3\) and consider (3.1) for \(\varLambda =0\): we wish to study the space of (entire) asymptotically flat, scalar-flat Riemannian manifolds. Given any pre-assigned such solution, say \(g_1\), we could for instance try to apply the logical scheme above with \(g_2=\delta \) (the Euclidean metric), \(\varOmega _1\) a ball centered at the origin and \(\varOmega _2\) the complement of a (closed) smaller ball also centered at the origin. In this case, the strategy in question, which aims at building a scalar-flat metric by a deformation that is local to the annulus \(\varOmega _1\cap \varOmega _2\) is inevitably doomed to fail unless \(g_1\) has zero ADM mass (i.e., it is itself the Euclidean metric). For indeed, if the strategy worked it would certainly produce an asymptotically flat, scalar-flat metric \({\overline{g}}\) on \({\mathbb {R}}^3\) having zero ADM mass, hence the Euclidean metric, which then forces \(g_1\) to be flat in \(\varOmega _1\). Said otherwise, the only cases when this approach (for such a choice of \(g_2\)) can possibly work are those when the outcome is trivial. However, clever variations on the same theme have interesting, and sometimes rather surprising, implications. We will now proceed, starting from the obstruction we just presented, towards a gallery of significant applications in this spirit.

Constructing solutions that are Schwarzschildean near infinity

We shall first describe the work Corvino (2000), which explains how to glue any asymptotically flat metric of positive mass to one belonging to the Schwarzschild family. We present the general setup of this work, state the main results and outline the proof, that will be the starting point for most of the other results we will discuss later on in this part of the review.

So, let \((M^3,g_0)\) be an asymptotically flat manifold with one endFootnote 5 and satisfying the (time-symmetric) vacuum constraint equations, namely the sole equation \(R_{g_0}=0\). As far as the precise asymptotic behaviour is concerned, we shall assume (following the original source) that \(g_0\) is conformally flat near infinity, that is to say: there exists \(\sigma _0>0\) large enough that for \(|x|>\sigma _0\) (to be understood in the chart at infinity) \(g_0=u^4\delta \) with

$$\begin{aligned} \left\{ \begin{aligned}& \varDelta _\delta u=0, \\ &u\rightarrow 1 \ \text { as } |x|\rightarrow \infty , \end{aligned}\right. \end{aligned}$$

where such an equation is equivalent to the geometric requirement of scalar-flatness. We will see below (discussing Corvino and Schoen 2006) that this assumption can actually be significantly relaxed; in any event, recall from Sect. 2.8 that the subclass of metrics that are conformally flat outside a compact set, as defined above, is dense (in suitable weighted Sobolev or Hölder spaces) in the larger class of asymptotically flat metrics having non-negative scalar curvature. Notice that, by standard facts concerning harmonic functions in the Euclidean setting we know that there exists only one solution to the problem above, which is smooth and has an expansion of the form

$$\begin{aligned} u(x)=1+\frac{m_0}{2|x|}+v(v) \ \ \text {where} \ v(x)=O(|x|^{-2}). \end{aligned}$$

With our choice of normalization constants, it turns out that the number \(m_0\ge 0\) precisely equals the value of the ADM mass of \((M^3,g_0)\).

In the same background Euclidean space, with coordinates \(\{x\}\) one can also consider a four-parameter family of Schwarzschild metrics

$$\begin{aligned} g_{m,{\overline{x}}}=\left( 1+\frac{m}{2|x-{\overline{x}}|}\right) ^{4}\delta \end{aligned}$$

where again \(m>0\) is the ADM mass and \({\overline{x}}\) will be called center of mass. Now, given a large positive parameter \(\sigma >\sigma _0\) we can consider a metric \({\tilde{g}}_{\sigma }\) that is obtained by interpolation of \(g_0\) and \(g_{m,{\overline{x}}}\) in the following sense. Fix a smooth cutoff function \(\phi :{\mathbb {R}}\rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} \phi (t) {\left\{ \begin{array}{ll} 1 \ \ \text {if} &{} t\le 1+\delta \\ 0 \ \ \text {if} &{} t\ge 2-\delta \end{array}\right. } \end{aligned}$$

where \(\delta >0\) is a small positive parameter that is fixed once and for all, and set

$$\begin{aligned} {\tilde{g}}_{\sigma }(x)= {\left\{ \begin{array}{ll} g_0(x) &{\quad} \ |x|\le \sigma \\ \phi \left( \frac{|x|}{\sigma }\right) g_0(x)+\left( 1-\phi \left( \frac{|x|}{\sigma }\right) \right) g_{m,{\overline{x}}}(x) &{\quad} \sigma \le |x|\le 2\sigma \\ g_{m,{\overline{x}}}(x) &{\quad} \ |x|\ge 2\sigma . \end{array}\right. } \end{aligned}$$

The resulting metric is smooth and well-defined on \(M^3\), furthermore it does solve the constraints (i. e. it is scalar-flat) away from the annular region \(\varOmega _{\sigma }\) described in coordinates by \(\sigma \le |x|\le 2\sigma \). The strategy which we want to follow is, as a special incarnation of the plan described above, to construct a sequence of corrections of such an interpolation eventually converging to a new scalar-flat metric. The deformations we want to perform are localised, that is to say there will be no modification of \({\tilde{g}}_{\sigma }\) outside of the gluing region.

We need to introduce some convenient notation concerning the functional spaces we shall be working with. In what follows, we let \(\varOmega \subset M\) denote an open, bounded domain whose boundary \(\partial \varOmega \) is at least \(C^2\) (in practice, in most applications we present it is in fact \(C^{\infty }\)); g is a background Riemannian metric which is defined on some open domain containing \({\overline{\varOmega }}\). To avoid ambiguities, let us stress that for Corvino’s gluing theorem (see Theorem 3.1 below) \(\varOmega \) is the annular region defined above and g is just the interpolating metric \({\tilde{g}}_{\sigma }\), however we will later discuss other gluing results where \(\varOmega \) and g are differently defined.

For \(\rho \) a smooth positive function on \(\varOmega \), to be specified later, we define

$$\begin{aligned} L^2_{\rho }{:}{=}\left\{ f\in L^2_{\mathrm{loc}}(\varOmega ) \ : \ f\rho ^{1/2}\in L^2(\varOmega )\right\} \end{aligned}$$

which is naturally a Hilbert space with the obvious norm. Similarly, we let \(H^{\ell }_{\rho }\) denote the space of functions (say in \(H^{\ell }_{\mathrm{loc}}\)) whose weak derivatives up to order \(\ell \) belong to \(L^2_{\rho }\), and again this space comes with its Hilbertian norm. Given an integer \(\ell \ge 0\) and \(\alpha \in (0,1)\) and \(\rho \) as in the previous item, we let \(C^{\ell ,\alpha }_{\rho }=C^{\ell ,\alpha }\cap L^2_{\rho }\), which is a Banach space when endowed with the norm

$$\begin{aligned} \Vert f\Vert _{\ell ,\alpha ,\rho }=\Vert f\Vert _{\ell ,\alpha }+\Vert f\Vert _{L^2_{\rho }}. \end{aligned}$$

Of course, the same definitions can be given for tensors of arbitrary order.

In what follows, for a domain \(\varOmega \) as above we shall only consider two specific choices of the weight function \(\rho \). If we let \(d:{\overline{\varOmega }}\rightarrow {\mathbb {R}}\) denote the Riemannian distance from \(\partial \varOmega \) (suitably truncated at a preassigned threshold \(d_0>0\), and then smoothened), we shall take:

$$\begin{aligned} \rho = {\left\{ \begin{array}{ll}d^{N} &{} \text {for the finite regularity gluing;}\\ e^{-1/d} &{} \text {for the infinite regularity gluing.} \end{array}\right. } \end{aligned}$$

Here \(N \in {\mathbb {N}}\) is a large positive integer to be fixed later. For duality reasons, we will also have to consider functional spaces defined by using the inverse weight \(\rho ^{-1}\) in lieu of \(\rho \).

The main gluing theorem in Corvino (2000), in its global version, can be stated as follows:

Theorem 3.1

Given an integer \(\ell \ge 0\), a real number \(\alpha \in (0,1)\), let \(g_0\) be a scalar-flat metric on \({\mathbb {R}}^3\) of class \(C^{\ell +4,\alpha }\) that is asymptotically flat and conformally flat outside a compact set. If the ADM mass of \(({\mathbb {R}}^3,g_0)\) is strictly positive, then for all sufficiently large \(\sigma >\sigma _0\) one can find:

  • a tensor \(h\in C^{\ell +2,\alpha }_{\rho ^{-1}}\),

  • a positive real number m,

  • a vector \({\overline{x}}\in {\mathbb {R}}^3\),

such that the metric \({\tilde{g}}_{\sigma }+h\) is scalar-flat.

Remark 3.2

This result was later extended to the general Einstein constraint equations (i.e., to the non time-symmetric scenario) in Corvino and Schoen (2006), as we will further explain in the following discussion.

As the reader may have noticed from the statement, we face a loss of regularity phenomenon: the iteration scheme that is followed in the proof only provides \(C^{\ell +2,\alpha }\) regularity starting from \(C^{\ell +4,\alpha }\) data. We will get back to this point after having presented the proof of Theorem 3.1. However, using exponential weights rather than polynomial ones one can directly obtain a smooth counterpart of the theorem above: if \(g_0\) is \(C^{\infty }\) then so will be the metric we design.

The starting idea to prove Theorem 3.1 is to try and solve the equation

$$\begin{aligned} R_{{\tilde{g}}+h}=0, \ \ h\in C^{\ell +2,\alpha }_{\rho ^{-1}} \end{aligned}$$

by means of a suitable iterative scheme, that is to say by constructing a sequence of solutions to linearised problems. Here and below we shall write \({\tilde{g}}\) in lieu of \({\tilde{g}}_{\sigma }\) for the sake of notational convenience; also note that any such metric \({\tilde{g}}_{\sigma }\) does also depend on the parameters m and \({\overline{x}}\) as explained before. At a formal level, one can write

$$\begin{aligned} R_{{\tilde{g}}+h}=R_{{\tilde{g}}}+L_{{\tilde{g}}}h+Q_{{\tilde{g}}}(h) \end{aligned}$$

where \(Q_{{\tilde{g}}}(h)\) stands for all the terms that are at least quadratic in h. Now, in general \(R_{{\tilde{g}}}\) will not be zero identically in the gluing region but will decay at a suitable rate as one lets \(\sigma \rightarrow \infty \) and thus it makes sense to set \(h_0=0\) and then, given \(h_0,h_1,\ldots , h_i\) determine \(h_{i+1}\) by solving the equation

$$\begin{aligned} L_{{\tilde{g}}} h_{i+1}=-R_{{\tilde{g}}}-Q_{{\tilde{g}}}(h_i). \end{aligned}$$

Of course, to do so we need at least to know that the operator \(L_{{\tilde{g}}}\) is (locally) surjective in suitable Banach spaces. Loosely speaking, by standard duality arguments, this is equivalent to the fact that the formal adjoint \(L^{*}_{{\tilde{g}}}\) is injective when acting on the corresponding dual spaces. Now, this is unlikely to be the case as \({\tilde{g}}_{\sigma }\) approaches the Euclidean metric for \(\sigma \rightarrow \infty \) and \(L^{*}_{\delta }\) does have a non-trivial kernel.Footnote 6 In other words, one needs to deal with an approximate kernel \(K=\left<1, x^1, x^2, x^3\right>\) of \(L^{*}_{{\tilde{g}}}\). The idea to proceed is then the following. First of all, by a scaling argument, one can always reduce to the case \(\sigma =1\) and the gluing happens in the fixed annular region \(A=\left\{ x\in {\mathbb {R}}^3 \ : \ 1< |x|< 2\right\} \) (we do not rename the coordinate). In our application, we shall then consider the rescaled interpolating metric

$$\begin{aligned} g_\sigma (x)={\tilde{g}}_{\sigma }(\sigma x). \end{aligned}$$

Fixed a spherically symmetric bump function \(\zeta \) of compact support in A, that is equal to one for \(1+\delta \le |x|\le 2-\delta \), we set

$$\begin{aligned} K_{*}=\left<\zeta ,\zeta x^1,\zeta x^2,\zeta x^3\right>\end{aligned}$$

understood as span over \({\mathbb {R}}\) generated by the four functions above. Clearly \(K_{*}\subset H^2_{\rho ^{-1}}(A)\) and for any Riemannian metric g that is \(C^0\)-sufficiently close to \(\delta \) one also has the splittings

$$\begin{aligned} L^2(A;dV_g)=K_{*}\oplus K^{\perp }, \ \ L^2(A;dV_g)=K\oplus K_{*}^{\perp }. \end{aligned}$$

At this stage, one proceeds in two steps, by first studying the solvability of the projected problem

$$\begin{aligned} \pi _{K^{\perp }_{*}} R_g=0 \end{aligned}$$

namely one wants to perturb \(g_{\sigma }\) to some \({\overline{g}}_{\sigma }= g_{\sigma }+h_{\sigma }\) (for \(h_{\sigma }=h_{\sigma , (m,{\overline{x}})}\)) to solve this projected problem, thereby achieving that \(R_{g_{\sigma }+h}\in K_{*}\). Here we have denoted by \(\pi _{K_{*}^{\perp }}:L^2(A)\rightarrow K_{*}^{\perp }\) the orthogonal projector in \(L^2(A)\) with respect to the Riemannian volume measure determined by the background metric \(g_{\sigma }\). Hence, as a second step, it is shown that one can carefully select parameters \((m,{\overline{x}})\in {\mathbb {R}}_{>0}\times {\mathbb {R}}^3\) so that in fact \(R_{{\overline{g}}_{\sigma }}=0\). At a conceptual level, this approach can be regarded as a suitable Lyapunov-Schmidt reduction (see e.g., Nirenberg 2001) for the problem we are dealing with. At a technical level, for the first step we first solve the linearised problem by means of suitable elliptic estimates for an underdetermined problem and then solve the nonlinear problem by a Picard iteration scheme, while for the second step we use a suitable (finite-dimensional) degree-type argument.

To solve the linearised problem, at least orthogonally to the cokernel, one first needs to prove what follows:

Proposition 3.3

Let \((M^n,g)\) be a complete Riemannian manifold and let \(\varOmega \subset M\) be a bounded subdomain with sufficiently smooth boundary (in fact, we shall require \(\partial \varOmega \) be \(C^2\)). Assume further that the domain \(\varOmega \) is not static, namely: there is no non-trivial \(u\in H^2_{\mathrm{loc}}(\varOmega )\) such that \(L^*_g u=0\). Then there exists a constant \(C=C(n,M,g,\varOmega )\) such that

$$\begin{aligned} \Vert u\Vert _{H^2_{\rho }(\varOmega )}\le C\Vert L^*_g u\Vert _{L^2_{\rho }(\varOmega )} \end{aligned}$$

for all \(u\in H^2_{\rho }(\varOmega )\).


As a preliminary step, note that

$$\begin{aligned} \mathrm {Hess}_g u=L^*_g u -\frac{1}{n-1}tr(L^*_g u)g+\left( {\mathrm {Ric}}_g-\frac{R_g }{n-1}g\right) u \end{aligned}$$

hence, using the standard interpolation inequalityFootnote 7

$$\begin{aligned} \Vert u\Vert _{H^1(\varOmega )}\le \frac{1}{2}\Vert \mathrm {Hess}_g u\Vert _{L^2(\varOmega )}+C\Vert u\Vert _{L^2(\varOmega )}. \end{aligned}$$

one gets that

$$\begin{aligned} \Vert u\Vert _{H^2(\varOmega )}\le C\left( \Vert L^*_g u\Vert _{L^2(\varOmega )}+\Vert u\Vert _{L^2(\varOmega )}\right) \end{aligned}$$

for all \(u\in H^2(\varOmega )\). Note that this same inequality also holds for any proper subdomain of the form \(\varOmega _{\varepsilon }{:}{=}\left\{ x\in \varOmega \ : \ dist_g(x,\partial \varOmega )\ge \varepsilon \right\} \) in lieu of \(\varOmega \) and will now exploit the assumption that the domain \(\varOmega \) not be static in order to prove that there exists \(\varepsilon _0>0\) and a constant (which we do not rename) such that for all \(0\le \varepsilon \le \varepsilon _0\) one has

$$\begin{aligned} \Vert u\Vert _{H^2(\varOmega _{\varepsilon })}\le C\Vert L^*_g u\Vert _{L^2(\varOmega _{\varepsilon })} \end{aligned}$$

for all \(u\in H^2_{\rho }(\varOmega )\); in other words, we can get rid of the zeroth-order term and derive a pure coercivity estimate.

This claim is justified by first observing that there exists \(\varepsilon _0>0\) such that the operator \(L^*_g\) has no kernel in \(H^2(\varOmega _{\varepsilon })\) for any \(0\le \varepsilon \le \varepsilon _0\) (for, if not, we would have a nested sequence of vector spaces, corresponding to an exhaustion of \(\varOmega \), hence those would eventually stabilise and violate the assumption that \(\varOmega \) be static) and then, given (3.7) for the domain \(\varOmega _{\varepsilon }\) as an input, by a standard compactness argument. At that stage, Proposition 3.3 is just a straightforward consequence of (3.8) together with a suitable coarea-type formula (see e.g., Lemma 4.1 in Carlotto and Schoen 2016). \(\square \)

One can then employ the estimate provided by Proposition 3.3 to discuss the solvability of the equation

$$\begin{aligned} L_g h = f \end{aligned}$$

for given \(f\in L^2_{\rho ^{-1}}(\varOmega )\). The general strategy, of variational character, is very simple. One considers the quadratic functional \(I:H^2_{\rho }(\varOmega )\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} I(u)=\frac{1}{2}\int _{\varOmega }|L^*_g u |^2\rho \,dV_g-\int _{\varOmega }fu\,dV_g \end{aligned}$$

which is well-defined because one has, by Cauchy–Schwarz, the trivial bound

$$\begin{aligned} \left| \int _{\varOmega }fu\right| \le \Vert f\Vert _{L^2_{\rho ^{-1}}(\varOmega )}\Vert u\Vert _{L^2_{\rho }(\varOmega )}. \end{aligned}$$

As a result, for any \(f\in L^2_{\rho ^{-1}}(\varOmega )\) it follows that

$$\begin{aligned} m_{f}{:}{=}\inf _{u\in H^2_{\rho }(\varOmega )} I(u)>-\infty \end{aligned}$$

and there exists a unique minimiser \(u_f\in H^2_{\rho }(\varOmega )\) for the corresponding variational problem. Note that the minimiser is unique due to the convexity of the functional in question, which follows from the identity

$$\begin{aligned} I\left( \frac{u_1+u_2}{2}\right) =\frac{1}{2}I(u_1)+\frac{1}{2}I(u_2) -\frac{1}{8}\int _{\varOmega }|L^*_g (u_2-u_1)|^2. \end{aligned}$$

The function \(u_f\) must then be a weak solution to the Euler–Lagrange equation associated to the functional I, which means that

$$\begin{aligned} \int _{\varOmega }\left( \rho g(L^*_g u, L^*_g \eta ) -f\eta \right) \,dV_g =0 \end{aligned}$$

for any test function \(\eta \). Therefore, it follows at once (via integration by parts) that the tensor \(h=\rho L^*_g u \in L^2_{\rho ^{-1}}(\varOmega )\) is a weak solution of

$$\begin{aligned} L_g h =f. \end{aligned}$$

Note that, a posteriori, h is actually smooth if so is f thanks to standard elliptic regularity, applied at the level of the equation solved by u.

Let us now discuss the applicability of these results to the gluing problem described above. Recall that we were working with the rescaled interpolating metric \(g_{\sigma }(x)\)defined on the Euclidean annulus \(A=\left\{ x\in {\mathbb {R}}^3 \ : \ 1<|x|<2 \right\} \). We shall denote by \(L^{\perp }_g\) the linearisation of the operator \(\pi _{K^{\perp }_{*}}R_g\) (with implicit dependence on \(\sigma \), for the sake of notational convenience), that is in fact just equal to \(\pi _{K^{\perp }_{*}} L_g\).

Proposition 3.4

In the setting above, given \(f\in L^2_{\rho ^{-1}}(A)\cap K^{\perp }_{*}\) one can find \(h\in H^2_{{\rho }^{-1}}(\varOmega )\) such that

$$\begin{aligned} L^{\perp }_g h = f. \end{aligned}$$

The only issue in the proof, given the above discussion, is the fact that one needs to work orthogonally to the cokernel. We will just provide an outline (the interested reader can find all the details in Corvino 2000, pp. 169–171).

Sketch of proof of Proposition 3.4

One starts by claiming the existence of \(\varepsilon >0\) and a uniform constant \(C>0\) so that for any Riemannian metric g that is (say smoothly) \(\varepsilon \)-close to the Euclidean metric

$$\begin{aligned} \Vert u\Vert _{H^2_{\rho }(A)}\le C\Vert L^*_g u\Vert _{L^2_{\rho }(A)}, \ \ \forall u\in K^{\perp }_{*}. \end{aligned}$$

This statement is justified, along the lines above, first at the Euclidean metric \(\delta \) and then by approximation. Given such an estimate, one can proceed by setting up the variational problem associated to minimising the functional

$$\begin{aligned} I(u)=\frac{1}{2}\int _{A}\left( |L^{\perp }_g u|^2\rho -fu\right) \,dV_g, \end{aligned}$$

defined on the Hilbert (sub)space \(H^2_{\rho }(A) \cap K^{\perp }_{*}\).

Is follows that a (unique) minimiser \(u_f\) exists and that it sastisfies

$$\begin{aligned} 0=\int _{A}\left( \rho g((L^{\perp }_g)^{*} u, (L^{\perp }_g)^{*} \eta ) -f\eta \right) \,dV_g \end{aligned}$$

for all test functions \(\eta \in C^{\infty }_c(A)\in K^{\perp }_{*}\). This is not quite enough to conclude, due to the restrictions imposed to the class of test functions. But, at that stage one notices that on the one hand the subspace \(C^{\infty }_c(A)\cap K^{\perp }_{*}\) is \(L^2(A;dV_g)\)-dense in \(K^{\perp }_{*}\) and on the other hand that for \(\eta \in K_{*}\) one trivially has \((L^{\perp }_g)^{*} \eta =0\) (because \(L^{\perp }_g=\pi _{K^{\perp }_{*}}L_g\) ) whence we see that the tensor \(h=\rho (L^{\perp }_g)^{*} u\in L^2_{\rho ^{-1}}(A)\) is actually a weak solution of \(L^{\perp }_g h=f\), which is what we wanted. \(\square \)

In fact, it follows from our construction that one can define a linear, continuous operator \(T:L^2_{\rho ^{-1}}\rightarrow H^2_{\rho ^{-1}}\) that associates to any given function a solution (selected as the unique minimiser of the above functional) of the equation in question. The following statement ensures solvability of the full (i.e., nonlinear) scalar curvature deformation problem, and that the deformation in question can be performed in a localised fashion.

Theorem 3.5

Given \(g_0\) a \(C^{\ell +4,\alpha }\)-metric (for given \(\ell \ge 0\) integer, and \(\alpha \in (0,1)\)) on a manifold \(M^n\) and a compact, regular domain \(\varOmega \) that is non-static, there exists \(\varepsilon >0\) such that if \(S-R_{g_0}\in C^{\ell ,\alpha }_{\rho ^{-1}}(\varOmega )\) with \(\Vert S-R_{g_0}\Vert _{\ell ,\alpha ,\rho ^{-1}}<\varepsilon \) then one can find a \(C^{\ell +2,\alpha }\)-metric g on \(M^n\) that equals \(g_0\) outside \(\varOmega \) and whose scalar curvature satisfies \(R_g=S\).

To prove this statement, one could first neglect the quadratic term and solve the equation

$$\begin{aligned} L_{g_0}h=S-R_{g_0} \end{aligned}$$

so if we let \(h_1=T(S-R_{g_0})\) we can then set \(g_1=g_0+h_1\) and proceed. In practice, one might be tempted to setup a Newton iterative scheme, where the linearisation for the second step happens at \(g_1\) and keeps changing step after step. Unfortunately, this poses a serious technical problem as one faces the aforementioned loss of regularity phenomenon: if \(g_0\in C^{k+4,\alpha }\) then \(g_1\in C^{k+2,\alpha }\) so we lose two orders of differentiability and thus we cannot possibly iterate this procedure as long as we wished. For such a reason, we proceed differently and rather define the general iteration scheme by setting \(f_0=0, h_0=0\) and then

$$\begin{aligned} \left\{ \begin{aligned} &f_i= (S-R_{g_0})-Q_{g_0}(h_{i-1}) \\& L_{g_0}h_i=f_i. \end{aligned}\right. \end{aligned}$$

In order to properly define the iteration one needs to introduce two Banach spaces \((X_1, \Vert \cdot \Vert _1)\) and \((X_2, \Vert \cdot \Vert _2)\), with the property that the linear solution operator \(T: X_1\rightarrow X_2\) is well-defined, linear and continuous; in the specific case under consideration we shall set

$$\begin{aligned} X_1&= C^{\ell ,\alpha }_{\rho ^{-1}}(\varOmega ), \text { with norm } \Vert \cdot \Vert _{1}=\Vert \cdot \Vert _{\ell ,\alpha ,\rho ^{-1}}, \\ X_2&= C^{\ell +2,\alpha }_{\rho ^{-1}}(\varOmega ), \text { with norm } \Vert \cdot \Vert _{2}=\Vert \cdot \Vert _{\ell +2,\alpha ,\rho ^{-1}}. \end{aligned}$$

Hence, once it is proven that the resolvent operator \(T:X_1\rightarrow X_2\) is continuous, the proof of Theorem 3.5 comes by a direct application of the following ‘abstract’ lemma, which concerns the improvement of the quadratic error terms.

Lemma 3.6

Given any \(\lambda >0\), there exists \(r_0>0\) sufficiently small so that if \(\Vert f_1\Vert _1<r_0\) and \(\Vert f_2\Vert _1<r_0\) and we let \(h_1=T f_1\), \(h_2=T f_2\) then we have

$$\begin{aligned} \Vert Q_{g}(h_1)-Q_{g}(h_2)\Vert _1\le \lambda \Vert h_1-h_2\Vert _2. \end{aligned}$$

Remark 3.7

(Back to the loss of regularity phenomenon). Suppose (as in the statement above) that \(g\in C^{\ell +4,\alpha }\) and suppose we want to solve the equation \(L_{g}h=f\) where \(f\in C^{\ell +2,\alpha }\) (this issue is purely local, so we do not need to worry about the weight function at the boundary). Now, the solution we had constructed was of the form \(h=\rho L^*_g u\), so that one gets for u the fourth-order equation

$$\begin{aligned} L_g (\rho L^*_g u)=f. \end{aligned}$$

Using the explicit expression of the operators \(L_g\) and \(L^*_g\), as provided above, it is easy to check that under our assumptions there are coefficients of the differential operator on the left-hand side that are only in \(C^{\ell ,\alpha }\) (one such term being, for instance, \(\varDelta _g R_g\)). Hence, it follows by standard Schauder theory that one only gets \(u\in C^{\ell +4,\alpha }\) and hence \(h\in C^{\ell +2,\alpha }\).

Getting back to to the gluing problem, following the very same conceptual scheme described above, one can prove this assertion which ensures the solvability of the gluing problem orthogonally to the cokernel.

Proposition 3.8

Given any \(C^{\ell +4,\alpha }\) asymptotically flat, scalar-flat metric g on \({\mathbb {R}}^3\) and given any \((m,{\overline{x}})\in {\mathbb {R}}_{>0}\times {\mathbb {R}}^3\) then for \(\sigma >0\) large enough there exists a \(C^{\ell +2,\alpha }\)-metric \({\overline{g}}_{\sigma ,(m,{\overline{x}})}=g_{\sigma ,(m,{\overline{x}})} +h_{\sigma ,(m,{\overline{x}})}\) with \(R_{{\overline{g}}_{\sigma ,(m,{\overline{x}})}}\in K_{*}\). Here \(h\in C^{\ell +2,\alpha }_{\rho ^{-1}}\) and

$$\begin{aligned} \Vert h\Vert _{\ell +2,\alpha ,\rho ^{-1}}\le c\Vert R(g_{\sigma ,(m,{\overline{x}})})\Vert _{\ell ,\alpha ,\rho ^{-1}}=O(\sigma ^{-1}). \end{aligned}$$

We actually get a continuous map

$$\begin{aligned} {\mathscr {I}}: \left\{ m>0\right\} \times {\mathbb {R}}^3_{{\overline{x}}} \ \rightarrow \ {\mathbb {R}}^4 \end{aligned}$$

defined by taking \(L^2\)-projections against the four generators of the linear space K, and the proof of Theorem 3.1 amounts to showing that this map vanishes somewhere, namely that there exists \(m_0>0\) and \({\overline{x}}\in {\mathbb {R}}^3\) such that \({\mathscr {I}}(m_0,{\overline{x}})=0\). Degree-theoretic arguments allow to get the conclusion, as we now outline.

To being with, we shall exploit the specific (i.e., conformally flat) structure of the assigned metric, hence of the interpolating metric \(g_{\sigma }\). If we write (notation as above)

$$\begin{aligned} g_0=u^4\delta _{ij}, \ u\rightarrow 1 \ \text {as} \ |x|\rightarrow \infty \end{aligned}$$

for some smooth, positive function u, the scalar-flatness requirement turns into the condition that u be harmonic, which implies that u has an expansion at infinity of the form

$$\begin{aligned} u(x)=1+\frac{A}{|x|}+\frac{Bx^1}{|x|^3}+\frac{Cx^2}{|x|^3} +\frac{Dx^3}{|x|^3}+O(|x|^{-3}). \end{aligned}$$

Now, recall that \(A=m_0/2\) for \(m_0>0\) the ADM mass of \(g_0\), and in addition one can always choose a new chart (essentially by translating the center) so that the three terms of order \(\simeq |x|^{-2}\) disappear. More precisely, we note that (if \(A>0\)) there exists a unique \(c_0\in {\mathbb {R}}^3\) such that

$$\begin{aligned} u(x-c_0)=1+\frac{A}{|x|}+O(|x|^{-3}), \end{aligned}$$

in fact this vector is explicitly characterised, in terms of the coefficients above, by the equation

$$\begin{aligned} \left( c_0^1,c_0^2,c_0^3\right) =-A^{-1}(B,C,D) \end{aligned}$$

and in terms of the metric by

$$\begin{aligned} (B,C,D)=\frac{3}{64\pi }\lim _{\sigma \rightarrow \infty }\int _{|x|=\sigma }x \sum _{i,j}((g_0)_{ij,i}-(g_0)_{ii,j})\nu ^j\,dA_{g_0}. \end{aligned}$$

The latter equation follows by virtue of the pointwise identity

$$\begin{aligned} \sum _{i,j}((g_0)_{ij,i}-(g_0)_{ii,j})\nu ^j=-8 u^3\left( \frac{\partial u}{\partial r}\right) \end{aligned}$$

where we have used the convenient notation (for the radial derivative)

$$\begin{aligned} \frac{\partial u}{\partial r}=\sum _{i=1}^3 \frac{\partial u}{\partial x^i}\nu ^i. \end{aligned}$$

As a result, we can (and we shall) always assume in the sequel of this discussion that the background coordinate system has been chosen once and for all so that \((B,C,D)=(0,0,0)\).

That being said, we consider the finite-dimensional reduction map

$$\begin{aligned} {\mathscr {I}}_{\sigma }: (m,{\overline{x}}) \mapsto \left( \frac{1}{16\pi }\sigma \int _{A}R_{{\overline{g}}_{\sigma }}\,dV_{g_{\sigma }}, -\frac{3}{16\pi }\sigma ^2\int _{A} x R_{{\overline{g}}_{\sigma }}\,dV_{g_{\sigma }}\right) . \end{aligned}$$

The aforementioned splitting

$$\begin{aligned} L^2(A; dV_{g_{\sigma }})=K\oplus K^{\perp }_{*} \end{aligned}$$

implies that the whole proof of the gluing theorem is complete once we can show that in fact there exist \((m,{\overline{x}})\) such that

$$\begin{aligned} {\mathscr {I}}_{\sigma }(m,{\overline{x}})=0 \ \ {\text{under the constraint}} \ \ |{\overline{x}}|<\sigma . \end{aligned}$$

Sketch of the proof of Theorem 3.1

We claim the following two estimates:

$$\begin{aligned} \int _{A}R_{{\overline{g}}_{\sigma ,(m,{\overline{x}})}} \,dV_{g_{\sigma ,(m,{\overline{x}})}}&=(16\pi )\left( \frac{m-m_0}{\sigma }\right) +o(\sigma ^{-1}) \\ \int _{A}x R_{{\overline{g}}_{\sigma ,(m,{\overline{x}})}}\,dV_{g_{\sigma ,(m,{\overline{x}})}}&=-\left( \frac{16\pi }{3}\right) \left( \frac{m{\overline{x}}}{\sigma ^2} +\varPsi _{\sigma }(m,{\overline{x}})+o(\sigma ^{-2})\right) \end{aligned}$$

where (in both cases) the bound on the remainder term is uniform as long as \(m,{\overline{x}}\) vary in a compact set (this last statement being true for any pre-fixed compact set). Here \(m_0\) denotes the mass of the initial metric \(g_0\), while the term \(\varPsi _{\sigma }(m,{\overline{x}})\) satisfies a bound of the form

$$\begin{aligned} \left| \varPsi _{\sigma }(m,{\overline{x}})\right| \le C\frac{|m-m_0|}{\sigma ^2}. \end{aligned}$$

We will only outline the proof of the first estimate (the \(L^2\)-product with the constant 1), which is slightly simpler and yet gives a rather accurate idea of the way these sorts of estimates are obtained.

By the classical work Fischer and Marsden (1975a), the scalar curvature map can be regarded as a differentiable map in suitable Hölder spaces i.e.,

$$\begin{aligned} R\in C^1\left( C^{k+2,\alpha }({\overline{A}}),C^{k,\alpha }({\overline{A}})\right) \end{aligned}$$

and we can write the Taylor expansion

$$\begin{aligned} R_{g_{\sigma }+h_{\sigma }}=R_{g_{\sigma }}+L_{g_{\sigma }}h_{\sigma } +O(\Vert h_{\sigma }\Vert ^2_{k+2,\alpha }) \end{aligned}$$

(where the dependence of \(g_{\sigma }, h_{\sigma }\) on \(m,{\overline{x}}\) is implicit, for the sake of notational convenience). Now, we analyse each term separately. First of all, observe that

$$\begin{aligned} O(\Vert h_{\sigma }\Vert ^2_{k+2,\alpha })=O(\sigma ^{-2}) \end{aligned}$$

as one lets \(\sigma \rightarrow \infty \), this being true as a result of the bound

$$\begin{aligned} \Vert h_{\sigma }\Vert _{k+2,\alpha ,\rho ^{-1}}\le C\Vert R_{g_{\sigma }}\Vert _{k,\alpha ,\rho ^{-1}}\le C\sigma ^{-1} \end{aligned}$$

which is part of the conclusion of Proposition 3.8. For the linear part:

$$\begin{aligned} \int _{A}L_{g_{\sigma }}h_{\sigma }\,dV_{g_{\sigma }}=\int _{A}g(h_{\sigma }, L^{*}_{g_{\sigma }}(1))\,dV_{g_{\sigma }} \end{aligned}$$

but now (recalling the explicit expression for the adjoint of the linearised scalar curvature operator)

$$\begin{aligned} L^{*}_{g_{\sigma }}(1)=-{\mathrm {Ric}}_{g_{\sigma }}=O(\sigma ^{-1}) \end{aligned}$$

and so (given the aforementioned bounds on the tensor \(h_{\sigma }\)) one gets

$$\begin{aligned} \int _{A}L_{g_{\sigma }}h_{\sigma }\,dV_{g_{\sigma }}=O(\sigma ^{-2}). \end{aligned}$$

Let us now handle the leading term, that is

$$\begin{aligned} R_{g_{\sigma }}&=\sum _{i,j}((g_{\sigma })_{ij,ij} -(g_{\sigma })_{ii,jj})+O(\sigma ^{-2}) \\ \biggl [\Longleftrightarrow \ R_{{\tilde{g}}_{\sigma }}&=\sum _{i,j}(({\tilde{g}}_{\sigma })_{ij,ij} -({\tilde{g}}_{\sigma })_{ii,jj})+O(\sigma ^{-4}) \biggr ]. \end{aligned}$$

Here is the key point of the argument: said \(A_{\sigma }\) the annulus of radii \(\sigma \) and \(2\sigma \), by first applying a change of variable and then the divergence theorem one has

$$\begin{aligned} \int _{A}R_{g_{\sigma }}\,dV_{g_{\sigma }}&=\sigma ^{-1}\int _{A_{\sigma }} \sum _{i,j}(({\tilde{g}}_{\sigma })_{ij,ij}-({\tilde{g}}_{\sigma })_{ii,jj}) \,dV_{{\tilde{g}}_{\sigma }}+O(\sigma ^{-2}) \\&=\sigma ^{-1}\int _{\partial A_{\sigma }}\sum _{i,j}(({\tilde{g}}_{\sigma })_{ij,i} -({\tilde{g}}_{\sigma })_{ii,j})\nu ^j\,dA_{{\tilde{g}}_{\sigma }}+O(\sigma ^{-2}) \end{aligned}$$

whence we just split such boundary integral into the outer and inner component, where \({\tilde{g}}_{\sigma }\) equals \(g_{m,{\overline{x}}}\) or \(g_0\) respectively. Thus, keeping in mind the very definition of ADM mass one has

$$\begin{aligned} \lim _{\sigma \rightarrow \infty }\sigma \int _{A}R_{g_{\sigma }}\,dV_{g_{\sigma }}=16\pi (m-m_0) \end{aligned}$$

which obviously implies

$$\begin{aligned} \int _{A}R_{g_{\sigma }}\,dV_{g_{\sigma }}=16\pi \left( \frac{m-m_0}{\sigma }\right) +o(\sigma ^{-1}), \end{aligned}$$

and the conclusion follows by simply combining the three terms above. We thus have

$$\begin{aligned} {\mathscr {I}}_{\sigma }(m,{\overline{x}})=(m-m_0,m{\overline{x}})+(0,\varXi _{\sigma })+o(1) \end{aligned}$$

where the remainder is uniform when \(m,{\overline{x}}\) vary in a compact set around \((m_0,0)\), and we know \(|\varXi _{\sigma }|=(m-m_0)O(1)\). Let us also consider the truncated map \({\mathscr {I}}^T_{\sigma }{:}{=}(m-m_0,m{\overline{x}})+(0,\varXi _{\sigma })\) where the additional error terms are neglected.

Now, take in \({\mathbb {R}}\times {\mathbb {R}}^3\) a rectangular box centered at \((m_0,0)\) and of half-sides \(\mu _0,\mu _1,\mu _2, \mu _3\) where \(\mu _0<m_0/4\) and

$$\begin{aligned} \mu _1=\mu _2=\mu _3=\theta ^{-1} \mu _0 \end{aligned}$$

where \(\theta \in (0,1)\) is a small parameter that shall be determined depending on \(\varXi _\sigma \) only. To that end, observe that (for any large \(\sigma \)) the map \({\mathscr {I}}^T_{\sigma }\) sends the hyperplanes \(m=m_0\pm \mu _0\) to the hyperplanes \(x^0=\pm \mu _0\), and the hyperplanes \(c^i=\pm \mu _i\) close to to the hyperplanes \(x^i=\pm m_0 \mu _i\) for each \(i=1,2,3\). In particular we can always choose the constant \(\theta \) small enough that the image of the boundary of the box in question encircles the origin. At that stage, it follows that one can find solutions to the truncated equation \({\mathscr {I}}^T_{\sigma }(m,{\overline{x}})=0\). In particular, this implies that the origin has non-zero degree for \({\mathscr {I}}^T_{\sigma }\) and at that stage the result for \({\mathscr {I}}_{\sigma }\) follows by homotopical invariance of the local degree (see e.g., Milnor 1965), which ensures that indeed the origin will have the same non-zero degree for both maps. \(\square \)

The striking results by Corvino have generated a significant amount of related work. Proceeding by conceptual affinity (rather than in chronological order) we should first mention the later paper Corvino and Schoen (2006), where the analysis above is extended (along similar lines) to the case of general, i.e., non time-symmetric, relativistic constraints. We note that the obstructions to the local deformations are represented by the Killing initial data sets (henceforth often abbreviated KIDs), namely those triples (Mgk) such that the operator \(D\varPhi ^{*}_{(g,k)}\) has a non-trivial kernel, in the appropriate functional space (which obviously depends on the specific setup of the problem in question). This notion, which clearly generalises that of static manifolds, and is nevertheless a lot less rigid (cf. Appendix D), will come up multiple times along the course of our discussion. In particular, we will see in Sect. 4.1 how the terminology we have employed is justified by the result that the spacetime arising by evolving any such initial data set naturally comes with a Killing vector field; we refer the reader to the aforementioned work Beig et al. (2005) for mathematical background around this fundamental class of initial data.

A subtle point, which we wish to stress explicitly, is that the analogue of Theorem 3.5 for the full constraints (no time-symmetry assumption) does not in general ensure that the weak DEC can be promoted to a strict inequality by means of a local deformation, even when no linear obstruction occurs. For indeed, the value of \(|J|_g\) obviously depends on the Riemannian metric g that is employed, which will vary (partly out of control) when performing a prescribed deformation of the quantities in the right-hand side of (1.2). This delicate problem has, very recently, come to a rather definitive solution in the work Huang and Lee (2020a), which we already alluded to in the introduction in relation to Bartnik’s stationarity conjecture. The authors introduce the so-called dominant energy scalar as

$$\begin{aligned} \sigma {:}{=}2(\mu -|J|_g) \end{aligned}$$

and carefully define a concept of local improvability, which precisely encodes the property of an initial data sets of being locally deformable to one for which the quantity \(\sigma \) can been increased by a small, yet pre-assigned amount. Then, they prove that a non-improvable initial data set without local symmetries must have constant dominant energy scalar in the domain under consideration (to be compared with Proposition D.4), and must sit inside a null perfect fluid spacetime carrying a global Killing vector field. Thereby, this work by Huang and Lee also extends some old, foundational results by Moncrief (concerning vacuum data) to the case when physical sources are actually present.

Moving further, at the global level Corvino and Schoen prove a general gluing theorem asserting that any asymptotically flat initial data set (satisfying suitable asymptotic conditions, see below) can be glued to an element of the Kerr family. Besides proving a suitable coercivity estimate for the adjoint of the linearised constraint map, which brings the coupling of the problem into play, one needs to deal with a ten dimensional cokernel. This obstruction space is dealt with by carefully studying the reduction map (following a Lyapunov–Schmidt reduction as above) and then showing, through a degree argument, that there is a point in the parameter space (which is a subregion of \({\mathbb {R}}^{10}\) parametrised by the energy, linear momentum, angular momentum and center of mass of the data) where such a map vanishes. Incidentally, when thinking about these four quantities, it is appropriate to keep in mind the enlightening result of Huang et al. (2011) (where gluing methods also play a central role): it is proven that given an (asymptotically flat) initial data set satisfying the Regge–Teitelboim condition it is possible to find another initial data set with the same energy–momentum and freely assigned angular momentum and center of mass. Hence, unless additional restrictions are imposed, there are no constraints on the angular momentum and center of mass in terms of the energy–momentum vector (which is subject to the positive mass theorem, cf. Appendix B) for general vacuum solutions of the Einstein equations.

We further note how Corvino and Schoen (but compare with our later comments on Chruściel and Delay 2003) point out that their method applies, somewhat more generally and more abstractly, to stationary ends near spacelike infinity of asymptotically flat vacuum solutions other than Kerr ones. In that respect, it is of interest to understand whether there exists a complete characterisation of such solutions in terms of (a mnimal set of) asymptotic data. This turns out to be, already in the static case, a delicate and fascinating question, see Friedrich (2007) and the later work Aceña (2009) for a generalisation to the stationary case.

On the technical side, a very important remark (which we will crucially exploit in Sect. 3.4 below) is that, differently from the conformally flat asymptotics assumed in Corvino (2000), it is observed in Corvino and Schoen (2006) that all one needs for the scheme to work is that (working at one end, with asymptotic coordinates \(\left\{ x\right\} \))

$$\begin{aligned}&g_{ij}(x)=\delta _{ij} +O(|x|^{-1}), \ \ g_{ij}(x)-g_{ij}(-x)=O(|x|^{-2}), \ \ g_{ij,k}(x)+g_{ij,k}(-x)=O(|x|^{-3})\nonumber \\&k_{ij}(x)= O(|x|^{-2}), \ \ k_{ij}(x)+k_{ij}(-x)=O(|x|^{-3}), \ \ k_{ij,k}(x)-k_{ij,k}(-x)=O(|x|^{-4}) \end{aligned}$$

together with, possibly, corresponding decay requirements on the higher derivatives. We note that these conditions are partly reminiscent of those given in Regge and Teitelboim (1974). In fact, recognising this fact allowed to streamline certain arguments. While the reduction map is obviously more complicated in the general case than in the time-symmetric setting (where only four parameters were needed), singling out the (3.10) condition affords some economy in computing this map, where (as we saw) an expansion of the conformal factor is used. We further note that such a gluing theorem, in combination with a suitable density theorem, in weighted Sobolev spaces, of data having harmonic asymptotics (cf. Sect. 2.8) implies, in turn, a suitable density statement for asymptotically flat data that are exactly Kerr near infinity. This is of course an interesting result, as the evolution of these data produces spacetimes whose structure at null infinity is comparatively well-understood. In particular, we explicitly note how the asymptotic smoothness of asymptotically flat solutions to the Einstein field equations, say in the vacuum case, at spacelike and null infinity is controlled to a large extent by the behaviour of the Cauchy data near the spacelike infinity on the initial hypersurface (cf. Friedrich 2018), so the aforementioned density results may also interpreted in terms of (improved) smoothness of the spacetimes arising from the approximating data.

The same global gluing theorem for asymptotically flat data was also obtained, independently and among many other interesting results, in the monograph Chruściel and Delay (2003) that is devoted to a detailed study of the mapping properties, between pairs of Hilbertian weighted Sobolev spaces, of the constraint operators. Rather than sticking to a specific gluing region or model geometry, the authors analyse localised deformation problems in great generality and prove a series of statements (see, for instance, Theorem 3.7 therein) having an ample range of applicability. Their methods work equally well, at least in principle, for gluing problems both in the asymptotically flat and in the asymptotically hyperbolic context. In addition to the aforementioned result concerning Kerrian data, they have been exploited, for instance, to produce Maskit combinations of metrics or to prove density theorems for asymptotically hyperbolic data having a special structure outside a compact set (i.e., counterparts of Corvino 2000 and Corvino and Schoen 2006 for a totally different geometry at infinity, see in part. Chruściel and Delay (2009) for related later work by the same authors); we refer the reader to Sect. 8 of Chruściel and Delay (2003) for an impressive gallery of diverse applications.

Among them, one which connects to a network of other contributions we would like to mention is the construction of (asymptotically flat) N-body initial data sets for the Einstein equations. In the context of Newtonian gravity, a set of initial data for the evolution of N massive bodies can be obtained by solving a single Poisson equation in the complement of a finite number of compact domains (say balls) in \({\mathbb {R}}^{3}\), at least if the interior structure and dynamics of the bodies in question is neglected. The nonlinearity of the Einstein constraint equations makes such a task a lot harder, and gluing methods turn out to be an extremely useful and powerful tool in this respect as well. The setup is roughly as follows. Let us suppose that a finite collection of N asymptotically flat data sets \((M_{1}, g_{1}, k_1), \ldots , (M_{N}, g_{N}, k_N)\) and, for each value of the index i, let \(U_{i}\) denote a regular subdomain of \(M_{i}\) (we may think of the complement of a large ball in the asymptotic coordinates of a given end). Moreover, let \(x_{1},\ldots , x_{N}\in {\mathbb {R}}^d\) be N vectors which are supposed to prescribe the location of the regions \(U_{1}, \ldots , U_{N}\) with respect to a flat background. Then, using e.g., the methods in Corvino (2000) (for the time-symmetric case) or Corvino and Schoen (2006)/ Chruściel and Delay (2003) in general, one can construct asymptotically flat data set (Mgk) solving the vacuum constraints (if the same was true for the given data), containing N regions that are isometric to the given bodies (represented by \(U_i\subset M_i\)), with the same second fundamental form and with the centers of such bodies in a configuration which is a scaled version of the chosen configuration. Some preliminary results in this direction were indeed already given in Chruściel and Delay (2003) (although restricting to rather symmetric configurations), while refinements were then obtained (crucially exploiting the presence of multiple ends) in Chruściel et al. (2005) and then in Chruściel et al. (2010a) (handling the time-symmetric case), Chruściel et al. (2011) (handling the general case). We wish to stress that in the last two papers, the data that are constructed satisfy, in addition to the three requirements above, the additional one of being identical to a spacelike slice of a Kerr spacetime sufficiently far from the bodies in question. These constructions come with a scale parameter \(\epsilon \) which has to do with the mutual separation of the bodies, and it is possible to prove that for any \(\epsilon >0\) one can choose this separation scale of order \(\epsilon ^{-1}\) so that the energy-momentum of the resulting data set is \(\epsilon \)-close the the sum of those associated to the given bodies.

In fact, as we will explain below, the general gluing theorem proven in Chruściel and Delay (2003) is also invoked (much more recently) in Chruściel and Delay (2018) to construct exotic hyperbolic data (through a gluing scheme happening in a non-compact domain near a point at infinity on the conformal boundary), which was then key to prove an unconditional (Riemannian) hyperbolic positive mass theorem, Chruściel and Delay (2019); see Appendix C for the corresponding discussion and contextualisation.

In yet another direction, we mention the construction (which again builds on these gluing methods) of asymptotically simple spacetimes given in Chruściel and Delay (2002) (see also Anderson and Chruściel 2005 for later contributions on the same theme). More precisely, building on a stability theorem by Friedrich (see both Friedrich 1986 and Friedrich and Schmidt 1987) here the authors prove, for the first time, the existence of nontrivial, vacuum, spacetimes which (i) admit conformal compactifications at null infinity with a high degree of differentiability and (ii) have global future null infinity \({\mathscr {I}}^{+}\). This specific application shows, quite clearly, how these gluing methods (which occur purely at the level of initial data sets) may shed light on patently global/hyperbolic aspects of general relativity. Along this same conceptual line, we would like to point out the amazing result in Li and Yu (2015), where the machinery in Corvino and Schoen (2006) is crucially exploited to construct a complete, asymptotically flat Cauchy initial data set for the vacuum Einstein field equations, without trapped surfaces, but leading to the formation of trapped surfaces along the future evolution.

Lastly, as a partial exception to the expository agreements we stipulated at the end of the introduction, we wish to mention the recent work Daszuta and Frauendiener (2019), which represents a first, significant step in the direction of implementing numerical methods based on gluing techniques (in this case: directly inspired by Corvino 2000), so to produce (or, more precisely, approximate) important classes of data that would hardly be realised otherwise. The reader may also wish to consult references therein for related contributions and perspectives on numerical gluing methodologies.

Localised solutions of the Einstein constraints

A few years ago, in an effort to better understand (the limits of validity of) certain rigidity phenomena involving minimal surfaces (cf. Carlotto et al. 2016), Schoen and the author of the present review came up with a rather surprising gluing construction, whose outcome are localised solutions to the Einstein constraints. Loosely speaking, we can obtain asymptotically flat initial data sets that have positive ADM mass but are exactly trivial outside a cone of given angle. The scheme that we develop then allows to produce a new class of N-body solutions for the Einstein equation, which exhibit the phenomenon of gravitational shielding.

There are various expositions of this work in the literature, and we certainly recommend to compare our presentation with the beautiful article Chruściel (2019) based on the Séminaire Bourbaki on that subject. The perspective we shall embrace here is partly different, although certain expository analogies are unavoidable. In Sect. 3.4, we will then discuss how to combine the results in Corvino (2000) with those in Carlotto and Schoen (2016) so to produce a novel class of solutions, which we believe might be of independent interest.

In order to contextualize the construction we are about to present, let us first make a short digression on the phenomenon of shielding for isolated systems in Newtonian gravity. Given certain compactly supported sources (i.e., massive bodies) in Euclidean \({\mathbb {R}}^3\), an interesting question one may ask is whether there exist (for non-trivial mass distributions) open regions where no gravitational forces are measured at all. It is not hard to see that this cannot be the case for any finite distribution of point masses, but on the other hand one can definitely construct explicit examples of continuous distributions where this shielding does occur. For instance, if we consider the radially symmetric mass density function given by

$$\begin{aligned} \varrho = {\left\{ \begin{array}{ll} \varrho _0 &{} \text {if} \ \ \ a\le |x|\le b \\ 0 &{}\text {if} \ \ \ |x|< a, |x|>b \end{array}\right. } \end{aligned}$$

for positive numbers \(a<b\), and \(\varrho _0>0\) constant, then the gravitational field is radial and (being attractive) inward-pointing, its magnitude being equal to

$$\begin{aligned} {\left\{ \begin{array}{ll} 0 &{\quad} \text {if} \ \ \ |x|\le a \\ \frac{4\pi \varrho _0(|x|^3-a^3)}{3|x|^2} &{\quad} \text {if} \ \ \ a\le |x|\le b \\ \frac{m}{|x|^2} &{\quad} \text {if} \ \ \ |x|\ge b \end{array}\right. } \end{aligned}$$

where we have set \(m=\frac{4\pi \varrho _0}{3}\left( b^3-a^3\right) \), the total mass of this distribution. Correspondingly, the potential \(\varphi \) is constant in the ball centered at the origin and whose radius equals |a|. However, the potential will not be constant in the unbounded domain \(|x|>b\): this is in fact follows from a more general fact, namely the absence of (long-range) gravitational shielding in Newtonian gravity. Indeed, if \(\varphi \) is the potential describing an isolated gravitational system (with the standard normalisation that \(\varphi \rightarrow 0\) as \(|x|\rightarrow \infty \)) then it is standard to prove that the leading order in the asymptotic expansion of the potential is precisely \(-m/|x|\) where \(m>0\) is the total gravitational mass of the sources. In particular, outside a suitably large compact set this term will dominate the others in the expansion, and thus the associated gravitational field will be measured to be non-zero outside such a set. So, in this specific sense, Newtonian gravity does not allow for shielding unless there are no sources at all. We will now present some asymptotically flat manifolds of positive ADM mass that contain large (in fact: scaling-invariant) regions where no gravity is perceived, as encoded by the fact that the metric there is exactly Euclidean.

The setup of this construction is as follows. First of all, we are assigned an asymptotically flat initial data set (Mgk) where the dimension of M is any \(n\ge 3\); in particular M may have non-trivial topology (namely: we shall not require it to be diffeomorphic to \({\mathbb {R}}^n\)) and multiple ends. The gluing scheme is local to any given end so these aspects do not, in any way, complicate the argument. Given an angle \(0<\theta <\pi \) and a point a far away in the asymptotic region (of an end of M), we consider the region \(\varOmega \) between two coaxial cones having vertex a and aperture \(0<\theta _1<\theta _2<\pi \) (understood in the coordinate chart at infinity). Actually, to avoid technical issues related to the presence of a singularity at the tip of the cones we rather regularise \(\varOmega \) (without renaming) so that it becomes the region bounded by a pair of smooth disjoint hypersurfaces \(\varSigma _1\) and \(\varSigma _2\) that coincide with the given cones outside, say, the coordinate ball centered at a and of radius one. We stress the fact that the specific form of the regularisation does not play any role in the construction, so that the only actual requirement is that \(\varOmega \) be dilation-invariant outside a compact set (and, in fact, we do not really specify the form of the regularisation in Carlotto and Schoen 2016). In any event, it follows that \(M\setminus (\varSigma _1\cup \varSigma _2)\) is a disjoint union of three regions, the inner region \(\varOmega _I (a)\) (corresponding to \(\varOmega '_1\) in the notation of Sect. 3.2), the transition region \(\varOmega \), and the outer region \(\varOmega _O(a)\) (corresponding to \(\varOmega '_2\) in the notation of Sect. 3.2).

These preliminaries being given, here is a slightly informal statement of our main theorem:

Theorem 3.9

Assume that we are given an initial data set \((M,{\check{g}},{\check{k}})\) satisfying the vacuum Einstein constraints, such that \((M, {\check{g}})\) is a complete asymptotically flat manifold of dimension \(n\ge 3\) with \({\check{g}}_{ij}(x)=\delta _{ij}+O(|x|^{-{\check{p}}})\) and \({\check{k}}\) is a symmetric (0, 2) tensor with \({\check{k}}_{ij}(x)=O(|x|^{-{\check{p}}-1})\), for \(\frac{n-2}{2}<{\check{p}}\le n-2\).

Given two distinct angles \(\theta _{1}, \theta _{2}\) less than \(\pi \), as well as \(p\in (\frac{n-2}{2},{\check{p}})\), there exists \(a_{\infty }\) so that for any \(a\in {\mathbb {R}}^{n}\) such that \(\left| a\right| \ge a_{\infty }\) we can find a metric \({\hat{g}}\) and a symmetric (0, 2)-tensor \({\hat{k}}\) so that \((M,{\hat{g}},{\hat{k}})\) satisfies the vacuum Einstein constraint equations,

$$\begin{aligned} {\hat{g}}_{ij}=\delta _{ij}+O(|x|^{-p}), \quad {\hat{k}}_{ij}=O(|x|^{-p-1}) \end{aligned}$$


$$\begin{aligned} ({\hat{g}}, {\hat{k}})={\left\{ \begin{array}{ll} ({\check{g}}, {\check{k}}) &{} \text {in } \varOmega _{I}(a) \\ (\delta , 0) &{} \text {in } \varOmega _{O}(a). \end{array}\right. } \end{aligned}$$

More accurately, there are basically two versions of the theorem above, which might be for brevity referred to as finite regularity version and infinite regularity version. In the finite regularity version, we assume \({\check{g}}\in C^{\ell ,\alpha }_{\mathrm{loc}}, {\check{k}}\in C^{\ell -1,\alpha }_{\mathrm{loc}}\) and the decay is also understood in weighted Hölder spaces, namely

$$\begin{aligned} {\check{g}}_{ij}=\delta _{ij}+O^{\ell ,\alpha }(|x|^{-p}), \ \ {\check{k}}_{ij}=O^{\ell ,\alpha }(|x|^{-p-1}) \end{aligned}$$

and we produce data \({\hat{g}} \in C^{\ell -2,\alpha }_{\mathrm{loc}},{\hat{k}} \in C^{\ell -2,\alpha }_{\mathrm{loc}}\) whose decay is also encoded in the same fashion, so that

$$\begin{aligned} {\hat{g}}_{ij}=\delta _{ij}+O^{\ell -2,\alpha }(|x|^{-p}), \ \ {\hat{k}}_{ij}=O^{\ell -2,\alpha }(|x|^{-p-1}). \end{aligned}$$

In the infinite regularity version, we simply take smooth data with smooth decay as one can get, formally, by taking \(\ell =\infty \) and \(\alpha =0\) in the equation above.

The general strategy of the proof goes as follows. Considered the region \(\varOmega \) we just described, we let g be a Riemannian metric gotten by interpolating \({\check{g}}\) and \(\delta \) in the region \(\varOmega \) (therefore g agrees with \({\check{g}}\) in the connected component of the complement of \(\varOmega \) which contains the compact core of the manifold M and agrees with \(\delta \) in the other connected component) and we perform the same construction for k, which interpolates between \({\check{k}}\) and the trivial symmetric tensor. Here one employs an angular cutoff function \(\chi \) which decays on approach to \(\varSigma _{2}\) and such that \(1-\chi \) decays on approach to \(\varSigma _{1}\). Observe that (gk) satisfies the very same decay properties as \(({\check{g}},{\check{k}})\) even though it will not in general be a solution for the Einstein constraint equations. To deform the data in the transition region, so to obtain a new solution, we perform a localised deformation scheme. In order for such a strategy to work, one needs to define (rather complicated) functional spaces so that the formal adjoint of the linearised constraint operator satisfies a suitable coercivity estimate. Said otherwise, rather than handling the cokernel (i.e., first solving the equation orthogonally to the cokernel and then taking care of a finite dimensional subspace) we rather engineer doubly weighted spaces where the adjoint operator has no kernel at all. Ultimately, this depends on the explicit and complete understanding of the kernel at the trivial data (so for Euclidean background metric and null second fundamental form).

Given data (Mgk), and recalled the definition of the momentum tensor \(\pi \), let us write the vacuum Einstein constraint equations in the compact form

$$\begin{aligned} \varPhi (g,\pi )=0. \end{aligned}$$

We let \(D\varPhi _{(g,\pi )}(h,\omega )\) denote the differential of the constraint map, and \(D\varPhi ^{*}_{(g,\pi )}(u,Z)\) denote its adjoint, which is defined by means of the equation

$$\begin{aligned} \int _{\varOmega }\left[ D\varPhi _{(g,\pi )}(h,\omega ) \cdot _{g}(u,Z)\right] \,dV_g=\int _{\varOmega }\left[ (h,\omega )\cdot _{g}D \varPhi ^{*}_{(g,\pi )}(u,Z)\right] \,dV_g. \end{aligned}$$

With this notation at hand, the crucial tool to solve the linear problem is the following result.

Proposition 3.10

Let \(n\ge 3\) and for any set of data \((M,{\check{g}},{\check{k}})\) as in the statement of Theorem 3.9let (Mgk) be the interpolating triple defined above. Fix a real number \(p\in (\frac{n-2}{2},{\check{p}})\) such that \(p\ne n/2\) if \(n\ge 5\) . There exist constants \(a_{\infty ,L}\) and C (depending only on \(g, k, \theta _{1}, \theta _{2}, p\)) such that uniformly for \(|a|>a_{\infty , L}\)

$$\begin{aligned} \left\| \left( u,Z\right) \right\| _{H_{2,-n+p+2,\rho }\times H_{1,-n+p+2,\rho }}\le C \left\| D\varPhi ^{*}_{\left( g,\pi \right) }(u,Z)\right\| _{H_{0,-n+p,\rho }\times H_{0,-n+p+1,\rho }}. \end{aligned}$$

Here (and below) \(H_{k,-q,\rho }\) denotes the Hilbertian Sobolev space of functions, or tensors, that are square-integrable with respect to a smooth radial weight bounded from above and below by the q-th power of the distance from the a and an angular weight \(\rho \) (similarly for \(H_{k,-q,\rho ^{-1}}\) with \(\rho ^{-1}\) in lieu of \(\rho \)), and correspondingly for all derivatives up to order k; for technical reasons the radial weight is given by powers of a function r that is bounded from below by a positive constant and equals the distance from the a outside of the unit ball centered at a. As the we did above, for the sake of notational simplicity we do not employ different letters for spaces of tensors of different types.

From the previous proposition one easily gets the a general solvability result for the linearised constraint in doubly weighted spaces. In the same setting as in the statement above, one considers the functional

$$\begin{aligned} J(u,Z)&= \int _{\varOmega }\biggl \{\frac{1}{2}\left[ \left| {D \varPhi ^{*}}^{(1)}_{\left( g,\pi \right) }(u,Z)\right| ^{2}r^{n-2p}\rho +\left| {D \varPhi ^{*}}^{(2)}_{\left( g,\pi \right) }(u,Z)\right| ^{2}r^{n-2p-2} \rho \right] \\&\quad -(f,V)\cdot _{g}(u,Z)\biggr \}\,dV_g \end{aligned}$$

where we are taking

$$\begin{aligned} f\in H_{0, -p-2,\rho ^{-1}}, \ \ V\in H_{0,-p-2,\rho ^{-1}}. \end{aligned}$$

If we let

$$\begin{aligned} h=r^{n-2p}\rho \ ({D\varPhi ^{*}}^{(1)}_{(g,\pi )}(u,Z)) \ \ \ \omega =r^{n-2p-2}\rho \ ({D\varPhi ^{*}}^{(2)}_{(g,\pi )}(u,Z)) \end{aligned}$$

then the Euler–Lagrange equation for this functional takes the form

$$\begin{aligned} D\varPhi _{(g,\pi )}(h,\omega )-(f,V)=0. \end{aligned}$$

Moreover, we see at once that h and \(\omega \) decay according to what we claimed in the statement of Theorem 3.9. Here is the corresponding existence result.

Proposition 3.11

Let \(n\ge 3\), let \(\frac{n-2}{2}<p<{\check{p}}\) with \(p\ne n/2\) if \(n\ge 5\) and assume that the vertex a satisfies the inequality \(|a|>a_{\infty ,L}\) (with \(a_{\infty ,L}\) as in the previous proposition). For any \((f, V)\in H_{0, -p-2,\rho ^{-1}}\times H_{0,-p-2,\rho ^{-1}}\) there exists a unique \((u_{*},Z_{*})\in H_{2,-n+p+2,\rho }\times H_{1, -n+p+2,\rho }\) which minimises the functional J on the Hilbert space \(H_{2,-n+p+2,\rho }\times H_{1, -n+p+2,\rho }\). In particular, for any such pair (fV) the linearised constraint equation

$$\begin{aligned} D\varPhi _{(g,\pi )}(h,\omega )-(f,V)=0 \end{aligned}$$

is solvable in \(H_{0,-p,\rho ^{-1}}\times H_{0,-p-1,\rho ^{-1}}\).

The proof of the coercivity estimate above, Proposition 3.10, requires a considerable amount of delicate work (mostly in dealing with the Lie derivative operator, which arises in the second component of the adjoint i.e., \({D\varPhi ^{*}}^{(2)}_{(g,\pi )}(u,Z)\)). We find it appropriate to just present it in the time-symmetric case, so to provide the reader with a clear idea of the reason why such a result should indeed be expected to hold. In that special case, one wants to prove that

$$\begin{aligned} \left\| u\right\| _{H_{2,-n+p+2,\rho }}\le C\Vert L^{*}_{g}u\Vert _{H_{0,-n+p,\rho }} \end{aligned}$$

for all \(u\in H_{2,-n+p+2,\rho }\).

Proof of Proposition 3.10 in the time-symmetric case

We first note that, once it is proven that a linear bounded operator \(T:H_{k_{1},-q_{1}}\rightarrow H_{k_{2},-q_{2}}\) satisfies a functional inequality of the form

$$\begin{aligned} \left\| f\right\| _{H_{k_{1},-q_{1}}}\le C\left\| Tf\right\| _{H_{k_{2},-q_{2}}} \end{aligned}$$

one can employ the aforementioned coarea-type formula (cf. Lemma 4.1 in Carlotto and Schoen 2016) to obtain that in fact

$$\begin{aligned} \left\| f\right\| _{H_{k_{1},-r_{1},\rho }}\le C \left\| Tf\right\| _{H_{k_{2},-r_{2},\rho }}. \end{aligned}$$

As a result, we can work with singly weighted Sobolev spaces. That being said, the key point is to prove the following (Euclidean) Poincaré inequalities: given any \(q\in (0,(n-2)/2)\), such that \(q\ne (n-4)/2\) if \(n\ge 5\), we have

$$\begin{aligned} \left\| u\right\| _{H_{1,-q}}\le C \left\| \nabla u\right\| _{H_{0,-q-1}}, \ \ \ \Vert u\Vert _{H_{2,-q}}\le C\Vert \mathrm {Hess}_{\delta } u\Vert _{H_{0,-q-2}}. \end{aligned}$$

We consider the function \(v=|x|^{2-n+2q}\) and observe that for \(|x|\ne 0\) we have

$$\begin{aligned} \varDelta _{\delta } v=2q(2-n+2q)|x|^{-n+2q} \end{aligned}$$

so when u has bounded support, as we can assume (thanks to a standard density argument), one can integrate by parts to obtain

$$\begin{aligned} \int _{\varOmega \setminus B_1(0)}u^2\varDelta _{\delta } v\ dV_{\delta }=(n-2-2q)\int _{\partial B_1(0)\cap \varOmega }u^2|x|^{1-n+2q}\ dA_{\delta }-2\int _{\varOmega \setminus B_1(0)} u\nabla u \cdot \nabla v\ dV_{\delta } \end{aligned}$$

where we have exploited the fact that \(\varOmega \) is a cone outside of \(B_1(0)\), so that the other boundary terms there vanish. Hence, it follows from the sign of the boundary term and the Cauchy-Schwarz inequality that

$$\begin{aligned} \int _{\varOmega \setminus B_1(0)}u^2r^{-n+2q}\ dV_{\delta }\le C\int _{\varOmega \setminus B_1(0)}|\nabla u|^2 r^{2-n+2q}\ dV_{\delta }. \end{aligned}$$

Let now \(\zeta \) be a smooth cutoff function with support in \(B_2(0)\) and with \(\zeta =1\) on \(B_1(0)\). A standard Poincaré inequality gives

$$\begin{aligned} \int _{\varOmega }(\zeta u)^2\ dV_{\delta }\le C\int _{\varOmega }|\nabla (\zeta u)|^2\ dV_{\delta }, \end{aligned}$$

which patently implies

$$\begin{aligned} \int _{\varOmega \cap B_1(0)}u^2r^{-n+2q}\ dV_{\delta }\le C\int _{\varOmega \cap B_2(0)}|\nabla u|^2 r^{2-n+2q}\ dV_{\delta }+C\int _{\varOmega \setminus B_1(0)}u^2r^{-n+2q}\ dV_{\delta } \end{aligned}$$

since r is bounded above and below by positive constants on \(B_2(0)\). Combining this inequality with the previous one, we obtain

$$\begin{aligned} \int _{\varOmega }u^2r^{-n+2q}\ dV_{\delta }\le C\int _{\varOmega }|\nabla u|^2 r^{2-n+2q}\ dV_{\delta }, \end{aligned}$$

thus the first estimate in the statement follows at once.

To justify the second one, we now apply the previous argument to each partial derivative of u. We may use essentially the same argument with the function \(v=|x|^{4-n+2q}\), provided \(q\ne (n-4)/2\). The two cases when \(4-n+2q<0\) and \(4-n+2q>0\) work similarly, just with a sign reversal. In particular, in both cases the boundary term one gets integrating by parts can be thrown away and we end up showing

$$\begin{aligned} \int _\varOmega |\nabla u|^2r^{2-n+2q}\ dV_{\delta }\le C\int _\varOmega |\mathrm {Hess}_{\delta }u|^2r^{4-n+2q}\ dV_{\delta } \end{aligned}$$

which allows to complete the proof of (3.13).

In particular, our assumptions on p imply that

$$\begin{aligned} \Vert u\Vert _{H_{2,-n+p+2}}\le C\Vert \mathrm {Hess}_{\delta }u\Vert _{H_{0,-n+p}}. \end{aligned}$$

On the other hand, we have

$$\begin{aligned} \Vert \mathrm {Hess}_{\delta }u\Vert _{H_{0,-n+p}}\le C\Vert L_{\delta }^*(u)\Vert _{H_{0,-n+p}}. \end{aligned}$$

since, recalling that we are working at the Euclidean metric (and will then deduce a general coercivity result by perturbation), we can simply take the trace in the definition of the operator \(L_{\delta }^{*}\) thereby obtaining

$$\begin{aligned} tr_{\delta }(L_{\delta }^*u)=(1-n)\varDelta _{\delta } u,\ \text{ or }\ \varDelta _{\delta } u=-1/(n-1)tr_{\delta }(L_{\delta }^*u), \end{aligned}$$

from which it follows that

$$\begin{aligned} \mathrm {Hess}_{\delta }u=L_{\delta }^*u -1/(n-1)tr_{\delta }(L_{\delta }^*(u))\delta \end{aligned}$$

and therefore we get the desired estimate at the Euclidean metric. Note that in the estimates above the constant C can be taken uniform with respect to a.

For these Euclidean Poincaré-type inequalities, the claimed result follows by a simple approximation argument, keeping in mind the decay assumptions on \(({\check{g}},{\check{k}})\) (hence on (gk)). \(\square \)

In fact, from Proposition 3.11 it follows that one can define a linear, bounded operator

$$\begin{aligned} T: H_{0, -p-2,\rho ^{-1}}\times H_{0,-p-2,\rho ^{-1}}\rightarrow H_{0,-p,\rho ^{-1}}\times H_{0,-p-1,\rho ^{-1}} \end{aligned}$$

that associates to any given pair (fV) a solution (in fact: the pair \((h,\omega )\) built, as explained above, from the unique minimiser of the functional J) of the equation (3.11). This is the starting point for attacking the nonlinear problem, i.e., for proving Theorem 3.9. More precisely, we can set up an iterative scheme (again of Picard type, namely where the linearisation occurs at a fixed metric) which ultimately leads to a general solvability result for the Einstein constraints (1.2) (seen as a system in the region \(\varOmega \)) that applies to small data, where the smallness is encoded in terms of the norms of the specific Banach spaces we work with.

Now, the specific expression of the norms we employ is provided in equations (5.3) and (5.4) in Carlotto and Schoen (2016). Without opening a technical digression on this, we mention that both spaces in question are defined by means of doubly weighted, mixed Sobolev–Hölder norms that are taylor-made for the iteration to satisfy the necessary contractivity property that allows the scheme to converge. We further note that the necessary Hölder estimates ultimately appeal to the fundamental work Douglis and Nirenberg (1955), which essentially extends earlier work by Schauder (see e.g., Gilbarg and Trudinger 2001 or Ambrosio et al. 2018) to the case of elliptic systems (for a suitable definition of ellipticity).

Remark 3.12

For a variety of reasons, including e.g., the geometric implications we will briefly mention below, Theorem 3.9 was stated for the vacuum constraints, but we note that the whole construction would allow (without any modification at all) to solve the constraints in the region \(\varOmega \) for data \(\mu , J\) that satisfy the aforementioned smallness condition. For instance, considering two triples \((M,g_1,k_1)\) and \((M,g_2,k_2)\) satisfying the sole decay assumptions in the statement above, namely (say, for \(\frac{n-2}{2}<p_1=p_2\le n-2\))

$$\begin{aligned} g_i(x)=\delta +O(|x|^{-p_i}), \ k_i(x)=O(|x|^{-p_i-1}), \ \ i=1,2 \end{aligned}$$

we can combine them to a triple (Mgk) that coincides with the former in the interior domain, with the latter in the exterior domain, and solves the constraints for right-hand side given by

$$\begin{aligned} \left\{ \begin{aligned} &\mu=\chi \mu _1 + (1-\chi ) \mu _2 \\& J= \chi J_1 + (1-\chi ) J_2 \end{aligned}\right. \end{aligned}$$

at all points of M. Here \(\chi \) is the same cutoff function we defined above, which smoothly transitions from the value 1 (attained in \(\varOmega _I\)) to the value 0 (attained in \(\varOmega _O\)), and \(\mu _1, J_1\) (respectively \(\mu _2, J_2\)) are now defined by the right-hand side of (1.2) for \((M,g_1,k_1)\) (respectively \((M,g_2,k_2)\)).

Yet, it should be noted that the pair \((\mu , J)\) may not, in general, satisfy any pre-assigned energy condition (such as e.g., the dominant energy condition) with respect to the glued metric g. A notable exception to this negative assertion is the time-symmetric case, where we take \(k_1=k_2=0\) and so we are just constructing a Riemannian metric g on the background manifold M, which interpolates between \(g_1\) and \(g_2\) and has scalar curvature \(\chi R_{g_1}+(1-\chi )R_{g_2}\). Of course, if \(g_1, g_2\) have non-negative scalar curvature then so will such a combination. This point, namely the fact that our approach allows for non-zero data on the right-hand side of the constraints, was also explicitly noted in the exposé Chruściel (2019).

We wish to add a few important comments on this gluing construction:

  1. (a)

    Among possible applications, these methods allow to construct a new class of N-body initial data sets for the Einstein constraint equations. Recall the setup we described at the very end of Sect. 3.2: we are given a finite collection of N asymptotically flat data sets \((M_{1}, g_{1}, k_1), \ldots , (M_{N}, g_{N}, k_N)\). Here, for each value of the index i, we let \(U_{i}\) denote a non-compact, regular subdomain of \(M_{i}\) that coincides with a cone outside a ball: then, one can use Theorem 3.9 to construct an asymptotically flat data set (Mgk) which solves the vacuum constraints (if the same was true for the given data) and contains N regions that are isometric to the given bodies (represented by \(U_i\subset M_i\)), with the same second fundamental form and with the centers of such bodies in a configuration which is a scaled version of an assigned configuration, as specified by N given vectors. This (a priori delicate) construction is somewhat trivialised by the fact that the localisation scheme above transplants whatever given data to a Minkowskian background.

  2. (b)

    An important, and somewhat surprising feature of our construction is that (with the notation as in the statement of Theorem 3.9) the ADM energy–momentum of \((M,{\hat{g}},{\hat{k}})\) converges to the ADM energy–momentum of the pre-assigned triple \((M,{\check{g}}, {\check{k}})\) as one lets \(a\rightarrow \infty \). This conclusion follows from the convergence of \({\hat{g}},{\hat{k}}\) to \({\check{g}}, {\check{k}}\) in suitable weighted Sobolev spaces, where the ADM energy–momentum acts as a continuous functional. In particular, the N-body solutions described above can be engineered so that their total mass is arbitrarily close to the sum of the ADM mass of the glued data \((M_1,g_1,k_1),\ldots , (M_N, g_N, k_N)\).

  3. (c)

    As a special case of the general construction, in the time-symmetric case one can design scalar-flat, asymptotically flat metrics on \({\mathbb {R}}^3\) that have positive ADM mass and are Euclidean on a half-space, say on \({\mathbb {R}}^2\times (0,\infty )\). Therefore, any such manifold contains plenty of stable minimal planes (any plane of the form \({\mathbb {R}}^2\times \left\{ z\right\} \) for \(z>0\)) and outlying, arbitrarily large, constant mean curvature spheres (any sphere of center (0, 0, z) and radius \(0<R<z\), for \(z>0\)). Both these conclusions should be contrasted with the rigidity theorems in Carlotto et al. (2016), concerning the large-scale structure of asymptotically Schwarzschildean manifolds (of positive mass), where such phenomena simply do not occur. We further note that the spheres defined above are not isoperimetric for the volume they enclose. In order to justify (and contextualise) this assertion, we need to open a brief digression. Over the past two decades considerable effort has been spent on recasting the notion of ADM mass (which one may simply regard as a suitable flux integral at infinity) in terms of the geometric properties of (Mg). In particular, a milestone in this path was the proposal, by Huisken, of an isoperimetric notion of mass which does not involve derivatives of the metric at all. It was then proven in Chodosh et al. (2016) that in fact

    $$\begin{aligned} m_{{\text{ISO}}} = \lim _{V \rightarrow \infty } \frac{2}{A(V)} \left( V - \frac{A(V)^{3/2}}{6 \sqrt{\pi }} \right) \end{aligned}$$

    where A(V) denotes, roughly speaking, the least area of a domain whose volume equals V (the map \(V\mapsto A(V)\) is the so-called isoperimetric profile of the manifold in question). Clearly, this equality implies that, in the examples by Schoen and the author that we mentioned above, sufficiently large spheres in the Euclidean half-space, though evidently stable CMC surfaces, are not isoperimetric for the volume they enclose (for, if they were, we would reach the absurd conclusion that \(m=0\)).

  4. (d)

    an interesting aspect, which we have not described before, is the study of the evolution of data arising from this construction. As it was remarked in Sect. 1.6 of Chruściel (2019), it follows from Bieri (2010) that one can design a subclass of initial data sets so that the corresponding space-times, obtained by solving the field equations (1.1), will exist globally (i.e., they are geodesically complete) when the data in question are small enough, in a suitable sense. Thus, it would be good to know what the precise asymptotic behaviour of these data is (or: may be), beyond say the rather basic information one can extract from the finite speed of propagation. Equally fascinating is also the question, suggested by one of the reviewers of the present survey, whether one can rigorously interpret the localised solutions above as being generated by gravitational radiation coming from the past. More precisely, we pose the following:

Open Problem 3.13

Design characteristic data, to be assigned on the union of past timelike and null infinity (and likely with a certain roughness), so that the associated spacetimes are geodesically complete and contain spacelike slices (Cauchy data) of the type described in the statement of Theorem 3.9.

In other words, we would like to generate these exotic data in a genuinely dynamical fashion. Among other things, a thorough understanding of these phenomena might in fact provide considerable insight into the non-linear interaction of gravitational radiation.

A natural question the reader may pose is whether the same gluing construction can be implemented in case of non-compact domains, that are not necessarily subject to the additional requirement of being scaling-invariant outside a compact set (i.e., essentially the region between two cones). In fact, while this assumption does play a role in our argument, it may well seem to be a purely technical issue, and that more refined methods should in fact bypass this obstacle. This may indeed be the case, but an important warning to be made here is that there are some definite restrictions to the gluing schemes one may aim at.

To explain this point let us recall, from the statement of the Riemannian positive mass theorem, that there cannot exist on \({\mathbb {R}}^3\) Riemannian metrics of non-negative scalar curvature and positive ADM mass which are exactly flat outside a compact set. In other words, there must be an unbounded region where the metric in question is not Euclidean and one may wonder, roughly speaking, how large such a region should be. Very concretely, one may wonder (for instance) whether it is possible to construct asymptotically flat metrics of positive mass being flat outside a cylinder or a slab of given height. In order to give a quantitative description of this problem, we introduced in Carlotto and Schoen (2016) the convenient definition of content at infinity of an asymptotically flat metric g, which measures (roughly speaking) the fraction of the area of large spheres that intersects the region where the metric in question is not Ricci-flat.Footnote 8 Combining the rigidity part of the positive mass theorem with the formula expressing the ADM in terms of the Ricci curvature (proven e.g., in Miao and Tam 2016) one easily proves, for instance, that an asymptotically flat manifold, of dimension \(n\ge 3\), satisfying

$$\begin{aligned} R_g\ge 0, \ \ \ |{\mathrm {Ric}}_g|\le C|x|^{-n} \end{aligned}$$

is either flat or has positive content at infinity. In particular, this assertion does indeed rule out any chance of constructing non-trivial time-symmetric data having harmonic decay (by which we mean that \(g_{ij}(x)=\delta _{ij}+O_2(|x|^{-(n-2)})\)) that are localised in a cylinder or a slab, and roughly asserts states that if an asymptotically flat metric g of non-negative scalar curvature is not trivial then the region where it is not Ricci-flat must contain a cone of positive aperture. Our gluing scheme provides a sort of partial converse to the previous statement, by ensuring that for any cone we can construct non-trivial data localised inside that cone. That being said, one may still ask the following general question:

Open Problem 3.14

Is the spacetime positive mass theorem with rigidity the only obstruction to the construction of asymptotically flat, localised vacuum initial data sets?

In somewhat more concrete terms, here we are wondering whether, for example, it is possible to construct asymptotically flat, scalar-flat metrics on \({\mathbb {R}}^3\) that are flat in an arbitrarily assigned regular region \(\varOmega \) under the sole assumptions that \({\mathbb {R}}^3\setminus \varOmega \) be unbounded and that the localisation requirement does not force any potential such metric to have vanishing mass. Of course, we further note that the question above can be trivialised if we allow for very slow decay at infinity, in particular if we go out of the range of applicability of the spacetime positive mass theorem with rigidity (which is strictly more restrictive than what is required to settle the inequality \(E\ge |P|\)), see Theorem B.10 and our related comments at the end of Appendix B.

This question is, to the best of our knowledge, completely open but, over the last few years, some interesting related contributions have appeared. In particular, we would like to mention here the construction of localised solutions of the linearised Einstein constraints given in Beig and Chruściel (2017), where indeed no constraints on the gluing regions need to be assumed. The idea behind this work is very beautiful in its simplicity. To illustrate it, let us first recall how the (vacuum) Maxwell constraint equations take, in absence of sources, the form:

$$\begin{aligned} \left\{ \begin{aligned} &div E=0 \\& div B=0 \end{aligned}\right. \end{aligned}$$

hence (essentially by a special case of the Poincaré Lemma in \({\mathbb {R}}^3\)) there exist vector potentials \(\omega \) and A such that

$$\begin{aligned} E=curl(\omega ), \ B=curl(A). \end{aligned}$$

Here all differential operators are obviously understood with respect to the flat background metric. As a result, given any open domain \(\varOmega \subset {\mathbb {R}}^3\) and any larger \({\tilde{\varOmega }}\) containing the closure \({\overline{\varOmega }}\) if we pick a cutoff function \(\chi \) which equals one on \(\varOmega \) and is compactly supported in \({\tilde{\varOmega }}\) we get that

$$\begin{aligned} E=curl(\chi \omega ), \ B=curl(\chi A) \end{aligned}$$

provides a new initial data set for the Maxwell system, which is indeed vanishing outside of the domain \({\tilde{\varOmega }}\). Keeping this toy model in mind, let us consider in a \({\mathbb {R}}^{1+3}\) Minkowskian background, a metric of the form \(g_{\mu \nu }=\eta _{\mu \nu }+h_{\mu \nu }\) under a perturbative assumption, which we may formally write as

$$\begin{aligned} |h_{\mu \nu }|+|\partial _{\sigma }h_{\mu \nu }| +|\partial _{\sigma }\partial _{\rho }h_{\mu \nu }|=O(\varepsilon ). \end{aligned}$$

The vacuum Einstein field equations take, in these (global) coordinates, the form

$$\begin{aligned} R_{\beta \gamma }=\frac{1}{2}[\partial _{\alpha }(\partial _{\beta } h^{\alpha }_{\gamma }+\partial _{\gamma }h^{\alpha }_{\beta } -\partial ^{\alpha }h_{\beta \gamma })-\partial _{\gamma } \partial _{\beta }h^{\alpha }_{\alpha }] +O(\varepsilon ^2). \end{aligned}$$

If we neglect the remainder term (more precisely, if we linearise at the Minkowski metric), in a suitable gauge the linearised field equation take the form of an hyperbolic evolution problem \(\square _{\eta } h_{\beta \gamma }=0\) with initial data sets of the form \((h_{ij},k_{ij}=\partial _0 h_{ij})\) subject to the constraints of both being TT tensors, i.e.,

$$\begin{aligned} h^{i}_i=\partial _i h^{i}_j=0, \ \ k^{i}_i=\partial _i k^i_j=0. \end{aligned}$$

Here one recalls, from Beig (1997), that a symmetric, TT tensor on \({\mathbb {R}}^3\) always comes from a third-order potential, namely

$$\begin{aligned} h_{ml}=P(u)_{ml} \end{aligned}$$

for a suitable differential operator P. Therefore, it follows that given any open region \(\varOmega \subset {\mathbb {R}}^3\), every vacuum initial data set for the gravitational field \((h_{ij},k_{ij})\) can be deformed to a new vacuum initial data set \(({\tilde{h}}_{ij},{\tilde{k}}_{ij})\) which coincides with \((h_{ij},k_{ij})\) on \(\varOmega \) and vanishes outside a slightly larger set. In physical terms, the linearised gravitational field has been screened away outside of \(\varOmega \), and this by using gravitational fields only: no matter fields, whether with positive or negative density, are needed for this construction to be performed. See also, more recently, Beig and Chruściel (2020) for further developments along these lines.

More generally, we note that similar gluing schemes may in fact be applied to other physical fields satisfying suitable axioms (in particular the so-called higher spin fields, see Joudioux 2017). Any such constructions builds, essentially, on two steps: the construction of an elliptic complex (replacing the De Rham complex), see Andersson et al. (2014), and then the explicit construction of general potentials.

To close this section, we would like to mention the construction of exotic hyperbolic gluings in Chruściel and Delay (2018), which is in the spirit of Theorem 3.9 but refers to a different model geometry at infinity (naturally arising, for instance, in presence of a negative cosmological constant). More precisely, the authors produce a new class of solutions to the Einstein constraint equations by gluing data on non-compact domains around a boundary point of certain conformally compact manifolds. A simple example to keep in mind for the gluing region is a half-annulus between two concentric balls in the upper half-space model for the hyperbolic space. We refer the reader to Theorem 3.3 and Theorem 3.7 therein for precise statements of the main results, which are somewhat technical; see also various later applications (most remarkably the Maskit-type combination theorem, Theorem 3.12, whose outcome should be compared to the content of Mazzeo and Pacard 2006 and the later Isenberg et al. 2010) and extensions to non time-symmetric counterparts, namely to initial data sets solving (1.2) with \(\mu =0, J=0\) but \(\varLambda <0\) (Theorem 3.10).

The main novel technical contributions of the paper are, essentially, coercivity estimates in weighted Sobolev spaces (in part. a Poincaré-type estimate in Sect. 4 and a Korn-type estimate in Sect. 5) that ensure solvability (in these non-compact domains) of the constraint equations at a linear level. Such linear estimates are obtained by dividing the gluing domain into different regions which present different technical problems (related to the fact that one or both the defining functions that naturally come up in describing these annuli may be close to zero or instead bounded away from zero). The nonlinear solvability then follows, as in the asymptotically flat setting, by a suitable implicit-function-theorem result. In fact, a non-secondary remark to be made is that the authors ultimately reduce to invoking the general solvability results they had proven in the earlier, aforementioned monograph Chruściel and Delay (2003) (i.e., they check that the hypotheses of Theorem 3.7 therein are satisfied).

It is quite unclear whether the results in Chruściel and Delay (2018) are optimal from the perspective of the decay of the data on approach to the conformal boundary, or whether the same technique can be employed to obtain even more general combinations of hyperbolic metrics. That being said, it is quite remarkable that such a gluing construction has recently allowed to obtain an elegant proof of the (Riemannian) hyperbolic positive mass theorem, without dimensional restrictions nor requiring that the background manifold be spin (see Chruściel and Delay 2019). We do not open a digression on this point here, but rather refer the reader to Appendix C for a better overview on such specific topic.

There is of course a third model geometry, the spherical one, which we associate (for instance) to the presence of a positive cosmological constant and naturally connects to the structure of static solutions (hence to the cosmic no-hair conjecture, see Boucher et al. 1984 together with Ambrozio 2017 and Borghini and Mazzieri 2018 for recent developments on this theme). It may be natural to wonder whether constructions like we described above can indeed be made to work, with interesting consequences, in that setting. More concretely, one may try to design a methodology to produce non-trivial Riemannian metrics on an n-dimensional hemisphere \(S^n_{+}\) that have scalar curvature equal to (more generally: greater or equal than) a fixed positive constant and satisfy suitable boundary conditions. To that end, one may simply try to consider the interpolation through a radial cutoff function, in a spherical annulusFootnote 9 centered at a boundary point of an n-dimensional hemisphere \(S^n_{+}\), of the spherical metric \({\overline{g}}\) itself and a second metric g (which we postulate to be approaching, in a suitable sense compatible with the specific boundary conditions one considers, \({\overline{g}}\)) and then reimpose the constraints by solving a local deformation problem in the annulus in question.

Loosely speaking, the strategy above is plausible but the nature of the problem (and its degree of difficulty) depends a lot on the specific choice of boundary conditions. If we required, following Min-Oo (1998), that the metric to be produced be equal to the standard spherical one along the boundary, and that the boundary itself be totally geodesic, we stumble across some obstacles, corresponding to known rigidity results in this setting. For instance, we point out the following result from Brendle and Marques (2011).

Theorem 3.15

Suppose that g is a Riemannian metric on the hemisphere \(S^n_{+}\) satisfying the following properties:

  • \(R_g\ge n(n-1)\) at all points of \(S^n_{+}\) ;

  • \(g={\overline{g}}\) in the region \(\left\{ x\in S^n\subset {\mathbb {R}}^{n+1} \ : \ 0\le x^{n+1}\le 2/\sqrt{n+3}\right\} \).

If \(g-{\overline{g}}\) is sufficiently close to zero in \(C^2\) norm, then g equals \({\overline{g}}\) on \(S^n_{+}\).

Now, the reason why such a statement may come into play to obstruct the feasibility of certain gluing procedures (as described above) is easily explained. If we could make the gluing work in the terms explained above (the gluing region being determined by spherical shells around a boundary point p) we could then attach two copies of the resulting manifold with boundary so to obtain a novel closed Riemannian manifold, which one could then cut along the equatorial sphere farthest from p. The outcome is then a hemisphere with a metric that is exactly spherical near the boundary, in fact everywhere except for a small neighbourhood of p (which now plays the role of north pole). Theorem 3.15 is then directly applicable and allows to conclude, a posteriori, that the glued metric we started from was nothing but the standard spherical metric (hence the gluing produced a trivial outcome).

Note that the heuristic argument above, invoking Theorem 3.15, implies that to design a non-trivial gluing scheme one needs to either get out of the perturbative regime (i.e., not working in a \(C^2\)-small neighbourhood of the standard round metric) or perform the gluing in a region which is not small (more precisely: is not contained in a small neighbourhood of a boundary point). Thus the problem looks quite delicate. On the other hand, if one succeeded in making such a gluing scheme work, that would indeed provide a novel class of counterexamples to the Min-Oo conjecture, necessarily different from those obtained in Brendle et al. (2011). This is worth an explicit question:

Open Problem 3.16

Is it possible to design a novel class of counterexamples to the Min-Oo conjecture by means of a gluing scheme around a point on the boundary of the hemisphere?

Clearly, there can be also other interesting boundary conditions, so other ways of posing the gluing problem above. For instance, one may simply require (besides the condition on the scalar curvature) that the boundary mean curvature be zero, or non-negative at all points. This problem, which is of course much more flexible than the one above, is somewhat in the spirit of the study we shall describe in Sect. 4.3. In any event, we ultimately expect the answer to this question, i.e., the feasibility of these spherical gluing constructions, to depend a lot on the specific boundary conditions one imposes.

Another class of exotic solutions

A natural question one may ask is: what happens if we try to perform Corvino’s gluing construction (see Sect. 3.2) starting from the flat Euclidean metric? In other words, what happen in the zero mass case? Is it possible to smoothly glue the flat metric to an element of the Schwarzschild family?

We first note that if one tries to go through, line by line, the argument in Corvino (2000) things work fine as long as one solves the equation orthogonally to the cokernel, while evident obstacles arise in the finite-dimensional degree argument. More precisely, one can indeed take a rough patch, in annuli of very large radius, of the Euclidean metric with any \(g_{m,{\overline{x}}}\), and then perform a localised deformation in such annuli so to get a metric whose scalar curvature has no component in the subspace \(K^{\perp }_{*}\), yet there is no obvious (degree-theoretic) argument showing that the reduction map \({\mathscr {I}}_{\sigma }\) should vanish at some point. Indeed, if we keep working with Schwarzschild metrics of positive mass (as we did above) then the four-dimensional box described at the end of Sect. 3.2 is not well-defined, because its center should be taken at \((0,0)\in {\mathbb {R}}\times {\mathbb {R}}^3\) and its half-sides obviously have positive length. Yet, the relentless reader may simply try to make a different attempt by now performing the whole construction considering Schwarzschild metrics having positive and negative mass, i.e., for \(m\in {\mathbb {R}}\). Even in this case it is readily seen that the estimates above for the components of \({\mathscr {I}}_{\sigma }\) are not good enough to close the argument (for there is not enough control on the image, through this map, of the sides defined by \(x^{i}=\pm \mu _i\)); in any event we also note that the gluing construction for \(m<0\) is a priori inevitably doomed to fail (as soon as one imposes the natural condition that \(\sigma >\max \left\{ -m,0\right\} +|{\overline{x}}|\) aimed at ensuring completeness of the glued metric) because it would otherwise violate the Riemannian positive mass theorem.

We shall present here a way around this problem, which relies on a combined application of the gluing methods we have presented in the previous two sections. The corresponding outcome, which we state below, has not appeared before in the literature (to the best of our knowledge).

Theorem 3.17

There exist smooth solutions of the vacuum Einstein constraints on \({\mathbb {R}}^3\) that are exactly Minkowskian inside a large coordinate ball and exactly equal to an element of the Kerr family outside a larger coordinate ball.

In the context of the present review, we will in fact construct time-symmetric solutions of the type claimed above. That is to say: we produce smooth, scalar-flat Riemannian metrics on \({\mathbb {R}}^3\) that are exactly equal to an element of the Schwarzschild family outside a compact set and yet contain large regions where they are exactly flat. Of course, this suffices to prove the statement but we note that, with very minor changes, one could equally well construct initial data with non-zero linear momentum. The details will be given elsewhere.


Employing the methods in Carlotto and Schoen (2016), it is possible to construct a smooth, scalar-flat Riemannian metric \({\hat{g}}\) on \({\mathbb {R}}^3\) that satisfies the decay bounds

$$\begin{aligned} {\hat{g}}(x)=\delta +O(|x|^{-p}) \end{aligned}$$

for any pre-assigned \(p\in (1/2,1)\), together with the parity conditions that

$$\begin{aligned} {\hat{g}}(x)={\hat{g}}(-x) \end{aligned}$$

satisfied for any \(x\in {\mathbb {R}}^3\), and the additional requirement of being exactly flat in a pre-assigned coordinate ball (which we can take as large as we wish).

Such metrics are simply obtained by localising any given scalar-flat metric on \({\mathbb {R}}^3\) of positive mass and then symmetrically gluing a second copy of the same localised solution in a Minkowskian background.

At this stage, one would like to apply a Corvino-type gluing argument to the pair \(({\mathbb {R}}^3, {\hat{g}})\) given above. The problem with this strategy is the fact that these input data do not, strictly speaking, satisfy the necessary decay assumptions, not even the more general ones indicated above in the first line of (3.10) (namely: the assumptions denoted by (AC) in Corvino and Schoen 2006). Hence, one needs to carefully go through those arguments to make sure that, in spite of the seemingly insufficient decay rate given in (3.14), the exact even-ness condition (3.15) allows to compensate and close the proof.

Since we work in the time-symmetric case, we let \({\tilde{g}}_{\sigma }\) be the smooth metric obtained by smoothly interpolating between \({\hat{g}}_{\sigma }\) and an element of the four-dimensional Schwarzschild family; the interpolation happens between radii \(\sigma \) and \(2\sigma \) under the sole assumption that \(\sigma \ge \sigma _0\) where \(\sigma _0\) is chosen once and for all under the constraint that

$$\begin{aligned} \sigma _0/2>m+|{\overline{x}}| \end{aligned}$$

(we will later impose stronger restrictions on \(m>0\) and \({\overline{x}}\)). The interpolating function is obtained by scaling a given radial function that is monotone non-increasing exactly equal to 1 near the sphere of radius one (and inside of it) and exactly equal to zero outside a sphere of radius two (and outside of it). We note that for \(|x|>\sigma \) one has

$$\begin{aligned} |(g_{m,{\overline{x}}}-\delta )_{ij}|\le \frac{Cm}{|x|}, \ |(g_{m,{\overline{x}}}-\delta )_{ij,k}|\le \frac{Cm}{|x|^2}, \ |(g_{m,{\overline{x}}}-\delta )_{ij,k\ell }|\le \frac{Cm}{|x|^3} \end{aligned}$$

that hold uniformly under the sole hypothesis that

$$\begin{aligned} |x-{\overline{x}}|>|x|/2, \ \ \frac{m}{\sigma }<1. \end{aligned}$$

which are both implied by our earlier assumption (3.16). The metric \({\tilde{g}}_{\sigma }\) is easily seen to satisfy the decay bounds, which hold uniformly in the parameters under (3.16) :

$$\begin{aligned} |({\tilde{g}}_{\sigma }-\delta )_{ij}|=O(|x|^{-p}), \ |({\tilde{g}}_{\sigma }-\delta )_{ij,k}|=O(|x|^{-p-1}), \ |({\tilde{g}}_{\sigma }-\delta )_{ij,k\ell }|=O(|x|^{-p-2}) \end{aligned}$$

thus for the scalar curvature we have

$$\begin{aligned} R_{{\tilde{g}}_{\sigma }}=O(|x|^{-p-2}). \end{aligned}$$

If we then scale back to unit scale, namely if we now set

$$\begin{aligned} g_{\sigma }(x)={\tilde{g}}_{\sigma }(\sigma x) \end{aligned}$$

we get

$$\begin{aligned} |(g_{\sigma }-\delta )_{ij}|=O(\sigma ^{-p}), \ |(g_{\sigma }-\delta )_{ij,k}|=O(\sigma ^{-p}), \ |(g_{\sigma }-\delta )_{ij,k\ell }|=O(\sigma ^{-p}) \end{aligned}$$


$$\begin{aligned} R_{g_{\sigma }}=O(\sigma ^{-p}). \end{aligned}$$

These estimates imply that, possibly choosing (once and for all) a larger threshold \(\sigma _0\) we can indeed apply Theorem 2Footnote 10 in Corvino and Schoen (2006), say for \(k=0\) but employing an exponential weight, and get a tensor \(h_{\sigma }=h_{\sigma }(m,{\overline{x}})\) such that \(R_{g_{\sigma }+h_{\sigma }}\in K_{*}\) and, in addition, one has

$$\begin{aligned} \Vert h_{\sigma }\Vert _{L^2_{\rho ^{-1}}}+\Vert h_{\sigma }\Vert _{C^{2,\alpha }}\le C\sigma ^{-p} \end{aligned}$$

where the constant \(C>0\) can be chosen uniformly so long that

$$\begin{aligned} {\left\{ \begin{array}{ll} \sigma \ge \sigma _0 \\ m+|{\overline{x}}|<\sigma _0. \end{array}\right. } \end{aligned}$$

Note that this solution is proven, a posteriori, to be smooth (i.e., \(C^{\infty }\)), which corresponds to saying that \(h_{\sigma }\) will vanish to infinite order at the interfaces of the gluing region, which in this case are coordinate spheres of radii \(\sigma \) and \(2\sigma \).

We now observe that, as the parameters \(m,{\overline{x}}\) vary in a compact set the following uniform symmetry estimates hold:

$$\begin{aligned} ({\tilde{g}}_{\sigma })_{ij}(x)-({\tilde{g}}_{\sigma })_{ij}(-x)=O(|x|^{-2}), \ \ ({\tilde{g}}_{\sigma })_{ij,k}(x)+({\tilde{g}}_{\sigma })_{ij,k}(-x)=O(|x|^{-3}) \end{aligned}$$

so, equivalently, after scaling we get

$$\begin{aligned} (g_{\sigma })_{ij}(x)-(g_{\sigma })_{ij}(-x)=O(\sigma ^{-2}), \ \ (g_{\sigma })_{ij,k}(x)+(g_{\sigma })_{ij,k}(-x)=O(\sigma ^{-2}). \end{aligned}$$

We will need a similarly improved pointwise estimate on \(h_{\sigma }\). To close our argument, one computes the finite-dimensional reduction map (notation as above)

$$\begin{aligned} {\mathscr {I}}_{\sigma }: (m,{\overline{x}}) \mapsto \left( \frac{1}{16\pi }\sigma \int _{A_1}R_{{\overline{g}}_{\sigma }}\,dV_{g_{\sigma }}, -\frac{3}{16\pi }\sigma ^2\int _{A} x R_{{\overline{g}}_{\sigma }}\,dV_{g_{\sigma }}\right) . \end{aligned}$$

with the ultimate goal of proving a bound of the form

$$\begin{aligned} {\mathscr {I}}_{\sigma }(m,{\overline{x}})= (m-m_0, m {\overline{x}})+o(1) \ \text {as} \ \sigma \rightarrow \infty , \end{aligned}$$

for this allows to conclude (as above) via degree arguments. However, note that this estimate is strictly stronger compared to what we had proven in Sect. 3.2 when reviewing Corvino’s argument (recall the form of the upper bound on the term \(\varPsi _{\sigma }\)); also our assumptions here are a bit weaker than Corvino and Schoen (2006) so this is the point where some care is needed.

There are four projections to be studied. It is convenient (because of the form of Definition B.4) to work with respect to the Euclidean measure, however since clearly \(dV_{g_{\sigma }}=dV_{\delta }(1+\sigma ^{-p})\) this makes no difference with respect to proving (3.23) (also recall (3.3)). The first component is handled easily, and no parity arguments are needed:

$$\begin{aligned} \int _{A_1}R_{{\overline{g}}_{\sigma }}\, dV_{\delta }=\int _{A_1}(({\overline{g}}_{\sigma })_{ij,ij} -({\overline{g}}_{\sigma })_{ii,jj})\, dV_{\delta }+O(\sigma ^{-2p}) \end{aligned}$$

thanks to (3.19) and (3.21), whence by the divergence theorem, the infinite order of vanishing of \(h_{\sigma }\) at the interfaces and scaling back we can continue the chain of equalities

$$\begin{aligned}&=\int _{\partial A_1}(({\overline{g}}_{\sigma })_{ij,i}-({\overline{g}}_{\sigma })_{ii,j})\nu ^{j}\, dA_{\delta }+O(\sigma ^{-2p}) =\int _{\partial A_1}((g_{\sigma })_{ij,i}-(g_{\sigma })_{ii,j})\nu ^{j}\, dA_{\delta }+O(\sigma ^{-2p}) \\&\quad =\sigma ^{-1}\int _{\partial A_\sigma }(({\tilde{g}}_{\sigma })_{ij,i}-({\tilde{g}}_{\sigma })_{ii,j}) \nu ^{j}\,dA_{\delta }+O(\sigma ^{-2p}). \end{aligned}$$

Therefore, by the very definition of ADM mass and since \(2p>1\) we conclude

$$\begin{aligned} \int _{A_1}R_{{\overline{g}}_{\sigma }}\, dV_{\delta }= \frac{1}{\sigma }\left( 16\pi \left( m-m_0\right) +o(1)\right) , \ \text {as one lets} \ \ \sigma \rightarrow \infty . \end{aligned}$$

Concerning the other three components, we first remark how one can follow the argument given in Corvino and Schoen (2006) (see pp. 213–214 therein) to show that the even part of \(h_{\sigma }\) satisfies improved decay estimates of the form

$$\begin{aligned} (h_{\sigma })_{ij}(x)-(h_{\sigma })_{ij}(-x))=O(\sigma ^{-2p}), \ \ (h_{\sigma })_{ij,k}(x)+(h_{\sigma })_{ij,k}(-x)=O(\sigma ^{-2p}), \end{aligned}$$

which we now employ in computing \(\int _{A_1}x^{k}R_{{\overline{g}}_{\sigma }}\, dV_{\delta }\) for \(k=1,2,3\). Indeed, noticing that only the odd part of the integrand gives a non-zero contribution and then arguing as above (so by applying the divergence theorem, then using the vanishing of \(h_{\sigma }\) at the boundaries, and scaling) we have

$$\begin{aligned}&\int _{A_1}x^{k}R_{{\overline{g}}_{\sigma }}\, dV_{\delta } \\&\quad =\int _{A_1}x^{k}\left( ({\overline{g}}_{\sigma })_{ij,ij} -({\overline{g}}_{\sigma })_{ii,jj}\right) \, dV_{\delta }+O(\sigma ^{-3p}) \\&\quad =\int _{\partial A_1}x^k ((g_{\sigma })_{ij,i}-(g_{\sigma })_{ii,j}) \nu ^{j}\, dA_{\delta }-\int _{\partial A_1}((g_{\sigma })_{ik}\nu ^i - (g_{\sigma })_{ii}\nu ^k)\,dA_{\delta }+O(\sigma ^{-3p})\\&\quad =\sigma ^{-2} \int _{\partial A_\sigma }x^k (({\tilde{g}}_{\sigma })_{ij,i}-({\tilde{g}}_{\sigma })_{ii,j})\nu ^{j}\, dA_{\delta } \\&\quad \quad -\sigma ^{-2}\int _{\partial A_\sigma }(({\tilde{g}}_{\sigma })_{ik}\nu ^i - ({\tilde{g}}_{\sigma })_{ii}\nu ^k)\,dA_{\delta }+O(\sigma ^{-3p}). \end{aligned}$$

Now, we handle the contributions of the two boundary components separately and in a different fashion: for what concerns the outer boundary component, the computation is explicit and well-known (see e.g., Corvino 2000), while for the inner boundary the contribution is exactly equal to zero as one can see by simply re-applying the divergence theorem (so going in the opposite direction), now in the coordinate ball of radius \(\sigma \), and invoking the fact that the metric \({\hat{g}}\) (which, we recall, is the output of the localisation procedure) is exactly even. Therefore, as soon as one chooses p satisfying \(3p>2\), the net outcome is that

$$\begin{aligned} \sigma ^{-2} \int _{\partial A_\sigma }x^k (({\tilde{g}}_{\sigma })_{ij,i}-({\tilde{g}}_{\sigma })_{ii,j})\nu ^{j}\, dA_{\delta } -\sigma ^{-2}\int _{\partial A_\sigma }(({\tilde{g}}_{\sigma })_{ik}\nu ^i - ({\tilde{g}}_{\sigma })_{ii}\nu ^k)\,dA_{\delta } \\ =\frac{1}{\sigma ^2}\left( -\frac{16\pi }{3}m {\overline{x}}+o(1)\right) , \end{aligned}$$


$$\begin{aligned} \int _{A_1}x^{j}R_{{\overline{g}}_{\sigma }}\, dV_{\delta }=\frac{1}{\sigma ^2} \left( -\frac{16\pi }{3} m{\overline{x}}+o(1)\right) , \ \text {as one lets} \ \ \sigma \rightarrow \infty . \end{aligned}$$

At this stage, the conclusion comes straight, via a standard degree argument, from (3.23), as we have explained at the end of Sect. 3.2. \(\square \)

We note that the result above should be compared, among others, with the construction (presented in Bartnik 1993) of non-trivial asymptotically flat metrics on \({\mathbb {R}}^3\) that are scalar-flat everywhere and exactly flat on, say, a ball. While the two proofs are very different in all respects (recall that Bartnik studies a parabolic evolution problem on spheres), and the metrics produced in (3.17) have the additional, desirable property of being ‘canonical’ outside a larger compact set, there is an interesting conceptual link between the two results.

Asymptotically localised solutions of the Einstein constraints

Another aspect of the construction in Carlotto and Schoen (2016), which we have not discussed yet, is the slightly unsatisfactory rate of decay of the data we produce. To clarify the point, let us focus, for the sake of simplicity, on the time-symmetric case: even if we start with a metric having an asymptotic expansion of the form

$$\begin{aligned} {\check{g}}(x)=\delta +O(|x|^{-n+2}) \end{aligned}$$

we can apparently only construct metrics converging to the Euclidean one modulo an error term bounded by \(|x|^{-p}\) for any chosen \(p\in (\frac{n-2}{2},n-2)\) but not for \(p=n-2\). This fact, which is deeply intertwined with the strategy we employed to prove our coercivity estimates (that are, as we have seen, crucial to solve the linearised problem) led Schoen and the author of the present review to pose, in multiple occasions, the following question.

Open Problem 3.18

Are there non-trivial asymptotically flat Riemannian metrics of non-negative scalar curvature that are localised (in the very same sense of Theorem 3.9) and asymptote to the Euclidean metric modulo a tensor whose length is bounded by (a constant times) \(|x|^{-(n-2)}\), possibly subject to the additional requirement that the mass functional be continuous in the sense that \( m({\hat{g}})\rightarrow m({\check{g}})\) as \(a\rightarrow \infty \) ?

Some comments on this question are certainly appropriate. First of all, we stated this question in the time-symmetric case for the sake of expository simplicity, but of course we would ideally like to answer the corresponding question for the system of constraints (1.2), both in the vacuum case or (say) for matter sources subject to the dominant energy condition. Secondly, we do not expect the background manifold M nor its dimension to play any significant role, so that the problem would be equally interesting in the special case \(M={\mathbb {R}}^3\). Lastly, it is still unclear (at least to the author) whether the requirement that the ADM mass of \({\hat{g}}\) converges to that of \({\check{g}}\) be an essential or, instead, inessential point (certainly it is an interesting and useful feature in our construction, so it would be desirable to preserve it in case one could construct localised data with Newtonian decay).

We did not witness any advance on this question till very recently, when an interesting construction was proposed in LeFloch and Nguyen (2019) to build novel solutions to the constraints that have the desired decay properties while being asymptotically localised, as we are about to explain. This result does not, in itself, answer the question above, but certainly provides some interesting insight on the matter. Roughly speaking, the main existence results there are phrased as seed-to-solution theorems, in a sense that is vaguely reminiscent of what we presented in Sect. 2 for the conformal method: one starts with data having a suitable rate of decay (although not solutions) and cleverly builds sequences of corrections which ultimately converge to a solution of the constraints. That being said, the authors are able (in particular) to carefully select seed data that allow to prove the following statement,Footnote 11 where the notation (in particular for the inner region \(\varOmega _{I}\) and the outer region \(\varOmega _O\)) is exactly as we described before the statement of Theorem 3.9.

Theorem 3.19

For any given \(q \in (1,2)\), and angles \(\theta _{1}, \theta _{2}\), there exists a metric \({\hat{g}}\) on \(M={\mathbb {R}}^3\) so that \((M,{\hat{g}})\) satisfies the time-symmetric vacuum Einstein constraint equations,

$$\begin{aligned} {\hat{g}}=\delta +O(|x|^{-1}), \end{aligned}$$

and for \(|x|\ge 1\) one has

$$\begin{aligned} {\hat{g}}={\left\{ \begin{array}{ll} g_{m}+O(|x|^{-q}) &{} \text {in} \ \ \varOmega _{I}(a) \\ \ \ \delta +O(|x|^{-q}) &{} \text {in} \ \ \varOmega _{O}(a). \end{array}\right. } \end{aligned}$$

Here \(g_m=g_{m,0}\) denotes the one-ended Riemannian Schwarzschild manifold of mass \(m>0\) and center of mass at the origin. Hence, in other words, these solutions are only ‘localised modulo a controlled perturbation’. Note that the results in LeFloch and Nguyen (2019) are stated for three-dimensional data; note that there does also exist an extension of the previous statement to the non-time-symmetric case of triples (Mgk), see Theorem 6.2 therein. While the restriction to the time-symmetric case was adopted here for the sake of expository convenience, we should note that in the more general version the tensor k only decays like \(|x|^{-1}\), while it would more natural (say, from the perspective of data in harmonic asymptotics, to have the second fundamental form fall off like \(|x|^{-2}\) near infinity). Also, from the perspective of the regularity, note that these data are constructed in weighted Hölder spaces and both g and k locally belong to \(C^{2,\alpha }\) (where one can select any \(\alpha \in (0,1)\)).

The scope of the project by LeFloch and Nguyen was not really to deal with Open Problem 3.18, but rather to construct a broad class of solutions whose asymptotic decay can be prescribed more or less arbitrarily (which may be, as we were told, a useful ingredient for a new proof of the nonlinear stability of Minkowski spacetime). It could be argued that, from a purely physical (or even: experimental) perspective rapidly decaying fields are in many respects as good as vanishing ones, so that it is not inappropriate to connect Theorem 3.19 to Open Problem 3.18. That being said, it is clear that the data produced in such a theorem do not allow to draw any of the geometric conclusions which we listed above, in Sect. 3.3, and (as it has been mentioned) really lie behind the construction we had presented.

The proof of Theorem 3.19, which is currently under scrutiny, is rather lengthy and sophisticated. Without aiming at doing it justice here, we just wish to tickle the reader’s curiosity by answering the following natural question: What are the new ingredients in the argument (compared e.g., to the proof of Theorem 3.9)? Except for the very last section of the paper (Section 6), that is devoted to the asymptotic localisation problem, the authors do not specifically consider the problem of solving the constraints in a conical region, hence with two weight functions in play, but rather analyse the existence of asymptotically flat (rather: asymptotically Minkowskian) solutions to (1.2) in singly weighted spaces (over, say, \({\mathbb {R}}^3\)), so with a weight that behaves like a power of a regularised distance from the origin, and for small data. The last assertion has to be understood as follows: we are given a set of seed data \((g_1, k_1, H_{\star }, M_{\star })\) that are almost a solution (for given physical sources \(H_{\star }\) and \(M_{\star }\), suitably decaying or zero in the vacuum case), and need to be corrected to an actual solution (gk) for \(g=g_1+g_2, k=k_1+k_2\). As we had done in Carlotto and Schoen (2016), this primarily boils down to a careful study of the linearised equations, with the error terms treated as sources, and then to designing a converging iteration scheme (again of Picard type, so with the linearisation always happening at the pre-assigned seed data). The subcritical existence result proven in LeFloch and Nguyen (2019), see Theorem 2.6 therein, does in fact rely on material already present in our earlier work, except for a slighly different choice of the functional spaces involved in the analysis.

Instead, the second part of the paper introduces some novel ideas that lead to substantial refinements. Building on a careful, essentially potential-theoretic, analysis (cf. Choquet-Bruhat and Christodoulou 1981; Lockhart and McOwen 1983, 1985) of the linearised constraint operators, whose key facts are recalled in Section 4 therein, the authors show that, if the seeds are good enough (which is what they call strongly tame) then one can build a very clever coupled bootstrap scheme so to gradually improve the decay rate of \(g_2\) and \(k_2\), i.e., one shows that the the general fall-off estimates can be refined and specified, and that these tensors do have, in fact, under these assumptions, a partial expansion at infinity. There are some effective tricks, ultimately designed by rewriting the equations in the most helpful fashion, that make this bootstrap work. For instance, the basic improvement of decay for \(g_2\) relies on noticing that its trace solves an equation of the form

$$\begin{aligned} -\varDelta _{\delta }(tr_{\delta }g_2)+\frac{2}{r^2}(tr_{\delta }g_2)=higher-order terms \end{aligned}$$

(whose right-hand has, for strongly tame data, good decay) and on exploiting the mapping properties of the Schrödinger-type operator \(-\varDelta _{\delta }+2/r^2\), which are extremely well-understood. Hence, given the basic decay rates, one improves them a bit for \(tr_{\delta }g_2\), then for \(u_2\) (notation as in Sect. 3.3) which solves

$$\begin{aligned} -2\varDelta _{\delta }u_2=\frac{1}{r^2}(tr_{\delta }g_2)+higher-order terms \end{aligned}$$

(with the error terms are different than above, but equally well controlled) and finally for the whole \(g_2\) by simply recalling that

$$\begin{aligned} g_2=r^{2} ({D\varPhi ^{*}}^{(1)}_{(g,\pi )}(u_2,Z_2)) \end{aligned}$$

A similar trick allows to deal with the momentum constraint, and more refined tools allow to reach the critical decay.

That being said, let us now turn back to the asymptotic localisation problem. First of all, there is a crucial, clever Ansatz that lies behind the construction of the data in Theorem 3.19. Specifically, one considers a family of metrics on \({\mathbb {R}}^3\) (it would be on \({\mathbb {R}}^3\setminus B\) but one can extend the data for practical convenience) of the form

$$\begin{aligned} g_1(\varPhi , s,t)=\left( -\frac{1}{2}\varDelta _{g_0}\varPsi +\varPsi \right) (g_m-t\xi _{R_{s,t}}r^2\mathrm {Hess}_{\delta }1/r)\\ +\left( 1+\frac{1}{2}\varDelta _{g_0}\varPsi -\varPsi \right) \delta +s\xi _{R_{s,t}}\left( \frac{1}{2}\varDelta _{g_0}\varPhi -\varPhi \right) r^2 \mathrm {Hess}_{\delta }1/r \end{aligned}$$

where the function \(\varPhi : S^2\rightarrow {\mathbb {R}}\) and the parameters st are to be chosen along the course of the argument, while \(\varPsi : S^2\rightarrow {\mathbb {R}}\) is just a cutoff function which transitions from the value 1 to the value 0 in the region between the cones in question; furthermore \(\xi _{R_{s,t}}(x)=\xi (x/R_{s,t})\) for \(\xi \) a radial smooth cutoff function that vanishes in \(B_{1}(0)\) and is equal to one outside of \(B_2(0)\) and \(R_{s,t}>0\) is a suitable scale, that is determined by the values of st. Note that all differential operators in \({\mathbb {R}}^3\) are understood with respect to the background Euclidean metric, and \(\varDelta _{g_0}\) is the Laplace operator on the unit round sphere. The argument does need the data to be relatively special, i.e., the choice of the seeds is partly constrained from the very beginning. In any event, one then solves the constraints for the pair \((g_1(\varPhi , s,t), k_1\equiv 0)\), which is certainly possible if one takes the subcritical exponents \((p,q)=(1/2,3/2)\), provided the radial cutoff parameters st are chosen so that \(R_{s,t}\) is large enough. Hence, one carefully analyses the asymptotic structure of the output pair \((g_2(\varPhi , s,t), k_2(\varPhi , s, t)\equiv 0)\).

Set aside subtler technical aspects, the basic idea is, to choose the two parameters s and t so that the leading terms as \(r\rightarrow \infty \) of the Hessian summands in the asymptotic expansion of \(g_1\) and \(g_2\) are matched and cancel out, so that (for such a choice) the metric \(g=g_1+g_2\) will not differ from the model metrics (namely \(g_m\) and \(\delta \)) by 1/r terms but only higher-order ones. In this respect, note that the metric \(g_1=g_1(\varPhi , s, t)\) has, inside the smaller cone and outside of the larger one, precisely a structure of the form

$$\begin{aligned} \text {model metric}\ +\ \text {parametric coefficient}\times (\mathrm {Hess}_{\delta }1/r) \end{aligned}$$

while in the transition region the is some interference associated to the interpolation of the model metrics.

After a non-trivial amount of work, which involves the study of the critical terms and an estimate of the mass corrector \({\tilde{m}}(\varPhi , s, t)\), which is essentially just the coefficient in the leading term of the expansion of \(g_2(\varPhi , s, t)\), one finally obtains that

$$\begin{aligned} g(\varPhi , s,-4m/5)-\varPsi g_m -(1-\varPsi )\delta +\left( \frac{5s}{4}-{\tilde{m}}(\varPhi , s, -4m/5)\right) r^2 \mathrm {Hess}_{\delta }1/r \end{aligned}$$

decays super-critically in the complement of the transition region. At that stage, it is argued that provided \(\int _{S^2}\varPhi \, dA_{g_0} =0\) one can indeed find \(s^*\) solving the equation

$$\begin{aligned} \frac{5s}{4}-{\tilde{m}}(\varPhi , s, -4m/5)=0 \end{aligned}$$

whence the conclusion follows at once.

Connecting data via collars

Gluing problems also arise, although in a rather different form, in an attempt of designing a good notion of quasi-local mass. This subject has now attracted considerable interest for some decades, and it cannot be accounted for here, but we refer the reader to the beautiful review Szabados (2009) for a broad description of this topic, as well as to Chen and Wang (2015), Miao (2015) and Wang (2019) for various updates with a special focus on recent mathematical developments around it.

Loosely speaking, in Bartnik (1989) a novel definition of quasi-local mass was proposed, that, in the special case of time-symmetric data, can be phrased as follows: one associates to a compact Riemannian manifold (Mg), with, say, non-empty connected boundary \(\partial M\), the infimum of the ADM masses over all admissible asymptotically flat extensions of (Mg) having non-negative scalar curvature. There are different ways of properly specifying the requirement that an extension be admissible but, somewhat imprecisely, they all aim at excluding the presence of minimal surfaces enclosing \(\partial M\). Note, also, that we tacitly assume that the extension have exactly one asymptotically flat end; in addition we consider extensions in the smooth category, namely we consider those ambient manifolds where (Mg) can be smoothly isometrically embedded.

It should be remarked that, from the perspective we have embraced in this review (i.e., that of producing novel solutions to the Einstein constraints) it would perhaps be more natural to assume that (Mg) be scalar-flat and that the extensions one considers be scalar-flat as well. In fact, a rather simple argument (cf. Sect. 4 in Corvino 2000) allows to prove that if a minimal mass extension exists then it must be static (which in particular ensures that it be scalar-flat, see Appendix D). That being said, there is an important caveat here: as we are about to see, in many cases of interests classical/smooth minimisers simply do not exist and, for a subclass of those cases, it may be reasonable to rather look for minimal mass extensions allowing for a non-smooth junction along \(\partial M\) (so long as one still requires non-negative scalar curvature in a distributional sense, as very well explained in Miao 2002 when discussing the positive mass theorems for data admitting edges along a hypersurface). That being said, there are lots of variations on the theme that arise by considering different degrees of regularity of the data in question, different matching conditions along \(\partial M\), and different admissibility conditions. We refer to Anderson and Jauregui (2019) for an interesting discussion of these notions, and a comparative analysis thereof. Another important modification, whose significance is widely discussed in the reference we listed above, is not to start with a domain (Mg) but, rather, with Bartnik data \((\varSigma ,g,H)\) where \(\varSigma \) is closed surface (playing, roughly speaking, the role of \(\partial M\)), g is a Riemannian metric on \(\varSigma \) and \(H:\varSigma \rightarrow {\mathbb {R}}\) is the assignment of a mean curvature function: one looks for asymptotically flat Riemannian manifolds, with one end, having non-negative scalar curvature and whose boundary is isometric to \((\varSigma ,g)\) and matches the assigned mean curvature function (with the obvious sign conventions). Again, important questions concern the existence of minimisers and their geometric characterisation.

Motivated by these problems, Mantoulidis and Schoen proposed a gluing construction which led to a rather complete (albeit negative) answer to such an existence question in the apparent horizon case, namely when the Bartnik data take the form \((\varSigma \cong S^2, g, H\equiv 0)\) and \(\lambda _1(-\varDelta _g+K_g)>0\), which is in fact strictly stronger than the more standard requirement that the minimal surface in question be stable. More precisely, they construct an almost-cylindrical collar that connects such \((\varSigma ,g)\) to a (centered) Schwarzschild manifold whose mass can be made arbitrarily close to the bound imposed by the Riemannian Penrose inequality (cf. Huisken and Ilmanen 2001; Bray 2001). A precise statement of the main result in Mantoulidis and Schoen (2015) is as follows:

Theorem 3.20

Let g be a smooth Riemannian metric on \(S^2\) such that \(\lambda _1(-\varDelta _g+K_g)>0\). For any m such that \(16 m^2> \text {area}(S^2,g)\) there exists an asymptotically flat 3-manifold whose boundary is minimal and isometric to \((S^2,g)\), that is isometric to a mass-m Schwarzschild metric outside a compact set and that has a (global) mean-convex foliation that eventually coincides with centered, Schwarzschildean coordinate spheres.

Note that the leaves of the foliation mentioned in the last clause are all strictly mean-convex with the sole exception of the initial leaf (that is \(\varSigma \) itself) hence the apparent horizon is outermost (and not enclosed by any other minimal surface, thanks to the maximum principle).

This theorem has a number of interesting implications:

  1. (a)

    it exhibits a large class of Bartnik data for which the Bartnik mass is computable and yet no minimising extension can possibly exist, since of course a minimal mass extension should have ADM mass exactly equal to \(\sqrt{\text {area}(S^2,g)}/16\pi \) and we do know that equality in the Penrose inequality is only attained by the (single ended) Schwarzschild Riemannian manifold;

  2. (b)

    (together with the very recent analysis of the borderline, non-generic case when \(\lambda _1(-\varDelta _g+K_g)=0\), see Chau and Martens 2020) it gives a complete description of the intrinsic geometry of apparent horizons: the condition that \(\lambda _1(-\varDelta _g+K_g)\ge 0\) is both necessary and sufficient for realisation of a Riemannian surface as an apparent horizon;

  3. (c)

    it led to disprove Gibbons formulation of Thorne’s hoop conjecture, see Gibbons (2009) and Cvetič et al. (2011), namely the conjecture that the Birkhoff systolic invariant of the apparent horizon in an asymptotically flat Riemannian manifold of non-negative scalar curvature is never larger than \(4\pi m^2\).

The proof of Theorem 3.20 consists of a few very concrete, ad hoc constructive steps. Said \({\mathscr {M}}_{+}\) the class of smooth metrics on \(\varSigma \cong S^2\) such that \(\lambda _1(-\varDelta _g+K_g)>0\), the authors first prove (strongly using the fact that \(\varSigma \) is a surface, i.e., it has dimension two) that one can connect any given \(g\in {\mathscr {M}}_{+}\) to a (suitably normalised) round metric without changing the area element along the path. Then, they employ such a path of metrics to construct a metric on a topological cylinder \(S^2\times I\) by introducing a warping factor of the form \((1+\varepsilon t^2)dt^2\) for t the coordinate on the interval I. In order to have a mean-convex foliation, that is strictly mean-convex except at the initial surface (which corresponds to \(t=0\)), the collar needs to have positive scalar curvature. Lastly, they join such a collar (that, let us recall, terminates at a round metric) to an outer domain of the one-ended Schwarzschild manifold, and deform the junction to a smooth one.

This is quite a striking, sharp construction. As already noted by the authors, there is precisely a single aspect of it one may wish to improve, which corresponds to the following issue:

Open Problem 3.21

Is it possible to build scalar-flat metrics which fulfill all requirements in the statement of Theorem 3.20?

We wish to remark that in Miao and Xie (2019) the authors construct vacuum data (i.e., scalar-flat) that satisfy all requirements as above except that the extensions are not equal to Schwarzschild data outside of a compact set, and anyway under the stronger assumption that the Gauss curvature of \((\varSigma ,g)\) be positive. That said, it may well be that this is close to the best one can achieve in this direction.

After Mantoulidis and Schoen (2015) first appeared, there have been a few related contributions and advances. First of all, a very natural question (which, again, stems from the problem of determining existence of minimal mass extensions for Bartnik data) is whether a construction of the same type as above may be designed, with analogous conclusions, in the CMC case (by which we mean that one deals with a triple \((\varSigma \cong S^2, g, H)\) where the function H is now identically equal to a positive constant c). There have been partial, yet very interesting contributions in Cabrera Pacheco et al. (2017). The main result there can be stated as follows:

Theorem 3.22

Let \((\varSigma \cong S^2, g, H)\) be Bartnik data such that g has positive Gauss curvature and H is a positive constant. There exist constants \(\alpha \ge 0\) and \(\beta \in (0,1]\) such that if this triple satisfies the inequality

$$\begin{aligned} \frac{1}{16\pi } \int _{\varSigma }H^2\,dA_g\le \frac{\beta }{1+\alpha } \end{aligned}$$

then for every \(m>m_{*}\), where

$$\begin{aligned} m_{*}=\left[ 1+\left( \frac{\alpha \left( \frac{1}{16\pi } \int _{\varSigma }H^2\,dA_g\right) }{\beta - (1+\alpha )\left( \frac{1}{16\pi } \int _{\varSigma }H^2\,dA_g\right) }\right) ^{1/2}\right] m_{H}(\varSigma ,g,H) \end{aligned}$$

there exists an asymptotically flat 3-manifold whose boundary is isometric to \((S^2,g)\) and has mean curvature H, that is isometric to a mass-m Schwarzschild metric outside a compact set and that has a (global) strictly mean-convex foliation that eventually coincides with centered, Schwarzschildean coordinate spheres.

In this statement, the notation \(m_{H}(\varSigma ,g,H)\) stands for the (Riemannian) Hawking mass, i.e.,

$$\begin{aligned} m_{H}(\varSigma ,g, H)=\sqrt{\frac{\text {area}(\varSigma ,g)}{16\pi }}\left( 1-\frac{1}{16\pi } \int _{\varSigma }H^2\,dA_g\right) . \end{aligned}$$

Also, let us remark two important features of the result above: first, the multiplicative factor defining the threshold \(m_{*}\) (namely: the ratio \(m_{*}/m_{H}\)) tends to one as \(H\rightarrow 0\) so as we approach an horizon; second, it follows from the explicit definition of \(\alpha \) and \(\beta \), hence ultimately from the proof of Theorem 3.22 that

$$\begin{aligned} \alpha \rightarrow 1, \ \ \text {and} \ \ \beta \rightarrow 0 \ \ \end{aligned}$$

as g converges to a round metric.

The proof of this results essentially follows a conceptual scheme similar to the one we outlined above, with some important differences: the model for the collar one constructs is not the standard product cylinder but a compact, annular portion of a Schwarzschild manifold and (correspondingly) one does not aim at keeping the area almost constant along the collar foliation, but rather the Hawking mass itself (which is indeed exactly constant on the canonical foliation of the one-ended Riemannian Schwarzschild manifold by centered spheres). The earlier work Miao and Xie (2018) plays an essential role in the collar extension designed there.

In terms of implications, Theorem 3.22 provides a nice, relatively explicit upper bound on Bartnik data having positive Gauss curvature (which, one should recall, implies the strict stability condition \(\lambda _1(-\varDelta _g+K_g)>0\)) and positive constant mean curvature. Unfortunately, this result does not really shed light on the question whether, under those assumptions, minimisers actually exist (which is, to our understanding, conjectured by many to be the case). Hence, we are led to explicitly pose the following question:

Open Problem 3.23

Is there a smooth minimiser for Bartnik data \((\varSigma \cong S^2, g, H)\) having positive Gauss curvature and positive mean curvature?

Of course, this includes the CMC case as a special instance, but there is no physical (nor mathematical) reason why the answer to such a question should really differ between the constant and the non-constant mean curvature case.

Other contributions along these lines can be found in Cabrera Pacheco and Miao (2018) (partial higher-dimensional analogues), Miao et al. (2020) (comparatively different estimates for domains with CMC boundary rather than Bartnik data), and Cabrera Pacheco et al. (2018) (asymptotically hyperbolic manifolds with scalar curvature greater or equal than \(-6\)); see also Mantoulidis and Miao (2017a, 2017b).

Another perspective: Sobolev extension theorems and applications

In this section we wish to describe a different perspective on constructing solutions to the Einstein constraints, which relates to the gluing methods presented above but also significantly diverts from them. We shall start here from the following basic question: given a solution (gk) on the unit ball of \({\mathbb {R}}^3\) (henceforth denoted by \(B_1\)) of the vacuum constraints in the maximal case, i.e., given a solution of the system

$$\begin{aligned} \left\{ \begin{aligned}& R_g=\Vert k\Vert ^2_g \\& tr_g(k)=0 \\ &div_g(k)=0 \end{aligned}\right. \end{aligned}$$

does there exist a (suitably regular) asymptotically flat initial data set \((g',k')\) on \({\mathbb {R}}^3\) that isometrically contains (gk) and continuously depends on it?

This question should be understood in formal analogy with the well-known extension theorems for Sobolev functions (see e.g., Appendix A in Ambrosio et al. 2018) and was first posed by Szeftel to Czimek, a PhD student of his at the time, with the goal of ultimately proving a localised counterpart of the bounded \(L^2\)-conjecture. Recall that the results in Choquet-Bruhat and Geroch (1969) ensure that, given (Mgk) an initial data set satisfying the vacuum constraint equations then, there exists a unique maximal and globally hyperbolic spacetime \((L,\gamma )\) that isometrically contains it. Of course, a fundamental point, of self-evident physical significance, would be to find quantitative criteria to determine ‘how large’ such maximal development is. The bounded \(L^2\)-conjecture roughly asserts that given (Mgk) an asymptotically flat initial data set on \(M={\mathbb {R}}^3\) , then there exists a time \(T>0\) only depending on \(\Vert {\mathrm {Ric}}_g\Vert _{L^2(M,g)}\), \(\Vert \nabla k\Vert _{L^2(M,g)}\), and the volume radius at scale 1 such that the spacetime can be evolved and controlled up to time T. As we already mentioned, this assertion is now a theorem (Klainerman et al. 2015), with a rather monumental proof building on various technical advances.

For a variety of reasons, it is interesting to consider the local counterpart of the same question: given a compact manifold with boundary, one would like to understand how long given data can be evolved and how the evolution can be effectively controlled in the sense explained above. A natural approach is clearly to invoke a suitable extension theorem, and then to appeal to the pre-existing machinery developed for the original conjecture (as many arguments there have a global character). Of course, through a covering argument and the use of a partition of unity, one can ultimately reduce to proving the extension theorem in the specific case of a ball in \({\mathbb {R}}^3\), which serves as the ‘local model’ to then handle compact manifolds with boundary. One should mention that this reduction is technically delicate due to the low regularity of the data (see below). In any event, a partial solution to the question above was provided Czimek (2018):

Theorem 3.24

Let \((g,k)\in H^{2}(B_1)\times H^{1}(B_1)\) be a solution to the maximal, vacuum constraint equations on \(B_1\subset {\mathbb {R}}^3\). Then there exists \(\epsilon >0\) such that if

$$\begin{aligned} \Vert (g-\delta ,k)\Vert _{H^2(B_1)\times H^1(B_1)}<\varepsilon \end{aligned}$$

then there exits a solution \((g',k')\) on \({\mathbb {R}}^3\) to the maximal, vacuum constraint equations such that

  • \((g',k')_{| B_1}=(g,k)\)

  • \((g',k')\) is asymptotically flat

  • \(\Vert (g' - \delta , k')\Vert _{H^2_{-1/2}({\mathbb {R}}^3)\times H^1_{-3/2}({\mathbb {R}}^3)}\le C \Vert (g-\delta ,k)\Vert _{H^2(B_1)\times H^1(B_1)}\).

The proof of this theorem requires a considerable amount of work, which is anyway a pleasure to read due to an excellent expository style. That being said, we wish to give a quick sketch of the argument by focusing on three key ideas.

First, one introduces an iteration method to decouple the problem. So, given data (gk) as in the statement above, one aims at solving the pair of extension problems given by

$$\begin{aligned} \left\{ \begin{aligned}& {g_{1}}_{| B_1}=g \\& R_{g_{1}}=|k|^2 \end{aligned}\right. \end{aligned}$$


$$\begin{aligned} \left\{ \begin{aligned} &{k_{1}}_{| B_1}=k \\ &div_{g_{1}}(k_{1})=0 \\& tr_{g_{1}}(k_{1})=0 \end{aligned}\right. \end{aligned}$$

for \((g_1,k_1)\) asymptotically flat, as encoded in the weighted Sobolev spaces mentioned in the third item of Theorem 3.24. Hence, one restricts \((g_1, k_1)\) back to the unit ball, and considers the two problems above, the first to be solved for \(g_2\) and the second for \(k_2\). Therefore, we define an iterative scheme which produces a sequence of data \((g_i, k_i)\) that provides (if convergent) a solution to our initial extension question. The convergence issue, which is dealt with through a fixed-point argument, is one of the many steps along the course of the proof where the smallness assumption is exploited.

Second, we need to explain how to proceed to solve the two underdetermined problems above. Roughly speaking, one reduces (by invoking the implicit function theorem) to proving the surjectivity of certain linear operators at the Euclidean metric, which in turn is done by means of a very careful analysis which crucially exploits the decomposition of scalar functions, vector fields and tensors into spherical harmonics (more precisely: in suitable Hodge–Fourier bases). In both cases one first appeals to Sobolev extension theorems (in weighted spaces) and then solves for the errors the resulting problems, suitably transformed into determined ones. For instance, in the case of (3.28) one faces (after the extension has been performed, and working at \(g_1=\delta \)) a system of the form

$$\begin{aligned} \left\{ \begin{aligned} &div_{\delta }(k')=\rho \\& tr_{\delta }(k')=0 \end{aligned}\right. \end{aligned}$$

where \(k_1={\overline{k}}+k'\) and \({\overline{k}}\) is the extended datum (which will obviously not solve the equations in general) and \(k'\) is a correction term. One then considers the 3D Hodge system of equations

$$\begin{aligned} \left\{ \begin{aligned}& div_{\delta }(k')=\rho \\ &curl_{\delta }(k')=\sigma \\& tr_{\delta }(k')=0 \end{aligned}\right. \end{aligned}$$

to be solved on \({\mathbb {R}}^3\setminus \overline{B_1}\), with the datum \(\sigma \) to be designed so the ensure regularity at the interface. Here we see a different incarnation of the same general principle we have already encountered (in very different forms) both in Sects. 2 and 3 which is the principle of transforming an underdetermined problem by introducing extra, taylor-made, conditions.

Lastly, let us explain how the choice of \(\sigma \) is designed so to ensure regularity at the unit sphere \(\partial B_1\subset {\mathbb {R}}^3\). To that aim, consider the following simpler model problem. Given \(f\in C^{\infty }_c({\mathbb {R}}^3\setminus \overline{B_1})\) the system

$$\begin{aligned} \left\{ \begin{aligned}& \varDelta u= f\quad \text { in }\,{\mathbb {R}}^3\setminus \overline{B_1} \\& u=0\quad \text { in } \partial B_1 \\& \partial _r u=0\quad \text { in }\, \partial B_1 \end{aligned}\right. \end{aligned}$$

is not solvable in general, however we may wonder what choices of f allow for solutions.

That being said, one observes that the condition \((\partial _r u)_{r=1}=0\) can indeed be imposed via a suitable design of f: after decomposition in spherical harmonics, one has the pointwise identities

$$\begin{aligned} f^{(lm)}=\frac{1}{r^{l+1}}\partial _r(r^{2l+2}\partial _r(r^{-l-1}u^{(lm)})), \ \ l\ge 0, m\in \left\{ -l,\ldots , l\right\} , \end{aligned}$$


$$\begin{aligned} -(\partial _r u^{(lm)})_{| r=1}=\int _{1}^{\infty }\partial _r(r^{2l+2}\partial _r(r^{-l-1}u^{(lm)}))\,dr =\int _{1}^{\infty }r^{l+1}f^{(lm)}\,dr. \end{aligned}$$

Therefore, imposing the integrals on the right-hand side to vanish one can indeed ensure that \(\partial _r u =0\) along the unit sphere. We note that, in fact, if u solves the problem above then it is readily checked that \((\partial ^k_r u)_{r=1}=0,\) for all \(k\ge 0\). This is the trick which is exploited in suitably selecting the tensor \(\sigma \) in the (much harder) problem at hand.

Remark 3.25

We note that, in comparison to some of the gluing results we presented in Sect. 3, this construction has two desirable properties: it requires no buffer region and it can be performed without loss of regularity.

While Theorem 3.24 suffices to prove a significant localised version of the bounded \(L^2\)-conjecture, which was indeed done in Czimek (2019), there remains the question (from the perspective of solving the Einstein constraints) of understanding whether an extension procedure can be designed under less restrictive assumptions. We can then pose the following:

Open Problem 3.26

Develop an effective extension theory for non-maximal data and in a non-perturbative regime.

Spaces of solutions to the Einstein contraint equations

In this fourth part of the review, we wish to describe spaces of solutions to the constraints or, in other words, the structure of (what is somewhat improperly called) the constraint manifold, both from a local/analytic perspective and from a global/topological one. Our discussion connects to the basic notion of linearisation stability, and has fundamental physical motivations since, from an Hamiltonian viewpoint, the Einstein field equations can be recast (at least locally) as a dynamical system on the constraint manifold. On other other hand, the themes we will present in Sects. 4.2 and 4.3 are naturally linked to the important mathematical problem of determining the homotopy type of spaces of positive scalar curvature metrics, see e.g., the surveys Rosenberg and Stolz (2001), Schick (2014) or Carlotto (2021).

Local structure and linearisation stability

For a fixed background manifold M, of dimension \(n\ge 3\), we shall study the space of tensors solving the constraint equations (1.2), which we simply write as

$$\begin{aligned} \varPhi (g,\pi )=(2\kappa \mu +2\varLambda ,\kappa J) \end{aligned}$$

where \(\varPhi \) is defined on an open linear cone in a product \(X_1\times X_2\) and takes values in \(Y_1\times Y_2\) for suitable functional spaces \(X_1, X_2\) and \(Y_1, Y_2\), and the right-hand side is tacitly understood to be fixed once and for all. Since M may be closed or compact with boundary or even non-compact (and, again, possibly with boundary in that case as well) the specification of these spaces very much depends on the choice of the background manifold, and of course on the level of regularity of the data one wishes to work with. For the purposes of this preliminary discussion, the reader may focus on the case when M is a closed manifold, or regard the argument at a pure formal level. Also, our primary concern will be the (simpler) vacuum case, although some of the results below are indeed valid under more general assumptions.

The most basic questions one may wish to ask concerns the regularity of the constraint manifold, henceforth denoted by \({\mathscr {C}}\), and defined as the pre-image through \(\varPhi \) of \((2\kappa \mu +2\varLambda ,\kappa J)\). If \(\varPhi \) is continuous then \({\mathscr {C}}\) is a closed subset of \(X_1\times X_2\), but is it a topological manifold? Is it actually a smooth (infinite-dimensional) subvariety in \(X_1\times X_2\)? The first thing to do when approaching this problem is of course to try and apply a suitable submersion criterion, either in a Banach space setting or, even more generally, for products of Fréchet spaces. Loosely speaking, and sticking to the Banach setting for the sake of simplicity, the outcome of that machinery is that a sufficient condition for the constraint manifold to be a \(C^1\) submanifold near a point \((g,\pi )\) is that the differential \(D\varPhi _{(g,\pi )}\) be surjective at the point in question, and that its kernel splits. In turn, if we are in a reasonably tame context (from the perspective of Functional Analysis), the latter condition occurs if and only if the dual of the linearised map \(D\varPhi ^{*}_{(g,\pi )}\) is injective (in other words: if we are not at a Killing initial data set) and has injective symbol. Also, note that in that case we also get that \(\varPhi : X_1\times X_2 \rightarrow Y_1\times Y_2\) is actually a submersion, and thus in particular the vectors in the kernel of \(D\varPhi _{(g,\pi )}\) precisely correspond to the tangent vectors to differentiable curves lying on the constraint manifold.

These arguments were first formalised, at least for the time-symmetric case (and mostly focusing on \(C^{\infty }\) metrics), in Fischer and Marsden (1975a) where the mapping properties of the scalar curvature map are studied in detail. In particular, it is shown there that the set of scalar-flat metrics is regular (i.e., a differentiableFootnote 12 manifold) near any metric g that is not static, which is nothing but the incarnation of the submersion criterion above in this specific context. The results we collected in Appendix D therefore provide sufficient criteria that ensure a locally regular structure of the constraint manifold: it is enough that either \(R_g/(n-1)\) not be a constant in the spectrum of the Laplace operator \(\varDelta _g\), or if \(R_g\) is identically zero that the manifold (Mg) not be Ricci flat. Concerning the scalar-flat case, recall that it was proven in Beig et al. (2005) that static metrics are non-generic in the class of scalar-flat Riemannian metrics on a closed 3-manifold M. In particular, it follows from this result that (in this particular setting, so in the time-symmetric case) the constraint manifold is smooth away from a meager set of points; in this specific case we will see later that Fischer and Marsden can actually say a lot more. We further anticipate that an analogous genericness result in the general (i.e., not necessarily time-symmetric) case is still lacking, even when the background manifold is a closed 3-manifold.

Now, the problem with the approach sketched above is twofold, in that on the one hand it only provides a sufficient (but clearly not necessary!) criterion for smoothness, and on the other hand it does not shed any light on the local structure of the constraint manifold near singular points (if any), namely about a canonical form for the singularity, the existence of tangent cones etc...It turns out that these questions are strictly connected, in fact intertwined, with a seemingly different one, namely to the question whether or not a solution of the linearised Einstein field equations (relative to a given background solution) actually approximates to first order a curve of exact solutions to the nonlinear equations. In order to proceed and make this link explicit, we shall by recalling the basic definition in a general context: given Banach manifolds XY, a \(C^1\) mapping \(F:X\rightarrow Y\) shall be called linearisation stable at \(x_0\in X\) if for any \(h\in \ker DF_{x_0}\) there exists a \(C^1\) curve \(x:(-\epsilon ,\epsilon )\rightarrow X\), for some \(\epsilon >0\), such that the following three conditions hold:

$$\begin{aligned} x(0)=0, \ \ x'(0)=h, \ \ F(x(t))=F(x_0). \end{aligned}$$

In analogy with all moduli problems (of which there are plenty within geometric analysis), this is just a special instance of the general notion of integrability. Informally speaking, if (say) \(F(x_0)=0\) and we are studying the zero set of the mapping F than we say that \(x_0\) is linearisation stable if every infinitesimal deformation is tangent to a family of actual solutions.

To get a feeling about this, let us consider a finite-dimensional toy model and test the notion above in that case. Let \(p:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) be a homogeneous quadratic polynomial in \(d\ge 2\) variables having mixed signature, to fix the ideas say \(p(x_1,\ldots , x_d)=-x^2_1+x^2_2+x^3_2+\cdots x^2_d\) and let us study the zero locus \(p(x_1,\ldots , x_d)=0\subset {\mathbb {R}}^d\) from the perspective of linearisation stability. Clearly, this set is smooth away from the origin (as follows from the implicit function theorem) and is non-smooth at the origin, where a quadratic singularity actually occurs. Correspondingly, it is readily checked that any non-zero solution is linearisation stable, while the origin is not linearisation stable since the kernel of the differential of p equals \({\mathbb {R}}^d\) itself. Set \(q=p^2\) we note that p and q have the same zero set (thus \(C^{\infty }\) at any point except the origin), but a point satisfying \(q(x_1,\ldots , x_d)=0\) is never linearisation stable since of course the differential of q vanishes at any point. So, these sort of spurious vanishing phenomena make the link between linearisation stability and smoothness of the level set in question less obvious than it may look at first sight. However, we will soon see that these sort of pathologies actually do not occur for the Einstein constraint operator.

The general notion of linearisation stability takes two important incarnations as far as the Einstein equations are concerned. At the level of initial data sets, the specification of that concept is rather straightforward:

Definition 4.1

Given a smooth manifold M, and assuming that the constraint map \(\varPhi : X_1\times X_2\rightarrow Y_1\times Y_2\) is \(C^1\), we say that \(\varPhi \) is linearisation stable at a solution of the vacuum Einstein constraint equations \((g_0,\pi _0)\) if \(D\varPhi _{(g_0,\pi _0)}(h,\omega )=0\) implies that there exists a \(C^1\) curve \((g,\pi ): (-\epsilon ,\epsilon )\rightarrow X_1\times X_2\), defined for some \(\epsilon >0\), such that

$$\begin{aligned} (g,\pi )(0)=(g_0,\pi _0), \ \ (g,\pi )'(0)=(h,\omega ), \ \ \varPhi (g(t),\pi (t))=0. \end{aligned}$$

In fact, in the specific setting we are dealing with, the notion above arose after, and in relation to, its natural Lorentzian counterpart, for the Einstein tensor G seen as a differentiable mapping between spaces of tensors having a suitable degree of regularity. To give this definition, assume that (as it is always the case in practice) given a subset V of L one can associate (to the pair \(Z_1, Z_2\) such that \(G: Z_1\rightarrow Z_2\) is a \(C^1\) mapping) a pair of Banach spaces \(W_1, W_2\) such that for tensors supported in V one has isometric embeddings \(W_1\rightarrow Z_1\), \(W_2\rightarrow Z_2\).

Definition 4.2

Let L be a smooth manifold, and assume that the Einstein map \(G: Z_1\rightarrow Z_2\) is \(C^1\) for suitable Banach manifolds of Lorentzian metrics \(Z_1\) and symmetric (0, 2)-tensors \(Z_2\). Given \(\gamma _0 \in Z_1\) a Ricci flat metric, and a Cauchy hypersurface (cf. Appendix A) M for \((L,\gamma _0)\), we say that G is linearisation stable at \(\gamma _0\) with respect to M if \( D G_{\gamma _0}(\eta )=0\) implies that there exist

  • a tubular neighbourhood V of M in L, and corresponding functional spaces \(W_1, W_2\) obtained by restriction of \(Z_1, Z_2\) respectively, as explained above;

  • a \(C^1\) curve \(\gamma : (-\epsilon ,\epsilon )\rightarrow W_1\), defined for some \(\epsilon >0\), such that

    $$\begin{aligned} \gamma (0)=\gamma _0, \quad \gamma '(0)=\eta , \quad G_{\gamma _t}=0. \end{aligned}$$

This Lorentzian notion of linearisation stability was first proposed in Fischer and Marsden (1973) and in Choquet-Bruhat and Deser (1972), and (at least from this specific perspective) the corresponding definition at the level of initial data sets, namely Definition 4.1 came par consequence. Of course, the reason why we may have to restrict the ambient manifold L to a possibly smaller open subdomain V concerns the fact that local sovability results for the Einstein field equations (cf. Sect. 1.2) may not ensure a common domain. In practice, Definition 4.2 is designed so that the previous two notions turn out to be equivalent, which ensures that the hyperbolic question about integrability of solution to the linearised problem to the somewhat simpler, elliptic problem one poses at the level of initial data sets: in the setting above, there are consistent choices of the functional spaces \(X_1, X_2\), \(Y_1, Y_2\) and \(Z_1, Z_2\) so that G is linearisation stable at \(\gamma _0\) with respect to \((M,g_0,k_0)\) in the sense of Definition 4.2 if and only if \(\varPhi \) is linearisation stable at \((g_0,\pi _0)\) in the sense of Definition 4.1.

We now wish to connect the two big problems presented above, namely the question about smoothness of the constraint manifold on the one hand and the question about linearisation stability for data sets. Once again, it is instructive to first look at the time-symmetric case. The result below was fully established in Arms and Marsden (1979) (see Theorem 1 therein).

Theorem 4.3

In dimension \(n\ge 3\) the scalar curvature map \(R: X_1\rightarrow Y_1\) is linearisation stable at g if and only if \(L_g\) is surjective.

A few comments are appropriate:

  1. (a)

    Here the functional spaces can be chosen as follows: \(X_1\) denotes the space of \(W^{s,p}\) Riemannian metrics, while \(Y_1\) denotes the space of \(W^{s-2,p}\) functions on M (for M a closed, orientable manifold of dimension n) and the restrictions on the exponents are given by the sole inequality \(s>n/p+1\);

  2. (b)

    That surjectivity be sufficient for linearisation stability follows straight from the implicit function theorem in the form of a standard submersion criterion in the context of Banach spaces;

  3. (c)

    The converse is much more subtle and relies, in a crucial way, on earlier contributions in Bourguignon et al. (1976) where it is proven (in particular) that pseudo-differential operators, acting between vector bundles over a given compact manifold, whose principal symbol is surjective but not injective have in their kernel an infinite dimensional linear space of elements supported in any given non-empty open set of the base manifold;Footnote 13

  4. d)

    in particular one exploits the quadratic vanishing conditions for integrable directions h in the form of the identity

    $$\begin{aligned} \int _M f \,D^2R_g (h,h)\,dV_g=0 \end{aligned}$$

    where the tensors h are chosen in the kernel of \(L_g\), divergence-free and supported in an open set where the static potential f (cf. Appendix D) is strictly positive. By explicitly computing the second variation of the scalar curvature map, one easily get an identity that allows to derive a bound of the form \(\Vert h\Vert _{W^{1,2}}\le C \Vert h\Vert _{L^2}\), which in turn forces h to lie in a finite dimensional space, a contradiction.

Theorem 4.3 can be regarded as a helpful criterion to conclude linearisation stability but also ensures that any static manifold does indeed provide an example of linearisation instability: for instance, that is certainly the case both for standard round spheres, and for flat manifolds. Note that such a conclusion can also be achieved, without appealing to Theorem 4.3, by more direct/specific arguments (as those given in Fischer and Marsden 1975a). However, it was also shown in the same article (see, in particular, Theorem B’ therein) that, in the same setting, the class of scalar-flat Riemannian metrics is a closed differentiable manifold (in fact consisting of the disjoint union of the flat metrics and a component of scalar-flat metrics that are not Ricci flat). Hence one may have a regular constraint manifold even in cases when the (sufficient) submersion criterion fails.

Open Problem 4.4

Does the same conclusion as in (second part of) the previous remark hold in presence of a positive cosmological constant \(\varLambda >0\)? In other words: is it possible to determine whether the class of Riemannian metrics (on a given closed 3-manifold M) having constant scalar curvature equal to 6 is a closed differentiable manifold? If not, how large is the singular set of the constraint manifold?

Remark 4.5

As we will explain in Sect. 4.2 for such a class to be non-empty, it must be (assuming for simplicity the orientability of M) that

$$\begin{aligned} M^3 \cong \left( S^3/\varGamma _1 \# \cdots \# S^3/\varGamma _p\right) \#_{i=1}^q (S^2\times S^1) \end{aligned}$$

where \(\#\) denotes a connected sum operation, and for each \(i=1,\ldots , p\), we have that \(S^3/\varGamma \) is a spherical space form (i.e., \(\varGamma _i\), \(i\le B\), are finite subgroups of SO(4) acting freely on \(S^3\)).

Turning our attention to the general case, it is convenient to first note that not only the linearisation stability but also the presence of symmetries can equivalently be read either at the level of spacetime geometry or at the level of initial data sets: indeed, it was shown in Moncrief (1975) that if a vacuum spacetime (a solution of the vacuum field equations (1.1), so a Ricci flat Lorentzian manifold) admits a Killing vector field then the surjectivity of \(D\varPhi \) fails to be satisfied at any Cauchy surface for the spacetime, and conversely if such a criterion fails to hold on an initial data set, then there is a Cauchy development of these initial data which admits one or more Killing vectors. More precisely, if \((L,\gamma )\) is a Ricci flat Lorentzian manifold of dimension \(1+3\), and M is a closed Cauchy hypersurface, with induced first and second fundamental form given by g and k respectively, then the dimension of \(\ker D \varPhi ^{*}_{(g,\pi )}\) is equal to the number of linearly independent Killing vector fields of \((L,\gamma )\), possibly only defined on an open subset of \((L,\gamma )\) containing M.

With respect to the proof of this statement, the strategy of Moncrief can be summarised as follows. On the one hand, if the spacetime admits a Killing vector field then the pair (fX) just consists of its normal and tangential projections with respect to M (and indeed it is a simple computation that the Killing equation in the ambient spacetime implies the KIDs condition on that pair). To prove the converse, which is more subtle, one takes (fX) as an input and integrates (in a suitable gauge) four of Killing’s equations to define a vector field on a Cauchy development of the given initial data set (Mgk), and then proves that the remaining six Killing equations are also satisfied. In particular, we wish to explicitly stress how Moncrief’s theorem above can easily be phrased in purely local terms instead. For a partial, yet highly non-trivial, analogue of the same result for non-vacuum initial data sets (which, in turn, is a key ingredient for the aforementioned partial resolution of Bartnik’s stationarity conjecture) see Huang and Lee (2020a).

That being said, the fundamental idea in proving the general (i.e., non time-symmetric) version of Theorem 4.3 is of genuinely Lorentzian character: the presence of spacetime Killing vector fields will impose, by virtue of (a suitable version of) Noether’s theorem some additional nonlinear constraints.

Theorem 4.6

In dimension \(n\ge 3\) the constraint map \(\varPhi : X_1\times X_2\rightarrow Y_1\times Y_2\) is linearisation stable at \((g,\pi )\) if and only if \(D\varPhi _{(g,\pi )}\) is surjective.

More specifically, it was shown in Moncrief (1976) that if \((L,\gamma )\) solves the (vacuum) Einstein field equations and \(\eta \) is an infinitesimal deformation then \(D G_{\gamma }(\eta )\) and \(D^2 G_{\gamma }(\eta ,\eta )\) are both divergence free. Incidentally, on the general link between symmetries (as encoded by Noether’s theorem) and linear instability we also wish to mention the recent work Khavkine (2015), which covers a broad spectrum of physical theories.

Let us now get back to the very problem we started from: when is the constraint manifold smooth near a point \((g,\pi )\) belonging to the zero set of the constraint map? Of course, we have a sufficient condition at hand, but what if \(D\varPhi \) is not surjective? In the time symmetric case, we have already mentioned the quadratic vanishing condition (4.1) as a necessary condition for the integrability of an infinitesimal deformation h. It turns out that an analogous result holds for the full constraints, and (even more importantly) that the resulting identity provides a complete set of obstructions in the sense that if an infinitesimal deformation satisfies the corresponding equation then it is integrable in the sense of Definition 4.1. So this gives a very explicit criterion for determining the set of actual integrable direction at points where the submersion criterion fails. Hence one is given, at least in principle, all tools needed to analyse the regularity problem for the constraint manifold. In case of a non-smooth point, we can then go on with our discussion by looking at the local structure of the constraint manifold near a non-regular point, and the outcome is that the singularities are defined by the vanishing of a quadratic form and that the constraint manifold does admit a tangent cone at any non-smooth point. Informally speaking, the finite-dimensional toy model described above is in fact a reasonably accurate (in a sense universal) description of the local picture. Here is the relevant statement.

Theorem 4.7

Let (Mgk) be a CMC, Killing initial data set solving the vacuum constraint equations, with M a closed 3-manifold.

A necessary and sufficient condition for an infinitesimal deformation \((h,\omega )\) to be integrable in the sense of Definition 4.1is that

$$\begin{aligned} \int _M \langle (f,X), D^2 \varPhi _{(g,\pi )}((h,\omega ),(h,\omega ))\rangle _{g}\,dV_g=0. \end{aligned}$$

for all \((f,X)\in \ker D\varPhi ^{*}_{(g,\pi )}\).

The fact that the obstruction above is the only one (namely that it provides a sufficient criterion for integrability) is in some sense surprising and was proven in Fischer et al. (1980) for the case of one Killing vector field (equivalently: when the kernel of \(D\varPhi ^{*}_{(g,\pi )}\) has dimension one) and then later in Arms et al. (1982) in full generality. Note that in the latter article the case of coupling between gravitational fields (as described by the Einstein tensor) and Yang-Mills fields is also studied.

Remark 4.8

We stress that, in both works we just cited, the authors assume that the initial data set (Mgk) has constant mean curvature. To the best of our knowledge, the unconditional extension of these sorts of results has not yet been given.

To get an appreciation for the highly non-obvious nature of the conclusion, let us go back one final time to our finite-dimensional toy model. Consider \(F\in C^{\infty }({\mathbb {R}}^d, {\mathbb {R}})\). At a given point \(x\in {\mathbb {R}}^d\) with \(F(x)=0\) either \(\dim \ker DF_x=d-1\) or instead \(\dim \ker DF_x=d\), equivalently \(DF_x=0\). From the perspective of the smoothness properties of the locus \(F(x)=0\), the study at those points where \(DF_x=0\) gives the second-order condition

$$\begin{aligned} D^2 F_x(h,h)=0 \end{aligned}$$

which is a necessary condition for a direction \(h\in {\mathbb {R}}^d\) to be integrable. However, if F vanished to high order, say like in the case \(F=q^2\) above (where F would be a fourth-order homogeneous polynomial) then such second-order condition is by no means sufficient for the integrability of h, since in fact the Hessian of F at the origin is identically zero.

To conclude the discussion about Theorem 4.7 we remark how often times, and especially in the Physics literature, it is more frequent to see the second-order condition phrased in Lorentzian terms, as vanishing of the Taub conserved quantities, namely

$$\begin{aligned} \int _M D^2 G_{\gamma }(\eta ,\eta )(X,Z)\,dV_g=0, \ \ \text {for any Killing vector field { X}} \end{aligned}$$

where Z denotes a (future-pointing) unit normal vector field to (Mgk) within \((L,\gamma )\). Note that the necessity of this condition follows at once from the previous remark that the tensor \(d^2 G_{\gamma }(\eta ,\eta )\) be divergence free for any infinitesimal deformation \(\eta \).

Remark 4.9

This criterion being given, it would still be desirable to know (either via the submersion criterion or otherwise) that singular points are, in a precise sense, sporadic. In this sense, the Open Problem 4.4 (and, in particular, the questions about the size and structure of the singular set) should also be understood for the case of the full constraints, especially in the cosmological case (so for a compact background manifold without boundary). We note that unconditional genericness for this case is not part of the results in Beig et al. (2005).

Transposing these sorts of arguments, and the whole discussion we have so far presented in this section, to the case of manifolds with boundary and/or to non-compact settings presents some technically non-trivial aspects, which also lead to partly different outcomes. In particular, one needs some care as the conclusions (i.e., the fact that linearisation stability may/may not hold) depends crucially on the choice of the weighted functional spaces, namely on the decay at infinity one imposes. On the analytic side, the Fredholm index of the operators in question is sensitive to the chosen weights, and ‘jumps’ as one goes through an indicial root.

In short, one may have asymptotically flat spacetimes with plenty of Killing vector fields and yet well-behaved (even totally geodesic) Cauchy hypersurfaces that are linearisation stable: for instance, the Minkowski spacetime is indeed, in a reasonable sense, linearisation stable (as discussed, independently, in Choquet-Bruhat and Deser 1973 and Fischer and Marsden 1975b; see also Choquet-Bruhat et al. 1977). A very accurate study of the constraint manifold in the asymptotically flat setting, under minimal regularity assumptions on the data, was later developed in Bartnik (2005). More specifically, the author defines a phase space \({\mathscr {F}}\) consisting of pairs \((g,\pi )\) lying in \(H^2_{\text {loc}}\times H^1_{\text {loc}}\) whose decay at infinity is encoded by suitable weighted Sobolev spaces, and rigorously proves that the constraint manifold \({\mathscr {C}}\) is a smoothFootnote 14 (i.e., \(C^{\infty }\)) Hilbert submanifold of \({\mathscr {F}}\). We have explicitly noted the regularity assumptions as they are well-connected to some of the most significant developements on the ‘hyperbolic’ side of the theory, namely the positive resolution of the bounded \(L^2\) conjecture (Klainerman et al. 2015).

At the time Bartnik (2005) was written, the best local existence and uniqueness results for the vacuum Einstein field equations required more than the above, in fact that \((g,\pi )\in H^s\times H^{s-1}\) for some \(s>2\) (in chronological order, see Fischer and Marsden 1972, Hughes et al. 1977 and the much later refinements in Bahouri and Chemin 1999 and Klainerman and Rodnianski 2005b), hence the author remarked an apparent discrepancy between the sharp/natural assumptions arising in the definition of the constraint manifold, and those arising when considering the well-posedness of the evolution problem: it is now an important and remarkable fact that this discrepancy seems to have been resolved, and the two theories exhibit some good degree of consistence. In addition, this correspondence allows to extend to the asymptotically flat context some of the basic results we have seen above, like e.g., the identification of Killing pairs at the level of initial data with the spatial restriction of true (i.e., physical) vacuum spacetime Killing vector fields.

The way \({\mathscr {C}}\) is proven to be a \(C^{\infty }\) Hilbert manifold is by first setting up functional spaces where the constraint map \(\varPhi \) is smooth (in the sense that it has infinitely many Fréchet derivatives), and then by proving that (in such setting) there are no weak solutions to the equation \(D\varPhi ^{*}_{(g,\pi )}(\xi )=0\). This implies, in fact, that all level sets of \(\varPhi \) are smooth, the constraint manifold just being a special case. We further note that in the same article Bartnik shows that the energy–momentum vector is a well-defined, smooth (vector-valued) functional on the constraint manifold \({\mathscr {C}}\), and also discusses the sharp assumptions for cleverly extending such functional to a subset (as large as possible) of the phase space \({\mathscr {F}}\). With such an extended definition, he then proves that the functional in question is well-defined in the sense that its value does not depend on the specific choice of the structure at infinity (thereby extending the much earlier analysis given in Bartnik 1986).

For a partial transposition of this program to the asymptotically hyperbolic setting see Delay and Fougeirol (2016), while for a treatment of the cosmological case (closed manifolds, under a no-KIDs assumption) see the very recent work Delay (2020) instead. A different project, carried through independently and stemming from the very same questions as Bartnik (2005), led to the results in Chruściel and Delay (2004) (which, in turn, can in many ways be regarded as a follow-up to the earlier work by the same authors, Chruściel and Delay (2003), we had mentioned in Sect. 3.2). There it is proven that (both in the compact case and for conformal compactifiable data, so with an eye to asymptotically hyperbolic ones) one can recover a smooth (i.e., \(C^{\infty }\)) Banach space manifold structure for the constraint manifold, and that in fact one can obtain a foliation structure for the phase space so long as there are no Killing initial data (or, more generally, away from those). If we neglect, for simplicity, the weights related to the behaviour at spatial infinity of functions and tensors, the authors consider pairs in \(C^{k,\alpha }_{\text {loc}}\times C^{k,\alpha }_{\text {loc}}\) for \(k\in {\mathbb {N}},\alpha \in (0,1)\), which has the drawbacks that the degree of regularity one imposes is not the most natural from the perspective of the evolution problem (for which one would rather be led to require one derivative less on the momentum, as was in particular the case in Bartnik’s work) and that, even if we neglect this issue, Schauder data are anyway more problematic when directly proving a short-time solvability result for the Einstein field equations (1.1).

Furthermore, partly motivated by the aforementioned Bartnik’s network of conjectures describing mass-minimising extensions of given Bartnik data (cf. Sect. 3.6), a lot of efforts have been devoted to establishing results in the same spirit for manifolds with boundary, where certain geometric boundary data are fixed. This is an interesting and technically subtle direction of research, which has seen in recent years some nice advances. We mention, in particular, McCormick (2015), Anderson and Jauregui (2019) as well as An (2020) (which appeared at the time this survey approached completion) building on interesting earlier work by the same author.

As we have written above, Bartnik’s work is in many ways optimal and provides an excellent correspondence between the elliptic and the hyperbolic side of the subject. That being said, there remains an important aspect of the theory that still needs to be understood, namely the well-posedness/local behaviour of the Hamiltonian flow as an ODE happening along \({\mathscr {C}}\subset {\mathscr {F}}\). To better understand this matter, let us keep in mind that, rather than embracing a Lagrangian perspective, one can equally well follow Arnowitt et al. (1961) and Arnowitt et al. (1962), and reformulate the dynamics by means a suitable Hamiltonian functional, as one would do in classical mechanics. In that perspective, and again focusing on the vacuum case for simplicity, to a local foliation by spacelike hypersurfaces one can associate the dynamic variables g and \(\pi \), and describe their motion, in terms of the lapse-shift pair \(\xi =(f,X)\), precisely as an Hamiltonian evolution problem which (due to its conservative character) occurs in \({\mathscr {C}}\).

Getting back to the well-posedness issue, when working with such Sobolev spaces as in Bartnik (2005) (allowing for very irregular functions and tensors) the Hamiltonian flow is only ‘densely defined’ in a classical sense, but does not possess any obvious clear definition for generic data. To the best of our knowledge this specific aspect has not seen any further advances in the last fifteen years. It seems to the author of the present review that there may indeed be a fascinating connection with theory of weak flows of irregular vector fields, as developed starting with the seminal work DiPerna and Lions (1989) and then later in Ambrosio (2004) (see the surveys De Lellis 2008 and Ambrosio and Trevisan 2017).

The shape of the set of initial data

Let M be a smooth manifold of dimension \(n\ge 3\), possibly non-compact and/or with non-empty boundary. For a choice of topological vector spaces \(X_1, X_2\) and \(Y_1, Y_2\) subject to the sole condition that the constraint map \(\varPhi : X_1\times X_2\rightarrow Y_1\times Y_2\) be continuous (i.e., \(C^0\)) we wish to describe our current understanding of the set \({\mathscr {C}}\) of initial data sets, as defined in the previous section (once again, here we shall mostly stick to the vacuum case, but possibly in presence of a non-zero cosmological constant \(\varLambda \)).

More specifically, one would like to understand the topology of \({\mathscr {C}}\) as encoded by its homotopy groups. As a special (and, in some sense, preliminary) case of the previous question, one would also like to determine conditions on M implying that \({\mathscr {C}}\ne \emptyset \). In general, we anticipate that these questions turn out to be remarkably subtle, and their answers are typically partial and only known in special cases. We will give an outline of what is the state of our knowledge at the time this survey is written, with a focus on model cases of physical relevance.

Remark 4.10

Unless otherwise stated in this section we will deal with spaces of smooth (i.e., \(C^{\infty }\)) tensors. This will always be the case when dealing with compact underlying manifolds, while we will have to violate this rule when discussing about asymptotically flat spaces (in which case we will explicitly describe the functional setup for our discussion).

We shall start by considering the case of manifolds without boundary, that has a much richer literature and for which a lot more is known. We will first focus on the case of closed manifolds and then (essentially by means of a blow-up procedure in the spirit of Schoen 1984) we will discuss about asymptotically flat Riemannian manifolds. (Almost) all we know on the shape of \({\mathscr {C}}\) follows, by suitable reductions, from results on the space of positive scalar curvature (henceforth: PSC) metrics on closed 3-manifolds, so let us proceed from there. First of all, Perelman’s work on the Ricci flow with surgery (see Perelman 2002, 2003a, b, as well as the monographs Kleiner and Lott 2008 and Morgan and Tian 2014), which led to the solution of the Poincaré and geometrisation conjectures has, among its byproducts, a neat characterisation of those 3-manifolds supporting positive scalar curvature metrics.

Theorem 4.11

Let \(X^3\) be a connected, orientable, compact manifold without boundary, supporting metrics of positive scalar curvature. Then

$$\begin{aligned} X^3 \cong \left( S^3/\varGamma _1 \# \cdots \# S^3/\varGamma _p\right) \#_{i=1}^q (S^2\times S^1) \end{aligned}$$

where \(\#\) denotes a connected sum operation, and for each \(i=1,\ldots , p\), we have that \(S^3/\varGamma \) is a spherical space form (i.e., \(\varGamma _i\), \(i\le B\), are finite subgroups of SO(4) acting freely on \(S^3\)).

Viceversa, any such manifold supports Riemannian metrics of positive scalar curvature.

For the sake of clarity, we note that one of the two implications in Theorem 4.11 follows at once from a classical, yet fundamental fact about positive scalar curvature metrics:

Theorem 4.12

If two oriented manifolds \(X_1\) and \(X_2\) support positive scalar curvature metrics, then so does their connected sum \(X_1\# X_2\).

The proof of this result has been presented, independently, in Gromov and Lawson (1980a) and in Schoen and Yau (1979c). In the former, the argument has a rather constructive character: given \(g_1\) (respectively \(g_2\)) a positive scalar curvature on \(X_1\) (respectively \(X_2\)), one can build a metric g on \(X_1\# X_2\) that coincides with \(g_1\) away from a small geodesic ball on \(X_1\), with \(g_2\) away from a small geodesic ball on \(X_2\), resembles a cylindrical (product) metric near the center of \(S^2\times I\) and (most importantly) has positive scalar curvature everywhere. The construction is thus local around two given points on the manifolds that serve as input. Now, as far as the application above is concerned, it is clear that both \(S^3/\varGamma \) and \(S^2\times S^1\) support positive scalar curvature metrics (for instance: the round, and product ones) hence the conclusion comes straight by invoking Theorem 4.12. The other implication is where the Ricci flow with surgery comes into play, and the description of that part of the proof is a monumental piece of mathematics that goes way beyond the scopes of this review.

If we think about the vacuum Einstein constraints in the time-symmetric case (but possibly with a positive cosmological constant) it is worth noting how the combination of Theorem 4.11 with the following theorem (see Kazdan and Warner 1975a, c, and the later overview in Kazdan and Warner 1975b) allows to also (easily!) determine the classes of 3-manifolds supporting a constant positive, or identically zero scalar curvature metric.

Theorem 4.13

Let \(M^n\) be a closed, connected manifold of dimension \(n\ge 3\). We shall say that \(M^n\) is:

  • of type I if it supports a metric of non-negative scalar curvature, but not identically zero;

  • of type II if it supports a metric of non-negative scalar curvature, but any such metric is actually scalar-flat;

  • of type III if it does not belong to the first two categories, namely if any metric it supports has negative scalar curvature somewhere.

Then, the following assertions hold:

  1. 1.

    if M is of type I then any smooth function can be realised as the scalar curvature of a smooth Riemannian metric on M;

  2. 2.

    if M is of type II then a smooth function is the scalar curvature of a Riemannian metric if and only if either it is negative somewhere or is identically zero;

  3. 3.

    if M is of type III then a smooth function is the scalar curvature of a Riemannian metric if and only if it is negative somewhere.

Remark 4.14

Let \(M^n\) be a type II manifold (in the sense above) and let g a Riemannian metric of non-negative scalar curvature. Then we claim that g is actually Ricci-flat. This can be seen in multiple ways (for instance, one effective approach is through Hamilton’s Ricci flow, as we shall see below), but it may be instructive to argue through purely elliptic methods. Indeed, if g cannot be deformed to a positive scalar curvature metric nearby, then we know that the applicability of the implicit function theorem must be obstructed, hence g must be a static metric. Said f a (non-trivial) static potential, the traced equation reads (cf. Appendix D)

$$\begin{aligned} \varDelta _g f =-\frac{R_g}{n-1}f \end{aligned}$$

but the right-hand side is zero (because M is of type II) hence f must be constant and so finally (from the static equation) we derive \({\mathrm {Ric}}_g=0\).

Note, in particular, that when \(n=3\) the whole Riemann curvature tensor is completely determined by the Ricci tensor, so that in particular any Riemannian metric of non-negative scalar curvature on a type II manifold must actually be flat, which implies (by standard facts) that \(M^3\cong S^1\times S^1\times S^1\), a three-dimensional torus.

Remark 4.15

Theorem 4.13 should not be confused with Theorem 2.3 we stated in Sect. 2.1. In fact, the deformations one employs to prove the theorem above do not typically occur within a given conformal class. To fix the ideas, let us briefly illustrate the spirit of these arguments by showing that a manifold of type I (like \(S^n\)) must always support a scalar-flat metric. A well-known result, first proven in Aubin (1970), but see also Carlotto (2021) for a somewhat simpler proof, asserts that any (closed connected) \(n-\)dimensional manifold always supports metrics of negative scalar curvature. Hence, fixed M of type I let \(g_{+}\) be a metric of positive scalar curvature, and let \(g_{-}\) be a metric of negative scalar curvature. Set \(g_{s}=(1-s)g_{+}+sg_{-}\), by continuity there exists \(s_{*}\in (0,1)\) such that the first eigenvalue of the conformal Laplacian (for the corresponding metric \(g_{s_{*}}\)) equals zero. As a result, deforming such a metric using the associated first eigenfunction, we prove the claim. Note that the path \((g_s)\) should, at least intuitively, be thought as transverse to the conformal classes of metrics on M.

After these important comments, here is the relevant corollary we previously alluded to:

Corollary 4.16

Let \(X^3\) be a connected, orientable, compact manifold without boundary. Then:

  1. (i)

    X supports a metric of constant positive scalar curvature if and only if Eq. (4.2) holds;

  2. (ii)

    X supports a metric of zero scalar curvature metric if and only if Eq. (4.2) holds or \(X^3\cong S^1\times S^1\times S^1\).

After having determined those 3-manifolds that do/do not support metrics of positive scalar curvature, it is then natural to proceed and investigate the topology of the corresponding space of metrics. This question has been the object of two major breakthroughs in Geometric Analysis over the last decade. We now know what follows:

Theorem 4.17

Let \(X^3\) be a connected, orientable, compact manifold without boundary. Then the space of positive scalar curvature metrics it supports is either empty or contractible.

Some comments are in order:

  1. (a)

    the fact that the moduli space of PSC metrics (namely: the quotient of the space of PSC metrics by the action of the diffeomorphisms group of the underlying manifold) be path-connected was first proven in Marques (2012) using a remarkably clever combination of the Ricci flow with surgery (as devised by Perelman) and conformal deformation techniques, that come into play crucially in the key backward-in-time inductive argument; in particular, by results in Cerf (1968) it follows that when \(M^3\) is the three-dimensional sphere the space of PSC metrics is itself path-connected, which in turn (as we will see below) allows to derive the path-connectedness of the space of asymptotically flat, maximal solutions to the vacuum constraints;

  2. (b)

    the proof of Theorem 4.17 was much later (and very recently) proposed by Bamler and Kleiner (as part of their program that culminated in the complete proof of the generalised Smale conjecture): the key tool, in their approach, is the result that there is a unique, canonical singular Ricci flow, and that the associated Ricci flow spacetimes depend continuously/smoothly (in a suitable sense) on their initial data. In this approach, the problem of suitably defining and constructing surgeries is somewhat by-passed by means of a more robust approach, where the flow is not necessarily regarded in purely classical, smooth terms but in a suitably weaker form (see Bamler and Kleiner 2019, as well as Kleiner and Lott 2017 and Bamler and Kleiner 2017 for earlier, foundational work on canonical singular Ricci flows and applications).

In the same way we deduced Corollary 4.16 from Theorem 4.11, it would be tempting to derive non-trivial homotopic consequences on the space of solutions to the equation \(R_g=2\varLambda \) (for \(\varLambda \ge 0\)) from Theorem 4.17. One would like to determine, for instance, whether this space is itself path-connected or maybe even contractible. Of course, a temptation to attack this problem would be to ‘project’ any given positive scalar curvature metric to a constant scalar curvature metric, essentially by appealing to the positive solution of the Yamabe problem: we look at the unique minimiser of the Yamabe functional in the conformal class and deform the given metric using such a conformal factor (and, then, possibly rescaling with a constant so to gain that the scalar curvature be exactly equal to \(2\varLambda \)). The resulting map depends continuously on the background metric, but it is patently not a retraction, the main problem with this approach being that in a given conformal class there are in general multiple constant scalar curvature metrics. In fact, in sufficiently high dimension (\(n\ge 25\)) one even experiences non-compactness of constant scalar curvature metrics within a conformal fiber (see the survey Brendle and Marques 2011 as well as references therein); in low dimension and in particular for \(n=3\) (which is the case we are dealing with) the actual scenario should be much more tame, yet the way constant scalar curvature metrics (with fixed value of the constant) sit inside the space of PSC metrics is not really understood.

Getting back to the study of initial data sets, as we anticipated above, Marques was able to derive (from his main result, concerning PSC metrics on closed manifolds) a few remarkable corollaries about certain classes of asymptotically flat initial data sets on \({\mathbb {R}}^3\).

More specifically, he proved that any of the following three spaces is path-connected:

  • Riemannian metrics g on \({\mathbb {R}}^3\) such that \(g_{ij}-\delta _{ij}\in C^{2,\alpha }_{-1}\) and \(R_g=0\);

  • Riemannian metrics g on \({\mathbb {R}}^3\) such that \(g_{ij}-\delta _{ij}\in C^{2,\alpha }_{-1}\) and \(R_g\ge 0\), \(R_g\in L^1\);

  • pairs (gk) of Riemannian metrics and symmetric tensors on \({\mathbb {R}}^3\) such that \(g_{ij}-\delta _{ij}\in C^{2,\alpha }_{-1}, \ h_{ij}\in C^{1,\alpha }_{-2}\) solving the Einstein constraints in the maximal case, i.e.,

    $$\begin{aligned} \left\{ \begin{aligned}& R_g=\Vert k\Vert ^2_g \\& tr_g(k)=0 \\& div_g(k)=0 \end{aligned}\right. \end{aligned}$$

Here we employ the definitions and the notation of Appendix B as far as weighted Hölder spaces are concerned; note that in the statements above \(\alpha \) denotes any number in the open interval (0, 1). For partial, earlier results on the second class above (that consisting of asymptotically flat metrics of non-negative scalar curvature) through totally different techniques see also earlier work in Smith and Weinstein (2004). We further note that, by means of rather straightforward modifications, the case of asymptotically flat 3-manifolds with multiple ends (but, again, with empty boundary) can also be handled.

In spite of these advances, even for \(n=3\) the problem of fully understanding the topology of asymptotically flat initial data sets remains open, which deserves to be pointed out explicitly.

Open Problem 4.18

Considered \({\mathbb {R}}^3\) as background manifold, define

$$\begin{aligned} {\mathscr {C}}=\left\{ (g,\pi )\in (\delta _{ij}+C^{2,\alpha }_{-1}) \ \times \ (C^{1,\alpha }_{-2}) \ : \ \varPhi (g,\pi )=0. \right\} \end{aligned}$$

Is \({\mathscr {C}}\) path-connected? Is it contractible? If not, what are the non-vanishing homotopy groups?

Of course, we note that one may pose several variations on this theme, depending on the choice of the background manifold (its topology, the number of ends ...), and on the other end one may also try to pose and study these questions in the non-vacuum case, either for a fixed matter model or more abstractly under the sole requirement that the data satisfy the standard dominant energy condition.

Let us now say something about the (partial) answers to the questions we started from in the higher-dimensional scenario, namely for \(n\ge 4\), again sticking to compact manifolds with boundary. We will focus on the time-symmetric case, as essentially no result seems available for the full system of constraints (1.2). The landscape in front of us is, in many respects, more complex and significantly more fragmented compared to the case \(n=3\) which we have so far sketched. Starting with the first pioneering works on spin structures and the Dirac operator (see in particular Lichnerowicz 1963 and then Hitchin 1974) results came, for decades, in the form of obstructions to the possibility for certain classes of manifolds to support metrics of positive scalar curvature. Just a few years later, parallel to the fundamental contributions in Gromov and Lawson (1980a, 1980b, 1983), minimal surface techniques naturally also came into play (see Schoen and Yau 1979a and Schoen and Yau 1979c). The field then flourished in many ways, ultimately leading in 1992 to a rather definitive, yet somewhat abstract, characterisation of those simply connected closed manifolds supporting a PSC metric.

Theorem 4.19

Let \(X^n\) be a closed, simply connected manifold of dimension \(n\ge 5\). Then X supports positive scalar curvature metrics if and only if either it does not admit any spin structure or admits a spin structure such that the (associated) \(\alpha \)-invariant vanishes.

The result above follows by combining the main theorem in Gromov and Lawson (1980a) (showing that manifolds not supporting spin structures do support PSC metrics) with the work in Stolz (1992) concerning the full comprehension of the case when the second Stiefel-Whitney class of the manifold in question vanishes (namely: showing that any such manifold supports a PSC metric if and only if the well-known Lichnerowicz obstruction is satisfied). For the basic background on the map \(\alpha \), which is a graded ring homomorphism from the set of spin cobordism classes to the real K-homology of a point, we refer the reader to the beautiful monograph Lawson and Michelsohn (1989).

Remark 4.20

Some further, related comments are appropriate:

  1. (a)

    When one considers closed (oriented) manifolds that are not simply connected, the characterisation given by Theorem 4.19 ceases to hold: for instance the three-dimensional torus \(T^3\) supports no positive scalar curvature metric, in spite of being a spin manifold with vanishing \(\alpha \)-invariant.

  2. (b)

    For \(n=2,3\) all closed simply connected manifolds do admit positive scalar curvature metrics, while for \(n=4\), there are lots of simply connected manifolds that do not admit positive scalar curvature metrics (in fact, this is the case for any K3 surface (cf. LeBrun 1995, 1999).

  3. (c)

    It turns out that the issues raised in the previous two items are not at all isolated, for in fact we still do not have a clear/complete understanding of the situation when the fundamental group is not trivial (cf. Bérard-Bergery 1983; Schick 1998) or when \(n=4\). On the one hand, for non simply connected manifolds (in dimension \(\ge 5\)) we have almost no general existence results for positive scalar curvature metrics, except for very specific fundamental groups. Some information in the case of abelian fundamental groups of odd order may be found in Hanke (2019). On the other hand, the world of four-dimensional manifold seems to be very peculiar (also, among other things) with respect to these sorts of questions. Its investigation is naturally connected to Seiberg–Witten theory, see Taubes (1994) and Ruberman (1998), as well as references therein. In particular, we wish to mention the existence of simply connected spin manifolds that have vanishing \(\alpha \)-invariant and yet do not admit any positive scalar curvature metric. This ultimately relies on the striking Theorem 5.8 in Teicher (1999) constructing simply connected general-type complex surfaces which are spin and have signature zero (cf. Moishezon et al. 1996), for indeed employing the Seiberg-Witten invariants one proves that all compact algebraic surfaces of ‘general type’ do not admit metrics of positive scalar curvature Moore (1996).

As we already mentioned in the introduction of this survey, the Riemannian positive mass theorem (see Appendix B) can ultimately be deduced from answering the question whether certain closed manifolds, i.e., those that take the form of a connected sum \(T^n\# M^n\) of an n-dimensional torus and an arbitrary (closed, connected and orientable) smooth manifold support PSC metrics. For the subclass of such manifolds admitting a spin structure (like e.g., n-dimensional tori for any \(n\ge 3\)) the (negative!) answer follows essentially by means of the Weitzenböck identity applied to harmonic spinors; instead for the class of such manifolds admitting a non-zero degree smooth map to \(T^n\) but possibly assuming the dimensional restriction \(n<8\) (or else appealing to either of the recent developments in Lohkamp 2016 or Schoen and Yau 2017) the conclusion is instead achieved by a downward dimensional inductive argument, based on the Yamabe positive conformal type inherited by a stable minimal hypersurface inside any given closed Riemannian manifold of positive scalar curvature.

At the time this survey approached completion, new interesting contributions came to light: in particular, we wish to mention the contributions in Chodosh and Li (2019) where the authors prove the conjecture by Gromov asserting that closed aspherical manifolds of dimension 4 or 5 do not support any metric of positive scalar curvature (see also Gromov 2019 for a partly different argument in dimension five). We recall here, for the sake of completeness, that a manifold is called aspherical if its universal cover is contractible (which is for instance the case for tori, thereby linking the question above to the Geroch conjecture described in the previous paragraph).

Moving forward, and raising the bar, let us now say something about what is known concerning the homotopy type of the space of positive scalar curvature metrics on closed manifolds of dimension larger than three. Once again, the results that have been obtained mostly rely on the index theoretic analysis of Dirac operators (or generalisations thereof) or, in the special case of four-manifolds, on gauge-theoretic tools. To make a long story short, and oversimplifying things to the extreme, we can loosely assert that ’if the background manifold has dimension \(n\ge 4\) the moduli space of PSC metrics has, in many cases of interest, infinitely many connected components.’ Note that, by an elementary topological argument, in any of these cases the same conclusion holds for the set of PSC metrics as well (without quotienting by the action of the diffeomorphisms group). A very brief overview of some landmark results concerning the space of positive scalar curvature metrics on the standard n-dimensional sphere \(S^n\) goes as follows:

  • in Hitchin (1974) it is shown that such a space is disconnected when \(n\equiv 0,1 \ (\text {mod} \ 8)\);

  • in Gromov and Lawson (1983) it is shown that such a space has infinitely many connected components when \(n=7\);

  • in Carr (1988) it is shown that such a space has infinitely many connected components when \(n\equiv -1 \ (\text {mod} \ 4), \ n\ge 7\).

Further refining the last of the results mentioned above, it was then proven in Kreck and Stolz (1993) that the corresponding moduli space also has infinite connected components for \(n\equiv -1 \ (\text {mod} \ 4), \ n\ge 7\). Later, in Botvinnik and Gilkey (1996) it is discussed how to construct examples of closed, orientable manifolds of any pre-assigned dimension \(n\ge 5\) for which, again, the moduli space in question has infinitely many connected components (see Reiser 2019 for a very recent improvement). Lastly, for what concerns the case \(n=4\), in Ruberman (1998) one can find examples of closed orientable manifolds for which the space of PSC has infinitely many connected components; however note that, to the best of our knowledge, it is still unclear whether for any \(n\ge 4\) it is possible to construct simply connected manifolds such that the space of PSC metrics has infinitely many connected components. The results we have listed above do not, by any means, provide a complete overview of the recent advances in the field. In particular, we have witnessed some partial, yet striking, progress on the problem of determining the higher homotopy groups (the reader is referred to Botvinnik et al. 2010; Hanke et al. 2014; Wiemeler 2020 and references therein).

It would certainly be interesting to derive an obstruction result in the spirit of the ones listed above that may have some rather direct physical implications, for instance in terms of causality or as an a priori violation of a suitable ‘final state conjecture’, be it either in the cosmological case or in the asymptotically flat spacetimes. In particular, loosely speaking, one could hope to employ techniques of homotopical character to detect dimensional phenomena, i.e., to distinguish the \(1+3\) dimensional scenario from the higher-dimensional ones, at least as far as the description of certain classes of physical phenomena is concerned.

Boundary conditions and the transition to black hole initial data

Proceeding to yet another class of recent advances, we shall now instead focus on compact manifolds with boundary and, later, on those (asymptotically flat, Riemannian) manifolds that can be obtained from those through a ‘blow-up’ procedure at an interior point. Recall that all metrics, and in fact all tensors in play are smooth unless otherwise stated. First of all, we need to understand what the ‘right questions’ to ask are, in this specific context. In that respect, it is helpful to start by recalling the following theorem, which is given in Gromov (1969) (see Theorem 4.5.1 therein).

Theorem 4.21

Let \(X^n\) be a compact manifold with \(n\ge 3\). If \(\partial X\ne \emptyset \), then \(X^n\) always supports positive scalar curvature metrics. In fact, \(X^n\) always supports metrics with positive sectional curvature.

Remark 4.22

Strictly speaking, we note that the result above follows from an assertion about open manifolds. For the analogue of Theorem 4.11 in that category see Bessières et al. (2011), see also Lesourd et al. (2020) as well as the aforementioned Chodosh and Li (2019) for very recent results about the non-existence of PSC metrics for some families of open dimensional manifolds of dimension at least four.

So, as it is also plausible from a PDE perspective, we must add boundary conditions for the questions above to be of some interest. Of course, a priori this leaves room for different choices. Motivated by later applications to the study of initial data sets for (certain classes of) asymptotically flat, black hole solutions to the Einstein equations, we will focus here on pointwise conditions on the mean curvature of \(\partial X\). Like we explained in Sect. 2.6, if \((L,\gamma )\) is a black hole solution of the Einstein equations, the trace of the event horizon on a spacelike hypersurface inherits an additional condition, on the null mean curvature, which can equivalently be posed as a possibly inhomogeneous condition on the Riemannian mean curvature of \(\partial X\) within (Xg). Then, once again, through suitable compactification arguments, we can derive results for black hole spacetimes from theorems concerning compact 3-manifolds with boundary (endowed with a Riemannian metric). Therefore, given \(X^n\) a compact orientable manifold and g a smooth Riemannian metric on X, and denoted by \(H_g:\partial X\rightarrow {\mathbb {R}}\ \text {the mean curvature of} \ \partial X\), the general theme we wish to develop in this section is the study of sets of Riemannian metrics on X that are ‘cut’ by two pointwise conditions involving these two curvature functions. For example, one can consider

$$\begin{aligned} {\mathscr {M}}{:}{=} \{g \ : \ R_g>0, \ H_g \ge 0 \}, \ \text {or} \ \ {\mathscr {H}}{:}{=} \{g \ : \ R_g\ge 0, \ H_g = 0 \}. \end{aligned}$$

To fix the ideas we will mostly focus, in the present discussion, on the case of \({\mathscr {M}}\), although one may develop a similar program for \({\mathscr {H}}\) (on which we will get back later on). We can then ask similar questions as in the closed case, namely one can wonder whether it is possible to characterise those manifolds X such that \({\mathscr {M}}\ne \emptyset \), and (in that case) whether one can determine the homotopy type of \({\mathscr {M}}\) and/or of the moduli space \({\mathscr {M}}'\) obtained by quotienting by the action of (proper) diffeomorphisms of X (without the assumption that they restrict to the identity along the boundary \(\partial X\)).

Very little is known, for the two questions above, when \(n\ge 4\). We shall instead focus on three-dimensional manifolds, and briefly present some of the main results in Carlotto and Li (2019), starting with the following characterisation.

Theorem 4.23

Let \(X^3\) be a connected, orientable, compact manifold with boundary, such that \({\mathscr {M}}\ne \emptyset \). Then there exist three integers \(A,B,C\ge 0\), not all zero, such that X is diffeomorphic to a connected sum of the form

$$\begin{aligned} P_{\gamma _1}\#\cdots \# P_{\gamma _A}\# S^3/\varGamma _1\#\cdots \# S^3/\varGamma _B\# \left( \#_{i=1}^C S^2\times S^1\right) \end{aligned}$$

where for any \(i\le A\) we have that \(P_{\gamma _i}\) denotes a genus \(\gamma _i\) handlebody. Viceversa, any such manifold supports Riemannian metrics of positive scalar curvature and mean-convex boundary.

In the statement, as above, the symbol \(\#\) denotes an interior connected sum; recall that, somewhat informally, a \(\gamma \) handlebody is a compact manifold with boundary bounded (say in \({\mathbb {R}}^3\)) by a genus \(\gamma \) closed surface. Also, note that we allow \(\gamma =0\) (i.e., genus zero handlebodies) and that taking the interior connected sum with a copy of \(P_0\) is equivalent to removing an open ball away from the boundary. Before sketching the proof of Theorem 4.23, let us mention some related results. In particular, it is natural to ask how the answer to our questions changes when replacing strict inequalities in the definition of \({\mathscr {M}}\) with weak inequalities and/or viceversa. The answer is given by the following statement.

Theorem 4.24

Let \(X^3\) be a connected, orientable, compact manifold with boundary. Then the following three assertions are equivalent:

  1. (i)

    \({\mathscr {M}}_{R>0, H>0}\ne \emptyset \);

  2. (ii)

    \({\mathscr {M}}_{R>0, H\ge 0}\ne \emptyset \);

  3. (iii)

    \({\mathscr {M}}_{R\ge 0, H>0}\ne \emptyset \).

Furthermore, each of these is equivalent to

  1. (iv)

    \({\mathscr {M}}_{R\ge 0, H\ge 0}\ne \emptyset \),

unless \(X^3\cong S^1\times S^1\times I\) (in which case the space \({\mathscr {M}}_{R\ge 0, H\ge 0}\) only contains flat metrics, making the boundary totally geodesic).

The proof of this result exhibits various connections with the torus rigidity theorem and the positive mass theorem. We shall not provide further details about it, but rather present an outline of the argument behind Theorem 4.23. In particular we first prove that all manifolds as in the statement support a metric in \({\mathscr {M}}\) and then we prove that all manifolds for which \({\mathscr {M}}\not =\emptyset \) are of that form.


Let us first deal with the second assertion. Thanks to Theorem 4.11, since the statement of Theorem 4.23 only involves connected sums in the interior (thus not affecting the boundary), it suffices to show that for any \(\gamma \ge 0\) the handlebody \(P_\gamma \) can be endowed with a metric of positive scalar curvature and mean-convex boundary. One way to see this is to notice that for any \(\gamma \ge 0\) there exists, in round \(S^3\), a minimal surface of genus \(\gamma \) (thanks to main result in Lawson 1970): any such surface is two-sided and bounds two domains, both with minimal boundary and scalar curvature equal to 6 at all points.

Let us now say something about the other implication, which is much subtler. The key trick to attack it is an old, but perhaps not so well-known remark in Gromov and Lawson (1980b) (Lemma 5.7 therein), asserting that whenever \(X^3\) supports metrics of positive scalar curvature and mean-convex boundary, we can endow its double DX with a metric of positive scalar curvature. This statement, and our related refinements, play a fundamental role in the global economy of Carlotto and Li (2019); here let us give it for granted and see how to proceed. Thanks to such a remark, it is enough to classify the compact orientable three-dimensional manifolds X with boundary such that

$$\begin{aligned} DX \cong \left( S^3/\varGamma _1\# \cdots \# S^3/\varGamma _p \right) \ \#_{i=1}^q (S^2\times S^1), \end{aligned}$$

which is a purely topological matter. To take care of this, we proceed in two steps.

In the first step, we assume that there are only spherical boundary components. In this case we compare the outcome of two topological operations we can perform on X: the first one is the double D (take two copies of X and identify every corresponding pairs of boundary components, with opposite orientations) and the second one is the filling F (add a disk to each boundary component, by identification of the boundary spheres). What is always true is that these two operations are related by the following equation

$$\begin{aligned} DX \cong (FX\# FX) \ \#_{i=1}^{d-1} \ (S^2\times S^1), \end{aligned}$$

where d is the number of boundary components of X. Keeping in mind that the left-hand side of this equation is given (it is provided by Theorem 4.11), we solve this equation for FX (using Milnor’s uniqueness theorem of prime decomposition of 3-manifolds, see Milnor 1962) and get, for suitable integers \(p', q'\)

$$\begin{aligned} FX \cong \left( S^3/\varGamma _1\# \cdots \# S^3/\varGamma _{p'} \right) \ \#_{i=1}^{q'} (S^2\times S^1). \end{aligned}$$

At that stage, we solve for X by simply removing finitely many balls in the interior, namely we obtain

$$\begin{aligned} X \cong (S^3/\varGamma _1\# \cdots \# S^3/\varGamma _{p'})\ \#_{i=1}^{q'} (S^2\times S^1)\setminus \ \cup _{i=1}^r B_i, \end{aligned}$$

which is what we wanted, provided we just keep in mind that any ball removed corresponds to the connected sum with a solid disc, which we denote \(P_0\) (a genus zero handlebody).

In the general case, namely when one needs to handle boundary components of positive genus, the key remark is that any such boundary component must be compressible in X. This is a standard notion in geometric topology, for which we refer the reader e.g., to the monograph Jaco (1980). In our specific setting, where any surface we deal with is two-sided and orientable (being a connected component of the boundary \(\partial X\)) this condition is equivalent to the non-injectivity of the fundamental group of the surface in the fundamental group of X (the morphism being the map induced by the inclusion). The reason why such a claim is true is quite simple: if that were not the case, we could construct in X a stable minimal surface of positive genus, which is impossible by well-known facts about the second variation of the area functional.

That being said, any such boundary component comes with a compressing disc and we can compress along that disc obtaining a new compact 3-manifold with boundary, say Z, such that \(DX\cong DZ\# (S^2\times S^1)\). At this stage, either Z only has spherical boundary components (in which case we invoke the outcome of the first step) or, if not, it has other non-spherical boundary components. In the latter alternative, we perform yet another compression. Because of the previous equation relating DX and DZ, the process must finish after finitely many steps, hence we eventually reduce to a compact manifold with boundary, say \(Z_0\), to which the first step applies. At that stage, we can determine \(Z_0\) whence one can actually reconstruct X by arguing backwards, i.e., by unwinding the compression operations we have just performed. Roughly speaking, moving backward \(\gamma \) times from \(B^3\) will produce a genus \(\gamma \) handlebody, whereby one can easily conclude the proof. \(\square \)

These preliminary aspects being clarified, one can proceed with the study of \({\mathscr {M}}\). Here is the second main result obtained in Carlotto and Li (2019).

Theorem 4.25

If \({\mathscr {M}}\not =\emptyset \), then the associated moduli space \({\mathscr {M}}'\) is path-connected. In the special case \(X^3\cong D^3\), then \({\mathscr {M}}\) is itself path-connected.

Remark 4.26

Here are some general comments about this statement.

  • This conclusion is not known for any \(n\ge 4\), not even for \(D^n\), but (relying on the analogy with the closed case) it is reasonably expected to be false in many cases of interest.

  • The same conclusion as in Theorem 4.25 also holds for the three related spaces \({\mathscr {M}}_{R\ge 0, H>0}\), \({\mathscr {M}}_{R \ge 0, H\ge 0}\) and \({\mathscr {M}}_{R>0, H>0}\) thanks to Theorem 4.24 and related, simple deformation arguments, unless \(X^3\cong S^1\times S^1\times I\) (in which case the conclusion is still true, due to the characterisation of the metrics in \({\mathscr {M}}_{R\ge 0, H\ge 0}\)).

  • The proof of Theorem 4.25 is a combination of elliptic and parabolic methods. In addition, although the statement concerns smooth metrics, the methods we employed are partly non-smooth (i.e., along the course of our arguments we need to deal with metrics that exhibits certain types of singularities along codimension one interfaces, partly related to work in Miao 2002 and McFeron and Székelyhidi 2012 on the positive mass theorem for spaces with a compact edge-type interface).

Essentially, this theorem is proven through two main steps. The first one is an ‘isotopic version’ of the Gromov-Lawson doubling construction. That is to say, given \(g\in {\mathscr {M}}\), we build an isotopy \((g_\mu )_{\mu \in [0,1]}\) starting at \(g_0=g\), such that \(g_\mu \in {\mathscr {M}}\) for all \(\mu \in [0,1]\) and that \(g_1\) has positive scalar curvature and totally geodesic boundary (in fact, it can be smoothly doubled). The net outcome of this construction is the reduction of our problem into one concerning the space of positive scalar curvature metrics satisfying a suitable equivariance constraint. More specifically, we define a category of reflexive n-manifolds, that are (loosely speaking) triples \((M^3,g,f)\) where \((M^3,g)\) is a closed Riemannian manifold and \(f\in C^\infty (M,M)\) is an isometric involution \(f^2 = id\), \(f\not =-id\) that is conjugate to a ‘standard reflection’. However, note that if M is non-connected, there may be reflexive pairs as well (couples of identical connected components).

Such an isotopy being constructed, the idea for the second step is that we can then flow the metric \(g_1\) on the closed manifold obtained as the double of X, so to connect it to a subset of model metrics, to be suitably defined in this context. Loosely speaking, model metrics are a preferred path-connected subset of \({\mathscr {M}}\), that is controlled by finitely many parameters related to performing a connected sum operation on certain standard blocks. More in detail, we evolve an initial reflexive triple through a suitable equivariant Ricci flow with surgery, as developed in Dinkelbach and Leeb (2009). Hence, we follow the conceptual path indicated by Marques in dealing with the closed case: one argues through a backward-in-time inductive scheme, from the pre-extinction time to the initial one. All technical tools need to be transplanted to this (special) equivariant setting, and in particular a key point to make the backward induction work is to show that the equivariant connected sum of reflexive triples that can be (separately) isotoped to model triples can be isotoped, within reflexive triples, to a model triple as well.

Here are a couple of final remarks about the contributions in Carlotto and Li (2019). First of all, one can obtain similar results for spaces of metrics of positive (or: non-negative) scalar curvature and minimal boundary such as e.g., the space \({\mathscr {H}}\) defined above (cf. Sect. 6 therein). This requires some extra care, and indeed there are subtle technical points coming into play, but the general argument resembles the approach we have described in this section. From there, fairly standard compactification arguments allow to deduce path-connectedness results for spaces of asymptotically flat metrics on, say, \({\mathbb {R}}^3\) minus a finite number of balls with non-negative scalar curvature and minimal or mean-convex boundary conditions. This conclusion has quite non-trivial implications. For instance, when the background topology is that of \({\mathbb {R}}^3\setminus B\) this result implies that we can connect any given asymptotically flat solution of the (Riemannian) vacuum Einstein constraint equations, through a continuous path of solutions (namely: through a continuous path of smooth, asymptotically flat and scalar-flat metrics with minimal boundary) to the simplest one we know, the Schwarzschild solution. The fact that this can be done is, in a sense, far from obvious as we know (cf. Sect. 3.3) how large and rich the space of such solutions can be. Differently phrased, such a connectedness result rules out a nightmare scenario of infinitely many islands of solutions, with the most exotic ones separate from the others. As it has been recently pointed out in Hirsch and Lesourd (2019) (where a streamlined exposition is given, about how Theorem 4.25 allows to derive a path-connectedness result for maximal, asymptotically flat vacuum data sets on \({\mathbb {R}}^3\setminus B\) with mean-convex boundary) this positive conclusion is consistent with the landscape predicted by the so-called ‘final state conjecture’.

Some conclusive remarks

As we have seen in this survey, a number of fundamental questions remain open and the field of ‘elliptic general relativity’, with its connections to several basic problems in Geometric Analysis, is more vital than ever. In fact, a key theme that emerged by our discussion is to what extent what we know for spaces of Riemannian metrics defined by a condition, be it an equation or an inequality, on the scalar curvature can be transposed, with analogous outcomes, to the space of solutions of the full constraints (be it vacuum, possibly considering a non-zero cosmological constant, or with specific sources, or even with an unspecified collection of sources solely subject to a suitable energy axiom). The problem of defining a geometric flow, for initial data, that may allow to derive path-connectedness conclusions similar to those stated above in the maximal case stands out as a paradigmatic example.

Getting to somewhat more specific remarks, if we look back at Sect. 2 we could reasonably assert that conformal methods have now been investigated quite thoroughly, and a number of basic pending questions have found a satisfactory answer. That said, it would still be desirable to obtain non-perturbative existence results beyond what is granted, through a subcritical approximation argument, by the limit equation criterion. It is reasonable to expect that substantial advances in this direction may in fact need genuinely new ideas, as there seems to be no obvious attack strategy at sight. In a different direction, we still lack a general ‘local structure theorem’ describing the set of solutions as one varies the parametrising data, so extending the portraits given in some special, highly symmetric cases.

By contrast, we believe that gluing methods are still far from having displayed their full strength, in the sense that (e.g., for asymptotically flat data) we observe some striking discrepancy between the restrictions imposed by the positive mass theorem, with rigidity, and those (relatively few) constructions that have been successfully implemented. In vague terms, it may well be that the landscape we see now is just the tip of the iceberg, and that the isolated theorems we have described will soon become part of a more organic theory. Other directions, like for instance the general ‘extension problem’ for data have just started being explored and yet have already led to fascinating advances in the field.

Another research line, which is perhaps less directly connected to the main themes of the present survey, and goes anyway more in the long run, is the development of very weak notions of solutions for the constraints (either in the vacuum case or under the dominant energy condition), in a sense vaguely reminescent of geometric measure theory. For instance, sticking to the time-symmetric case where this program would already be very ambitious, one would like to define weak notions of scalar-flat or scalar non-negative Riemannian metrics, that are designed so to allow for a robust structure and compactness theory. It is obvious that any significant advance in this direction would have direct, strong implications on several variational problems that arise at the level of initial data sets for the Einstein equations (stability of the positive mass theorems, existence of mass-minimising extensions, solvability of fill-in problems etc...). Certainly, at a conceptual level, this program would fill a big gap in the current state of Riemannian geometry, although the more physically significant consequences are somewhat harder to predict.

A list of ten open problems

We wish to conclude the present review by collecting, in the Table 4, some of the open problems we explicitly stated along the course of our exposition. As it has already been said, many more (possibly of comparable significance) are scattered throughout the text. Reviewing this list may—we believe—provide a convenient way for the reader to quickly go through some of the key themes we have tried to present, and put them in a broader perspective. Needless to say, we very much hope that the present survey will inspire further progress, and that (in a few years from now) we will have the pleasure to report on the solutions of some of these questions, and give a more accurate (or, rather, less preliminary) sketch of the landscape in front of us.

Table 4 Selection of ten open problems


  1. 1.

    In this survey, all manifold are \(C^{\infty }\) unless otherwise stated; similarly all Riemannian metrics are also \(C^{\infty }\) unless otherwise stated.

  2. 2.

    We have decided to stick to this notation, which is quite universally adopted in the literature on this theme, although we note that the function \(\pi \) defined here is not to be confused with the momentum tensor that one can associate to an initial data set.

  3. 3.

    In a precise sense, given in Eq. (2.12) therein, which extends to non-compact manifolds the standard condition that the linearised conformal Laplacian has trivial kernel at the metric in question.

  4. 4.

    If we focus e.g., on the specific applications described in Sect. 2.9, then we rather have that (for each \(i=1,2\)) the metric \(g_i\) solves (3.1) only in the smaller domain \(\varOmega '_i\) and is an approximate solution in the region \(\varOmega \), which in that case corresponds to the connecting neck. The discussion we present in the next sections applies equally well to such a scenario, at least in general terms.

  5. 5.

    As we will see, the construction we are about to perform is patently local to any given end of the manifold, hence can be easily extended to the case \((M^3,g_0)\) has multiple ends.

  6. 6.

    If we do not impose decay conditions, the kernel of \(L^{*}_{\delta }=\varDelta _{\delta }\) on \({\mathbb {R}}^3\) shall consist of all affine functions, i.e., \(\langle 1, x^1,x^2, x^3\rangle _{{\mathbb {R}}}\).

  7. 7.

    Such an inequality can be proven, say for compactly supported data in Euclidean domains using the characterisation of the Sobolev norms of a function in terms of integrability of its Fourier transform, whence one can deduce the result for Riemannian domains using a partition of unity.

  8. 8.

    We remind the reader that, due to the Bishop–Gromov inequality (see e.g., Gallot et al. 2004), an asymptotically flat metric having non-negative Ricci curvature must be flat.

  9. 9.

    As determined by the background spherical metric.

  10. 10.

    This is the local deformation theorem; in fact in the time-symmetric case we are discussing here the result we actually need reduces to the version of Theorem 3.5 in the \(C^{\infty }\) category.

  11. 11.

    We have rephrased the statement of Main Result 2 in LeFloch and Nguyen (2019) for the sole sake of notational consistence.

  12. 12.

    For the sake of brevity, in this section we will typically employ the word differentiable to refer to a \(C^1\) submanifold of a product of Banach spaces; instead the word smooth will rather refer to a \(C^{\infty }\) structure.

  13. 13.

    In fact, in Bourguignon et al. (1976) actually claim a proof of Theorem 4.3, although a slight issue in the argument was pointed out and later fixed in Arms and Marsden (1979).

  14. 14.

    There seems to be some confusion in the literature between two well-distinct conceptual aspects concerning infinite-dimensional manifolds: on the one hand the local model for the manifold in question (say a Hilbert, Banach or Fréchet space) and, on the other hand, the degree of regularity one requires (say \(C^0\), \(C^k\) or \(C^{\infty }\)). Many ‘classical’ works do establish, in the appropriate setting, a \(C^{k}\) Banach manifold structure for the constraint manifold, but in many instances do not typically discuss the \(C^{\infty }\) case.

  15. 15.

    We note that there are different, mutually incompatible variations of this definition. We have followed those in O’Neill (1983).

  16. 16.

    For the sake of clarity, we shall always tacitly assume the word smooth to mean \(C^{\infty }\) and leave the straightforward modifications needed to handle the case of finite degree of regularity, as encoded by functional spaces like \(C^{k,\alpha }\) or \(W^{k,p}\), to the reader.

  17. 17.

    It is customary to regard, in this context, the momentum tensor as a covariant tensor rather than as a contravariant one; it is understood that the indices are lowered with respect to the metric in question.

  18. 18.

    At that stage, one can give self-contained arguments proving that (Mgk) must isometrically embed inside in as a Minkowskian spacelike slice (see, for instance, the note Nardmann 2010).

  19. 19.

    We note that when \(n=3\) one needs the technical assumption that \(tr_{g}(k)=O(|x|^{-\beta })\) for some \(\beta >2\).


  1. Abate M, Tovena F (2012) Curves and surfaces. Unitext, vol 55. Springer, Milan.

    Google Scholar 

  2. Aceña AE (2009) Convergent null data expansions at space-like infinity of stationary vacuum solutions. Ann Henri Poincaré 10:275–337.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  3. Albanese G, Rigoli M (2016) Lichnerowicz-type equations on complete manifolds. Adv Nonlinear Anal 5:223–250.

    MathSciNet  Article  MATH  Google Scholar 

  4. Albanese G, Rigoli M (2017) Lichnerowicz-type equations with sign-changing nonlinearities on complete manifolds with boundary. J Differ Equations 263:7475–7495.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  5. Allen PT, Clausen A, Isenberg J (2008) Near-constant mean curvature solutions of the Einstein constraint equations with non-negative Yamabe metrics. Class Quantum Grav 25:075009.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  6. Ambrosio L (2004) Transport equation and Cauchy problem for \(BV\) vector fields. Invent Math 158:227–260.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  7. Ambrosio L, Trevisan D (2017) Lecture notes on the DiPerna–Lions theory in abstract measure spaces. Ann Fac Sci Toulouse Math 26:729–766.

    MathSciNet  Article  MATH  Google Scholar 

  8. Ambrosio L, Carlotto A, Massaccesi A (2018) Lectures on elliptic partial differential equations, Appunti. Scuola Normale Superiore di Pisa (Nuova Serie) [Lecture Notes. Scuola Normale Superiore di Pisa (New Series)], vol 18. Edizioni della Normale, Pisa.

  9. Ambrozio L (2015) On perturbations of the Schwarzschild anti-de Sitter spaces of positive mass. Commun Math Phys 337:767–783.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  10. Ambrozio L (2017) On static three-manifolds with positive scalar curvature. J Differ Geom 107:1–45.

    MathSciNet  Article  MATH  Google Scholar 

  11. An Z (2020) On mass-minimizing extension of Bartnik boundary data. arXiv e-prints arXiv:2007.05452

  12. Anderson M (2018) On the conformal method for the Einstein constraint equations. arXiv e-prints arXiv:1812.06320

  13. Anderson MT, Chruściel PT (2005) Asymptotically simple solutions of the vacuum Einstein equations in even dimensions. Commun Math Phys 260:557–577.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  14. Anderson MT, Jauregui JL (2019) Embeddings, immersions and the Bartnik quasi-local mass conjectures. Ann Henri Poincaré 20:1651–1698.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  15. Andersson L, Chruściel PT (1996) Solutions of the constraint equations in general relativity satisfying “hyperboloidal boundary conditions”. Diss Math (Rozprawy Mat) 355:100

    MathSciNet  MATH  Google Scholar 

  16. Andersson L, Dahl M (1998) Scalar curvature rigidity for asymptotically locally hyperbolic manifolds. Ann Glob Anal Geom 16:1–27.

    MathSciNet  Article  MATH