1 Introduction

It is well-known that the total mass of an isolated body in general relativity is given by the ADM mass, and that the very nature of general relativity precludes the possibility of a local energy density; however, the notion of the mass contained in a given region of finite extent is still an open problem. This question is particularly peculiar, as it is not that we lack an answer to it, but rather we have many candidates for what this mass should be (see [30] for a detailed review), many of which are incompatible. Bartnik’s quasi-local mass [6] is considered by many to give the best answer to this question, if only it were possible to compute in general. The Bartnik mass is described as follows: Given a subset \(\varOmega \) of some \((\tilde{M},\tilde{g},\tilde{\pi })\), an initial data set satisfying the Einstein constraints, let \(\mathcal {PM}\) be the set of asymptotically flat initial data sets satisfying the hypotheses of the positive mass theorem, in which \(\varOmega \) isometrically embeds, with no horizons strictly enclosing \(\varOmega \). The Bartnik mass is then taken as the infimum of the ADM mass over \(\mathcal {PM}\). It is conjectured that this infimum is indeed realised; however, while some progress has been made (see [4, 8, 14, 18, 24] and references therein), this is still an open problem in general. There are some counter-examples in special cases. Mantoulidis and Schoen [18] constructed a sequence of extensions to a stable minimal surface whose mass converges to the Bartnik mass. In light of black hole uniqueness theorems, the only possible limit for this sequence is a Schwarzschild solution, so if \(\varOmega \) is not contained in a slice of Schwarzschild then the infimum is not realised. Anderson and Jauregui [3] have also constructed a counterexample in the case where the Bartnik mass is zero, essentially exploiting the rigidity of the positive mass theorem.

There are also interesting results by Corvino [14, 15] and Miao [24], which demonstrate that if a mass-minimising extension exists, then it must be static and satisfy Bartnik’s geometric boundary conditions. These boundary conditions precisely amount to the condition that the dominant energy condition across \(\partial \varOmega \) is satisfied in the distributional sense (see Section 5 of [7] for details). Bartnik’s work on the phase space for the Einstein equations [9] was, in part, motivated by the idea of placing Corvino and Miao’s work into a more variational setting. Here we also work to this end. For more details pertaining to the space \(\mathcal {PM}\) and the Bartnik mass, the reader is referred to [6, 7]. In this paper, we consider a larger set of extensions to such a bounded domain \(\varOmega \), described by asymptotically flat manifolds with boundary conditions imposed on a compact interior boundary, \(\Sigma \). Specifically, we cannot rule out horizons in the extensions and the boundary conditions we consider are far more general than Bartnik’s geometric boundary conditions. The initial data we consider has local regularity \((g,\pi )\in H^2\times H^1\), with g prescribed on \(\Sigma \) in the trace sense. It should be remarked here that the boundary conditions we impose are not Bartnik’s geometric boundary conditions so we cannot directly obtain results on Bartnik’s quasi-local mass. However, we are able to use this framework to establish some results in this direction, as described below.

In Sect. 2, we review the mapping properties of the Laplace–Beltrami operator on M and show that this is an isomorphism between certain weighted spaces over M with boundary conditions imposed. In Sect. 3, we apply Bartnik’s phase space analysis to the case considered here, where M has an interior boundary and g satisfies certain boundary conditions. In particular, Theorem 2 therein establishes that the space of asymptotically flat solutions to the constraints, satisfying our boundary conditions, is a Hilbert manifold. Finally, in Sect. 4, we prove a result motivated by the static metric extension conjecture and Bartnik’s quasi-local mass. Specifically, Theorem 4 in Sect. 4 shows that critical points of the mass over the space of extensions to \(\varOmega \) with g fixed on the boundary, correspond to stationary solutions with vanishing stationary Killing vector on \(\Sigma \). Furthermore, if the data is smooth, this implies \(\Sigma \) is the bifurcation surface of a bifurcate Killing horizon that is non-rotating, and by a staticity result of Sudarsky and Wald [29], one concludes that the extension is therefore static. We conclude with a version of this result closely related to Bartnik’s geometric boundary data (Corollary 2).

2 The Laplace–Beltrami operator on an asymptotically flat manifold with interior boundary

In this section we discuss some preliminary results regarding the Laplace–Beltrami operator on an asymptotically flat manifold with interior boundary. The results in this section are relatively standard (cf. [5, 11, 19, 23]), however for the sake of completeness we include them here.

It is well-known that while the Laplace operator is not Fredholm on \(\mathbb {R}^n\) when considered as a map \(H^2\rightarrow L^2\), it is in fact an isomorphism between certain weighted Sobolev/Lebesgue spaces (cf. [27]). We discuss some properties of the Laplace–Beltrami operator on an asymptotically flat manifold when boundary conditions are imposed.

Throughout, we let M be a smooth, connected manifold with compact boundary \(\Sigma \). Further assume that there exists a compact set \(K\subset M\) such that the complement \(M\setminus K\) consists of N connected components, each diffeomorphic to \(\mathbb {R}^n\) minus the closed unit ball, B. For concreteness, we denote these connected components by \(N_i\), and the associated diffeomorphisms by \(\phi _i:N_i\rightarrow \mathbb {R}^n\setminus B\). Equip M with a smooth background Riemannian metric \(\mathring{g}\), equal to the pullback of the Euclidean metric to each of these ends. Let r be a smooth function on M such that \(r(x)=|\phi _i(x)|\) on each \(N_i\), and \(\frac{1}{2}<r<2\) on K.

In terms of this background asymptotically flat structure, we define the usual weighted Lebesgue and Sobolev norms, respectively as follows:

$$\begin{aligned} \left\| u\right\| _{p,\delta }&= \left\{ \begin{array}{ll} \left( \int \left| u\right| ^p r^{-\delta p-n}d\mu _0\right) ^{1/p},&{} 1\le p<\infty ,\\ \text {ess sup}(r^{-\delta }|u|), &{} p=\infty , \end{array} \right. \end{aligned}$$
(1)
$$\begin{aligned} \left\| u\right\| _{k,p,\delta }&=\sum _{j=0}^k\Vert \mathring{\nabla }^j u\Vert _{p,\delta -j}, \end{aligned}$$
(2)

where \(\delta \in \mathbb {R}\). Norms of sections of bundles are defined in the usual way. Note that our convention follows [5], where \(\delta \) directly indicates the asymptotic behaviour; that is, \(u\in L^p_\delta \) behaves as \(o(r^\delta )\) near infinity. We denote the completion with respect to these norms, of the set of smooth functions with compact support on the interior of the manifold, by \(L^p\) and \(\overline{W}^{k,p}_\delta \). Note that \(\overline{W}^{k,p}_\delta \) is a space of functions that vanish on the boundary in the trace sense, along with their first \(k-1\) derivatives. We use \(W^{k,p}_\delta \) to denote the completion of the smooth functions with compact support on the manifold with boundary, and also use the convention \(\overline{H}^k_\delta =\overline{W}^{k,2}_\delta \) and \(H^k_\delta =W^{k,2}_\delta \).

It is well-known that weighted versions of the usual Sobolev-type inequalities hold for these norms. For the reader’s convenience, we quote these directly from [5].

Theorem 1

The following inequalities hold:

  1. (i.)

    If \(1\le p\le q\le \infty \), \(\delta _2 < \delta _1\) and \(u\in L^q_{\delta _2}\), then

    $$\begin{aligned} \left\| u\right\| _{p,\delta _1}\le c\left\| u\right\| _{q,\delta _2} \end{aligned}$$
    (3)

    and thus \(L^q_{\delta _2}\subset L^p_{\delta _1}\).

  2. (ii.)

    (Hölder) If \(u\in L^q_{\delta _1}\), \(v\in L^r_{\delta _2}\) and \(\delta =\delta _1+\delta _2\), \(1\le p,q,r\le \infty \), then

    $$\begin{aligned} \left\| uv\right\| _{p,\delta }\le \left\| u\right\| _{q,\delta _1}\left\| v\right\| _{r.\delta _2}, \end{aligned}$$
    (4)

    where \(1/p=1/q+1/r\).

  3. (iii.)

    (Interpolation) For any \(\epsilon >0\), there is a \(C(\epsilon )\) such that, for all \(u\in W^{2,p}_\delta \)

    $$\begin{aligned} \left\| u\right\| _{1,p,\delta }\le \epsilon \left\| u\right\| _{2,p,\delta }+C(\epsilon )\left\| u\right\| _{p,\delta }, \end{aligned}$$
    (5)

    for \(1\le p\le \infty \).

  4. (iv.)

    (Sobolev) If \(u\in W^{k,p}_\delta \), then

    $$\begin{aligned} \left\| u\right\| _{np/(n-kp),\delta }\le c\left\| u\right\| _{k,q,\delta } \end{aligned}$$
    (6)

    for q satisfying \(p\le q\le np/(n-kp)\).

    If \(kp>n\) then

    $$\begin{aligned} \Vert u\Vert _{\infty ,\delta }\le c\Vert u\Vert _{k,p,\delta } \end{aligned}$$
    (7)
  5. (v.)

    (Morrey’s) If \(u\in W^{k,p}_\delta \) and \(0<\alpha \le k-n/p\le 1\), then

    $$\begin{aligned} \Vert u\Vert _{C^{0,\alpha }_\delta }\le c\Vert u\Vert _{k,p,\delta }, \end{aligned}$$
    (8)

    where the weighted Hölder norm is given by

    $$\begin{aligned} \Vert u\Vert _{C^{0,\alpha }_\delta }:=&\sup _{x\in \mathcal {M}}\Big {(}r^{-\delta +\alpha }(x)\sup _{4|x-y|\le r(x)}\frac{|u(x)-u(y)|}{|x-y|^\alpha }\Big {)}\\&+\sup _{x\in \mathcal {M}}\left( r^\delta (x)|u(x)|\right) \end{aligned}$$
  6. (vi.)

    (Poincaré) If \(\delta <0\) and \(1\le p<\infty \), for any \(u\in W^{1,p}_\delta \) we have

    $$\begin{aligned} \Vert u\Vert _{p,\delta }\le c \Vert \mathring{\nabla }u\Vert _{p,\delta -1}, \end{aligned}$$
    (9)

where n is the dimension of \(\mathcal {M}\).

While these inequalities were considered in [5] on manifolds without boundary, it is clear that the proofs presented remain valid when a boundary is present. One can easily check this, as the proof in [5] relies only on splitting the norms into integrals over annular regions, rescaling the integrals to integrals over an annulus of fixed radius, then applying the usual local inequalities. The reader is referred to [11, 23] for more results pertaining to these weighted spaces.

In terms of these weighted Sobolev spaces, we make precise the notion of asymptotically flat manifolds considered here.

Definition 1

An asymptotically flat manifold with N ends and interior boundary, is a manifold M, satisfying the properties described above, equipped with a Riemannian metric g satisfying \((g-\mathring{g})\in W^{2,k}_{5/2-n}\), for some \(k>n/2\).

Note that the condition \(k>n/2\) ensures that the metric is Hölder continuous, by the Sobolev–Morrey embedding. We also remark that while the above definition appears to implicitly depend on the choice of diffeomorphisms \(\phi _i\), at least in the case of interest here (\(k=2\), \(n=3\)) Bartnik has shown that these Sobolev spaces of metrics are in fact independent of \(\phi _i\) (see Theorem 4.7 of [9]).

By comparison to the Laplacian on a bounded domain, it is expected that boundary conditions must be enforced if we hope for \(\varDelta _g\), the Laplace–Beltrami operator associated with g, to be an isomorphism.

First note the following elementary estimate, which follows immediately from Proposition 1.6 of [5].

Lemma 1

Let \(\delta \in \mathbb {R}\), then

$$\begin{aligned} \Vert u\Vert _{2,2,\delta }\le C\left( \Vert \varDelta _g u\Vert _{2,\delta -2}+\Vert u\Vert _{2,\delta }\right) , \end{aligned}$$
(10)

for any \(u\in H^2_\delta \cap \overline{H}^1_\delta \).

Note that \(\delta <\epsilon \) is required for the embedding \(W^{k,p}_\delta \hookrightarrow W^{j,p}_{\epsilon }\) to be compact (cf. Lemma 2.1 of [11]), in addition to the usual condition \(k>j\); that is, the estimate above does not suffice to prove Fredholmness. For this, we require Lemma 2, below.

Lemma 2

Let (Mg) be an asymptotically flat manifold as described above, and fix \(\delta \in (2-n,0)\). Then for \(u\in H^2_\delta \cap \overline{H}^1_\delta \) we have

$$\begin{aligned} \Vert u\Vert _{2,2,\delta }\le C\Vert \varDelta _g u\Vert _{2,\delta -2}. \end{aligned}$$
(11)

Proof

First note that \(\varDelta _g\) is asymptotic to the background Laplacian in the sense of [5] (Definition 1.5). Further note that the proof of Theorem 1.10 of [5] remains valid on an asymptotically flat manifold with boundary, so we have the scale-broken estimate,

$$\begin{aligned} \Vert u\Vert _{2,2,\delta }\le C(\Vert \varDelta _g u\Vert _{2,\delta -2}+\Vert u\Vert _{2,0}), \end{aligned}$$
(12)

which does indeed suffice to prove Fredholmness (see Proposition 1 below).

From this, we prove (11) using a standard argument. Assume, to the contrary, that there exists a sequence \(u_i\) such that \(\Vert u_i\Vert _{2,2,\delta }=1\) and \(\varDelta _g u_i\rightarrow 0\) in \(H^2_{\delta -2}\). Passing to a subsequence, \(u_i\) converges weakly in \(H^2_\delta \) and by the weighted Rellich compactness theorem it converges strongly in \(L^2_0\). Now (12) implies \(u_i\) is Cauchy and therefore converges in \(H^2_\delta \). By continuity, we have \(\varDelta _g u=0\), and therefore we have a non-trivial element of \(\ker (\varDelta _g)\). However, it can be seen directly from the maximum principle that \(\varDelta _g\) has trivial kernel in \({H^2_\delta \cap \overline{H}^1_\delta }\). \(\square \)

From Lemma 2, we establish the following.

Proposition 1

For any \(\delta \in (2-n,0)\), the map \(\varDelta _{g}:H^2_\delta \cap \overline{H}^1_\delta (M) \rightarrow L^2_{\delta -2}(M)\) is an isomorphism.

Proof

We simply must prove that \(\varDelta _g\) is surjective, which is achieved by proving the range is closed and \(\varDelta _g^*\) has trivial kernel. It is a fairly standard argument to demonstrate that \(\varDelta _g\) has closed range, which is as follows. Take a sequence \(u_i\in H^2_\delta \cap \overline{H}^1_\delta (M)\) such that \(\phi _i=\varDelta _g u_i\) is Cauchy; that is, any Cauchy sequence in the range. By (11), \(u_i\) is convergent to some u, and by continuity, \(\phi _i\rightarrow \varDelta _gu\). It follows that \(\varDelta _g\) has closed range.

It remains to prove that \(\varDelta _g^*\) has trivial kernel. We first remark that by standard elliptic regularity [16] and the rescaled interior estimates [5] we have any element of the kernel of \(\varDelta _g^*\) is smooth and in \(H^2_{2-n-\delta }\) (as in the proof of Proposition 6 in [19]).

Now an element v in the kernel of \(\varDelta _g^*\) satisfies

$$\begin{aligned} \int _M \varDelta _g (f) v \,dV=0 \end{aligned}$$

for all \(f\in H^2_\delta \cap \overline{H}^1_\delta (M)\), so for any , we have

$$\begin{aligned} \int _\varOmega \varDelta _g (f)v\, dV=\int _\varOmega f\varDelta _g(v)\, dV=0 \end{aligned}$$

for all \(f\in C^\infty _c(\varOmega )\), and therefore \(\varDelta v=0\) on M. It then follows that

$$\begin{aligned} \int _M \varDelta _g (f)v\, dV=0=\int _{\partial M}\nabla (f)v\cdot dS \end{aligned}$$
(13)

for all \(f\in H^2_\delta \cap \overline{H}^1_\delta (M)\), and therefore \(v\equiv 0\) on \(\partial M\). Since v is smooth and decays to zero at infinity, by the maximum principle v is identically zero. Note that in obtaining (13), we have dropped a boundary integral at infinity, which can be seen to vanish due to the asymptotics for f and v. \(\square \)

3 The phase space

In this section we adapt Bartnik’s phase space construction to an asymptotically flat manifold with an interior boundary. In particular, we show that the set of asymptotically flat initial data, with g fixed on the boundary, is a Hilbert submanifold of the phase space. For simplicity, we restrict ourselves to the physically relevant case, \(n=3\). Several of the results in the case considered here follow by entirely identical arguments as used by Bartnik, so we simply refer to the appropriate places in Ref. [9] for proofs in these instances. In addition to this, many proofs given here involve only small modifications to those given by Bartnik.

The constraint map is given by

$$\begin{aligned} \varPhi _0(g,\pi )&=R(g)\sqrt{g}-\left( \pi ^{ij}\pi _{ij}-\frac{1}{2}(\pi ^k_k)^2\right) /\sqrt{g}, \end{aligned}$$
(14)
$$\begin{aligned} \varPhi _i(g,\pi )&=2\nabla _k\pi ^k_i, \end{aligned}$$
(15)

where \(\sqrt{g}=\frac{\sqrt{\det g}}{\sqrt{\det \mathring{g}}}\) is a volume form, and \(\pi \) is the canonical momentum given in terms of the second fundamental form K, by \(\pi ^{ij}=(K^{ij}-g^{ij}{{\,\mathrm{tr}\,}}_g K)\sqrt{g}\). For a given energy-momentum source \((s,S_i)\), the constraint equations are \(\varPhi (g,\pi )=(s,S_i)\)—in particular, the vacuum constraints are simply \(\varPhi (g,\pi )=0\).

Now let \((M,\mathring{g})\) be an asymptotically flat manifold as described in Sect. 2, where \(\mathring{g}\) will serve as a background metric. As we are motivated by considering extensions to a given compact manifold with boundary, \(\varOmega \), one should consider \(\mathring{g}\) near \(\Sigma \) as coming from the metric on \(\varOmega \), which is to be extended. More concretely, one may choose M such that it can be glued to \(\varOmega \) along \(\Sigma \), and \(\mathring{g}\) would then be a smooth extension of the metric on \(\varOmega \). However, we avoid further discussion on \(\varOmega \) by simply considering \(\mathring{g}\) to be some given background metric. We define the domain and codomain of \(\varPhi \) in terms of weighted Sobolev spaces:

$$\begin{aligned}&\mathcal {G}:=\{g\in S_2:g>0, (g-\mathring{g})\in H^2_{-1/2}\cap \overline{H}^1_{-1/2}(M)\},\\&\mathcal {K}:=H^1_{-3/2}(S^2\otimes \varLambda ^3),\qquad \mathcal {N}:=L^2_{-5/2}(\varLambda ^3\times T^*M\otimes \varLambda ^3), \end{aligned}$$

where \(\varLambda ^k\) is the space of k-forms on M, and \(S_2\) and \(S^2\) are symmetric covariant and contravariant 2-tensors on M respectively. The phase space is the set of prospective initial data, \(\mathcal {F}=\mathcal {G}\times \mathcal {K}\). The proofs of Proposition 3.1 and Corollary 3.2 of [9] apply directly in the case considered here, and it therefore follows immediately that \(\varPhi :\mathcal {F}\rightarrow \mathcal {N}\) is a smooth map of Hilbert manifolds.

It is interesting to note that at the time of publication, Bartnik’s phase space concerned initial data that was slightly too rough to apply known results on the well-posedness of the Cauchy problem; however, through the positive resolution of the bounded \(L^2\) curvature conjecture, Klainerman, Rodnianski and Szeftel [17] have improved the local existence and uniqueness results to the case considered by Bartnik, and indeed the case considered here.

The key to proving that the level sets of \(\varPhi \) are Hilbert submanifolds, is a standard implicit function theorem style argument. As such, we study the linearisation of \(\varPhi \), which at a point \((g,\pi )\in \mathcal {F}\), is given by

$$\begin{aligned} D\varPhi _{0\,(g,\pi )}[h,p]=&\,(\pi ^k_k\pi ^{ij}-2\pi ^{ik}\pi ^j_k)h_{ij}+{{\,\mathrm{tr}\,}}(h) \left( \frac{1}{2}\pi \cdot \pi -\frac{1}{4}({{\,\mathrm{tr}\,}}\pi )^2\right) /\sqrt{g}\nonumber \\&+\left( \frac{1}{2}{{\,\mathrm{tr}\,}}(h)R-\varDelta {{\,\mathrm{tr}\,}}(h)+\nabla ^i\nabla ^jh_{ij}-R^{ij}h_{ij}\right) \sqrt{g}\nonumber \\&+({{\,\mathrm{tr}\,}}(p){{\,\mathrm{tr}\,}}(\pi )-2\pi \cdot p)/\sqrt{g} \end{aligned}$$
(16)
$$\begin{aligned} D\varPhi _{i\,(g,\pi )}[h,p]=&\,2\nabla _j(\pi ^{jk}h_{ik})-\pi ^{jk}\nabla _i h_{jk} +2\nabla _j p_i^j, \end{aligned}$$
(17)

for \((h,p)\in T_{(g,\pi )}\mathcal {F}\). The formal \(L^2\) adjoint is then computed as

$$\begin{aligned} D\varPhi ^F_{1\,(g,\pi )}[N,X]=&\,N\left( \pi ^k_k\pi ^{ij}-2\pi ^{ik}\pi ^k_k +\left( \frac{1}{2}\pi ^{kl}\pi _{kl}-\frac{1}{4}(\pi ^k_k)^2\right) g^{ij}\right) /\sqrt{g}\nonumber \\&+\left( N\left( \frac{1}{2}Rg^{ij}-R^{ij}\right) +\nabla ^i\nabla ^kN-g^{ij}\nabla ^k\nabla _kN\right) \sqrt{g}\nonumber \\&+\mathcal {L}_X\pi ^{ij} \end{aligned}$$
(18)
$$\begin{aligned} D\varPhi ^F_{2\,(g,\pi )}[N,X]=&\,N(g_{ij}\pi ^k_k-2\pi _{ij})/\sqrt{g}-\mathcal {L}_Xg_{ij}, \end{aligned}$$
(19)

where \((N,X)\in \mathcal {N}^*=L^2_{-5/2}(\varLambda ^0\times TM)\) and \(\mathcal {L}\) is the Lie derivative on M. Note that we use the superscript ‘F’ for the formal adjoint, obtained by integrating by parts and disregarding boundary terms, rather than ‘\(*\)’, which we reserve for the true adjoint.

We first give a coercivity estimate for \(D\varPhi _{(g,\pi )}^F\). It should be noted that this is simply Bartnik’s Proposition 3.3 of [9]; however, particularly since there is a minor modification to the proof at the end, there is no harm in presenting the computation here. Furthermore, there is a minor omission in the argument of Bartnik that relies on a local version of this estimate, which we address in the proof of Proposition 3. Note that for simplicity of presentation, we write \(\xi =(N,X)\), which may be interpreted as a 4-vector in the spacetime. We also briefly remark that the constant C used in all of our estimates below and throughout may change from line to line.

Proposition 2

For all \(\xi \in W^{2,2}_{-1/2}\), \(D\varPhi ^F_{(g,\pi )}\) satisfies,

$$\begin{aligned} \Vert \xi \Vert _{2,2,-1/2}\le C\left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}+\Vert \xi \Vert _{2,0}\right) . \end{aligned}$$
(20)

Proof

We will need to make use of the difference of connections tensor,

$$\begin{aligned} \tilde{\varGamma }=\varGamma -\mathring{\varGamma }=\frac{1}{2}g^{il}(\mathring{\nabla }_j g_{lk}+\mathring{\nabla }_k g_{jl}-\mathring{\nabla }_l g_{jk}), \end{aligned}$$

which is clearly controlled in \(W^{1,2}_{-3/2}\), for \(g\in \mathcal {G}\).

Rearranging (18) gives

$$\begin{aligned} \nabla ^i\nabla ^j N-g^{ij}\nabla ^k\nabla _k N=S^{ij}, \end{aligned}$$

where S is given by

$$\begin{aligned} \sqrt{g}S^{ij}=&\,D\varPhi _{g}^F[\xi ]^{ij}-N\left( \pi ^k_k\pi ^{ij}-2\pi ^{ik}\pi ^j_k-\Big ( N \left( \frac{1}{2}Rg^{ij}-R^{ij}\right) \Big )\sqrt{g}\right. \\&\left. +\Big (\frac{1}{2}\pi ^{kl}\pi _{kl}-\frac{1}{4}(\pi ^k_k)^2\Big ) g^{ij}\right) /\sqrt{g} +\mathcal {L}_X\pi ^{ij}. \end{aligned}$$

From this, we can then write

$$\begin{aligned} \nabla ^i\nabla ^j N=S^{ij}-\frac{1}{2}g^{ij}S^k_k, \end{aligned}$$
(21)

which gives an estimate for \(\nabla ^2N\):

$$\begin{aligned} \Vert \nabla ^2 N\Vert _{2,-5/2}\le C\Vert S\Vert _{2,-5/2}. \end{aligned}$$
(22)

Noting that \((g,\pi )\) is fixed and \(\xi =(N,X)\), the standard weighted Sobolev-type inequalities give

$$\begin{aligned} \Vert \mathring{\nabla }^2N\Vert _{2,-5/2}\le&\,C \Big {(}\Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5.2}+\Vert \tilde{\varGamma }\mathring{\nabla }N\Vert _{2,-5/2}+\Vert \pi \mathring{\nabla }X\Vert _{2,-5/2}\\&+\Vert X\mathring{\nabla }\pi \Vert _{2,-5/2}+\Vert N\Vert _{\infty ,0}(\Vert \pi ^2\Vert _{2,-5/2}+\Vert Ric(g)\Vert _{2,-5/2})\Big {)}\\ \le&\,C\left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert \xi \Vert _{\infty ,0}\right. \\&\left. +\Vert \mathring{\nabla }\xi \Vert _{3,-1}(\Vert \tilde{\varGamma }\Vert _{6,-3/2}+\Vert \pi \Vert _{6,-3/2})\right) \\ \le&\,C\left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert \xi \Vert _{\infty ,0}\right. \\&\left. +\Vert \mathring{\nabla }\xi \Vert _{3,-1}(\Vert \tilde{\varGamma }\Vert _{1,2,-3/2}+\Vert \pi \Vert _{1,2,-3/2})\right) \\ \le&\,C\left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert \xi \Vert _{\infty ,0}+\Vert \mathring{\nabla }\xi \Vert _{3,-1}\right) . \end{aligned}$$

The Bianchi identity, the identity \(R_{ijkl}X^l=\nabla _i\nabla _j X_k-\nabla _j\nabla _i X_k\), and a bit of algebraic manipulation result in

$$\begin{aligned} \nabla _k\mathcal {L}_X g_{ij}+\nabla _j\mathcal {L}_X g_{ik}-\nabla _i\mathcal {L}_X g_{jk}=2(R_{ikjl}X^l+\nabla _k\nabla _j X_i), \end{aligned}$$

and therefore can estimate \(\nabla ^2X\) by

$$\begin{aligned} \Vert \nabla ^2X\Vert _{2,-5/2}\le C(\Vert Riem(g)\Vert _{2,-5/2}\Vert X\Vert _{\infty ,0}+\Vert \nabla \mathcal {L}_X g\Vert _{2,-5/2}). \end{aligned}$$
(23)

By writing the Riemann tensor explicitly in terms of \(g,\mathring{\nabla }g\) and \(\mathring{\nabla }^2 g\), it is clear that we can control \(\Vert Riem(g)\Vert _{2,-5/2}\) for \(g\in \mathcal {G}\); the Riemann tensor is quadratic in \(\mathring{\nabla }g\) and linear in \(\mathring{\nabla }^2 g\). This somewhat lengthy albeit straightforward computation can be found in, for example, in Appendix A of [21].

Making use of (19), the Lie derivative is expressed as

$$\begin{aligned} \mathcal {L}_X g_{ij}=N(g_{ij}\pi ^k_k-2\pi _{ij})g^{-1/2}-D\varPhi ^F_{2\,(g,\pi )}[\xi ]_{ij}, \end{aligned}$$
(24)

and from this, the weighted Sobolev-type inequalities give

$$\begin{aligned} \Vert \nabla \mathcal {L}_X g\Vert _{2,-5/2}&\le C \,(\Vert \nabla (N\pi )\Vert _{2,-5/2}+\Vert \nabla D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{2,-5/2})\\&\le C \, \Big {(}\Vert \mathring{\nabla }D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert \mathring{\nabla }N\Vert _{3,-1}\Vert \pi \Vert _{6,-3/2}\\&\quad +\Vert N\Vert _{\infty ,0}(\Vert \mathring{\nabla }\pi \Vert _{2,-5/2}+\Vert \tilde{\varGamma }\pi \Vert _{2,-5/2})\\&\quad +\Vert \tilde{\varGamma }\Vert _{4,-1}\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{4,-3/2}\Big {)}\\&\le C \, \Big {(}\Vert \mathring{\nabla }D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert \mathring{\nabla }N\Vert _{3,-1}\Vert \pi \Vert _{1,2,-3/2}\\&\quad +\Vert N\Vert _{\infty ,0}(\Vert \mathring{\nabla }\pi \Vert _{2,-5/2}+\Vert \tilde{\varGamma }\Vert _{1,2.-3/2}\Vert \pi \Vert _{1,2,-3/2})\\&\quad +\Vert \tilde{\varGamma }\Vert _{1,2,-3/2}\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}\Big {)}\\&\le C \, \big {(}\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}+\Vert N\Vert _{\infty ,0}+\Vert \mathring{\nabla }N\Vert _{3,-1}\big {)}. \end{aligned}$$

We now obtain an estimate for \(\Vert \mathring{\nabla }^2 X\Vert \) in terms of \(\Vert \nabla ^2 X\Vert \) as follows:

$$\begin{aligned} \Vert \mathring{\nabla }^2X\Vert _{2,-5/2}&\le C \, \big {(}\Vert \nabla ^2 X\Vert _{2,-5/2}+\Vert \mathring{\nabla }(X)\tilde{\varGamma }\Vert _{2,-5/2}+\Vert X\mathring{\nabla }(\tilde{\varGamma })\Vert _{2,-5/2}\\&\quad +\Vert \tilde{\varGamma }^2X\Vert _{2,-5/2}\big {)}\\&\le C \, \big {(}\Vert \nabla ^2 X\Vert _{2,-5/2}+\Vert \mathring{\nabla }X\Vert _{3,-1}\Vert \tilde{\varGamma }\Vert _{6,-3/2}\\&\quad +\Vert X\Vert _{\infty ,0}(\Vert \mathring{\nabla }\tilde{\varGamma }\Vert _{2,-5/2}+\Vert \tilde{\varGamma }^2\Vert _{2,-5/2})\big {)}\\&\le C \, \big {(}\Vert \nabla ^2 X\Vert _{2,-5/2}+\Vert \mathring{\nabla }X\Vert _{3,-1}\Vert \tilde{\varGamma }\Vert _{1,2,-3/2}\\&\quad +\Vert X\Vert _{\infty ,0}(\Vert \mathring{\nabla }\tilde{\varGamma }\Vert _{2,-5/2}+\Vert \tilde{\varGamma }\Vert ^2_{1,2,-3/2})\big {)}\\&\le C \, \big {(}\Vert \nabla ^2 X\Vert _{2,-5/2}+\Vert \mathring{\nabla }X\Vert _{3,-1}+\Vert X\Vert _{\infty ,0}\big {)}. \end{aligned}$$

Combining these we have

$$\begin{aligned} \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2}\le&\,C\left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}\right. \nonumber \\&\left. +\Vert \xi \Vert _{\infty ,0}+\Vert \mathring{\nabla }\xi \Vert _{3,-1}\right) . \end{aligned}$$
(25)

The last two terms on the right-hand side are estimated using the weighted inequalities, Young’s inequality, and the definition of the \(W^{k,p}_\delta \) norm directly:

$$\begin{aligned} \Vert \xi \Vert _{\infty ,0}&\le c \Vert \xi \Vert _{1,4,0}=\Vert \xi ^{1/4}\xi ^{3/4}\Vert _{1,4,0}\nonumber \\&\le c \Vert \xi ^{1/4}\Vert _{1,8,0}\Vert \xi ^{3/4}\Vert _{1,8,0}\nonumber \\&\le c \Vert \xi \Vert ^{1/4}_{1,2,0}\Vert \xi \Vert ^{3/4}_{1,6,0}\nonumber \\&\le c \Vert \xi \Vert ^{1/4}_{1,2,0}\Vert \xi \Vert ^{3/4}_{2,2,0}\nonumber \\&\le c \Vert \xi \Vert ^{1/4}_{1,2,0}(\Vert \xi \Vert _{1,2,0}+\Vert \mathring{\nabla }^2\xi \Vert _{2,-2})^{3/4}\nonumber \\&\le c \epsilon ^{-3}\Vert \xi \Vert _{1,2,0}+\epsilon (\Vert \xi \Vert _{1,2,0}+\Vert \mathring{\nabla }^2\xi \Vert _{2,-2})\nonumber \\&\le c \epsilon ^{-3}\Vert \xi \Vert _{1,2,0}+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-2}, \end{aligned}$$
(26)

for any \(\epsilon >0\).

An estimate for the final term in (25) is obtained almost identically:

$$\begin{aligned} \Vert \mathring{\nabla }\xi \Vert _{3,-1}&\le \Vert \xi \Vert _{1,3,0}=\Vert \xi ^{1/3}\xi ^{2/3}\Vert _{1,3,0}\nonumber \\&\le c \Vert \xi ^{1/3}\Vert _{1,6,0}\Vert \xi ^{2/3}\Vert _{1,6,0}\nonumber \\&\le c \Vert \xi \Vert ^{1/3}_{1,2,0}\Vert \xi \Vert ^{2/3}_{1,4,0}\nonumber \\&\le c \Vert \xi \Vert ^{1/3}_{1,2,0}\Vert \xi \Vert ^{2/3}_{2,2,0}\nonumber \\&\le c \Vert \xi \Vert ^{1/3}_{1,2,0}(\Vert \xi \Vert _{1,2,0}+\Vert \mathring{\nabla }^2\xi \Vert _{2,-2})^{2/3}\nonumber \\&\le c \epsilon ^{-2}\Vert \xi \Vert _{1,2,0}+\epsilon (\Vert \xi \Vert _{1,2,0}+\Vert \mathring{\nabla }^2\xi \Vert _{2,-2})\nonumber \\&\le c \epsilon ^{-2}\Vert \xi \Vert _{1,2,0}+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-2}. \end{aligned}$$
(27)

By inserting these estimates back into (25), we obtain

$$\begin{aligned} \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2}\le&\, C \big {(}\Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}\big {)}\\&+c(\epsilon )\Vert \xi \Vert _{1,2,0}+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-2}; \end{aligned}$$

choosing \(\epsilon \) small enough and applying the interpolation inequality gives

$$\begin{aligned} \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2}\le C \left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}+\Vert \xi \Vert _{2,0}\right) . \end{aligned}$$
(28)

Up to this point, we have essentially reproduced Bartnik’s argument with a little additional detail, and if we had a weighted Poincaré inequality we would be done; however, we are unaware of an appropriate Poincaré inequality in the case of a general asymptotically flat manifold with an interior boundary. Instead we consider separately the inequality near infinity, where we do have an appropriate Poincaré inequality, and on a compact domain. For some exterior region \(E_{R_0}\) we have the Poincaré inequality and therefore it follows that we have

$$\begin{aligned} \Vert \xi \Vert _{2,2,-1/2}\le&\, C \left( \Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2}+\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2}\right. \\&\left. +\Vert \xi \Vert _{2,0}+\Vert \xi \Vert _{1,2:M\setminus E_{R_0}}\right) . \end{aligned}$$

Applying the interpolation inequality again and noting \(\Vert \xi \Vert _{2:M\setminus E_{R_0}}\le C\Vert \xi \Vert _{2,0}\) completes the proof.\(\square \)

Remark 1

While Proposition 2 gives an estimate on M, the weighted Hölder, Sobolev and interpolation inequalities used above are also valid on an annular region \(A_R:=\{x\in M:r(x)\in (R,2R)\}\) (cf. [5]). In particular, using the usual Lebesgue and Sobolev norms on \(A_R\), we have

$$\begin{aligned} \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R}\le&\, C\Big {(}\Vert D\varPhi ^F_{1\,(g,\pi )}[\xi ]\Vert _{2,-5/2:A_R}\nonumber \\&+\Vert D\varPhi ^F_{2\,(g,\pi )}[\xi ]\Vert _{1,2,-3/2:A_R}+\Vert \xi \Vert _{2,0:A_R}\Big {)} \end{aligned}$$
(29)

for \(\xi \in W^{2,2}_{\delta }(A_R)\), where C is independent of R. However, we do not have the same control on \(\Vert \xi \Vert _{2,2,-1/2:A_R}\), as the constant in the Poincaré inequality depends on \(A_R\).

Note that the true adjoint of the linearised constraint map, \(D\varPhi ^*_{(g,\pi )}\), is only defined in the weak sense, which is why we make the distinction between \(D\varPhi ^*_{(g,\pi )}\) and \(D\varPhi ^F_{(g,\pi )}\). In order to study the kernel of \(D\varPhi ^*_{(g,\pi )}\) we must first demonstrate that weak solutions to the equation \(D\varPhi ^*_{(g,\pi )}[\xi ]=0\) are sufficiently regular to consider this as a bona fide differential equation.

Proposition 3

Suppose \(\xi \in \mathcal {N}\) is a weak solution of \(D\varPhi _{(g,\pi )}^*[\xi ]=(f_1,f_2)\), where \({(f_1,f_2)\in L^2_{-5/2}\times W^{1,2}_{-3/2}}\) and \((g,\pi )\in \mathcal {F}\), then \(\xi \in H^2_{-1/2}\cap \overline{H}^1_{-1/2}\) and furthermore, \({D\varPhi ^*_{(g,\pi )}[\xi ]=D\varPhi ^F_{(g,\pi )}[\xi ]}\).

Proof

We first note that local regularity follows directly from Bartnik’s proof of Proposition 3.5 in Ref. [9]. The only possible place in Bartnik’s proof where the boundary terms may come in to play are in choosing (hp) supported in some coordinate neighbourhood. Clearly our boundary conditions do not prevent this, so there is no obstruction to applying Bartnik’s proof directly. That is, \(\xi \in H^2_{loc}\).

In the following, let \(B_R\) be an open “ball” of radius R; for \(R>2\), \(B_R:=\{x\in M: r(x)<R\}\), and define \(M_{\epsilon R}:=\{x\in B_R: \text {dist}(\Sigma ,x)>\epsilon \}\), for some small \(\epsilon \). Then

$$\begin{aligned} \int _{M_{\epsilon R}}D\varPhi _{(g,\pi )}[h,p]\cdot \xi =\int _{M_{\epsilon R}}(h,p)\cdot (f_1,f_2) \end{aligned}$$

for all \((h,p)\in C^\infty _c(M_{\epsilon R})\). In particular, since \(\xi \in H^2_{-1/2}(M_{\epsilon R})\), we have

$$\begin{aligned} \int _{M_{\epsilon R}}(h,p)\cdot D\varPhi ^F_{(g,\pi )}[\xi ]=\int _{M_{\epsilon R}}(h,p)\cdot (f_1,f_2); \end{aligned}$$

that is, \(D\varPhi ^F_{(g,\pi )}[\xi ]=(f_1,f_2)\) on any \(M_{\epsilon R}\). Therefore \({D\varPhi ^*_{(g,\pi )}[\xi ]=D\varPhi ^F_{(g,\pi )}[\xi ]}\) on M, and the formal adjoint is indeed the true adjoint when \((f_1,f_2)\in L^2_{-5/2}\times H^1_{-3/2}\), as expected.

It remains to demonstrate that \(\xi \) satisfies the boundary conditions and exhibits the correct asymptotics. To this end, we introduce a new smooth cutoff function \(\chi \in C^\infty _c(M)\) such that \(\chi \equiv 1\) on \(B_{R_0}\), for some \(R_0>2\) and \(\chi =0\) on \(M\setminus B_{2R_0}\). Define \(\chi _R(x)=\chi (xR_0/R)\), so that \(\chi _R\) has support on \(B_{2R}\). Clearly \(\chi _R\xi \in W^{2,2}_{-1/2}\), therefore Proposition 2 gives

$$\begin{aligned} \Vert \chi _R\xi \Vert _{2,2,-1/2}\le&\,C\Big {(}\Vert D\varPhi _1^F[\chi _R\xi ]\Vert _{2,-5/2}+\Vert D\varPhi _2^F[\chi _R\xi ]\Vert _{1,2,-3/2}+\Vert \xi \Vert _{2,0}\Big {)}, \end{aligned}$$
(30)

noting that \(\chi _R\xi \rightarrow \xi \) in \(L^2_{0}\). From this we now show that \(\chi _R\xi \) is uniformly bounded in \(W^{2,2}_{-1/2}\). Obtaining control of \(\Vert \chi _R\xi \Vert _{2,2,-1/2}\) independent of R is the minor omission in Ref. [9] mentioned above, however this is easily resolved as follows.

Note that \(\mathring{\nabla }\chi _R(x)=(R_0/R)\mathring{\nabla }\chi (xR_0/R)\), \(\mathring{\nabla }\chi \) is bounded, and \(\mathring{\nabla }\chi _R\) has support on the annular region \(A_R\). It follows that we have

$$\begin{aligned} \Vert u\mathring{\nabla }\chi _R\Vert _{p,\delta }\le c\Vert u/R\Vert _{p,\delta :A_R}\le c \sup _{x\in A_R}|r(x)/R|\Vert u\Vert _{p,\delta +1:A_R}\le c\Vert u\Vert _{p,\delta +1:A_R}. \end{aligned}$$

From this, the expression for \(D\varPhi ^F\), and the usual weighted Sobolev-type inequalities (see Theorem 1), we have

$$\begin{aligned} \Vert D\varPhi ^F_1[\chi _R\xi ]\Vert _{2,-5/2}\le \,&c\, \Big {(}\Vert \chi _R D\varPhi ^F_1[\xi ]\Vert _{2,-5/2}+\Vert \pi \xi \mathring{\nabla }\chi _R\Vert _{2,-5/2}\\&+\Vert \xi \mathring{\nabla }^2\chi _R\Vert _{2,-5/2}+\Vert \mathring{\nabla }(\xi )\mathring{\nabla }(\chi _R)\Vert _{2,-5/2}\Big {)}\\ \le \,&c \, \Big {(}\Vert f_1\Vert _{2,-5/2}+\Vert \pi \Vert _{4,-3/2}\Vert \xi \Vert _{4,0:A_R}+\Vert \xi \Vert _{2,-1/2}\\&+\Vert \mathring{\nabla }\xi \Vert _{2,-3/2:A_R} \Big {)}\\ \le \,&c \, \Big {(}\Vert f_1\Vert _{2,-5/2}+\Vert \pi \Vert _{1,2,-3/2}\Vert \xi \Vert _{1,2,0:A_R}+\Vert \xi \Vert _{2,-1/2}\\&+\Vert \mathring{\nabla }\xi \Vert _{2,-3/2:A_R} \Big {)}\\ \le \,&C \, (\Vert f_1\Vert _{2,-5/2}+\Vert \xi \Vert _{2,-1/2}+\Vert \mathring{\nabla }\xi \Vert _{2,-3/2:A_R})\\ \le \,&C \, (\Vert f_1\Vert _{2,-5/2}+\Vert \xi \Vert _{2,-1/2})+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R}. \end{aligned}$$

Almost identically, we have

$$\begin{aligned} \Vert \mathring{\nabla }D\varPhi ^F_2[\chi _R\xi ]\Vert _{2,-5/2}\le \,&\, \Big {(}\Vert \chi _R\mathring{\nabla }D\varPhi ^F_2[\xi ]\Vert _{2,-5/2}+\Vert \pi \xi \mathring{\nabla }\chi _R\Vert _{2,-5/2}\\&+\Vert \xi \mathring{\nabla }^2\chi _R\Vert _{2,-5/2}+\Vert \mathring{\nabla }(\xi )\mathring{\nabla }(\chi _R)\Vert _{2,-5/2}\Big {)}\\ \le \,&C \, (\Vert \mathring{\nabla }f_2\Vert _{2,-5/2}+\Vert \xi \Vert _{2,-1/2})+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R} \end{aligned}$$

and a similar estimate for \(\Vert D\varPhi ^F_2[\chi _R\xi ]\Vert _{2,-3/2}\) holds, so we have in fact,

$$\begin{aligned} \Vert D\varPhi ^F_2[\chi _R\xi ]\Vert _{1,2,-3/2}\le C (\Vert \mathring{\nabla }f_2\Vert _{2,-5/2}+\Vert \xi \Vert _{2,-1/2})+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R}. \end{aligned}$$

Inserting the estimates above into (30) we arrive at

$$\begin{aligned} \Vert \mathring{\nabla }^2(\chi _R\xi )\Vert _{2,-5/2}\le&\,C \big {(}\Vert f_1\Vert _{2,-5/2}+\Vert f_2|_{1,2,-3/2}\nonumber \\&+\Vert \xi \Vert _{2,-1/2}+\epsilon \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R}\big {)}. \end{aligned}$$
(31)

Unfortunately we are unable to ensure \(\Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R}\lesssim \Vert \mathring{\nabla }^2(\chi _R\xi )\Vert _{2,-5/2}\), so we can not absorb the last term into the left-hand side of (31). Recalling Remark 1, we apply the local version of Proposition 2 to obtain

$$\begin{aligned} \Vert \mathring{\nabla }^2\xi \Vert _{2,-5/2:A_R}\le C\left( \Vert f_1\Vert _{2,-5/2}+\Vert f_2\Vert _{1,2,-3/2}+\Vert \xi \Vert _{2,0}\right) . \end{aligned}$$
(32)

Finally we obtain the desired uniform bound, applying the interpolation inequality from Theorem 1,

$$\begin{aligned} \Vert \chi _R\xi \Vert _{2,2,-1/2}&\le C(\Vert \chi _R\xi \Vert _{2,-1/2}+\Vert \mathring{\nabla }^2(\chi _R\xi )\Vert _{2,-5/2})\nonumber \\&\le \,C \big {(}\Vert f_1\Vert _{2,-5/2}+\Vert f_2\Vert _{1,2,-3/2}+\Vert \xi \Vert _{2,-1/2}\big {)}. \end{aligned}$$
(33)

It follows that \(\chi _R\xi \) converges weakly to \(\xi \) in \(H^2_{-1/2}\). Now, since the formal adjoint agrees with the true adjoint, the boundary terms arising from integration by parts necessarily vanish; explicitly (cf. Eq. (42)),

$$\begin{aligned} \oint _\Sigma \left( \xi ^0(\nabla _i\text {tr}_gh-\nabla ^jh_{ij})\sqrt{g}-2\xi ^jp_{ij}\right) dS^i=0, \end{aligned}$$
(34)

for all \((h,p)\in (H^2_{-1/2}\cap \overline{H}^1_{-1/2})\times H^1_{-3/2}\). It follows that \(\xi \) vanishes on \(\Sigma \) and therefore, \(\xi \in H^2_{-1/2}\cap \overline{H}^1_{-1/2}\). \(\square \)

Theorem 2

For all \((s,S)\in \mathcal {N}\), the level set \(\mathcal {C}(s,S):=\varPhi ^{-1}(s,S)\) is a Hilbert submanifold of \(\mathcal {F}\). We refer to this as the constraint manifold.

Proof

By the implicit function theorem, we simply must demonstrate that \(D\varPhi _{(g,\pi )}\) is surjective and the kernel splits. The kernel trivially splits with respect to the Hilbert structure, so we simply must prove that \(D\varPhi ^*_{(g,\pi )}\) has trivial kernel and \(D\varPhi _{(g,\pi )}\) has closed range. It is clear from the above, that elements in the kernel of \(D\varPhi ^*_{(g,\pi )}\) indeed satisfy \(D\varPhi ^F_{(g,\pi )}=0\). Once we have this, note that Bartnik’s proof of the triviality of \(\ker (D\varPhi ^F_{(g,\pi )})\) relies only on the structure of the equation and the asymptotics assumedFootnote 1—it is entirely unaffected by the inclusion of an interior boundary. Therefore this proof applies here and we simply must prove that \(D\varPhi _{(g,\pi )}\) is surjective, which is again adapted from Bartnik’s arguments to deal with the boundary. The key to making this argument work is the estimate given earlier by Lemma 2.

The idea is to consider a restriction of \(D\varPhi _{(g,\pi )}\) to variations of a particular form, so that the operator becomes elliptic. Then we simply must show that this restricted operator has closed range and finite dimensional cokernel. We consider

$$\begin{aligned} h_{ij}(y)=-\frac{1}{2}yg_{ij},\qquad p^{ij}(Y)=\frac{1}{2}(\nabla ^iY^j+\nabla ^jY^i-g^{ij}\nabla _kY^k)\sqrt{g} \end{aligned}$$

for \(y,Y\in H^2_{-1/2}\cap \overline{H}^1_{-1/2}(M)\).

For the operator \(F[y,Y]:=D\varPhi _{(g,\pi )}[h(y),p(Y)]\), we have

$$\begin{aligned} F[y,Y]= \begin{bmatrix} \varDelta y\sqrt{g}-\frac{1}{4}\varPhi _0(g,\pi )y+\frac{1}{2}\pi ^k_k\nabla _j Y^j-2\pi ^{ij}\nabla _i Y_j\\ \varDelta Y_i \sqrt{g}+R_{ij}Y^j\sqrt{g}-\nabla _j(\pi ^j_i)y-\pi _i^j\mathring{\nabla }_jy+\frac{1}{2}\pi ^j_j\mathring{\nabla }_iy \end{bmatrix}. \end{aligned}$$

and the formal adjoint is given by

$$\begin{aligned} F^F[z,Z]= \begin{bmatrix} \varDelta z\sqrt{g}-\frac{1}{4}\varPhi _0(g,\pi )z+\pi ^j_i\mathring{\nabla }_j Z^i-\frac{1}{2}\mathring{\nabla }_i(\pi ^j_j Z^i)\\ \varDelta Z_j \sqrt{g}+2\nabla _i(\pi ^i_j z)-\frac{1}{2}\nabla _j(\pi ^i_i z)+R_{ij}Z^i\sqrt{g} \end{bmatrix}. \end{aligned}$$

It follows from the proof of Proposition 3, that any (zZ) that weakly satisfies \({F^*[z,Z]=0}\) is in fact \(H^2_{-1/2}\) and the boundary terms arising from integration by parts vanish; that is, \((z,Z)\in H^2_{-1/2}\cap \overline{H}^1_{-1/2}(M)\). From Lemma 2, it is straightforward to show using the weighted Hölder, Sobolev and interpolation inequalities (cf. eq. (3.42) of [9]), that we have the scale-broken estimate:

$$\begin{aligned} \Vert (y,Y)\Vert _{2,2,-1/2}\le C(\Vert F[y,Y]\Vert _{2,-5/2}+\Vert (y,Y)\Vert _{2,0}). \end{aligned}$$
(35)

It is now a standard argument to demonstrate that F has closed range and finite dimensional cokernel (cf. Ref. [11], Theorem 6.3, and Ref. [5], Theorem 1.10).

Let \((y,Y)_i\) be a sequence in \(\ker (F)\) satisfying \(\Vert (y,Y)\Vert _{2,2,-1/2}\le 1\); that is, a sequence in the closed unit ball in \(\ker (F)\). By the weighted Rellich compactness theorem, passing to a subsequence, \((y,Y)_{i_n}\) converges strongly in \(L^2_0\), which in turn implies via (35) that \((y,Y)_{i_n}\) converges strongly in \(H^2_{-1/2}\). That is, the closed unit ball in \(\ker (F)\) is compact, and therefore \(\ker (F)\) is finite dimensional. It follows that the domain of F can be split as \(H^2_\delta \cap \overline{H}^1_\delta =\ker (F)\oplus Z\), for some closed orthogonal complementary subspace, Z. Now, for \((y,Y)\in Z\), we prove

$$\begin{aligned} \Vert (y,Y)\Vert _{2,2,-1/2}\le C\Vert F[y,Y]\Vert _{2,-5/2} \end{aligned}$$
(36)

by contradiction. Assume that there exists a sequence \((y,Y)_i\in W\) satisfying \({\Vert (y,Y)_i\Vert _{2,2,-1/2}=1}\), while \(\Vert F[y,Y]_i\Vert _{2,-5/2}\rightarrow 0\). By the above argument, passing to a subsequence, we have that \((y,Y)_{i_n}\) converges strongly to \((y,Y)\in W\). By continuity, \(F[y,Y]=0\), while \(\Vert (y,Y)\Vert _{2,2,-1/2}=1\), implying the intersection of \(\ker (F)\) and W is nontrivial. That is, by contradiction, (36) holds. An identical argument to that used in the proof of Proposition 1 now shows that F has closed range.

Furthermore, since \(F^F\) has the same form as F, an estimate of the form of (35) also holds for \({(z,Z)\in \ker (F^*)}\), which implies that \(\ker (F^*)\) is finite dimensional. Since the range of F is contained in the range of \(D\varPhi \), we have surjectivity of \(D\varPhi \) and therefore completes the proof.\(\square \)

4 Critical points of the ADM mass

In [9] Bartnik discusses a result of Corvino, which states that if there exists an asymptotically flat extension to a compact manifold with boundary, minimising the ADM energy, then it must be a static metric [14, 15]. Specifically, Bartnik argues that it would be more natural to obtain Corvino’s result from the Hamiltonian considerations he uses to prove a similar result for complete manifolds with no boundary. Here we give such an argument, considering the mass rather than the energy, and obtain that critical points of the mass functional only occur if the exterior is stationary. Furthermore, if these stationary solutions are smooth, they must in fact be static black hole exteriors. This is elaborated on below in Remark 2. It should be noted that our set of extensions is larger than the usual set of admissible extensions in the context of the Bartnik mass. In order to ensure the validity of the positive mass theorem, we would also require conditions on the mean curvature of \(\Sigma \) (see [25]). Since the first version of this article was posted to the arXiv, Anderson and Jauregui have successfully established this in the time-symmetric case using Banach manifolds modelled on weighted Hölder spaces [3]. The framework for this argument has more recently been developed by Z. An [1], and very recently this framework has been used to establish a version of this result outside of time-symmetry [2]. The content of this section has also been discussed in [22] using stronger boundary conditions than considered here, and indeed stronger than the preferred mean curvature boundary conditions mentioned above.

As in the preceding section, we quote Bartnik’s results where the proofs require no modifications to this case. Furthermore, the results established here are again based on adapting Bartnik’s arguments to deal with the boundary. The results of Sect. 3 are precisely what is needed for these arguments to work in the case considered here.

The energy-momentum covector \(\mathbb {P}_\mu =(m_0,p_i)\) is defined by

$$\begin{aligned} 16\pi m_0&:=\oint _{S_\infty }\mathring{g}^{jk}(\mathring{\nabla }_kg_{ij}-\mathring{\nabla }_i g_{jk})dS^i, \end{aligned}$$
(37)
$$\begin{aligned} 16\pi p_i&:=2\oint _{S_\infty }\pi _{ij}dS^j. \end{aligned}$$
(38)

It is useful to consider the pairing of the energy-momentum covector with some asymptotic translation, \(\xi _\infty =(\xi _\infty ^0,\xi _\infty ^i)\in \mathbb {R}^{1,3}\),

$$\begin{aligned} 16\pi \xi _\infty \cdot \mathbb {P}=\oint _\infty \left( \xi _\infty ^0\mathring{g}^{ik}(\mathring{\nabla }_k g_{ij}-\mathring{\nabla }_j g_{ik})+2\xi _\infty ^i\pi _{ij}\right) dS^j. \end{aligned}$$

By writing this as scalar-valued flux integral at infinity, we can make sense of this as an integral over M through the divergence theorem. To extend \(\xi _\infty \) to a scalar function and vector field over M, we identify \(\xi _\infty ^0\) with a constant function and \(\xi ^i_\infty \) with a \(\mathring{g}\)-parallel vector field in a neighbourhood of infinity; that is, we identify \(\xi _\infty \) with some \(\tilde{\xi }\), defined near infinity and satisfying \(\mathring{\nabla }\tilde{\xi }\equiv 0\). We then choose any smooth bounded \(\xi _{{{\,\mathrm{ref}\,}}}=(\xi _{{{\,\mathrm{ref}\,}}}^0,\xi _{{{\,\mathrm{ref}\,}}}^i)\) supported away from \(\Sigma \) and with \(\xi _{{{\,\mathrm{ref}\,}}}\equiv \tilde{\xi }\) near infinity to represent \(\xi _\infty \). This allows us to write the energy-momentum as

$$\begin{aligned} 16\pi \xi ^0_\infty \mathbb {P}_0(g)=&\int _M\Big {(}\xi _{{{\,\mathrm{ref}\,}}}^0(\mathring{g}^{ki}\mathring{g}^{jl}\mathring{\nabla }_k\mathring{\nabla }_l g_{ij}-\mathring{\varDelta }\text {tr}_{\mathring{g}} g)\nonumber \\&+\mathring{g}^{ki}\mathring{g}^{jl}\mathring{\nabla }_k \xi _{{{\,\mathrm{ref}\,}}}^0(\mathring{\nabla }_l g_{ij}-\mathring{\nabla }_i \mathring{g}_{jl})\Big {)} \sqrt{\mathring{g}}, \end{aligned}$$
(39)
$$\begin{aligned} 16\pi \xi ^i_\infty \mathbb {P}_i(\pi )=&\,2\int _M\left( \xi _{{{\,\mathrm{ref}\,}}}^i\mathring{\nabla }_j\pi _i^j+\pi ^j_i\mathring{\nabla }_j\xi _{{{\,\mathrm{ref}\,}}}^i\right) . \end{aligned}$$
(40)

Now it should be noted that \(\mathbb {P}\) is not well-defined everywhere on \(\mathcal {F}\); however, it is well-defined on any constraint manifold \(\mathcal {C}(s,S)\) with \((s,S)\in L^1=L^1_{-3}\). In Section 4 of [9], it is shown that this definition is equivalent to the usual definition of the ADM energy-momentum and is in fact a smooth map on each \(\mathcal {C}(s,S)\) with \((s,S)\in L^1\).

It is well known that the mass must be added to the ADM Hamiltonian in order to generate the correct equations of motion [28]. The formal equations of motion arising from the ADM Hamiltonian are indeed the correct evolution equations, however the boundary terms, coming from the integration by parts, correspond to the linearisation of the energy-momentum; that is, for \((g,\pi )\in \mathcal {F}\), the correct Hamiltonian to generate the equations of motion is given by

$$\begin{aligned} \mathcal {H}^{(\xi )}(g,\pi )=16\pi \xi ^\mu _\infty \mathbb {P}_\mu -\int _M\xi ^\mu \varPhi _\mu (g,\pi ), \end{aligned}$$
(41)

where \(\xi \in \varXi :=\{\xi :\xi -\xi _{{{\,\mathrm{ref}\,}}}\in H^2_{-1/2}\cap \overline{H}^1_{-1/2}(M)\}\). While the separate terms in (41) are not well-defined on \(\mathcal {F}\), by combining the terms into a single integrand, the dominant terms in each component cancel exactly (cf. [9]). Henceforth, we consider the Hamiltonian to be this regularised one, with the dominant terms cancelled. Note that \(\xi \) and its first derivatives are required to vanish on \(\Sigma \) in order to ensure that the surface integrals arising there, due to integration by parts in obtaining the equations of motion, do indeed vanish. This can be seen by considering the following:

$$\begin{aligned}&(h,p)\cdot D\varPhi ^F_{(g,\pi )}[\xi ]-\xi \cdot D\varPhi _{(g,\pi )}[h,p]\nonumber \\&\quad =\,\nabla ^i\Big {(}(\xi ^0(\mathring{\nabla }_i\text {tr}_gh-\nabla ^jh_{ij})+\mathring{\nabla }^j(\xi ^0)h_{ij}-\text {tr}_g h\mathring{\nabla }_i(\xi ^0))\sqrt{g}-2\xi ^jp_{ij}\Big {)}\nonumber \\&\qquad -\nabla ^i\Big {(}2\pi ^k_i h_{jk}\xi ^j - \pi ^{jk}h_{jk}\xi _i \Big {)}. \end{aligned}$$
(42)

The surface integrals at infinity are exactly cancelled by the term \(16\pi \xi ^\mu _\infty \mathbb {P}_\mu \) (cf. [9]). In particular, we have for all \((g,\pi )\in \mathcal {F}\), \((h,p)\in T_{(g,\pi )}\mathcal {F}\) and \(\xi \in \varXi \),

$$\begin{aligned} D\mathcal {H}^{(\xi )}_{(g,\pi )}[h,p]=-\int _M(h,p)\cdot D\varPhi ^F_{(g,\pi )}[\xi ]. \end{aligned}$$
(43)

The ability to express the variation of the Hamiltonian in this form is precisely what we mean by the statement that the correct equations of motion are generated. In this form, we can interpret the variation of the Hamiltonian density with respect to each of g and \(\pi \); that is, \(\frac{\delta H^{(\xi )}}{\delta g}=D\varPhi ^F_{1\,(g,\pi )}[\xi ]\). We then can write Hamilton’s equations as

$$\begin{aligned} \frac{\partial }{\partial t} \begin{bmatrix} g\\ \pi \\ \end{bmatrix} =-\begin{bmatrix}0&{}1\\ -1&{}0\\ \end{bmatrix}\circ D\varPhi _{(g,\pi )}^F[\xi ], \end{aligned}$$
(44)

where t is interpreted as the flow parameter of (NX) in the full spacetime; this is exactly the Einstein evolution equations. This also motivates a result of Moncrief [26], equating solutions to \(D\varPhi _{(g,\pi )}^F[\xi ]=0\) with Killing vectors in the spacetime. For this reason, we say an initial data set \((g,\pi )\) is stationary if there exists \(\xi \), asymptotic to a constant timelike translation, satisfying \(D\varPhi _{(g,\pi )}^F[\xi ]=0\).

It is evident that the Hamiltonian (41) has the form of a Lagrange function, where we seek to find extrema of \(\xi ^\mu _\infty \mathbb {P}_\mu \) subject to the constraints being satisfied. As such, we need to make use of the following Lagrange multipliers theorem for Banach manifolds (cf. Theorem 6.3 of [9]).

Theorem 3

Suppose \(K:B_1\rightarrow B_2\) is a \(C^1\) map between Banach manifolds, such that \(DK_u:T_uB_1\rightarrow T_{K(u)}B_2\) is surjective, with closed kernel and closed complementary subspace for all \(u\in K^{-1}(0)\). Let \(f\in C^1(B_1)\) and fix \(u\in K^{-1}(0)\), then the following statements are equivalent:

  1. (i)

    For all \(v\in \ker DK_u\), we have

    $$\begin{aligned} Df_u(v)=0.\end{aligned}$$
    (45)
  2. (ii)

    There is \(\lambda \in B_2^*\) such that for all \(v\in B_1\),

    $$\begin{aligned} Df_u(v)=\left<\lambda ,DK_u(v)\right>,\end{aligned}$$
    (46)

    where \(< \, ,>\) refers to the natural dual pairing.

From this, we prove the following.

Theorem 4

Let \(\xi _\infty \in \mathbb {R}^{1,3}\) be some fixed future-pointing timelike vector, \((s,S)\in L^1\), and define \(E^{(\xi _\infty )}(g,\pi )\in C^\infty (\mathcal {C}(s,S))\) by

$$\begin{aligned} E^{(\xi _\infty )}(g,\pi )=\xi ^\mu _\infty \mathbb {P}_\mu (g,\pi ). \end{aligned}$$

For \((g,\pi )\in \mathcal {C}(s,S)\), the following statements are equivalent:

  1. (i)

    For all \((h,p)\in T_{(g,\pi )}\mathcal {C}(s,S)\),

    $$\begin{aligned} DE^{(\xi _\infty )}_{(g,\pi )}[h,p]=0. \end{aligned}$$
  2. (ii)

    There exists \(\xi \in \varXi \) satisfying

    $$\begin{aligned} D\varPhi ^F_{(g,\pi )}[\xi ]=0. \end{aligned}$$

Proof

Assume (i) holds for some fixed \((g,\pi )=(\tilde{g},\tilde{\pi })\in \mathcal {C}(s,S)\). Let \(K(g,\pi )=\varPhi (g,\pi )-(s,S)\) and let \(f(g,\pi )=\mathcal {H}^{(\xi )}(g,\pi )\) for some \(\xi \in \varXi \), then condition (i) of Theorem 3 is satisfied. It follows that there exists \(\lambda \in L^2_{-5/2}\) such that

$$\begin{aligned} D\mathcal {H}^{(\xi )}_{(\tilde{g},\tilde{\pi })}[h,p]=\int _M \lambda \cdot D\varPhi _{(\tilde{g},\tilde{\pi })}[h,p], \end{aligned}$$

for all \((h,p)\in T_{(\tilde{g},\tilde{\pi })}\mathcal {F}\), which combined with (43), gives

$$\begin{aligned} -\int _M(h,p)\cdot D\varPhi ^F_{(\tilde{g},\tilde{\pi })}[\xi ]=\int _M\lambda \cdot D\varPhi _{(\tilde{g},\tilde{\pi })}[h,p]. \end{aligned}$$

Now \(D\varPhi ^F_{(\tilde{g},\tilde{\pi })}[\xi ] \in L^2_{-5/2}\times W^{1,2}_{-3/2}\), so Proposition 3 then implies

$$\begin{aligned} D\varPhi _{(\tilde{g},\tilde{\pi })}^F[\xi +\lambda ]=0, \end{aligned}$$

and \(\lambda \in H^2_{-1/2}\cap \overline{H}^1_{-1/2}(M)\), which in turn implies \((\xi +\lambda )\in \varXi \).

Conversely, assuming (ii) holds at some \((\tilde{g},\tilde{\pi })\), it follows from (43) that

$$\begin{aligned} D\mathcal {H}^{(\xi )}_{(\tilde{g},\tilde{\pi })}[h,p]=0, \end{aligned}$$

for all \((h,p)\in T_{(\tilde{g},\tilde{\pi })}\mathcal {F}\). Then by the definition of \(\mathcal {H}^{(\xi )}\), we have

$$\begin{aligned} D\mathcal {H}^{(\xi )}_{(\tilde{g},\tilde{\pi })}[h,p]=DE^{(\xi _\infty )}_{(\tilde{g},\tilde{\pi })}[h,p]=0, \end{aligned}$$

for all \((h,p)\in \mathcal {C}(s,S)\); that is, (i) holds. \(\square \)

Physically, \(E^{(\xi _\infty )}\) is interpreted as the total energy viewed by an observer at infinity, whose worldline is generated by \(\xi _\infty \). So Theorem 4 may be interpreted as the statement that critical points of the energy measured by \(\xi _\infty \), correspond to solutions with Killing vectors asymptotic to \(\xi _\infty \).

Let \(\eta \) be the Minkowski metric with signature \((-,+,+,+)\), and define \(\mathbb {P}^\mu =\eta ^{\mu \nu }\mathbb {P}_\nu \). Further define the total mass, \(m=\frac{-\mathbb {P}^\mu \mathbb {P}_\mu }{\sqrt{|\mathbb {P}^\mu \mathbb {P}_\mu |}}\). Recall that we have not imposed conditions on the boundary mean curvature; that is, we include initial data for which the positive mass theorem fails. Away from \(m=0\), this is a smooth function on \(\mathcal {C}(s,S)\) when \((s,S)\in L^1\). With this in mind, we have the following corollary of Theorem 4.

Corollary 1

Suppose \((g,\pi )\in \mathcal {C}(s,S)\) with \((s,S)\in L^1\), and \(\mathbb {P}^\mu \) is a past-pointing timelike vector, then the following statements are equivalent:

  1. (i)

    For all \((h,p)\in T_{(g,\pi )}\mathcal {C}(s,S)\), \(Dm_{(g,\pi )}[h,p]=0\).

  2. (ii)

    \((g,\pi )\) is a stationary initial data set, whose stationary Killing vector is proportional to \(\mathbb {P}\) at infinity and vanishes on \(\Sigma \).

It is worth noting that a Killing vector that is asymptotically constant, must in fact be proportional to \(\mathbb P\) at infinity [10].

Proof

We first show the implication \((i)\implies (ii)\). Let \(\xi _\infty ^\mu =-\frac{1}{m}\mathbb {P}^\mu \) be a future-pointing unit timelike vector, parallel to \(\mathbb {P}^\mu \). It then follows that \(E^{(\xi _\infty )}(g,\pi )=m\), so (i) implies condition (i) of Theorem 4 and (ii) follows.

Conversely, if (ii) holds, then possibly after rescaling, we have some \(\xi \in \varXi \), where \(\xi _\infty ^\mu =-\frac{1}{m}\mathbb {P}^\mu \), satisfying \(D\varPhi ^F_{(g,\pi )}[\xi ]=0\). Again, \(E^{(\xi _\infty )}(g,\pi )=m\) and Theorem 4 implies (i). \(\square \)

Remark 2

If the data is smoothFootnote 2 vacuum data, the stationarity conclusion can be replaced with staticity by the following argument. It is known that if a Killing vector field vanishes identically on a closed spacelike 2-surface, then that 2-surface is the bifurcation surface of a bifurcate Killing horizon (see, for example [31]). Furthermore, a result of Chruściel and Wald [13] implies the existence of a maximal spacelike hypersurface in the full spacetime containing the bifurcation surface. Then a staticity theorem of Sudarsky and Wald can be applied [29] (cf. Section 7 of [12]), which states, under the assumption of the existence of a maximal spacelike hypersurface, if the stationary Killing vector generates the horizon, then the solution is static. That is, for the vacuum case, critical points of the mass occur exactly when the solution is the region exterior to a Schwarzschild black hole. It follows that for generic choices of \(\mathring{g}\) on \(\Sigma \), there are no smooth critical points of the mass functional.

Remark 3

The same analysis can be performed with \(\pi \equiv 0\), considering only the Hamiltonian constraint. In this case, the mass and energy are interchangeable, and we only have the lapse as the Lagrange multiplier. The conclusion from the above analysis is then that critical points of the mass correspond to static solutions, as the Killing vector is necessarily hypersurface orthogonal (cf. Theorem 8 of [14]).

4.1 Geometric boundary data

The asymptotic value of the stationary Killing vector field given by Theorem 4, comes from our choice of \(\xi _{{{\,\mathrm{ref}\,}}}\), which above we chose to be supported away from \(\Sigma \). However, if we allow \(\xi _{{{\,\mathrm{ref}\,}}}\) to be nonzero on \(\Sigma \) then the energy-momentum can no longer be expressed as integrals over M, and expression (43) no longer holds. To deal with this, we leave \(\xi _{{{\,\mathrm{ref}\,}}}\) unchanged in the definition of \(\mathbb {P}\) and we introduce \(\xi _\Sigma =(\xi ^0_\Sigma ,0,0,0)\) with support near \(\Sigma \) and \(\xi ^0_\Sigma \) constant on \(\Sigma \). We then modify the Hamiltonian to allow for \({\xi \in \hat{\varXi }:=\{\xi :\xi -\xi _{{{\,\mathrm{ref}\,}}}-\xi _\Sigma \in H^2_{-1/2}\cap \overline{H}^1_{-1/2}(M)\}}\)

$$\begin{aligned} \hat{\mathcal {H}}^{(\xi )}(g,\pi )=16\pi \xi ^\mu _\infty \mathbb {P}_\mu +2\xi ^0_\Sigma \oint _\Sigma H_g\, dS_g-\int _M\xi ^\mu \varPhi _\mu (g,\pi ), \end{aligned}$$
(47)

where \(H_g\) is the mean curvature of \(\Sigma \) in M, computed with respect to the unit normal pointing towards infinity. If we assume that \(\Sigma \) is a topological two-sphere with positive Gaussian curvature then it is well-known that there exists a unique isometric embedding of \(\Sigma \) into \(\mathbb {R}^3\), and then the Brown–York mass can be defined. Let \(H_0\) be the mean curvature of \(\Sigma \) when isometrically embedded in \(\mathbb {R}^3\) as above, then the Brown–York mass is given by

$$\begin{aligned} m_{BY}(\Sigma )=\frac{1}{8\pi }\oint _\Sigma (H_0-H_g)\,dS_g. \end{aligned}$$

Note that we could replace the term \(2\oint _\Sigma H_g\, dS_g\) in (47) with \(-16\pi m_{BY}\) when the latter is defined, as the addition of a constant does not change the equations of motion. This modification to the Hamiltonian seems somehow more intuitive as it gives a sensible measure of the energy of the system, however it should be emphasised that we require the positive Gaussian curvature condition to ensure that it is well-defined. In what follows, we will assume \(\Sigma \) has positive Gaussian curvature, however this is purely for aesthetic purposes and one may drop this assumption if the reader simply replaces \(m_{BY}\) with \(-\frac{1}{8\pi }\oint _\Sigma H_g\, dS_g\). With this in mind, we write the Hamiltonian as

$$\begin{aligned} \hat{\mathcal {H}}^{(\xi )}(g,\pi )=16(\pi \xi ^\mu _\infty \mathbb {P}_\mu -\xi ^0_\Sigma m_{BY}(g;\Sigma ))-\int _M\xi ^\mu \varPhi _\mu (g,\pi ). \end{aligned}$$
(48)

In coordinates adapted to \(\Sigma \), the linearisation of \(m_{BY}(g;\Sigma )\) is given by (cf. [24])

$$\begin{aligned}&16\pi Dm^{BY}_{(g,\pi )}(\Sigma )[h,p]=2D\left( \oint _\Sigma H_g\, dS_g \right) _{(g,\pi )}[h,p] \\&\quad =\oint _\Sigma \left( \nabla _nh_{nn}-H_\Sigma (g)h_{nn}+2h^{AB}K_{AB}-2\nabla ^ih_{in}+\nabla _n{{\,\mathrm{tr}\,}}_gh\right) dS, \end{aligned}$$

where \(A,B=1,2\) are coordinates on \(\Sigma \), n is the normal direction, and \(K_{AB}\) is the second fundamental form of \(\Sigma \) with respect to g and n. Since h vanishes on \(\Sigma \), this reduces to

$$\begin{aligned} 16\pi Dm^{BY}_{(g,\pi )}(\Sigma )[h,p]=\oint _\Sigma \left( \nabla _n{{\,\mathrm{tr}\,}}_gh-\nabla _nh_{nn}\right) dS, \end{aligned}$$

where we have also made use of the divergence theorem. Now it is straightforward to check (cf. 42) we have

$$\begin{aligned} (h,p)\cdot D\varPhi ^F_{(g,\pi )}[\xi ]-\xi \cdot D\varPhi _{(g,\pi )}[h,p]=16\pi \xi ^0_\Sigma Dm^{BY}_{(g,\pi )}(\Sigma )[h,p]. \end{aligned}$$

This then gives us (cf. 43)

$$\begin{aligned} D\hat{\mathcal {H}}^{(\xi )}_{(g,\pi )}[h,p]=-\int _M(h,p)\cdot D\varPhi ^F_{(g,\pi )}[\xi ], \end{aligned}$$
(49)

for all \((h,p)\in T_{(g,\pi )}\mathcal {F}\). At this point, it only requires superficial modifications to the proofs of Theorem 4 and Corollary 1, to obtain the following.

Theorem 5

Let \(\xi _\infty \in \mathbb {R}^{1,3}\) be some fixed future-pointing timelike vector, \(\xi _\Sigma \in \mathbb {R}\) be some fixed constant, \((s,S)\in L^1\), and define \(\hat{E}^{(\xi _{{{\,\mathrm{ref}\,}}})}(g,\pi )\in C^\infty (\mathcal {C}(s,S))\) by

$$\begin{aligned} \hat{E}^{(\xi _{{{\,\mathrm{ref}\,}}})}(g,\pi )=\xi ^\mu _\infty \mathbb {P}_\mu (g,\pi )-\xi _\Sigma m^{BY}(g;\Sigma ). \end{aligned}$$

For \((g,\pi )\in \mathcal {C}(s,S)\), the following statements are equivalent:

  1. (i)

    For all \((h,p)\in T_{(g,\pi )}\mathcal {C}(s,S)\),

    $$\begin{aligned} D\hat{E}^{(\xi _{{{\,\mathrm{ref}\,}}})}_{(g,\pi )}[h,p]=0. \end{aligned}$$
  2. (ii)

    There exists \(\xi \in \hat{\varXi }\) satisfying

    $$\begin{aligned} D\varPhi ^F_{(g,\pi )}[\xi ]=0. \end{aligned}$$

Note that this version of the theorem does not force the Killing vector to vanish on the boundary, but rather it is orthogonal to the initial data hypersurface there. By fixing \(\xi _\Sigma =1\) on \(\Sigma \) Corollary 1 becomes:

Corollary 2

Suppose \((g,\pi )\in \mathcal {C}(s,S)\) with \((s,S)\in L^1\), and \(\mathbb {P}^\mu \) is a past-pointing timelike vector, then the following statements are equivalent:

  1. (i)

    For all \((h,p)\in T_{(g,\pi )}\mathcal {C}(s,S)\), \(Dm_{(g,\pi )}[h,p]=Dm^{BY}_{(g,\pi )}(\Sigma )[h,p]\).

  2. (ii)

    \((g,\pi )\) is a stationary initial data set, whose stationary Killing vector is proportional to \(\mathbb {P}^\mu \) at infinity and \((-m_0,0,0,0)\) on \(\Sigma \), with the same constant of proportionality.

Remark 4

Bartnik’s geometric boundary conditions ask that the induced metric on the boundary and mean curvature are prescribed, as well as the trace of K with respect to the boundary metric and a projection, \(K_{Ak}n^k\). One would like to prove that critical points of the mass, subject to these quantities being fixed, correspond to stationary solutions. In the time-symmetric case, one would like to prove that critical points of the mass subject to only the mean curvature being fixed, correspond to static solutions. While Corollary 2 does not prove this, it does illuminate a connection. Since first posting this article to arXiv, the static case has been established by Anderson and Jauregui [3] and in full generality by An [1, 2], in both cases working on a phase space modelled on weighted Hölder spaces.

By choosing different conditions on \(\xi \), both at infinity and on \(\Sigma \), we will obtain different conditions for solutions to be stationary; essentially, these ideas can be used to find the appropriate condition for the existence of a Killing vector with prescribed boundary conditions. In [20], we use similar ideas to prove that the first law of black hole mechanics gives a condition for stationarity, when the boundary conditions on the Killing vector are inspired by bifurcate Killing horizons. Here we can include the quasi-local generalised angular momentum used in [20] to obtain a similar result, without the area/surface gravity term (as the metric is fixed on \(\Sigma \) here). One can also infer that \(\hat{E}^{(\xi _{{{\,\mathrm{ref}\,}}})}\) has no critical points when \(\xi _\infty =0\) from the fact that \(D\varPhi ^F\) has trivial kernel in \(L^2_{-1/2}\). That is, one immediately has the expected, or perhaps even obvious, result that the Brown–York mass (equivalently, the mean curvature of \(\Sigma \)) has no critical points.