1 Introduction

In [15] the authors discuss how causality arguments can be used to prove the positive mass theorem.Footnote 1 Their arguments have the advantage of relying specifically on properties of spacetime which one would expect to be characteristic of positive mass, in particular the focusing and retarding of null geodesics which pass near the source (as well as the corresponding relative time advancement of geodesics passing far from the source). Conversely, null geodesics which pass far from a negative mass source will be delayed relative to those which pass nearby. This is used to argue that a certain class of negative mass spacetimes must contain a null line as defined by Galloway [8]:

Definition 1.1

A null line is an inextendible null geodesic which is achronal. In particular, null lines cannot contain conjugate points.

On the other hand, under certain conditions (see, for example, Theorem 5.1 in Sect. 5) it is possible to prove that such a null line cannot exist. This allows us to prove a positive mass theorem among spacetimes satisfying these conditions (Corollary 5.2). If the Einstein equations are assumed, conditions on the Riemann tensor can equivalently be stated as conditions on the stress-energy tensor and hence can be thought of as requirements on the matter content of spacetime.

The argument given in [15] concerns only \(3+1\)-dimensional spacetimes and relies on the fact that null geodesics become infinitely delayed, relative to those nearer to a negative mass source, as we let their distance of closest approach tend to infinity (see [12] for further discussion). As discussed in [3] for the specific example of the Schwarzschild metric, this effect becomes vanishingly small in higher dimensions (i.e. in \(D+1\) dimensions with \(D\ge 4\)), owing to the faster decay at infinity of the leading order non-Minkowski terms in the metric. Indeed, [15] contains a brief discussion of this point and notes that it prevents the argument given from immediately generalising to higher dimensions.

However, in [5] the result is generalised to include higher dimensions (albeit with a more restrictive class of metrics than those considered in [15]). The assumptions made in this paper are weaker than those of [5]—namely we will require the metric to be uniformly Schwarzschildean (Definition 2.3) as in [4] rather than the more restrictive assumption that it is strongly uniformly Schwarzschildean ([5] Sect. 1). We will also drop the assumption in [5, Theorem 1.1] that the spacetime is weakly asymptotically regularFootnote 2). On the other hand, our proof will apply only to higher dimensions. We will also be unable to obtain a rigidity result in the \(m=0\) case. We hope that this method of proof will be enlightening as it is closer in spirit to the methods employed in [15], namely the construction of a null line as a fastest causal curve (Sect. 4) between two generators of past and future null infinity.

The theorem we will prove is the following (see later sections for definitions of various terms):

Theorem 1.2

Let (Mg) be a uniformly Schwarzschildean spacetime in \(D+1\) dimensions (\(D\ge 4\)) with ADM mass \(m_{ADM}<0\) and suppose \({\mathcal {D}}\cup {\mathcal {I}}\) is a globally hyperbolic subset of \({\tilde{M}}\). Then (Mg) contains a null line.

For \(3+1\)-dimensional spacetimes, the argument in [15], which we now outline, proceeds by constructing a fastest causal curve between a generator, \(\Lambda ^-\), of past null infinity (denoted \({\mathcal {I}}^-\)) and a generator, \(\Lambda ^+\), of future null infinity (denoted \({\mathcal {I}}^+\)). A causal (i.e. nowhere spacelike) curve is said to be faster than some other if it departs \({\mathcal {I}}^-\) no earlier and arrives at \({\mathcal {I}}^+\) no later. A fastest causal curve (if it exists) is defined to be such that no other causal curve is faster.

In order to specify the generator \(\Lambda ^+\), we specify some radial, outgoing, future pointing, Minkowski-null directionFootnote 3\(k^a\). We then define \(\Lambda ^+\) to be the intersection of \({\mathcal {I}}^+\) with the union of all null geodesics whose future pointing tangent vector asymptotes towards this direction at future null infinity. We similarly define \(\Lambda ^-\) as the intersection of \({\mathcal {I}}^-\) with the union of all null geodesics with future pointing tangent vector asymptoting to \(k^a\) at past null infinity. In this paper we will restrict attention to a quasi-Cartesian frame (Definition 2.2) and choose co-ordinates \((x_0,x_1,\ldots ,x_D)\) such that \(k^a\) has components \(k^\mu =(1,0,\ldots ,0,1)\) in this frame.

The key lemma in [15] is Lemma III.2.1. This gives an estimate of the time of flight along a curve consisting of the union of two null geodesics joined at a pointFootnote 4 with \(x_D=0\) and with endpoints on \(\Lambda ^\pm \). The time of flight is defined to be the retarded time of arrival on \(\Lambda ^+\) minus the advanced time of departure from \(\Lambda ^-\) (where these are defined using the Hamilton–Jacobi functions describing null geodesics with endpoints on \({\mathcal {I}}^+\) and \({\mathcal {I}}^-\) respectively). It is argued that this quantity behaves asymptotically like

$$\begin{aligned} 4P\cdot k\log (b/R) \end{aligned}$$
(1)

where \(P^a\) is the ADM 4-momentum of the spacetime [1], b is the value of \(r:=\sqrt{\sum \nolimits _{i=1}^{D}x_i^2}\) (defined using quasi-Cartesian co-ordinates) at the point where the null geodesics join, and R is a positive constant.

Suppose \(P\cdot k>0\), i.e. the 4-momentum of the spacetime is not future causal.Footnote 5 The aim is to show that there must then exist a fastest causal curve from \(\Lambda ^-\) to \(\Lambda ^+\) which enters the interior of the spacetime. If this is the case, then this curve must lie on the boundary of the causal future of some point \(p\in \Lambda ^-\) and hence must be a null geodesic ([20] Corollary after Theorem 8.1.2) without conjugate points ([9] Proposition 4.5.12).

The construction begins by finding points \(p\in \Lambda ^-\), \(q\in \Lambda ^+\) such that there is no causal curve which departs \(\Lambda ^-\) later than p and arrives at \(\Lambda ^+\) earlier than q. These will be the endpoints of the fastest causal curve, \(\gamma \), to be constructed. Consider a sequence of causal curves, \((\gamma _i)_{i=0}^\infty \), with endpoints on \(\Lambda ^\pm \) which tend towards p and q and with \(\gamma _i\) faster than \(\gamma _j\) for \(i>j\). The curve \(\gamma \) is defined to be the limit of this sequence of causal curves, where we use the fact that, in a globally hyperbolic set, the space of causal curves between two compact sets is compact ([19] Theorem 23).

It remains to check that \(\gamma \) does in fact enter the interior of the spacetime and hence define a null line. The curves \((\gamma _i)_{i=0}^\infty \) can be modified (possibly making them faster) so that they consist of two null geodesics joined at a point with \(x_D=0\), \(r=b_i\). This means that the estimate (1) now applies. If we were to have \(b_i\longrightarrow \infty \) as \(i\longrightarrow \infty \) then this estimate tells us that the time of flight along \(\gamma _i\) would also diverge to \(+\infty \) as \(i\longrightarrow \infty \). In particular, the sequence (\(\gamma _i)^\infty _{i=0}\) would eventually become slower than \(\gamma _0\). This contradicts the definition of the sequence, so we conclude that \(b_i\) must not diverge along the sequence and hence, possibly restricting to a subsequence, all of the \(\gamma _i\) must enter the compact set \({\mathcal {K}}:=\left( J^+(p_0)\cap J^-(q_0)\right) \setminus {\mathcal {U}}_R\), where \({\mathcal {U}}_R:=\{x\in J^+(p_0)\cap J^-(q_0):r(x)>R\}\) (see Sect. 2 for definitions). Once again, using the compactness result of [19], we conclude that \(\gamma \) must also enter this set, and hence must enter the interior of the spacetime. We therefore conclude that \(\gamma \) is a null line.

This argument does not generalise to higher dimensions because the time of flight along curves restricted to arbitrarily large values of r no longer diverges. Instead, the time of flight estimate (1) is replaced by Lemma 3.4. This lemma says that the time of flight along a curve from \(\Lambda ^-\) to \(\Lambda ^+\) which consists of two null geodesics tends to 0 as we let \(R\longrightarrow \infty \), where R is such that \(r>R\) along the curve.

However, Lemma 3.4 combined with ideas from [3] turns out to be sufficient to construct a null line for negative mass spacetimes in higher dimensions (with some slightly modified assumptions). In particular, we generalise a result of [3] to show that, by a comparison argument involving a Minkowski metric defined on a neighbourhood of conformal infinity, the presence of negative mass allows us to construct a Minkowski-null curve from \(\Lambda ^-\) to \(\Lambda ^+\) which is timelike with respect to the physical metricFootnote 6g. To do this, we show that for negative mass spacetimes, the Minkowski null cones at sufficiently large r are contained inside the g-null cones. Defining retarded and advanced time co-ordinates as in equation (8), it is a straightforward calculation to show that the time of flight along a Minkowski null geodesic is exactly zero. So, if we choose a \(\eta \)-null geodesic restricted to sufficiently large values of r, then this curve must be timelike with respect to the metric g. Then since the timelike future of any point is an open set, it must be possible to modify this curve slightly to obtain a g-timelike curve, \(\gamma _0\), between \(\Lambda ^-\) and \(\Lambda ^+\) which has time of flight strictly less than zero. This allows us to use a similar construction as was used in \(3+1\) dimensions based on a sequence of faster and faster causal curves \((\gamma )_{i=0}^\infty \) from \(\Lambda ^-\) to \(\Lambda ^+\). If \(b_i\longrightarrow \infty \) along this sequence, then by Lemma 3.4 the time of flight will tend to 0 and in particular will eventually become larger than the time of flight of \(\gamma _0\). As in 3+1 dimensions, this allows us to conclude that a fastest causal curve from \(\Lambda ^-\) to \(\Lambda ^+\) does exist.

2 Definitions and Assumptions

In order to carry out the comparison with Minkowski spacetime mentioned in the previous section, it will be necessary to impose stronger conditions than those used in [15]. These conditions will be more similar to the ones used in [4] and [5]. In this section we outline the various assumptions made.

Definition 2.1

A spacetime (Mg) is a connected manifold, M, of dimension \(D+1\) (\(D\ge 3\)) equipped with a \(C^{1,1}\) Lorentzian metric g of signature (D, 1).

For the purposes of Theorem 1.2, requiring the metric to be \(C^{1,1}\) will be sufficient. In order to obtain a focusing result in Sect. 5, it may be necessary to make stronger assumptions. For example, in Theorem 5.1 we assume that the quantity \(R_{ab}T^aT^b\) is finite and continuous, where \(T^a\) is tangent to a null geodesic. To ensure this, it would be sufficient to assume that the metric is \(C^2\).

Definition 2.2

A spacetime (Mg) admits quasi-Cartesian co-ordinates if there are co-ordinates, defined on some subset of M diffeomorphic to \(\mathbbm {R}\times \mathbbm {R}^D\setminus B\) (where B denotes a closed ball in \(\mathbbm {R}^D\)), with respect to which the components of the metric, g, take the form

$$\begin{aligned} g_{\mu \nu }=\eta _{\mu \nu }+h_{\mu \nu } \end{aligned}$$
(2)

For our purposes it will be sufficient to assume that

$$\begin{aligned} \begin{aligned} h_{\mu \nu }&=O\left( r^{-\alpha }\right) \\ \partial _\rho h_{\mu \nu }&=O\left( r^{-(1+\alpha )}\right) \end{aligned} \end{aligned}$$
(3)

for some \(\alpha >1\).

Note that Schwarzschild spacetime in \(3+1\) dimensions does not satisfy these conditions since in this case we have \(h_{\mu \nu }=O(r^{-1})\) and \(\partial _\rho h_{\mu \nu }=O(r^{-2})\). As a result, this spacetime is not covered by the results of this paper. However, the conditions stated above are satisfied by Schwarzschild in higher dimensions.

Requiring that a spacetime admits quasi-Cartesian co-ordinates will allow us to prove Lemma 3.4, however in order to construct a null line we will need to consider a more restrictive class of metrics. Recall that the \(D+1\)-dimensional Schwarzschild metric with ADM mass \(m_{ADM}\in \mathbbm {R}\), which we denote \(g_m\), has line element

$$\begin{aligned} ds_m^2=-\left( 1-\frac{2m}{r^{D-2}}\right) dt^2 +\frac{dr^2}{1-\frac{2m}{r^{D-2}}}+r^2d\omega ^2_{D-1} \end{aligned}$$
(4)

where \(d\omega ^2_{D-1}\) is the round line element on the unit \((D-1)\)-sphere, \(S^{D-1}\), and the mass parameter, m, is related to \(m_{ADM}\) byFootnote 7

$$\begin{aligned} m_{ADM}=\frac{(D-1)Area(S^{D-1})}{8\pi }m \end{aligned}$$
(5)

Definition 2.3

(Following) [4] For \(m\in \mathbbm {R}\), we say that a metric g on \(\mathbbm {R}\times \left( \mathbbm {R}^D\setminus B\right) \), where B is a ball of radius R with \(R^{D-2}>2m\), is uniformly Schwarzschildean if, in the co-ordinates of (4) (or equivalently in the co-ordinates of equation (1) of [4]):

$$\begin{aligned} \begin{aligned} g-g_m&=o\left( |m|r^{-(D-2)}\right) \\ \partial _i(g-g_m)_{jk}&=o\left( |m|r^{-(D-1)}\right) \end{aligned} \end{aligned}$$
(6)

As in [4] we will abuse notation and allow \(m=0\) in this definition, by which we mean the metric is flat for \(r>R\), for some \(R\in \mathbbm {R}_{\ge 0}\).Footnote 8

Note that the conditions imposed on the spacetime are less restrictive than the ones used in [5], where the spacetime is assumed to be strongly uniformly Schwarzschildean.

Throughout this paper we will refer to the following sets:

$$\begin{aligned} \begin{aligned} J^+(p)&=\{q\in {\tilde{M}}|\exists \text {a smooth future-directed causal curve from} p \text {to} q\}\\ J^-(p)&=\{q\in {\tilde{M}}|p\in J^+(q)\}\\ I^+(p)&=\{q\in {\tilde{M}}|\exists \text {a smooth future-directed timelike curve from} p \text {to} q\}\\ I^-(p)&=\{q\in {\tilde{M}}|p\in I^+(q)\}\\ \end{aligned} \end{aligned}$$
(7)

For non-compact spacetimes, (Mg), we can define the same sets as subsets of M rather than as subsets of \({\tilde{M}}\). As in [15], we define single points to be curves of zero length, so \(p\in J^\pm (p)\) but \(p\notin I^\pm (p)\).

In Theorem 1.2 we will require that \({\mathcal {D}}\cup {\mathcal {I}}\) be globally hyperbolic as a subset of \({\tilde{M}}\), where \({\mathcal {D}}\) denotes the domain of outer communications \({\mathcal {D}}=I^-({\mathcal {I}}^+)\cap I^+({\mathcal {I}}^-)\) and \({\mathcal {I}}={\mathcal {I}}^+\cup {\mathcal {I}}^-\). By this we mean that \({\mathcal {D}}\cup {\mathcal {I}}\) is strongly causal and contains \(J^+(p)\cap J^-(q)\) as a compact subset for each \(p,q\in {\mathcal {D}}\cup {\mathcal {I}}\) ([9] Sect. 6.6).

Following [9], we make the following definition:

Definition 2.4

A spacetime (Mg) is asymptotically empty and simple if there is a strongly causal spacetime \(({\tilde{M}},{\tilde{g}})\) and an embedding \(\theta :M\longrightarrow {\tilde{M}}\) which embeds M as a manifold with smooth boundary \(\partial M\) in \({\tilde{M}}\), such that

  1. 1.

    there is a smooth function \(\Omega \) on \({\tilde{M}}\) such that on \(\theta (M)\), \(\Omega \) is positive and \(\Omega ^2g=\theta _*({\tilde{g}});\)

  2. 2.

    \(\Omega =0\) and \(d\Omega \ne 0\) on \(\partial M\);

  3. 3.

    \(R_{ab}=0\) in an open neighbourhood of \({\mathcal {I}}\) in \({\tilde{M}}\); and

  4. 4.

    every null geodesic in M acquires a future and past endpoint on \({\mathcal {I}}\).

If (Mg) is an asymptotically empty and simple spacetime, then \(\partial M \) is a null surface and can be split into two parts [9]: past null infinity, denoted \({\mathcal {I}}^-\), where null geodesics have their past endpoints, and future null infinity, denoted \({\mathcal {I}}^+\), where null geodesics have their future endpoints.

It is common to also label the following points, which lie in the topological boundary of \({\tilde{M}}\):

  • future timelike infinity, \(i^+\): the point consisting of the future endpoints of timelike geodesics;

  • past timelike infinity, \(i^-\): the point consisting of the past endpoints of timelike geodesics;

  • and spatial infinity, \(i^0\): the point consisting of the endpoints of spacelike geodesics.

The final condition in Definition 2.4 is extremely restrictive, since it rules out spacetimes containing black hole regions. We will instead consider spacetimes which are weakly asymptotically empty and simple [9].

Definition 2.5

A spacetime (Mg) is weakly asymptotically empty and simple if there is an asymptotically empty and simple spacetime \((M',g')\) and a neighbourhood \(U'\) of \(\partial M '\) in the corresponding \({\tilde{M}}'\) such that \(U'\cap M'\) is isometric to a subset of M.

For a weakly asymptotically empty and simple spacetime, (Mg), we define the conformal boundary at infinity, denoted \({\mathcal {I}}\), to be the points in \(\partial M \) which are identified with \(\partial M '\) by the isometry in Definition 2.5. This can then be split into two parts, \({\mathcal {I}}^+\) and \({\mathcal {I}}^-\), as for an asymptotically empty and simple spacetime, again using the isometry in Definition 2.5. The points \(i^+\), \(i^-\) and \(i^0\) in the topological boundary of \({\tilde{M}}\) can also be labelled similarly.

As in [14], we assume \({\tilde{M}}\) extends slightly past conformal infinity so that it is indeed a manifold. This is required in order to satisfy the conditions of Theorem 23 in [19] which is used in the proof of Theorem 1.2.

We will require the following results regarding the completeness of \({\mathcal {I}}^\pm \).

Proposition 2.6

[9, Proposition 6.9.4] In a (\(D+1\))-dimensional asymptotically simple and empty spacetime (Mg), \({\mathcal {I}}^+\) and \({\mathcal {I}}^-\) are topologically \(\mathbbm {R}\times S^{D-1}\) and M is \(\mathbbm {R}^{D+1}\).

This tells us that in an asymptotically simple and empty spacetime, \({\mathcal {I}}^\pm \) are the same as in Minkowski spacetime of the same dimension. The following corollary follows immediately from this proposition and from definition 2.5.

Corollary 2.7

In a (\(D+1\))-dimensional weakly asymptotically simple and empty spacetime (Mg), \({\mathcal {I}}^+\) and \({\mathcal {I}}^-\) are topologically \(\mathbbm {R}\times S^{D-1}\).

3 Time of Flight Estimate in Higher Dimensions

In this section we will derive a higher-dimensional analogue of [15, Lemma III.2.1] which gives an estimate for the time of flight of causal curves near infinity with endpoints on \(\Lambda ^\pm \). In [15] it was found that the time of flight diverged logarithmically as we considered curves restricted to increasingly large values of r. We will show that in higher dimensions, the time of flight instead tends to 0. The absence of a divergence is the reason the 3+1-dimensional argument given in [15] could not be generalised to higher dimensions.

We will show that higher-dimensional spacetimes admitting quasi-Cartesian co-ordinates can be compactified using the same procedure (and same retarded and advanced time co-ordinates) as Minkowski spacetime. As a result, the time of flight along curves with endpoints on \({\mathcal {I}}^\pm \) can be calculated as the difference between the Minkowski retarded and advanced time co-ordinates, \(u=t-r\) and \(v=t+r\), evaluated at future and past null infinity respectively. This method avoids the need to consider Hamilton–Jacobi functions \(S^\pm \), as is done in [5] and [15].

We begin by recalling how Minkowski spacetime can be compactified [3, Sect. 3]. We define retarded and advanced time co-ordinates

$$\begin{aligned} \begin{aligned} u:=t-r,\quad v:=t+r \end{aligned} \end{aligned}$$
(8)

and then compactify the metric by defining

$$\begin{aligned} u=\tan P, \quad v=\tan Q \end{aligned}$$
(9)

Finally, we define new co-ordinates

$$\begin{aligned} T=Q+P\in (-\pi ,\pi ), \quad \chi =Q-P\in [0,\pi ) \end{aligned}$$
(10)

which we can think of as “time” and “radial” co-ordinates respectively in the compactified spacetime.

We then consider the conformally related metric, \({\tilde{g}}\), given by

$$\begin{aligned} {\tilde{g}}=\Omega ^2g=\left( 2\cos P\cos Q\right) ^2g \end{aligned}$$
(11)

with corresponding line element

$$\begin{aligned} \tilde{ds}^2=-dT^2+d\chi ^2+\sin ^2\chi d\omega _{D-1}^2 \end{aligned}$$
(12)

The conformal boundary at infinity can be split into two parts, as described in Sect. 2, as follows:

$$\begin{aligned} \begin{aligned} {\mathcal {I}}^+&:=\{u\in \mathbbm {R}, v=\infty \}\\ {\mathcal {I}}^-&:=\{u=-\infty , v\in \mathbbm {R}\} \end{aligned} \end{aligned}$$
(13)

The points \(i^+\), \(i^-\) and \(i^0\) are given by

$$\begin{aligned} \begin{aligned} i^+&:=\{u=\infty , v=\infty \}\\ i^-&:=\{u=-\infty , v=-\infty \}\\ i^0&:=\{u=-\infty , v=\infty \} \end{aligned} \end{aligned}$$
(14)

This is illustrated in Fig. 1.

If this procedure is carried out for Schwarzschild spacetime of mass m in 3+1 dimensions, the result is a spacetime where null geodesics which do not cross an event horizon have endpoints at \(i^\pm \) if \(m>0\) (they are infinitely delayed), or are located entirely at \(i^0\) if \(m<0\) (they are infinitely advanced). This violates condition 4 of Definition 2.4. Instead, the standard procedure for the compactification of Schwarzschild [3] involves re-scaling by a factor of \(\frac{1}{V(r)}\) and defining retarded and advanced time co-ordinates by

$$\begin{aligned} \begin{aligned} u_s&=t-r_*\\ v_s&=t+r_* \end{aligned} \end{aligned}$$
(15)

where

$$\begin{aligned} \begin{aligned}&\frac{dr_*}{dr}=\frac{1}{V(r)}\\&\quad \implies r_*={\left\{ \begin{array}{ll}r+2m\log (r/2m-1)&{}\quad \text { if }\quad D=3\\ r+O(r^{3-D})&{}\quad \text { if }\quad D\ge 4 \end{array}\right. } \end{aligned} \end{aligned}$$
(16)
Fig. 1
figure 1

Figure showing the Penrose diagram for Minkowski spacetime

In 3+1 dimensions, \(r_*\) diverges from r logarithmically. This is the reason we do not obtain a good compactification using Minkowski retarded and advanced time co-ordinates (in the sense that null geodesics escaping to the asymptotic region do not have endpoints on the surfaces \({\mathcal {I}}^\pm \) as defined in (13)). We see that in higher dimensions, \(u_s\) and \(v_s\) agree with the Minkowski u and v at \(r=\infty \). Consequently, we could alternatively have used these Minkowski retarded and advanced time co-ordinates to compactify Schwarzschild in higher dimensions, since the structure of the conformal boundary at infinity would be the same in both cases. The following lemma (similar to [6, Proposition B.1]) shows that this is a general feature of higher-dimensional spacetimes admitting quasi-Cartesian co-ordinates.

Lemma 3.1

Let (Mg) be a spacetime in \(D+1\) dimensions (\(D\ge 4\)) which admits quasi-Cartesian co-ordinates. Let \(\gamma \) be a future endless g-null geodesic segment with co-ordinates \(x^\mu (s)\), where \(s\ge 0\) is an affine parameter (increasing to the future). Suppose \(r\longrightarrow \infty \) as \(s\longrightarrow \infty \). Then \({\dot{x}}^\mu (s)\) tends to a finite limit as \(s\longrightarrow \infty \), denoted \({\dot{x}}^\mu _\infty \), (where \(\dot{}\) denotes differentiation with respect to s). Moreover, if \(\gamma \) lies entirely in the region \(r>R\) then there exists some constant C such that

$$\begin{aligned} \left| x^\mu (s)-{\dot{x}}^\mu _\infty s-x^\mu (0)\right| \le \frac{C}{R^{\alpha -1}} \end{aligned}$$
(17)

for any \(s\ge 0\) and any index \(\mu \). In particular, Minkowski retarded time \(u:=t-r\) tends to a finite limit as \(s\longrightarrow \infty \).

Similarly, if \(\gamma \) is instead past endless and the affine parameter s is chosen so that \(s\le 0\) along \(\gamma \), then \({\dot{x}}^\mu (s)\) tends to a finite limit, denoted \({\dot{x}}^\mu _{-\infty }\), as \(s\longrightarrow {-\infty }\). Moreover, if \(\gamma \) lies entirely in the region \(r>R\) then there exists some constant \(C'\) such that

$$\begin{aligned} \left| x^\mu (s)-{\dot{x}}^\mu _{-\infty } s-x^\mu (0)\right| \le \frac{C'}{R^{\alpha -1}} \end{aligned}$$
(18)

holds for any \(s\le 0\) and any index \(\mu \). In particular, Minkowski advanced time \(v:=t+r\) tends to a finite limit as \(s\longrightarrow -\infty \).

Proof

Let R be such that \(r\ge R\) for all \(s\ge 0\). Since \(r\longrightarrow \infty \) as \(s\longrightarrow \infty \), by shifting the origin of s we are free to make R arbitrarily large and enforce \(\frac{dr^2}{ds}|_{s=0}\ge 0\). Next, re-scale s so that \(\frac{dx^i}{ds}\frac{dx^i}{ds}\vert _{s=0}=1\). Let \(s_1>0\) be maximal such that \(\frac{3}{4}<\frac{dx^i}{ds}\frac{dx^i}{ds}<\frac{5}{4}\) for all \(0\le s<s_1\). From the geodesic equations, we have

$$\begin{aligned} \frac{d^2r^2}{ds^2}=2\left( \frac{dx^i}{ds}\frac{dx^i}{ds} +x^i\Gamma ^i_{\mu \nu }\frac{dx^\mu }{ds}\frac{dx^\nu }{ds}\right) . \end{aligned}$$
(19)

Since \(\gamma \) is null, for R sufficiently large and \(0\le s<s_1\) we have \(|dt/ds|<2\) and hence \(|dx^\mu /ds|\) is bounded for \(\mu =0,1,\ldots ,D\). Since (Mg) admits quasi-Cartesian co-ordinates, we also have

$$\begin{aligned} \left| \Gamma ^\mu _{\nu \rho }\right| \le C_1r^{-\alpha -1} \end{aligned}$$
(20)

for some constant \(C_1\).

Substituting this into equation (19), we have

$$\begin{aligned} \frac{d^2r^2}{ds^2}\ge 2\times \left( \frac{3}{4}-C_2r^{-\alpha }\right) \end{aligned}$$
(21)

for some constant \(C_2\). Hence for \(0< s<s_1\) and \(r\ge R\) (increasing R if necessary), we have

$$\begin{aligned} \begin{aligned} \frac{d^2r^2}{ds^2}&>1\\ \implies r^2(s)&\ge r^2(0)+s\left. \frac{dr^2}{ds}\right| _{s=0}+\frac{s^2}{2}\\&\ge R^2+\frac{s^2}{2}\\&> C_3(R+s)^2 \end{aligned} \end{aligned}$$
(22)

for some constant \(C_3>0\). To derive the final inequality above, we note that if \(C_3\) is chosen to be sufficiently small, then this inequality holds for \(s=0\) and the equation

$$\begin{aligned} \left( \frac{1}{2}-C_3\right) s^2-2C_3Rs+(1-C_3)R^2=0, \end{aligned}$$
(23)

viewed as a quadratic in s, has no real solutions.

From this it follows that, for \(0\le s<s_1\):

$$\begin{aligned} \begin{aligned} \left| \int ^{s}_0\frac{d^2x^\mu }{ds'^2}ds'\right|&\le \int ^{s}_0\left| \frac{d^2x^\mu }{ds'^2}\right| ds'\\ \implies \left| \frac{dx^\mu }{ds'}(s)-\frac{dx^\mu }{ds'}(0)\right|&\le \int _0^{s}\left| \Gamma ^\mu _{\nu \rho }\frac{dx^\nu }{ds'} \frac{dx^\rho }{ds'}\right| ds'\\&\le C_4\int ^{s}_0r^{-\alpha -1}ds'\\&< C_5\int ^{s}_0(R+s')^{-\alpha -1}ds'\\&\le C_6R^{-\alpha }\\ \implies \left| \sum _{i=1}^d\frac{dx^i}{ds} \frac{dx^i}{ds}(s)-1\right|&\le C_7R^{-\alpha } \end{aligned} \end{aligned}$$
(24)

where \(C_4, C_5, C_6, C_7>0\) are constants. To obtain the final inequality above, we have used the fact that if \({\textbf {x}},{\textbf {y}}\in (\mathbbm {R}^d,\delta )\), where \(\delta \) denotes the Euclidean metric, with \(|{\textbf {x}}|\le K\) and \(|{\textbf {y}}|=1\), then we have

$$\begin{aligned} \begin{aligned} \left| {\textbf {x}}^2-{\textbf {y}}^2\right|&=\left| ({\textbf {x}}+{\textbf {y}})\cdot ({\textbf {x}}-{\textbf {y}})\right| \\&\le \left| {\textbf {x}}+{\textbf {y}}\right| \left| {\textbf {x}}-{\textbf {y}}\right| \\&\le (1+K)\left| {\textbf {x}}-{\textbf {y}}\right| . \end{aligned} \end{aligned}$$
(25)

We conclude that by choosing R sufficiently large, we can take \(s_1=\infty \).

If, rather than integrating from 0 to s in (24), we instead integrate between \(s_2\) and \(s_3>s_2\), we find that

$$\begin{aligned} \begin{aligned} \left| \frac{dx^\mu }{ds}(s_3)-\frac{dx^\mu }{ds}(s_2)\right|&\le C_6(R+s_2)^{-\alpha } \text { for all }s_3>s_2 \end{aligned} \end{aligned}$$
(26)

The right hand side of this inequality tends to 0 as \(s_2\longrightarrow \infty \), so we conclude that \({\dot{x}}^\mu (s)\) tends to a finite limit as \(s\longrightarrow \infty \). We denote this limit by \({\dot{x}}^\mu _\infty \).

Then for any \(s\ge 0\) and any index \(\mu \), we have:

$$\begin{aligned} \begin{aligned} \left| \int _0^{s}\frac{dx^\mu }{ds'}-{\dot{x}}^\mu _\infty ds'\right|&\le \int _0^{s}\left| \frac{dx^\mu }{ds'}-{\dot{x}}^\mu _\infty \right| ds'\\ \implies \left| x^\mu (s)-{\dot{x}}^\mu _\infty s-x^\mu (0)\right|&\le \int _0^{s} \frac{C_6}{(R+s')^\alpha }ds'\\&\le \frac{C}{R^{\alpha -1}} \end{aligned} \end{aligned}$$
(27)

where \(C=\frac{C_6}{\alpha -1}>0\) is a constant.

Asymptotic flatness implies that \(\eta _{\mu \nu }{\dot{x}}_\infty ^\mu {\dot{x}}_\infty ^\nu =0\) and hence that the curve \(x_{Mink}^\mu (s):=x(0)^\mu +{\dot{x}}^\mu _\infty s\) defines a Minkowski null geodesic along which \(u:=t-r\) tends to a finite value as \(s\longrightarrow \infty \). From (27) we have

$$\begin{aligned} \left| x^\mu (s)-x^\mu _{Mink}(s)\right| \le \frac{C}{R^{\alpha -1}} \end{aligned}$$
(28)

so we conclude that u must also tend to a finite limit along \(\gamma \) as \(s\longrightarrow \infty \).

If \(\gamma \) is instead a past endless null geodesic segment, then similar arguments can be used to show that \({\dot{x}}^\mu (s)\) tends to a finite limit as \(s\longrightarrow -\infty \) and to derive (18). One can then deduce that \(v:=t+r\) tends to a finite limit as \(s\longrightarrow -\infty \).

\(\square \)

Lemma 3.1 tells us that if we compactify such a spacetime using the same procedure as used for Minkowski (including the same retarded and advanced time co-ordinates), then null geodesics escaping to the asymptotic region in the infinite future (respectively past) will have endpoints on the null surface \({\mathcal {I}}^+\) (respectively, \({\mathcal {I}}^-\)) as defined in (13). This can be summarised in the following corollary.

Corollary 3.2

Let (Mg) be a spacetime in \(D+1\) dimensions (\(D\ge 4\)) admitting quasi-Cartesian co-ordinates. Then (Mg) is weakly asymptotically simple and empty and furthermore the compactification map \(\theta \) in Definition 2.4 can be taken to be the same as for Minkowski spacetime.

The above results allow us to define the time of flight for spacetimes admitting quasi-Cartesian co-ordinates as follows:

Definition 3.3

Let (Mg) be a spacetime in \(D+1\) dimensions (\(D\ge 4\)) admitting quasi-Cartesian co-ordinates. We define the (possibly negative) time of flight along an endless curve as \(u_\infty -v_\infty \), where \(u_\infty \) denotes the value of \(u=t-r\) at the future endpoint of the curve and \(v_\infty \) denotes the value of \(v=t+r\) at its past endpoint.

Lemma 3.1 tells us that the time of flight is finite along null geodesics which escape to the asymptotic region in both the future and the past (so in particular do not enter black or white holes). In [15], the time of flight is defined to be \(S^+-S^-\), where the Hamilton–Jacobi functions \(S^+\) and \(S^-\) are finite on future and past null infinity respectively. According to [5], the proof of existence of the optical functions \(S^\pm \) is the missing step of the argument in [15]. The above definition means that it is not necessary to define such functions here.

We can now prove the following time of flight estimate for spacetimes admitting quasi-Cartesian co-ordinates:

Lemma 3.4

(Time of Flight Estimate) Let (Mg) be a spacetime in \(D+1\) dimensions (\(D\ge 4)\) which admits quasi-Cartesian co-ordinates and let \(\gamma \) be a causal curve which connects \(\Lambda ^-\) to \(\Lambda ^+\) (see Sect. 1 for definitions) and is comprised of two null geodesic segments. Suppose \(\gamma \) lies entirely in the region \(r>R\). Then, for R sufficiently large, the time of flight along \(\gamma \) satisfies

$$\begin{aligned} |u_\infty -v_\infty |\le \frac{A}{R^{\alpha -1}} \end{aligned}$$
(29)

for some constant A.

Proof

Let \(p_*\) denote the point at which the two null geodesic segments are joined. Let \(\gamma _+\) denote the null geodesic from \(p_*\) to \(\Lambda ^+\) and let \(\gamma _-\) denote the null geodesic from \(\Lambda ^-\) to \(p_*\). Let \(\gamma _{\eta }\) denote the curve through \(p_*\) with co-ordinates

$$\begin{aligned} x_{\eta }^\mu (s)=x_{p_*}^\mu + sk^\mu \end{aligned}$$
(30)

where \(k^\mu =(1,0,\ldots ,0,1)\) is the Minkowski-null direction used to define \(\Lambda ^\pm \) (Fig. 2).

Fig. 2
figure 2

Lemma 3.4 shows that if \(\gamma _-\) is a g-null geodesic from \(\Lambda ^-\) to \(p_*\) and \(\gamma _+\) is a g-null geodesic from \(p_*\) to \(\Lambda ^+\), with both restricted to the region \(r>R\), then the time of flight along \(\gamma _-\cup \gamma _+\) approaches 0 as \(R\longrightarrow \infty \). The lemma is proved by comparing this time of flight to the time of flight along a Minkowski null geodesic, \(\gamma _\eta \), through \(p_*\), along which the time of flight is 0

Choose the affine parameter s so that \(\frac{dx^i}{ds}\frac{dx^i}{ds}|_{s=0}=1\) (as in the proof of Lemma 3.1). We will use inequality (28), which tells us that for R sufficiently large, the time of flight along \(\gamma \) is close to the time of flight along \(\gamma _{\eta }\), which we now calculate.

Let \(u_{\gamma _\eta ,\infty }\) denote the u co-ordinate at the future endpoint of \(\gamma _\eta \); b denote the impact parameter of \(\gamma _\eta \) (the smallest value of r attained along \(\gamma _\eta \)); and \(t_b\) denote the value of the t co-ordinate on \(\gamma _\eta \) when \(r=b\). Suppose \(s=s_b\) when \(t=t_b\) and \(r=b\). Along \(\gamma _\eta \) we have \({\dot{t}}=1\) and hence

$$\begin{aligned} \begin{aligned} t(s)&=t_b+s-s_b. \end{aligned} \end{aligned}$$
(31)

Furthermore, \(x^D=0\) at \(r=b\), so we have

$$\begin{aligned} \begin{aligned} r(s)&=\sqrt{b^2+(s-s_b)^2}\\ \implies u(s)&:=t(s)-r(s)\\&=t_b+s-s_b-\sqrt{b^2+(s-s_b)^2}\\ \implies u_{\gamma _\eta ,\infty }&:=\lim _{s\longrightarrow \infty }u(s)\\&=t_b \end{aligned} \end{aligned}$$
(32)

Similarly, the advanced time co-ordinate at the past endpoint of \(\gamma _\eta \) is \(v_{\gamma _\eta ,\infty }=t_b\). Hence the time of flight along \(\gamma _\eta \) is

$$\begin{aligned} \begin{aligned} u_{\gamma _\eta ,\infty }-v_{\gamma _\eta ,\infty }&=0 \end{aligned} \end{aligned}$$
(33)

Consequently, it follows from Lemma 3.1 that the time of flight along \(\gamma \) is

$$\begin{aligned} \begin{aligned} |u_\infty -v_\infty |&\le |u_\infty -u_{\gamma _\eta ,\infty }|+|u_{\gamma _\eta ,\infty }-v_{\gamma _\eta ,\infty }| +|v_{\gamma _\eta ,\infty }-v_\infty |\\&\le \frac{A}{R^{\alpha -1}}\\ \end{aligned} \end{aligned}$$
(34)

for some constant \(A>0\). \(\square \)

Note that the above argument relied on the fact that defining the time of flight along \(\gamma \) using Minkowski retarded and advanced time co-ordinates gives a finite result (Lemma 3.1). For Schwarzschild in \(3+1\) dimensions, this time of flight is \(\pm \infty \) for \(m\gtrless 0\). As a result, although the metric is asymptotically flat, we cannot conclude that the time of flight approaches 0 if \(\gamma \) is restricted to arbitrarily large r.

4 Constructing a Fastest Causal Curve

As discussed in Sect. 1, we will begin by constructing a causal curve from \(\Lambda ^-\) to \(\Lambda ^+\) which has negative time of flight. To do this we use a comparison argument based on Minkowski spacetime, similar to the ones used in [3]. This argument relates to the compactified spacetimes.

Lemma 4.1

Let (Mg) be a uniformly Schwarzschildean spacetime in \(D+1\) dimensions (\(D\ge 4\)) with ADM mass \(m_{ADM}<0\). Then there exists an endless timelike curve from \(\Lambda ^-\) to \(\Lambda ^+\) which has negative time of flight.

Note that a uniformly Schwarzschildean spacetime in higher dimensions necessarily admits quasi-Cartesian co-ordinates. This means that the results of the previous section apply. In particular, we define the time of flight along curves with endpoints on \({\mathcal {I}}^-\) and \({\mathcal {I}}^+\) using Definition 3.3, with \(u=t-r\) and \(v=t+r\).

Proof

The metric is uniformly Schwarzschildean, so the line element can be written in some neighbourhood of conformal infinity as

$$\begin{aligned} \begin{aligned} ds^2=ds^2_{Mink}+\frac{2m}{r^{D-2}}\left( dt^2+dr^2\right) +o\left( r^{-(D-2)}\right) \end{aligned} \end{aligned}$$
(35)

where by \(o\left( r^{-(D-2)}\right) \) we mean that this term is equal to \(g'_{\mu \nu }dx^\mu dx^\nu \) for some \(g'_{\mu \nu }=o\left( r^{-(D-2)}\right) \).

Since these co-ordinates are defined on some region diffeomorphic to \(\mathbbm {R}\times \left( \mathbbm {R}^D\setminus B\right) \), we are able to identify curves in this region with curves in Minkowski spacetime (we simply identify the quasi-Cartesian co-ordinates with some Cartesian co-ordinate system in Minkowski). Furthermore, since the spacetime (Mg) can be compactified to \(({\tilde{M}},{\tilde{g}})\) using the same compactified co-ordinates as Minkowski, we can also identify curves in a neighbourhood of \({\mathcal {I}}\) in \(({\tilde{M}},{\tilde{g}})\) with curves in compactified Minkowski (we simply identify the compactified co-ordinates).

It is clear from (35) that for sufficiently large r, the term proportional to m will dominate the other non-Minkowski terms. This means that if \(m_{ADM}<0\) (or equivalently \(m<0\)), then there exists some \(R_0\) such that

$$\begin{aligned} \begin{aligned} ds^2<ds^2_{Mink} \end{aligned} \end{aligned}$$
(36)

along any Minkowski causal curve at \(r>R_0\).

Consider a Minkowski null geodesic with past endpoint at \(v=0\) on \(\Lambda ^-\), future endpoint at \(u=0\) on \(\Lambda ^+\) and impact parameter \(b>R_0\). This curve is restricted to the region where \(r> R_0\) and hence inequality (36) guarantees that, after identifying compactified co-ordinates, it defines a g-timelike curve in \((\tilde{M},\tilde{g})\).

Now since \(I^\pm (p)\) are open sets for any point \(p\in {\tilde{M}}\), we must be able to modify this curve slightly so that it remains g–timelike but has past endpoint at \(v>0\) on \(\Lambda ^-\) and future endpoint at \(u<0\) on \(\Lambda ^+\). This ensures that the curve has negative time of flight (see Fig. 3). \(\square \)

Fig. 3
figure 3

The Minkowski null geodesic between \(\Lambda ^-\) and \(\Lambda ^+\) with endpoints at \(u=0\) and \(v=0\) respectively and impact parameter b is shown as a dotted line. If the metric g is uniformly Schwarzschildean with negative mass and b is sufficiently large, then this path can be modified so that it still connects \(\Lambda ^-\) to \(\Lambda ^+\) but is g-timelike and has strictly negative time of flight. This modification is shown as an unbroken line

This curve will be the starting point for our sequence of faster and faster causal curves used in the proof of Theorem 1.2 to construct a fastest causal curve from \(\Lambda ^-\) to \(\Lambda ^+\). We will also use the time of flight estimate, Lemma 3.4, in this construction. However in order to do so it is necessary to prove the following lemma.

Lemma 4.2

Let (Mg) be a \(D+1\)-dimensional uniformly Schwarzschildean spacetime (\(D\ge 4\)) and let \(R>0\). Let \(p_*\) be a point with \(t=0\) and let \(q\in {\mathcal {I}}^+\). Then there exists a constant \(R'\) such that if \(p_*\) has r co-ordinate \(r_{p_*}>R'\), then any g-null geodesic from \(p_*\) to q is confined to the region \(r>R\).

Fig. 4
figure 4

The causal diamond \(J^+(p_*)\cap J^-(q)\) (taken with respect to the Minkowski metric) is contained in the shaded region bound by curves of constant u or v. These curves are straight lines at \(45^\circ \) in the Penrose diagram above. Given some \(R>0\), if the point \(p_*\) lies sufficiently close to \(i^0\) (i.e. it has sufficiently large r co-ordinate), then causal curves from \(p_*\) to q remain close to \({\mathcal {I}}^+\) on this diagram and hence cannot enter the region \(r\le R\)

The Minkowski metric in \(D+1\) dimensions is

$$\begin{aligned} ds^2=-dt^2+dr^2+r^2d\omega _{D-1}^2 \end{aligned}$$
(37)

It follows that along Minkowski causal curves, we have

$$\begin{aligned} \begin{aligned}&{\dot{t}}^2-{\dot{r}}^2\ge 0\\&\quad \implies {\dot{v}}=0\text { or }\frac{du}{dv} =\frac{{\dot{t}}-{\dot{r}}}{{\dot{t}}+{\dot{r}}}\ge 0 \end{aligned} \end{aligned}$$
(38)

where \(\dot{}\) denotes differentiation with respect to an affine parameter s. Using this, Fig. 4 illustrates how Lemma 4.2 is true in Minkowski spacetime.

The need to prove this lemma is the downside of compactifying using \(u=t-r\) and \(v=t-r\). It is possible that g–causal curves from \(p_*\) have \(\frac{du}{dv}<0\) and hence escape the shaded region shown in Fig. 4. The proof below relies on results obtained in the proof of Lemma 3.1. These are used to show that g-null geodesics and \(\eta \)-null geodesics emanating from the same point remain close in a neighbourhood of \(i^0\) (see Fig. 5).

Proof

Let \(\gamma \) be a g–null geodesic from \(p_*\) to q and let s be an affine parameter along \(\gamma \) (increasing to the future) such that \(\frac{dx^i}{ds}\frac{dx^i}{ds}|_{s=0}=1\) and \(\frac{dr^2}{ds}|_{s=0}\ge 0\). Let \(s_0\) denote the value of s at \(p_*\).

From Fig. 4 we see that, for fixed R, in order for a curve from \(p_*\) to q (with r sufficiently large at \(p_*\)) to enter the region \(r<R\), it must have \(\frac{du}{dv}<0\) at some point.

We first consider the case where \({\dot{u}}<0<{\dot{v}}\) along some portion of \(\gamma \). It is possible, if \(\gamma \) reaches small values of r, that we may have \({\dot{t}}\le 0\) at some point. However since the metric is asymptotically flat, there exists \(\delta >0\) such that \({\dot{t}}>\delta >0\) for sufficiently large r. We therefore have \({\dot{v}}={\dot{t}}+{\dot{r}}>2{\dot{t}}>2\delta \).

It follows that for \(r>R_0\) (with \(R_0\) sufficiently large) we have

$$\begin{aligned} \begin{aligned} 0<-({\dot{t}}-{\dot{r}})({\dot{t}}+{\dot{r}})&\le \eta _{\mu \nu }{\dot{x}}^\mu {\dot{x}}^\nu \\&\le (\eta _{\mu \nu }-g_{\mu \nu }){\dot{x}}^\mu {\dot{x}}^\nu \\&\le Br^{-(D-2)}\\ \implies -B'r^{-(D-2)}&\le \frac{du}{dv}=\frac{{\dot{t}}-{\dot{r}}}{{\dot{t}}+{\dot{r}}}<0 \end{aligned} \end{aligned}$$
(39)

for some constants \(B,B'>0\).

Choose c such that the u co-ordinate of q satisfies \(u_q<c\). Consider a segment of \(\gamma \) in the region \(\{u<c\}\cap \{r>R_0\}\cap \{t\ge 0\}\). We have

$$\begin{aligned} \begin{aligned} 0\le t&<r+c\\ \implies 0\le v&< 2r+c \end{aligned} \end{aligned}$$
(40)

From this we have

$$\begin{aligned} \begin{aligned} \left| \frac{1+v^2}{1+u^2}\frac{du}{dv}\right|&\le \left( 1+(2r+c)^2\right) \frac{B'}{r^{D-2}}\\&\longrightarrow 0 \text { as }r\longrightarrow \infty \end{aligned} \end{aligned}$$
(41)

Let \(\epsilon >0\). From the above we see that for \(R_0\) sufficiently large we have

$$\begin{aligned} 1-\epsilon \le \frac{dT}{d\chi }=\frac{1+\frac{1+v^2}{1+u^2}\frac{du}{dv}}{1-\frac{1+v^2}{1+u^2}\frac{du}{dv}}\le 1 \end{aligned}$$
(42)

for \(r>R_0\), where T and \(\chi \) are defined in (10).

Now consider the case where \({\dot{v}}<0<{\dot{u}}\) along a portion of \(\gamma \). We have \({\dot{u}}={\dot{t}}-{\dot{r}}>2{\dot{t}}>2\delta \) for r sufficiently large. Analogously to (39), it follows that for \(r>R_0\) (with \(R_0\) sufficiently large) we have

$$\begin{aligned} \begin{aligned} 0&>\frac{du}{dv}=\frac{{\dot{t}}-{\dot{r}}}{{\dot{t}}+{\dot{r}}}>-B''r^{-(d-2)} \end{aligned} \end{aligned}$$
(43)

for some constant \(B''>0\).

We also have

$$\begin{aligned} \begin{aligned} \left| \frac{1+u^2}{1+v^2}\frac{dv}{du}\right|&\le \frac{B''}{r^{d-2}}\\&\longrightarrow 0 \text { as }r\longrightarrow \infty \end{aligned} \end{aligned}$$
(44)

Hence for any \(\epsilon >0\), if \(R_0\) is sufficiently large we have

$$\begin{aligned} -1\le \frac{dT}{d\chi }=\frac{\frac{1+u^2}{1+v^2}\frac{dv}{du}+1}{\frac{1+u^2}{1+v^2}\frac{dv}{du}-1}\le -1+\epsilon \end{aligned}$$
(45)

in the region \(\{u<c\}\cap \{r>R_0\}\cap \{t\ge 0\}\).

Putting these results together, we find that for any c and any \(\epsilon >0\), there exists some \(R_0>0\) such that causal curves in the region \(\{u<c\}\cap \{r>R_0\}\cap \{t\ge 0\}\) have

$$\begin{aligned} \left| \frac{dT}{d\chi }\right| \ge 1-\epsilon . \end{aligned}$$
(46)

From this we conclude that g-causal curves in this region are confined to a wedge in the Penrose diagram bound by straight lines with gradients which, by choosing \(R_0\) sufficiently large, can be chosen arbitrarily close to \(\pm 1\). It follows that for any \(R>0\), there exists some \(R'\) such that if \(p_*\) has r co-ordinate \(r_{p_*}>R'\) then any g-causal curve from \(p_*\) to q does not enter the region \(r\le R\). This is illustrated in Fig. 5. \(\square \)

Fig. 5
figure 5

The shaded region shows \(J^+(p_*)\cap J^-(q)\) calculated with respect to the Minkowski metric. This region is bound by curves with \(\left| \frac{dT}{d\chi }\right| =1\) For any c and \(\epsilon >0\), g-causal curves in \(\{u<c\}\cap \{r>R_0\}\cap \{t\ge 0\}\), with \(R_0\) sufficiently large, have \(\left| \frac{dT}{d\chi }\right| \ge 1-\epsilon \)

Theorem 1.2 Let (Mg) be a uniformly Schwarzschildean spacetime in \(D+1\) dimensions (\(D\ge 4\)) with ADM mass \(m_{ADM}<0\) and suppose \({\mathcal {D}}\cup {\mathcal {I}}\) is a globally hyperbolic subset of \({\tilde{M}}\). Then (Mg) contains a null line.

Proof

Using Lemma 4.1, we construct a causal curve from \(\Lambda ^-\) to \(\Lambda ^+\) which has negative time of flight. Denote this curve by \(\gamma _0\) and label its past and future endpoints \(p_0\) and \(q_0\) respectively.

Following [15], we introduce a partial ordering \(\le \) on \(\left( \left( J^+(p_0)\cap \Lambda ^-\right) \cup i^0\right) \times \left( \left( J^-(q_0)\cap \Lambda ^+\right) \cup i^0\right) \). We say that \((p',q')\le (p'',q'')\) if \(p'\in J^+(p'')\) and \(q'\in J^-(q'')\). Define the set F to contain all pairs of points in \(\left( J^+(p_0)\cap \Lambda ^-\right) \times \left( J^-(q_0)\cap \Lambda ^+\right) \) which can be connected by a causal curve through \({\mathcal {D}}\) and choose (pq) to be any element of the closure of this set which is minimal with respect to the partial ordering \(\le \) (so a priori one or both of these points could lie at \(i^0\)).

We can then define sequences of points \(\{p_i\}_{i=0}^\infty \) along \(\Lambda ^-\) and \(\{q_i\}_{i=0}^\infty \) along \(\Lambda ^+\) such that

  • \((p_i,q_i)\le (p_j,q_j)\) for \(i>j\)

  • \(p_i\) and \(q_i\) are connected by a causal curve \(\gamma _i\)

  • \(p_i\longrightarrow p\) and \(q_i\longrightarrow q\) as \(i\longrightarrow \infty \)

The first condition here ensures that the time of flight along \(\gamma _i\) is less than or equal to the time of flight along \(\gamma _j\) for \(j<i\).

We first check that the sequence of curves \(\{\gamma _i\}_{i=0}^\infty \) does not escape to infinity. By this we mean that there exists some \(R>0\) such that, possibly restricting to a subsequence, every curve \(\gamma _i\) enters the region \(\{r<R\}\).

The curve \(\gamma _i\) can be replaced by a (possibly faster) curve \(\gamma _i'\) from \(\Lambda ^-\) to \(\Lambda ^+\) which is the union of two null geodesics joined at \(p_{*,i}\), the point at which \(\gamma _i\) intersects the surface \(t=0\).

Let \(R_i\) denote the value of the r co-ordinate at \(p_{*,i}\). Suppose that \(R_i\longrightarrow \infty \) as \(i\longrightarrow \infty \). From Lemma 4.2, for any \(R>0\) (and restricting to a subsequence if necessary) the sequence of curves \(\{\gamma _i\}_{i=0}^\infty \) is eventually contained in the region \(r>R\).

Then, by Lemma 3.4, we have that the time of flight along \(\gamma _i'\) satisfies

$$\begin{aligned} |u_{\gamma _i',\infty }-v_{\gamma _i',\infty }|\le \frac{A}{R^{D-3}} \end{aligned}$$
(47)

for some constant A, where we note that a uniformly Schwarzschildean spacetime admits quasi-Cartesian co-ordinates with \(\alpha =D-2\).

From this we see that, for R sufficiently large, the time of flight along the \(\gamma _i'\), and hence also the time of flight along \(\gamma _i\), will eventually become strictly greater than the time of flight along \(\gamma _0\) (which was chosen to be strictly negative). But this contradicts the definition of the sequence \((\gamma _i)_{i=0}^\infty \) as consisting of faster and faster causal curves. We therefore conclude that, restricting to a subsequence if necessary, each of the causal curves \(\gamma _i\) enters the set \({\mathcal {K}}:=\left( J^+(p_0)\cap J^-(q_0)\right) \setminus {\mathcal {U}}_R\), where \({\mathcal {U}}_R:=\{x\in J^+(p_0)\cap J^-(q_0):r(x)>R\}\) and \(J^+(p_0)\cap J^-(q_0)\) includes \(i^0\) as well as certain points on \(\Lambda ^+\cup \Lambda ^-\).

But \({\mathcal {U}}_R\) is an open set in \(J^+(p_0)\cap J^-(q_0)\) and, since \({\mathcal {D}}\) is globally hyperbolic, \(J^+(p_0)\cap J^-(q_0)\) is compact. It follows that \({\mathcal {K}}\) is a compact set. Hence, defining \(\gamma \) to be the limit of the sequence \(\{\gamma _i\}_{i=0}^\infty \) in the Vietoris topology (Appendix A, [19]), we see that \(\gamma \) must also enter \({\mathcal {K}}\). In particular this means that \(\gamma \) enters the interior of the spacetime. Furthermore, since each of the \(\gamma _i\) are causal, so too is the limit curve \(\gamma \).

So \(\gamma \) is a fastest causal curve from \(\Lambda ^-\) to \(\Lambda ^+\) which enters the interior of the spacetime. Such a curve necessarily lies on the boundary of the future null cone from p and hence must be a null geodesic ([20] Corollary after Theorem 8.1.2) without conjugate points ([9] Proposition 4.5.12). We conclude that \(\gamma \) is a null line.

5 A Focusing Theorem in Higher Dimensions

Theorem 1.2 shows that certain higher-dimensional, negative mass spacetimes contain a null line. This means that such spacetimes would be excluded if we also impose conditions which forbid the existence of such a curve.

We will require conditions which ensure that endless null geodesics encounter sufficient regions of positive focusing to guarantee that conjugate points will occur. This focusing of geodesics is consistent with the sort of behaviour we would expect to be caused by regions of positive mass (assuming the Einstein equations hold). In this sense, Corollary 5.2 below agrees with our prior understanding of positivity of mass.

In [15], the conditions imposed are those stated in Borde’s Focusing Theorem [2] (where we require these to hold for every complete causal geodesic). As mentioned in Sect. 1, we now assume that the quantity \(R_{ab}T^aT^b\) is finite and continuous, where \(T^a\) is tangent to the null geodesic under consideration. To guarantee this, it is sufficient to assume that the metric is \(C^2\).

Theorem 5.1

[2, Focusing Theorem 2] Let \(\gamma \) be a complete, affinely parameterised, causal geodesic with tangent \(T^a\) such that \(T_{[a}R_{b]cd[e}T_{f]}T^cT^d\ne 0\) at some point on \(\gamma \). Suppose that for any \(\epsilon >0\) and any \(t_1<t_2\), there exists \(\delta >0\) and intervals \(I_1\) and \(I_2\) of length \(\ge \delta \) with endpoints \(<t_1\) and \(>t_2\) respectively such that

$$\begin{aligned} \int _{t'}^{t''}R_{ab}T^aT^bdt\ge -\epsilon \quad \quad \forall t'\in I_1,\quad \forall t''\in I_2 \end{aligned}$$
(48)

Then \(\gamma \) contains a pair of conjugate points.

Theorem 5.1 was originally stated in \(3+1\) dimensions, although the proof does not rely on this and hence the theorem also holds in higher dimensions. For this reason, we may use the conditions of this theorem in our higher-dimensional generalisation of the positive mass theorem. As is pointed out in [15], if the Einstein equations are assumed to hold then the conditions of Borde’s theorem can equivalently be expressed in terms of the energy-momentum tensor.

Note that the conditions of Borde’s theorem are entirely global. Other conditions often used in such focusing theorems relate to local positivity of energy (assuming the Einstein equations hold). These conditions can be violated in a quantum theory, so our approach has the advantage that it may be possible to extend it to the semi-classical regime.

It may be the case that we could impose weaker conditions than those used in Borde’s theorem and still rule out the existence of null lines. For example, Theorem 1.2 only required the metric to be \(C^{1,1}\), so a focusing result for metrics which fail to be \(C^2\) would provide a more general result. This is mentioned at the end of Sect. II.2 in [15] and also in [13]. The important thing is that we impose sufficiently strong conditions to ensure that null lines cannot exist. The condition that \(T_{[a}R_{b]cd[e}T_{f]}T^cT^d\ne 0\) at some point on every geodesic is called the generic condition and is satisfied in all but a very special class of spacetimes [10]. As discussed in [15], it does not appear to be a particularly restrictive assumption. This is because if it is not satisfied by our spacetime, then we would expect it to be satisfied by some other spacetime which is “nearby” in some sense and whose 4-momentum differs by an arbitrarily small amount. As a result, imposing this assumption does not appear to weaken the positivity of mass result obtained here.

Using Theorems 1.2 and 5.1, we derive the following corollary, which is our version of the positive mass theorem in higher dimensions.

Corollary 5.2

Let (Mg) be a uniformly Schwarzschildean spacetime in \(D+1\) dimensions (\(D\ge 4\)) which satisfies the conditions of Borde’s theorem for every complete causal geodesic and is such that \({\mathcal {D}}\cup {\mathcal {I}}\) is globally hyperbolic as a subset of \({\tilde{M}}\). Then \(m_{ADM}\ge 0\).

6 Conclusion

We have proved a version of the positive mass theorem for higher-dimensional uniformly Schwarzschildean spacetimes. As mentioned in Sect. 2, the assumptions made about our spacetime are weaker than those of [5]. In particular, we only require the metric to be uniformly Schwarzschildean, whereas in [5] the more restrictive requirement that the metric be strongly uniformly Schwarzschildean is made. We also drop the assumption of weak asymptotic regularity. However, as in [5] we fail to prove anything regarding the \(m=0\) case (a similar problem is encountered in [15] in 3+1 dimensions). This is because the method used in Sect. 4 to construct the initial timelike curve \(\gamma _0\) with strictly negative time of flight is no longer guaranteed to work. Its success will be determined by the higher order terms in the metric.

The assumptions made in this paper are also fundamentally different to those used in the proofs by Witten [21] and Schoen and Yau [16,17,18] (extended to all dimensions up to seven by Eichmair et al. [7]). The proof given here relies on properties of the spacetime in a neighbourhood of conformal infinity, whereas these other methods have the advantage of imposing conditions only on some initial spacelike hypersurface. Nonetheless, the “global” proof presented in this paper is interesting because it relies more clearly on properties we expect to hold in positive mass spacetimes. In particular, we show that null geodesic focusing is compatible only with spacetimes of non-negative mass. Consequently this proof also acts as evidence that it is consistent to think of the ADM mass defined in higher dimensions as describing the total mass contained in a spacetime. Furthermore, as mentioned above and in [15], by relying on global focusing results this proof avoids imposing non-negativity of energy locally. This leads to the possibility that the methods described here can be generalised to the semi-classical setting, where such local conditions can be violated by quantum matter.