1 Introduction

The goal of this article is first to make explicit the definitions of wavelike spacetimes with parallel rays in general relativity, and as was done in the development of the theory, to motivate a number of these definitions with reference to the well-established theory of electromagnetism. This is the subject of Sect. 2. By “wavelike spacetimes” we mean those spacetimes which themselves model wavelike behaviour, in contrast to the spacetimes which model objects that produce radiation.Footnote 1 We then examine the coordinate descriptions of the wavelike spacetimes in Sect. 3, where the “adapted” or “Brinkmann” coordinates in which these metrics are typically written are derived. Section 4 discusses the properties of the wavelike spacetimes, in particular the details of their so-called “wavefronts”, that such a spacetime appears as a limit of any spacetime via the “Penrose limit”, and their causal properties. In Sect. 5 we discuss progress in addressing the “Ehlers–Kundt conjecture”, which is a statement about our expectations of the wavelike spacetimes based on physical intuition.

It should be noted that this article deals only with the wavelike spacetimes which possess parallel rays, which is a subclass of all wavelike spacetimes in general relativity. A more general class are those geometries admitting shear-free, twist-free, geodesic null congruences, which splits into the Kundt class (non-expanding congruence) and the Robinson–Trautman class (expanding congruence). The waves with parallel rays discussed in this article form a subclass of the Kundt class. Other noteworthy classes include the colliding plane waves, cylindrical gravitational waves, spacetimes with accelerated sources (C-metrics, more generally spacetimes with boost-rotation symmetries), solitonic gravitational waves, cosmological gravitational waves in de Sitter and anti-de Sitter spacetimes, exact gravitational waves in FLRW cosmologies, Bianchi cosmologies, and Gowdy universes. For reviews of these topics and modern results other than those presented in the remainder of this article, we direct the reader to the following articles: A summary of the historical development of the mathematics of wavelike exact solutions [2], modern references on exact solutions in general relativity [3, 4], works which deal with colliding plane waves and the physical interpretations of certain exact solutions [5, 6], and other related reviews [7,8,9,10].

1.1 Survey of early developments

We begin by providing a brief historical perspective on the development of the theory of waves in general relativity (GR) as in [11] with some relevant additions. In particular, we outline here only the early results in the field in order to supplement the material of Sect. 2, and leave discussion of modern developments not covered elsewhere in this article to the references above.

1915:

Albert Einstein establishes the field equation of general relativity

1916:

Einstein demonstrates that the linearised vacuum field equation admits wavelike solutions which are rather similar to electromagnetic waves

1918:

Einstein derives the quadrupole formula according to which gravitational waves are produced by a time-dependent mass quadrupole moment

1925:

Hans Brinkmann finds a class of exact wavelike solutions to the vacuum field equation, later called pp-waves (“plane-fronted waves with parallel rays”) by Jürgen Ehlers and Wolfgang Kundt. Note that this was a purely mathematical work, and they were not yet understood as modelling massless radiation.

1926:

Baldwin and Jeffery illuminate the interpretations of wavelike spacetimes when amplitudes are not assumed to be small [12]

1936:

Einstein submits, together with Nathan Rosen, a manuscript to Physical Review in which they claim that gravitational waves do not exist

1937:

After receiving a critical referee report from Howard P. Robertson, Einstein withdraws the manuscript with the erroneous claim and publishes, together with Rosen, a strongly revised manuscript on wavelike solutions (Einstein-Rosen waves) in the Journal of the Franklin Institute

1957:

Felix Pirani gives an invariant (i.e. coordinate-independent) characterisation of gravitational radiation, and Bondi independently writes down a metric for the plane wave which is singularity-free and carries energy [13]. This work was later developed by Asher Peres in 1959 [14]

1958:

Anderzej Trautman reformulates Sommerfeld’s radiation boundary conditions for a general field theory, and applies this approach to relativity to find the boundary conditions to be imposed at infinity due to bounded sources in GR

1960:

Ivor Robinson and Trautman discover a class of exact solutions to Einstein’s vacuum field equation that describe outgoing gravitational radiation

1961:

Wolfgang Kundt surveys the wavelike geometries as those admitting a twistfree and non-expanding null congruence, and characterizes their subclasses of different Petrov type by geometrical properties [15, 16]

1962:

Ehlers and Kundt conjecture that the gravitational pp-waves other than the plane wave cannot be complete

1962:

Roger Penrose provides a geometric definition of asymptotic flatness, along with various new studies of the asymptotic properties of spacetimes including definitions and conservation laws for energy and momentum

1965:

Penrose shows that the plane waves (gravitational or otherwise) are not globally hyperbolic

1976:

Penrose demonstrates a limiting procedure by which any spacetime reduces to a plane wave, by “blowing up” a neighbourhood of a null geodesic

In the remainder of this article, we detail a subset of these results followed by a selection of advances of the theory that have taken place in the decades since. Again for details on modern advances not within the scope of this article, see [2,3,4,5,6,7,8,9,10].

Table 1 Nomenclature summary, covering the definitions of this article and some terminology which has been used to refer to the same objects in the literature

1.2 Nomenclature

The names used to refer to different classes of wavelike geometries in this article are not all standard in the literature, due to a degree of degeneracy in the usage of certain terms; eg. “pp-wave” can implicitly refer to a 4-dimensional geometry with planar wavefront, or to an n-dimensional geometry with curved wavefront. Also sometimes ambiguous is the local or global nature of coordinates used in the description of wavelike geometries. Due to the importance of dimension, global characteristics and wavefront geometry in determining the properties of the wave, the authors see it as necessary to fix one consistent language for the purposes of this article. To summarize these definitions and to facilitate comparison with the literature, we fix nomenclature in Table 1 below. In particular, note that the term “parallel wave” has not previously been used, and instead the term “pp-wave” is often used in the literature to refer to the same object with the understanding that the geometry in question need not have planar wavefront.Footnote 2

2 Defining waves in general relativity

Let us now set about attempting to define a wave in GR. This is not a simple task because of the inherent nonlinearity of GR, and so we take inspiration from the well-established linear wave theory of electromagnetism. To this end, we will start by looking at linearised/“weak-field” GR, and demonstrate that in this linear regime one finds wavelike behaviour analogous to Maxwell’s electromagnetism (EM), with some fundamental differences. Such differences have originFootnote 3 in the fact that the relevant field object in EM is a 1-tensor (the vector potential \(A^\mu \)) whereas in GR the relevant object is a 2-tensor (the metric \(g_{\mu \nu }\)).

Upon finding such behaviour in the linear regime, we will discuss how to extend the results to the general case. This will be accomplished by taking inspiration from the covariant properties of the linearised waves (those properties which do not depend on the coordinate system used), and showing that a general metric satisfying such properties exhibits similar wavelike behaviour.

2.1 Linearised gravity

Finding wavelike behaviour in the linear/weak-field regime is a very standard calculation, completed first in 1916 by Einstein [18] but for a modern presentation see for example [11, 19, 20]. As a result, in this section we will only restate the results necessary to build intuition for the later definitions of wavelike behavior. Consider a perturbation \(h_{\mu \nu }\) to the Minkowski background \(\eta _{\mu \nu }\). That is, for the spacetime manifold \(M = {\mathbb {R}}^4\) we have the Lorentzian metric

$$\begin{aligned} g_{\mu \nu }= \eta _{\mu \nu }+ h_{\mu \nu }, \quad |h_{\mu \nu }| \ll 1 \end{aligned}$$
(1)

where we have implicitly chosen local coordinates \(x^\mu \), and in these coordinates the Minkowski metric \(\eta \) takes the usual form \(\text {diag}(-1,+1,+1,+1)\) and the perturbation \(h_{\mu \nu }\) is in some sense “small”. Here, “smallness” is defined loosely by the fact that the terms quadratic in \(h_{\mu \nu }\) contribute insignificantly to the Einstein equations. We then wish to obtain the Einstein tensor for this metric to linear order in \(h_{\mu \nu }\). To this end, we may raise and lower indices of \(h_{\mu \nu }\) with the background metric \(\eta \) since doing so with the full metric g would lead to corrections of order higher than 1 in \(h_{\mu \nu }\). This can also be viewed as treating the perturbation \(h_{\mu \nu }\) as a symmetric tensor propagatingFootnote 4 on a Minkowski background. For the details of this calculation on a curved background, see [11].

To simplify calculations, one chooses to work not with \(h_{\mu \nu }\) but rather with the trace-reversed variable \(\bar{h}_{\mu \nu }\) defined as

$$\begin{aligned} \bar{h}_{\mu \nu }:= h_{\mu \nu }-\frac{1}{2} h \eta _{\mu \nu }\end{aligned}$$

called so because \(\bar{h}^\mu {}_\mu = - h^\mu {}_\mu =: -h\) (note that the Einstein tensor is just the trace-reversed Ricci tensor). In electromagnetism one often works with the LorenzFootnote 5 gauge conditions \(\partial _\mu A^{\mu }=0\) for the vector potential \(A^\mu \). Since we are interested in describing radiation in general relativity, we will use the analogous condition

$$\begin{aligned} \partial {}^\mu \bar{h}_{\mu \nu }= 0 \end{aligned}$$
(2)

on the trace-reversed perturbation \(\bar{h}_{\mu \nu }\). As a result of these choices, the Einstein tensor is given (to linear order in the perturbation) by

$$\begin{aligned} G_{\mu \nu }=-\frac{1}{2}\square \bar{h}_{\mu \nu }\end{aligned}$$
(3)

where we have defined the D’Alembertian \(\square :=\nabla ^\mu \nabla _\mu \) which here is simply the flat space D’Alembertian \(\square = -\partial _t^2 + \partial _x^2 + \partial _y^2 + \partial _z^2\) (the presence of which is an early sign of wavelike behaviour). Therefore the Einstein equation of linearised gravity reads

$$\begin{aligned} \square \bar{h}_{\mu \nu }= -16\pi T_{\mu \nu }\end{aligned}$$
(4)

in units where \(c=G=1\) and it is understood that the energy-momentum tensor T is also consistent with the “weak field” regime. By this, we mean that the lowest nonvanishing order in \(T_{\mu \nu }\) is of the same order of magnitude as the perturbation \(h_{\mu \nu }\). The vacuum Einstein equation is then simply a homogeneous wave equation for \(\bar{h}_{\mu \nu }\) and so one makes the plane wave ansatz

$$\begin{aligned} \bar{h}_{\mu \nu }= C_{\mu \nu }(k) e^{ik_\sigma x^\sigma } \end{aligned}$$
(5)

for some complex, symmetric coefficientsFootnote 6\(C_{\mu \nu }\) and \(k = (\omega , \textbf{k})\) a constant vector field on M (constant in the usual sense, since the background metric is flat). As is standard when making a plane wave ansatz written in the complex form, it is understood that at the end of the day, one should take the real part of expressions to obtain physical results.

The Lorenz gauge condition Eq. 2 for such a perturbation yields

$$\begin{aligned} k^\mu C_{\mu \nu }= 0 \end{aligned}$$
(6)

for all \(\nu \), that is, the perturbation is orthogonal to the wave vector. One may interpret this as the fact that a gravitational perturbation of this kind will be transverse in a way analogous to the electric and magnetic fields of electromagnetism. The vacuum Einstein equations for such a plane wave perturbation yield

$$\begin{aligned} 0 = \square \bar{h}_{\mu \nu }= -k_\sigma k^\sigma \bar{h}_{\mu \nu }\end{aligned}$$
(7)

which is obtained by noting that \(\partial _\sigma \bar{h}_{\mu \nu }= ik_\sigma \bar{h}_{\mu \nu }\). Since we are not interested in solutions for which \(\bar{h}_{\mu \nu }\) is identically zero, we instead have

$$\begin{aligned} k_\sigma k^\sigma = 0 \end{aligned}$$
(8)

that is, the “wave vector” k of a plane wave solution to the linearised Einstein equations must be null. This is the statement that in the linear theory, the metric exhibits a wavelike behaviour which propagates at the speed of light c. These facts served as an early hint that gravitational waves exist, and that they travel at c .

One can utilize the remaining coordinate freedom (since the Lorenz gauge is only a partial gauge fixing) to obtain illuminating expressions for the \(C_{\mu \nu }\) in the so-called transverse-traceless gauge,Footnote 7 named so because in such a gauge the perturbation h is traceless and thus \(h=\bar{h}\). Reusing the labels \(x^\mu \) for the coordinate system resulting from the full gauge fixing, one finds [20, p. 150] that for a wave travelling in the \(x^3\) directionFootnote 8 the coefficients \(C_{\mu \nu }\) take the particularly simple form

$$\begin{aligned} (C_{\mu \nu })= \begin{pmatrix} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} C_{+} &{} C_{\times } &{} 0 \\ 0 &{} C_{\times } &{} -C_{+} &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \end{pmatrix} \end{aligned}$$
(9)

where the subscripts on the components are justified after computing the effect of such a perturbation on a ring of test particles, and noting that for only \(C_+\) nonzero one finds the ring oscillates in a “\(+\)” pattern, and for only \(C_\times \) nonzero the ring oscillates in a “\(\times \)” pattern [19, 21]. The same structure will be observed when we make the transition to the nonlinear theory and attempt to define an analogous “plane wave” (Sect. 2.2.1). Note that our perturbation is now fully described by the two remaining independent constant components \(C_+\) and \(C_\times \) (and k which here only has one independent parameter, the frequency \(\omega \)) suggesting that there exist two linearly independent polarisation states of gravitational radiation.

To convince oneself of the physicality of these results, one needs to examine the motion of test particles in such a spacetime. One finds that for non-relativistic test particles, the geodesic equations are solved by a particle whose coordinate location remains constant. In fact the coordinate system can be thought of as “moving with” the particle, effectively hiding the dynamics from the perspective of our coordinates [21, Sec. 1.4]. Instead, upon examining the relative motion of test particles via the geodesic deviation equation, one finds a periodic oscillation of the test particles, supporting the physicality of such a wave in the weak-field regime.

2.2 Wavelike exact solutions

We now ask ourselves the natural question “does the full nonlinear theory also admit wavelike solutions?”. Furthermore, we wonder if such solutions reduce to those of the linear theory in the weak-field regime. In order to generalize the wave objects of the linearised theory, let us examine which of their properties are covariantly defined (that is, in a coordinate-independent manner). One easily recognizable covariant property is that the “wave vector” k should be null

$$\begin{aligned} g(k,k) = k^\mu k_\mu = 0. \end{aligned}$$
(10)

Further scrutiny of the results of the previous section yields that \(k^\mu \) is also an eigenvector of the Riemann tensor with eigenvalue 0, that is

$$\begin{aligned} R_{\mu \nu \sigma \rho }k^\rho = 0 \end{aligned}$$
(11)

for all \(\mu ,\nu ,\sigma \). One could use these two properties as a starting point for a definition of a wave in general relativity, that is a Lorentzian manifold (Mg) admitting a null vector fieldFootnote 9Z which is an eigenvector of the Riemann tensor with eigenvalue 0. In fact such a spacetime does exhibit wavelike behaviour [27, Ch. 32.3  & 34.1], but is rather cumbersome to work with, and is missing some characteristics of the waves in the linearised theory.

One such characteristic is as follows: When making the plane wave ansatz Eq. 5, we assumed the vector field k to be constant. As a result, the rays of the corresponding wave were parallel (in the usual Euclidean sense). In order to obtain the same qualitative behaviour, we should not demand that Z be an eigenvector of the Riemann tensor with eigenvalue 0, but rather the stronger condition that Z be covariantly constant (which in some sense generalises the notion of “constant”) which is written \(\nabla Z = 0\) for \(\nabla \) the Levi-Civita connection of the geometry in question. With this, we attempt the following covariant definition:

Definition 1

(Parallel Wave) A parallel wave (wave with parallel rays) is a Lorentzian manifold (Mg) which admits a global, covariantly constant, null vector field Z.

The “rays” of such a wave are the integral curves of the defining vector field Z, which are automatically (null) geodesics since Z is covariantly constant. It is justified that we may call such objects “rays” by the fact that null geodesics correspond to the paths of light rays.

Remark 2.1

If we had demanded that Z was an eigenvector of the Riemann tensor with eigenvalue zero instead of being covariantly constant, we would obtain an example from a general class of solutions called the “Degenerate gravitational fields” which contains the pp-waves as a subset (that is, Z being covariantly constant implies that Z is an eigenvector of the Reimann tensor with eigenvalue 0, but the converse is not true). These degenerate vacuum solutions are defined by the property that they admit (at least) one shear-free, geodesic null congruence. For details of this class and in particular the above mentioned example, see [27, Ch. 32.3  & 34.1]. The family of geometries admitting at least one shear-free, twist-free, geodesic null congruence splits into the Kundt class (for a non-expanding congruence) and the Robinson-Trautman class (for an expanding congruence). For details of these classes see [4], but in this article we focus primarily on the pp-waves.

However, another feature of the plane waves in the linearised theory which we have not yet imposed is the planar character. A plane wave has a planar wavefront (roughly, the spacelike codimension-2 hypersurface orthogonal to the wave vector), but in general these parallel waves can have curved wavefronts. Although to obtain wavelike behaviour it is not necessary to demand the wavefront be flat (and in fact we will reintroduce this curvature in Sect. 3.5), it is standard in the field to make this restriction. This is likely because when considering a curved wavefront, the geometric properties of the wavefront can “obscure” those fundamental properties of the wave, such as the vanishing of the scalar curvature invariants (Sect. 4.1). To demand the wavefront is flat, let us define preciselyFootnote 10 the wavefront of a wave:

Definition 2

(The Wavefront of a Parallel Wave) If a parallel wave (Mg) is defined by a covariantly constant, null vector field Z (analogous to the “wave vector” of a plane wave in the linear theory) then the wavefront of such a wave is defined as

$$\begin{aligned} Z_\perp / Z, \end{aligned}$$

where \(Z_\perp := \{X \in TM ~|~ g(X,Z) = 0\}\) and the quotient is defined by the equivalence relation \(X\sim Y \iff Y = X + fZ\) for some smooth function f.

We must quotient with the wave vector itself since Z is null, thus \(Z \in Z_\perp \) and the natural analogy to electromagnetism suggests that Z itself should not be considered as part of the wavefront. This definition appears in [24] under the name “screen bundle”, where it is treated rigorously in the context of compact pp-waves. As the authors note, the “wave” interpretation becomes less clear in the compact case. As will be discussed in Sect. 2.3, the presence of radiation is characterized by the null asymptotics of the spacetime, but a compact manifold does not admit the same notion of “null infinity” as will be used to define the presence of radiation. With this in mind, we maintain the name “wavefront” for simplicity. For details of the induced metric on the wavefront see Sect. 4.2.

If we wish to demand that the wavefront be flat, then this is most succinctly described (see [25]) by considering the Riemann tensor as a map on bivectors (antisymmetric 2-tensors) in \(Z_\perp \wedge Z_\perp \), in which case the flatness condition for the wavefront becomes

$$\begin{aligned} R|_{Z_\perp \wedge Z_\perp } = 0. \end{aligned}$$
(12)

With this, we arrive at the definition of the plane-fronted waves with parallel rays (pp-waves).

Definition 3

(Plane-fronted Wave with Parallel Rays (pp-Wave)) A pp-wave is Lorentzian manifold (Mg) which admits a global, covariantly constant, null vector field Z, in which the curvature tensor satisfies \(R|_{Z_\perp \wedge Z_\perp } = 0\).

Note that in the literature (for example [3, Eq. 24.39]) a pp-wave is often defined as a Lorentzian manifold admitting a covariantly constant, null vector field (that is, our definition of a parallel wave), where it is understood that the name refers to no actual planar character. Other works however also include also the curvature condition Eq. 12 as is done here, eg. [24,25,26].

2.2.1 Comparison with the linearised theory

We now set about comparing the features of these pp-waves with those of the waves found in the linear regime. Consider the metric of Minkowski space written in the so-called “light-cone” coordinatesFootnote 11

$$\begin{aligned} \eta = 2dudv + dx^{2}+dy^{2}, \end{aligned}$$
(13)

where the coordinates u and v are defined in terms of the standard txyz coordinates as

$$\begin{aligned} u := \frac{z-ct}{\sqrt{2}} \quad v := \frac{z+ct}{\sqrt{2}} \end{aligned}$$
(14)

and where we briefly reintroduce the speed of light c for transparency. As we will prove in Sect. 3.1, a 4-dimensional pp-wave metric can locally be written as

$$\begin{aligned} g = 2dudv + H(u, x, y) du^{2} + dx^{2}+dy^{2}, \end{aligned}$$
(15)

where the so-called “characteristic function” H is independent of the coordinate v, and where we have suggestively used the same coordinate labels as for the above flat metric. Here H describes the wave (deviation from flat space) in the sense that when \(H = 0\), we simply have the above flat metric Eq. 13. Note that this metric is a solution of the vacuum Einstein equations if and only if H is harmonic in (xy), that is \((\partial _x^2 + \partial _y^2)H(u,x,y) = 0\). Here we already see a hint of wavelike behaviour. Treating H as a perturbation on the Minkowski background (and thus inheriting the coordinate system of Eq. 13), we see that the perturbation depends on time only through the coordinate u, that is a time-dependence proportional to \(z - ct\), as one would expect for a travelling wave.

Surprisingly, as in [27, Above Eq. 29.46], one can show that the pp-wave metric Eq. 15 in fact solves the linearised field equations. This is because even in the general theory, no expressions of quadratic order or higher in H nor its derivatives appear in the field equations for such a spacetime. The primary difference with the linear theory is that H need not be “small”. In this way, we see that the “standard pp-waves” do in fact generalise the results of the linearised theory.

We will find in Sect. 3.3.1 that the simplest pp-wave occurs when the characteristic function H(uxy) is quadratic in (xy) (with arbitrary u-dependence). Such a pp-wave is typically referred to as a “plane wave”. These spacetimes exhibit the same polarization states as those which can be derived in the linear regime, and this is one reason they are given the name “plane waves” (shown in Sect. 3.3.1). For a detailed description of the “planeness” of such spacetimes, see [30, Sec. 3]. In order to directly compare these simple pp-waves to the plane wave solutions in the linear regime, as in [27], one “linearises” the exact solution by assuming the amplitude of the wave is small. The reasoning of Stephani [27] is as follows:

  • The vacuum plane wave metric of linearised gravity can be written as

    $$\begin{aligned} g= 2dudv + \left( 1+f(u)\right) \textrm{d} x^{2} +\left( 1-f(u)\right) \textrm{d} y^{2}, \end{aligned}$$

    where \(f(u)=A \cos (\frac{\omega }{c} (u+\varphi ))\) for some frequency \(\omega \), phase \(\varphi \) and constant A. As usual on a Minkowski background, we interpret u as \(z-ct\).

  • The linearised version of the vacuum plane wave metric (pp-wave with H harmonic and quadratic in (xy)) can be written

    $$\begin{aligned} g=2dudv + \left( 1+\alpha (u)\right) \textrm{d} x^{2} +\left( 1-\alpha (u)\right) \textrm{d} y^{2} \end{aligned}$$

    with the u-dependence of \(\alpha \) arbitrary, and \(\alpha \ll 1\).

  • The frequency \(\omega \) of the linearised theory is fixed by the plane wave ansatz, but the profile functions \(\alpha (u)\) of the second case have no predetermined frequency. Therefore the \(\alpha (u)\) can be chosen for example as

    $$\begin{aligned} \sum _j A_j\cos \left( \frac{\omega _j}{c} (u+\varphi _j)\right) \end{aligned}$$

    for small constants \(A_j\), which corresponds to a superpositionFootnote 12 of waves of varying frequency. In this way, the exact solution plane waves are interpreted as a packet of plane waves of differing frequencies.

There is a more convincing reason why one would call such a pp-wave a “plane wave” based on the algebraic and geometric symmetries of the spacetime, and we will discuss this in the following section.

2.3 Spacetimes containing gravitational radiation

Let us now review two paths by which one can obtain definitions of the presence of wavelike behaviour/radiation in a spacetime, and the ways in which these approaches coincide with our existing definition of an exact solution describing only a wave.

2.3.1 Algebraic classification of the Weyl tensor

Felix Pirani and Hermann Bondi (independently) pioneered an attempt at defining gravitational waves as exact solutions of the Einstein field equations, using geometric and algebraic principles developed first by Petrov. Our presentation will follow closely that of [21, p. 8,9]. The key concept in this endeavour is the Weyl tensor, which is the trace-free part of the Riemann tensor. As such, the Riemann tensor reduces to the Weyl tensor in vacuum regions, where the Ricci tensor (the trace of the Riemann tensor) vanishes.

$$\begin{aligned} R_{\mu \nu }= 0 \iff C_{\mu \nu \sigma \rho } = R_{\mu \nu \sigma \rho } \end{aligned}$$
(16)

for all \(\mu ,\nu \), where it is understood that a C with four indices is the Weyl tensor, not to be confused with the (0,2)-tensor C in the plane wave ansatz Eq. 5 of the linearised theory. When looking in particular for gravitational waves (i.e. in vacuum), it is apparent that the relevant object for describing the wave is the Weyl tensor.

Pirani’s intuition was that for gravitational waves, the Weyl tensor should exhibit special symmetries. The Weyl tensor of a spacetime (Mg) is conformally invariant, that is, it is invariant under conformal transformations of the metric:

$$\begin{aligned} g_{\mu \nu }&\longrightarrow g_{\mu \nu }^{\prime } =\lambda ^{2} g_{\mu \nu } \end{aligned}$$
(17)
$$\begin{aligned} C_{\mu \nu \sigma }{}^{\rho }&\longrightarrow C^{\prime }_{\mu \nu \sigma }{}^{\rho } =C_{\mu \nu \sigma }{}^{\rho } \end{aligned}$$
(18)

for some conformal factor \(\lambda : M \mapsto {\mathbb {R}}\). Intuitively, the Weyl tensor expresses the tidal forces that a free-falling body feels along a geodesic (see [21]). That the Weyl tensor describes tidal forces (roughly, the relative acceleration felt by two test masses separated by an infinitesimal distance) should sound familiar, as this was how we detected the physical effect of gravitational waves in the linearised theory. It should not be surprising then that the Weyl tensor is the object describing radiation in general relativity. The correspondence between tidal forces and exact gravitational waves has been the subject of much study (often from the perspective of the geodesic deviation equation), details of which can be found in the following articles: [29, 31,32,33,34,35].

In 1954, Petrov devised a classification of the algebraic symmetries of the Weyl tensor at each point in a 4-dimensional spacetime, and Pirani independently derived the same classification in 1957. They noted that the Weyl tensor preserves the antisymmetry of antisymmetric 2-tensors (or “bivectors”), that is for \(X_{\mu \nu }= -X_{\nu \mu }\),

$$\begin{aligned} X_{\mu \nu }C^{\mu \nu }{}_{\sigma \rho } = Y_{\sigma \rho } \end{aligned}$$
(19)

where \(Y_{{\mu \nu }}\) is also a bivector. By finding the eigenbivectors \(X_{\mu \nu }\) of the Weyl tensor, i.e. bivectors satisfying \(X_{\mu \nu }C^{\mu \nu }{}_{\sigma \rho } = 2\lambda X_{\sigma \rho }\), one can classify 6 types of algebraic symmetry. The eigenbivectors for a given point p in a spacetime are related to a set of null vectors in \(T_p M\) called the “principal null directions” (PNDs) at p, but the specifics of this correspondence are rather complicated. For details see for example [3] or [36, Sec. 7.2-7.4].

One may wonder why there are 6 symmetry types, but this is simply because the Weyl tensor can have at most 4 linearly independent eigenbivectors, and so the options are:

$$\begin{aligned}{} & {} \text {Type I:}~~\uparrow \rightarrow \nwarrow \nearrow ~~~~\text {Type II:}~~\uparrow \uparrow \nearrow \searrow ~~~~ \text {Type D:}~~ \uparrow \uparrow \rightarrow \rightarrow \\{} & {} \text {Type III:}~~\uparrow \uparrow \uparrow \rightarrow ~~~~~ \text {Type N:} ~~\uparrow \uparrow \uparrow \uparrow ~~~~~~ \text {Type O:}~~ C_{\mu \nu \sigma \rho } = 0 \end{aligned}$$

where aligned arrows represent linearly dependent PNDs. The Bel criteria are the conditions on the Weyl tensor \(C_{\mu \nu \sigma \rho }\) (in a special coordinate system) such that it is of one of the above types. The Bel criterion for a type N spacetime is that the metric admits a null vector field \(k^\rho \)

$$\begin{aligned} C_{\mu \nu \sigma \rho } k^\rho = 0 \end{aligned}$$
(20)

This condition should again look very familiar, as it was one of the two covariantly defined properties of the wave vector k in the linear theory, where the Riemann tensor is replaced by only the Weyl tensor (which it indeed reduces to in a vacuum region). The four coinciding PNDs indeed correspond to the wave vector of the linear theory, but also to the covariantly constant, null vector field Z in the definition of a pp-wave Eq. 3. By this we mean that the pp-wave spacetime is everywhere algebraically special, and is of Petrov type N.

In this way, the Petrov type N represents the presence of wavelike behaviour in a spacetime. Note that the Petrov type can vary from region to region in a spacetime (though not all “transitions” are possible, see [21]), and so the Weyl tensor of what we could reasonably consider a radiative spacetime should be of type N in the far-field (towards null infinity). Such a statement is made precise by the “peeling theorem” [37,38,39], which describes the asymptotic behaviour of the Weyl tensor as one approaches null infinity. For r an affine parameter along a null geodesic \(\gamma \) from a point p to null infinity, as \(r\rightarrow \infty \), the Weyl tensor can be written in a parallelly propagated frame along \(\gamma \) as

$$\begin{aligned} C_{\mu \nu \sigma \rho } = \frac{C_{\mu \nu \sigma \rho }^{\mathrm{(N)}}}{r} + \frac{C_{\mu \nu \sigma \rho }^{\mathrm{(III)}}}{r^2} + \frac{C_{\mu \nu \sigma \rho }^{\mathrm{(II)}}}{r^3} + \frac{C_{\mu \nu \sigma \rho }^{\mathrm{(I)}}}{r^4} + \dots \end{aligned}$$
(21)

where the superscript on each term on the right hand side represents the Petrov type of that tensor. Roughly,Footnote 13 towards null infinity one finds that the dominant behaviour comes from the type N component. This expansion bears a striking resemblance to the multipole expansion of the electromagnetic potentials, wherein again only the \(\sim 1/r\) term contributes to radiation.

Remark 2.2

We pause to mention here the more geometric notion of asymptotic behavior at infinity due to Penrose [40], where infinity is regarded as a three-dimensional boundary corresponding to \(\Omega = 0\) in the definition of the following conformal metric

$$\begin{aligned} g = \Omega ^2 \tilde{g}, \end{aligned}$$

where \(\tilde{g}\) is the original spacetime metric. The key is that one can treat infinity as a 3-dimensional boundary while still studying those physical properties of the original spacetime metric \(\tilde{g}\) that are conformally invariant. For a comprehensive treatment of this notion of conformal infinity, consult [41]; for its more recent use in holography and the AdS-CFT correspondence, consult, e.g., [42].

Remark 2.3

It is worth now stating precisely what one means by gravitational radiation. As in [31], gravitational radiation is the transfer of energy via gravitational waves to null infinity, that is gravitational radiation is present in the asymptotic regime of an isolated dynamical system in GR such as that in the Christodoulou-Klainerman spacetimes [43].

In 1957, Pirani attempted to define the presence of gravitational radiation as being modelled by a spacetime which was everywhere algebraically special with certain type [44], but eventually published new work with Robertson and Bondi [30, Sec. 4] in which they claimed that such a definition was too restrictive and in fact only applies to pure radiation; it would not describe the radiation from a system of charges (gravitational or electromagnetic) at a finite distance. As such, they revised the definition of a spacetime containing gravitational radiation to a spacetime which is asymptotically type N. One reason for this is that a plane wave is everywhereFootnote 14 type N (again in the original classification of Petrov), and in the far-field, gravitational radiation should approximate the plane wave. The everywhere type N spacetimes contain the “pp-waves” defined above as a subclass, see [4, Sec. 18.2].

2.3.2 Groups of motions (symmetry)

In an attempt at a purely geometric definition of gravitational waves, Bondi, Pirani and Robinson began by attempting to define covariantly the plane wave. They do this by demanding that the gravitational plane wave of general relativity should “possess an analogous degree of symmetry to that possessed by plane electromagnetic waves in flat space-time” [30, Sec. 2]. As mentioned in the original paper, this approach ensures that one avoids the so-called “coordinate waves” which are apparent wavelike behaviours which are removed by a diffeomorphism (and thus, simply artifacts of the coordinates chosen).

Consider a plane wave in Minkowski space with wave vector in the positive z direction.Footnote 15 There is one clear symmetry of such a wave, and that is the planar wavefront. More precisely, translations in the x and y directions leave our description invariant. Another symmetry is due to the translation of the wavefronts themselves, i.e. the translation along the null 3-surfaces \(z - t = \text {const}\) in units where \(c = 1\). In fact, there are an additional 2 less obvious symmetries known as the “null rotations”, which are more difficult to see and visualise as their nature is inherently 4-dimensional. In total, we say there exists a 5-parameter group of motions (isometries) under which the plane wave is invariant. The corresponding Killing vector fields for these isometries are given explicitly in [3, Table 24.5] and [4, Sec. 17.5]. Using this as inspiration, the authors defined a gravitational plane wave as follows, where “equivalent” is in reference to a spacetime with metric Eq. 15 such that H is quadratic in (xy) as was briefly mentioned in Sect. 2.2.1, and is made more explicit in Sect. 3.3.1.

Definition 4

(Equivalent Definition: Plane Wave) A plane wave is a 4-dimensional non-flat Lorentzian manifold (Mg) which admits a 5-parameter group of isometries.

Note that in the original [30], the definition also involves “Ricci-flat”, but this would only correspond to the purely gravitational plane waves. The other definitions of the plane wave presented here (via quadratic H in Brinkmann coordinates and via the curvature condition of Definition 5) include also electromagnetic plane wave components in general. Also note that we make no assumption about the structure of the symmetry group; in particular, we do not assume it to have the same group structure as that of a plane wave in electromagnetism. Remarkably, such a property appears as a consequence of our existing assumptions. Such symmetries can be viewed as generated by vector fields, and the explicit form of these generators is given in [30, Eq. 2.12], for a wave constructed in such a way that it has a finite wave profile.Footnote 16 Note also that the gravitational plane wave of Definition 4 above is in fact a special case of our pp-wave spacetimes (Definition 3), and corresponds to the “plane wave” mentioned in the comparison to the linear theory. These plane waves are described fully in Sect. 3.3.

We can also define the plane wave in a covariant manner as in [25] as follows, where a “classical pp-wave” is simply a 4-dimensional pp-wave with planar wavefront (see Sect. 3.3):

Definition 5

(Equivalent Definition: Plane Wave) A plane wave is a classical pp-wave defined via a covariantly constant, null vector field Z which additionally satisfies

$$\begin{aligned} \nabla _X R = 0 \quad \forall \quad X \in Z_{\perp }, \end{aligned}$$

where R is the curvature tensor and \(Z_\perp :=\{X \in TM ~|~ g(X,Z) = 0\}\).

We prove the correspondence of such a definition with the other definitions of a plane wave in Sect. 3.3.1. For a full discussion of the properties of such waves, the fact that such a definition actually coincides with the algebraic definition of plane waves and the conceptual difficulties involved (e.g. “to whom is such a gravitational plane wave planar?”), see [30]. For a succinct overview of the connection between the Petrov classification and the definition of the plane wave in terms of its symmetry group, see [2, p. 688].

Note that all our definitions involve at least one lightlike group of motions (symmetry), corresponding to the “propagation” of the wave. There are conditions one may place on a wave such that the wavefront itself is of finite extent (which amount to conditions on the characteristic function H in standard coordinates) and such conditions have relevance to determining the causal character of the wave, as we will see in Sect. 4.4. For a detailed table describing various special cases of gravitational pp-waves and their symmetry properties/Killing vector fields, see [17, p. 79].

The next step in defining the presence of radiation in a spacetime was provided by Trautman, by imposing boundary conditions at infinity in analogy to the Sommerfeld radiation conditions. He showed that in electromagnetism, his conditions restricted one to those solutions of Maxwell’s equations with outgoing radiative fields. Note that as in the case of the Petrov classification, it is the asymptotic behaviour which is used to define the presence of waves. For a review of Trautman’s definition in the context of the development of gravitational wave theory, see [2], and for Penrose’s contribution to the study of asymptotics and their relation to outgoing radiation, see [40].

3 The coordinate description

We have defined a parallel wave as a Lorentzian manifold admitting a covariantly constant, null vector field, and a pp-wave as a parallel wave with flat wavefront. In this section, we first derive the most general form of a Lorentzian metric satisfying these conditions, and then discuss the various simplifications which have been studied in the literature. These simplifications remain exact wavelike solutions to the Einstein equations, but have the benefit of being easier to understand and work with. The simplest and most widely known example we call the “classical pp-wave”, which is discussed in Sect. 3.3.

Notation Our goal is to develop a local coordinate system on a parallel wave of dimension n which we will denote \(\{u,v,\textbf{x}\}\), where \(\textbf{x} = x^1, \ldots ,x^{n-2}\) are the so-called “wavefront coordinates”. This name is justified by examining the definition of a wavefront (Definition 2) in the context of the coordinate description of a parallel wave metric Eq. 23. We will use Greek indices when referring to all coordinates \(\{u,v,\textbf{x}\}\), and Latin indices (other than the letters u and v) when referring to only the wavefront coordinates. For example, the sum \(g_{va}X^{a}\) for some vector field \(X \in \mathfrak {X}(M)\) (the space of vector fields on M) will have \(n-2\) terms (\(a \ne u,v\)), whereas the sum \(g_{v\sigma }X^{\sigma }\) will have n terms. To avoid confusion with the coordinates u and v, we will not use the typical \(\mu \) and \(\nu \) Greek indices in this section, and instead we will favor \(\sigma , \rho , \gamma \). For the Latin indices, we use abc and ijk. Additionally, when a coordinate is labelled \(x^i\), we will denote its corresponding coordinate vector field by \(\partial _{x^i} =: \partial _i\).

3.1 General parallel waves and pp-waves

Consider the n-dimensional Lorentzian manifold (Mg). Denote the covariantly constant null vector field on M by Z, that is \(\nabla Z = 0\) and \(g(Z,Z) = 0\) for \(\nabla \) the Levi-Civita connection on (Mg) and Z nontrivial.

Theorem 3.1

(Coordinates adapted to covariantly constant,Footnote 17 null vector field) If a Lorentzian manifold (Mg) admits a covariantly constant, null vector field Z, then in a neighbourhood U of each \(p \in M\) there exists a local coordinate chart \(\varphi = \{u,v,\textbf{x}\}\) on U which is “adapted to Z” such that

$$\begin{aligned} Z|_U = \partial _v = \nabla u. \end{aligned}$$

Proof

The proof can be found in “Appendix A”. \(\square \)

The following proposition outlines the properties of the metric g when it is written in these adapted coordinates.

Proposition 3.2

If a Lorentzian manifold (Mg) admits a covariantly constant, null vector field Z, and \(\{u,v,\textbf{x}\}\) are the local coordinates adapted to Z of Theorem 3.1, then the metric components in this coordinate system have the following properties on the domain of definition of the coordinates:

  1. (i)

    All metric components are independent of v, that is \(\partial _v(g_{\mu \nu }) = 0\)

  2. (ii)

    \(g_{v\sigma } = \delta _{\sigma }^{u}\)

  3. (iii)

    \((g_{ab})\) forms a positive-definite matrix, and therefore the embedded codimension-2 submanifolds defined by \(u =\text {const}\), \(v = \text {const}\) are Riemannian manifolds.

Proof

  1. (i)

    A covariantly constant vector field Z is in particular a Killing vector field. By definition of a Killing vector field we have \({\mathcal {L}}_Z (g) = 0\), but since \(Z = \partial _v\) we have \(0 = \left[ {\mathcal {L}}_Z (g)\right] _{\sigma \rho } = Z (g_{\sigma \rho }) = \partial _v (g_{\sigma \rho })\).

  2. (ii)

    First note that \(Z^{\sigma } = \delta _v^{\sigma }\) and therefore \(Z_{\sigma } = g_{v\sigma }\). Then since , we have \(Z_{\sigma } = du_{\sigma } = \delta _{\sigma }^{u}\). Therefore \(g_{v\sigma } = \delta _{\sigma }^{u}\).

  3. (iii)

    First, the hypersurfaces \(\Sigma _c := u^{-1}(c) = \left\{ q \in U: \varphi (q)=\left( c, v(q), x^1(q),\dots , x^{n-2}(q)\right) \right\} \) are null hypersurfaces since the normal to these surfaces is the null \(grad(u) = Z\). Via the previous point, the normal \(Z = \partial _v\) is orthogonal to \(\partial _i\) for all \(i\in \{1,\dots ,n-2\}\) and to itself and therefore all these coordinate vectors lie in the null hypersurfaces \(\Sigma _c\).

    Via [47, Lemma 28, p. 142] we have that a null hypersurface can contain only one null vector (here, \(Z = \partial _v\) itself) and so the remaining coordinate vector fields must be timelike or spacelike. Via point (2) of the same lemma, we have that there are no timelike vectors, and therefore the \(\partial _i\) for all \(i\in \{1,\dots ,n-2\}\) are spacelike and thus \(g_{ii}>0\) for all i, that is \((g_{ab})\) is positive-definite.

\(\square \)

Using the results of Theorem 3.1 and Proposition 3.2, we can now write the explicit form of the metric g in adapted coordinates for a general parallel wave:

$$\begin{aligned} g = 2 d u d v+g_{u u}\left( u, \textbf{x}\right) d u^{2} + 2g_{a u}\left( u, \textbf{x}\right) d x^{a} d u+g_{a b} \left( u, \textbf{x}\right) d x^{a} d x^{b} \end{aligned}$$
(22)

The functions \(g_{u u}\left( u, \textbf{x}\right) \) and \(g_{a u}\left( u, \textbf{x}\right) \) will be useful for the classification of parallel wave spacetimes, and we will therefore label them \(H\left( u, \textbf{x}\right) \) and \(A_a \left( u, \textbf{x}\right) \) respectively. We then have the metric of a general parallel wave in local adapted coordinates [48,49,50,51],

$$\begin{aligned} \boxed {g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2} +2A_a\left( u, \textbf{x}\right) d x^{a} d u+g_{a b} \left( u, \textbf{x}\right) d x^{a} d x^{b}.} \end{aligned}$$
(23)

One could also write this metric in matrix notation as

$$\begin{aligned} g=\begin{pmatrix} H &{} 1 &{} A_1 &{} \dots &{} A_{n-2} \\ 1 &{} 0 &{} 0 &{} \dots &{} 0 \\ A_1 &{} 0 &{} &{} &{} \\ \vdots &{} \vdots &{} &{} (g_{ab}) &{} \\ A_{n-2} &{} 0 &{} &{} &{} \\ \end{pmatrix}. \end{aligned}$$
(24)

In fact, this result can be viewed as a special case of a more general result by [52], which derives this form of a metric admitting a parallel null plane rather than a parallel null vector field. Conceptually the generalisation is simple, as a parallel null r-plane is pointwise a set of r linearly independent vectors, such that the field of planes (replacing the vector field in the above example) is a parallel null r-dimensional section of the tangent bundle TM. In this case, the metric takes a form similar to Eq. 24, though with some individual elements replaced by matrix blocks.

If we then impose the curvature condition Eq. 12 to obtain a pp-wave, as demonstrated in [25, Appendix A] one finds the metric of a general pp-wave in local adapted coordinates

$$\begin{aligned} \boxed {g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2} +2A_a\left( u, \textbf{x}\right) d x^{a} d u+\delta _{a b} \left( u, \textbf{x}\right) d x^{a} d x^{b}.} \end{aligned}$$
(25)

Note that in the context of pp-waves, these coordinates are sometimes referred to as Brinkmann coordinates due to their original discovery [48] in a primarily mathematical context. The original work does not impose the curvature condition Eq. 12 and thus “Brinkmann coordinates” can refer to either Eq. 23 or implicitly to Eq. 25.

The properties of this general metric and some of the various special cases are discussed in Sect. 4. The remainder of this section focuses on defining these special cases, which are obtained by making additional assumptions on \(H, A_a, g_{ab}\), the topology of the manifold, or the dimension n.

Remark 3.3

Gauge Freedom: The gauge freedoms of the parallel wave and pp-wave metrics have been studied carefully, for example by [3, Sec. 24.5] in the \(n=4\) case, and [50, Sec. 6.1] in the \(n>4\) case. In vacuum regions it is standard to utilize local gauge freedoms to eliminate the cross terms \(dx^adu\), though in certain cases one can “lose” some global information about the nature of the wave source in doing so. Both the process of changing the coordinates to eliminate these terms and extensive detail about which global information is lost in performing such a transformation can be found in [53], and will be discussed again in Sect. 3.4. Upon eliminating these terms, the metric locally takes the form

$$\begin{aligned} g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2}+g_{a b} \left( u, \textbf{x}\right) d x^{a} d x^{b} \end{aligned}$$
(26)

which one can summarise as

$$\begin{aligned} g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2}+h(u), \end{aligned}$$
(27)

where h is a u-dependent family of Riemannian metrics on the codimension-2 hypersurface \(u =\) const, \(v=\) const. The construction of this form of the metric can be found for the special case of classical pp-waves (Eq. 32 below) in [17, Theorem 4.1.3].

For such a metric, it was shown in [25] that the coordinate changes which leave this form Eq. 26 invariant are

$$\begin{aligned} v&\longrightarrow v^\prime =\frac{1}{a}v + f_1(u, \textbf{x})\nonumber \\ u&\longrightarrow u^\prime = au + b\nonumber \\ \textbf{x}&\longrightarrow \textbf{x}^\prime = \textbf{f}_2(u, \textbf{x}), \end{aligned}$$
(28)

where \(a \ne 0\) and b are constants and \(f_1\), \(\textbf{f}_2\) are smooth functions independent of v on the domain of the coordinate chart. In such coordinates, the metric would retain its form

$$\begin{aligned} g = 2 d u^{\prime } d v^{\prime } + H^{\prime }\left( u^{\prime }, \textbf{x}^{\prime }\right) d u^{\prime }{}^{2}+h^{\prime }(u^\prime ). \end{aligned}$$
(29)

The authors showed that this fact may be used to transform to so-called normal Brinkmann coordinates centred at p, in which it holds that \(\varphi (p) = 0 \in {\mathbb {R}}^n\) where \(\varphi \) is the coordinate chart and

$$\begin{aligned} H(u,\textbf{0}) = 0, \quad \frac{\partial H}{\partial x^i}(u,\textbf{0}) = 0 \end{aligned}$$
(30)

for all u in an interval around 0.

3.2 Standard pp-wave

The class of pp-wave most commonly studied in the physics literature has been referred to by [24, Eq. 2] as a standard pp-wave. The defining characteristics of a standard pp-wave metric when written in the coordinate chart \(\{u,v,\textbf{x}\}\) of Theorem 3.1 are:

  1. (i)

    The coordinates \(\{u,v,\textbf{x}\}\) exist globally

  2. (ii)

    The metric is written with no cross terms \(dx^a du\), that is \(A_a = 0\) for all a.

and thus our metric takes the form

$$\begin{aligned} g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2} +\delta _{a b}d x^{a} d x^{b}. \end{aligned}$$
(31)

One can see that in coordinates, the codimension-2 hypersurface defined by \(u =\) const, \(v =\) const corresponds precisely to the wavefront of Definition 2. Unsurprisingly, for this n-dimensional standard pp-wave, the wavefront (or “transverse space”) is simply Euclidean \({\mathbb {R}}^{n-2}\).

By assuming the coordinates u and v exist globally, we are making assumptions on the properties of the spacetime manifold M. Certainly, that M is simply-connected is a sufficient condition for the coordinate u being global (since then the construction involving the Poincaré lemma would hold globally) but this is certainly not a necessary condition (for example the (Nh)p-waves of Sect. 3.5 with any non-simply connected N still admit a global u). In the case of the v coordinate, one expects that the integral curves of Z should be complete and non-closed.Footnote 18 Typically physical research involving pp-wave spacetimes begins with the assumption of a Lorentzian manifold \((M = {\mathbb {R}}^n,g)\) with a metric of the form above.

3.3 Classical pp-waves

These are the pp-waves for which the wavefront is two-dimensional Euclidean space, that is they are standard pp-waves on \({\mathbb {R}}^4\) such that the metric takes the form

$$\begin{aligned} g = 2 d u d v+ H(u, x,y) d u^{2}+ dx^2 + dy^2, \end{aligned}$$
(32)

where the usual adapted coordinates on the wavefront \((x^1,x^2)\) have been relabelledFootnote 19 to (xy). This metric is the most widely-known and well-studied pp-wave metric, due to its relevance to physics, and its simplicity while still exhibiting the key features of a pp-wave. The most important types of classical waves are the plane waves, whose properties will be discussed extensively in Sects. 4.4 and 5.

3.3.1 Plane waves

A plane wave is a classical pp-wave for which the characteristic function H(uxy) is quadraticFootnote 20 in (xy), i.e. the metric of Eq. 32 wherein

$$\begin{aligned} H(u,x,y) = \sum _{i, j=1}^{2} h_{i j}(u) x^{i} x^{j} \end{aligned}$$
(33)

for a symmetric \(2\times 2\) and u-dependent matrix \(h_{ij}(u)\). The vacuum Einstein equations imply [11] that \(h_{ij}\) should be trace-free, which means we can write

$$\begin{aligned} (h_{ij})(u) = \begin{pmatrix} f_+(u) &{} f_\times (u)\\ f_\times (u) &{} -f_+ (u) \end{pmatrix}. \end{aligned}$$
(34)

Had we wanted to describe a purely electromagnetic wave rather than a gravitational wave, one should have \((h_{ij}) = \text {diag}(f(u),f(u))\) for some arbitrary smooth f. A sandwich wave is obtained when the support of the profile functions is compact; for details see [54, Eq. 2.1] and [4, Sec. 17.4]. Note that the presence of two functions necessary to describe the wave, as in the linear regime, means that the gravitational wave described by such a metric possesses two linearly independent polarization states. Note that we have used the analogous subscripts as we had on the coefficients \(C_{{\mu \nu }}\), as the \(f_+\) and \(f_\times \) functions again describe the components of the wave in each polarisation state. If we had not imposed the vacuum condition, the plane wave would instead have described a coupled system of both gravitational and electromagnetic plane waves. Such plane waves were originally studied in [12] and then by [55].

Let us now examine the affect of these polarisation states as in [11, p. 94], where we skip some steps due to the similarity with the analysis of the linear regime. For a plane wave, the geodesic equation for u is simply \(\ddot{u} = 0\), that for v is

$$\begin{aligned} \ddot{v}=\frac{1}{2} \left( f_{+}^{\prime }(u)(x^2 - y^2) +2 f_{\times }^{\prime }(u) xy\right) \dot{u}^{2} +\big (f_{+}(u)(x \dot{x}-y \dot{y})+f_{\times }(u) \left( x \dot{y}+y \dot{x}\right) \big ) \dot{u} \end{aligned}$$
(35)

and for x and y we have

$$\begin{aligned} \begin{pmatrix} \ddot{x} \\ \ddot{y} \end{pmatrix} = \frac{1}{2} \begin{pmatrix} f_+(u) &{} f_\times (u)\\ f_\times (u) &{} -f_+ (u) \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}. \end{aligned}$$
(36)

Since \(\ddot{u} = 0\), we have that \(u(s) = as + b\) for curve parameter s and \(a,b \in {\mathbb {R}}\). Therefore as the affine parameterisation along a geodesic is only unique up to a transformation of the form \(s \mapsto cs + d\), u itself can be used as an affine parameter and we may take \(u(s) = s\).

For the “\(+\)” mode, we have \(f_\times = 0\), and one finds the geodesic equations reduce to

$$\begin{aligned} \begin{pmatrix} \ddot{x}(s) \\ \ddot{y}(s) \end{pmatrix} = \frac{f_+(s)}{2} \begin{pmatrix} x(s) \\ -y(s) \end{pmatrix}. \end{aligned}$$
(37)

That is, the motion decouples and takes place only in the transverse directions (as expected by analogy with the linear theory). This motion is such that where \(f_+(s)\) is positive, there is a “focusing” in the x direction and a defocusing in the y direction. Where \(f_+\) is negative, one sees the converse effect.

By introducing coordinates (wz) rotated by \(45^\circ \) relative to (xy), and taking the “\(\times \)” polarisation mode \(f_+ = 0\), one finds precisely the same equation of motion for the rotated variables

$$\begin{aligned} \begin{pmatrix} \ddot{w}(s) \\ \ddot{z}(s) \end{pmatrix} = \frac{f_\times (s)}{2} \begin{pmatrix} w(s) \\ -z(s) \end{pmatrix}, \end{aligned}$$
(38)

where

$$\begin{aligned} \begin{pmatrix} w \\ z \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 &{} 1\\ -1 &{} 1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}. \end{aligned}$$

Thus the two polarization modes have precisely the same effect as in the linearised theory, but now there is no requirement that the separations be “small”. This is in line with the interpretation of the characteristic function H as corresponding to the perturbation \(h_{{\mu \nu }}\) of the linear theory, but without the requirement that it be “small” in some sense.

We now demonstrate that the above expression for the metric of a plane wave (Eq. 33) corresponds to our previous definitions of a plane wave. The correspondence between the dimension of the symmetry group and the form of the line element has already been succinctly and fully described by [17, Table, pg 79], and so we will not reproduce the calculation here. This establishes the connection with Definition 4, and we now illustrate the connection with Definition 5.

Lemma 3.4

The plane wave of Definition 5 corresponds to a classical pp-wave (Eq. 32) for which the characteristic function H in Brinkmann coordinates is quadratic in (xy). That is, the condition

$$\begin{aligned} \nabla _X R = 0 \quad \forall \quad X \in Z_{\perp }, \end{aligned}$$

where \(Z=\partial _v\) in these coordinates, R is the curvature tensor and \(Z_\perp :=\{X \in TM ~|~ g(X,Z) = 0\}\) is equivalent to \(H_{xxx}=H_{yxx}=H_{xyy}=H_{yyy}=0\) for classical pp-waves.

Proof

First note that \(\partial _x\) and \(\partial _y\) are elements of \(Z_\perp \). Let us begin by examining \(\nabla _{\partial _x}R\) which we assume to be 0, and we will see that this implies \(H_{xxx}=H_{yxx}=0\).

(39)

where we have used that the nonzero Christoffel symbols are given by

$$\begin{aligned} \nabla _{\partial _{x}} \partial _{u}= & {} \nabla _{\partial _{u}} \partial _{x} =\frac{H_{x}}{2} \partial _{v}, \end{aligned}$$
(40)
$$\begin{aligned} \nabla _{\partial _{y}} \partial _{u}= & {} \nabla _{\partial _{u}} \partial _{y} =\frac{H_{y}}{2} \partial _{v}, \end{aligned}$$
(41)
$$\begin{aligned} \nabla _{\partial _{u}} \partial _{u}= & {} \frac{H_{u}}{2} \partial _{v} -\frac{H_{x}}{2} \partial _{x}-\frac{H_{y}}{2} \partial _{y}. \end{aligned}$$
(42)

Thus \(H_{xxx}=H_{yxx}=0\), and the remainder of the proof then follows by considering \((\nabla _{\partial _y}R)(\partial _u, \partial _y,\partial _u)\), from which the result is obtained in precisely the same manner as for \(\partial _x\). The reverse direction of the equivalence then follows from the fact that \(Z_\perp \) is pointwise spanned by \(\partial _x,\partial _y\) and \(\partial _v\), and that \(\partial _v\) is a Killing vector field. \(\square \)

3.4 Gyratonic pp-waves

The gyratonic pp-waves are those pp-waves with nonvanishing \(A_a\), that is the general metric can be written as

$$\begin{aligned} g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2}+2A_a \left( u, \textbf{x}\right) d x^{a} d u+g_{a b} d x^{a} d x^{b}, \end{aligned}$$
(43)

but note that the gyratonic pp-waves may also be studied with flat wavefront (\(g_{ab} = \delta _{ab}\)) as in [53]. Such pp-waves have been studied extensively, for example in [56], and in [53], wherein work by [57] is used to conclude that in the Ricci-flat case, they correspond to the exterior vacuum field of spinning particles moving with the speed of light. In reference to the off-diagonal terms with coefficients \(A_a\), the authors state:

In vacuum regions it is a standard and common procedure to completely remove these functions by a gauge (coordinate) transformation. However, such a freedom is generally only local and completely ignores the global (topological) properties of the spacetimes. \(\dots \) In particular the possible rotational character of the source of the gravitational waves (its internal spin/helicity) is obscured.

What one finds [53, Sec. 4] is that the physical characteristics one can define in a pp-wave spacetime can be obscured via the local gauge transformations which eliminate the \(A_a\), and in general it may be necessary to keep such terms. Most notably, one should pay close attention to such terms when attempting to define the angular momentum density of pp-waves in an analogous manner to the linearised theory [8]. In the end, such a physical property depends manifestly on the \(A_a\) via the contour integral (see [53, Eq. 33])

$$\begin{aligned} \oint _{C} A_{a} dx^{a}, \end{aligned}$$
(44)

where C is a (not completely arbitrary) contour in the transverse space.

3.5 \(\mathbf {(N,h)}\)p-waves

These spacetimes are a subclass of the parallel waves which roughly correspond to a standard pp-wave with a Riemannian manifold replacing the planar wavefront of a pp-wave. That is, they are the parallel waves which the following conditions hold:

  1. (i)

    In the adapted coordinates of theorem 3.1, the metric components of the wavefront \(g_{ab}\) are independent of the coordinate u.

  2. (ii)

    The spacetime decomposes as \(M = {\mathbb {R}}^2 \times N\) where (Nh) is a connected Riemannian manifold.Footnote 21 Note that this implies the coordinates u and v are globally defined.

This amounts to a general parallel wave metric Eq. 25 with the additional constraint that the metric on the transverse space h be independent of u. The name we suggest for such spacetimes is in analogy to the “pp-wave” spacetimes (plane-fronted waves with parallel waves) as here we have a wavefront (Nh) and the rays remain parallel, as they are the integral curves of Z and Z remains, as always, covariantly constant. Such spacetimes have also been called “generalised plane waves” [58] and “PFWs” (plane-fronted waves) [54] & [59], but the authors find this suggested naming scheme to be the most transparent and accurate. We may write the (Nh)p-wave metric as

$$\begin{aligned} g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2} +2A_a\left( u, \textbf{x}\right) d x^{a} d u+h. \end{aligned}$$
(45)

We can write this metric without referencing coordinates on N if we instead consider H as a map \(H:{\mathbb {R}}\rightarrow C^{\infty }(N)\). That is for each u, H is a smooth function on N. Similarly for the mixed terms \(dx^{a}du\), we define \(A :{\mathbb {R}}\rightarrow \Gamma (T^*N)\) where \(\Gamma (T^*N)\) is the space of sections of the cotangent bundle of N. With these redefinitions (unique to this Section) we may write g as

$$\begin{aligned} g = 2 d u d v+H(u) d u^{2}+2 A(u) du + h. \end{aligned}$$
(46)

Such spacetimes have been studied extensively in [54, 59], in which the geodesic completeness, geodesic connectedness and causality have been determined.

3.6 Rosen coordinates of plane waves

The coordinates for plane waves which make manifest the symmetries/Killing vector fields are called Rosen coordinates, after [60]. The transformation between Brinkmann and Rosen coordinates is well-documented, for example see [23, Appendix A] and [22, Sec. 2.8]. We simply present the local form of a plane wave metric in Rosen coordinates, where we use capital letters for the coordinates U and V to emphasize that they are not the same coordinate functions as in Brinkmann coordinates.

$$\begin{aligned} g = 2 d U d V+K_{ij}\left( U\right) dy^idy^j, \end{aligned}$$
(47)

where \(K_{ij}\) is positive-definite on the domain of validity of these coordinates. Note that in such coordinates, the Minkowski metric could be represented as

$$\begin{aligned} \eta = 2 d U d V+ \delta _{ij}dy^idy^j \end{aligned}$$
(48)

which is simply the usual metric written in light-cone coordinates. Rosen coordinates can often exhibit (coordinate) singularities, and are therefore often avoided in favour of Brinkmann coordinates [22, Sec. 2.9].

Generically, the plane wave metric has \(2n - 3\) linearly independent Killing vectors, which in a suitable basis generate the Heisenberg algebra [23, Sec. 2.1]. In Rosen coordinates, half (\(+1\)) of the Killing vector fields are manifest (independent of \(K_{ij}\)) and the remaining symmetries can be obtained in terms of \(K^{ij}\), the inverse of \(K_{ij}\). The Killing vector fields are thus (as in [61, Eq. 2.11])

$$\begin{aligned} e_{+}=\frac{\partial }{\partial V}, \quad e_{i}=\frac{\partial }{\partial y^{i}}, \quad e_{i}^{*}=y^{i} \frac{\partial }{\partial v} -\sum _{j} \int K^{i j}(U) d U \frac{\partial }{\partial y^{j}}. \end{aligned}$$
(49)

These correspond to the defining symmetry of the parallel wave Z and the translations and rotations of the \(y^j\). Note that the \(e_{i}^{*}\) are the usual rotations when we have \(K_{ij} = \delta _{ij}\), that is the Minkowski metric Eq. 48.

4 Properties

4.1 Vanishing scalar invariants

A well-known property of the pp-wave geometries is that all scalar curvature invariants (a scalar constructed from the metric, Riemann tensor and covariant derivatives of the Riemann tensor) are zeroFootnote 22 [51, 62]. Here, we will present a proof that all curvature invariants of the plane waves vanish, and for the case of the general pp-wave, we direct the reader to [62]. There are two approaches to prove this fact, the first by explicitly calculating the curvature tensor and the second by showing that each point p in a plane wave spacetime is the fixed point of a homothety, and that any curvature invariant must be 0 at such a point. We will present the second such approach here, the proof of which is due to Schmidt [63], where we follow closely the presentation in [22].

Theorem 4.1

All curvature invariants of a plane wave vanish.

Proof

We will proceed via the following series of arguments:

  1. 1.

    An elementary curvature invariant cannot be invariant under constant rescalings of the metric (called a homothety).

  2. 2.

    If there exists a coordinate transformation which induces a homothety, then due to the previous point, at the fixed points of the transformation (i.e. points which are invariant under the transformation) any elementary curvature invariant must be 0.

  3. 3.

    Any point in a plane wave is the fixed point of a homothety

These statements are proved as follows:

  1. 1.

    A general curvature invariant of a manifold (Mg) is constructed from the metric and elementary curvature invariants. An elementary curvature invariant is obtained by taking covariant derivatives of the Riemann tensor

    $$\begin{aligned} \nabla _{\mu _{1}} \ldots \nabla _{\mu _{p}} R_{\nu \lambda \rho }{}^{\mu } \end{aligned}$$

    and “tracing out” all free indices with the inverse metric \(g^{\mu \nu }\). The Levi-Civita connection \(\nabla \) is invariant under a constant rescaling of the metric (homothety), which is conformal transformation, in which the conformal factor \(\lambda \) is a nonzero constant

    $$\begin{aligned} g_{\mu \nu }\longrightarrow \tilde{g}_{\mu \nu }= e^{2\lambda }g_{\mu \nu }\end{aligned}$$

    That is, we have a second manifold \((M,\tilde{g })\) conformally related to (Mg) (the homothety is in particular not an isometry). Since \(\nabla \) is invariant under such a transformation, so too is the Riemann tensor. Since the (certainly not invariant) inverse metric is required to make a scalar, the elementary invariants cannot be invariant under such a homothety. Rather, a curvature invariant J will change as

    $$\begin{aligned} J(x) \longrightarrow e^{m\lambda }J(x) \end{aligned}$$

    for some \(x \in M\) and some natural number m which depends on the order of J (number of covariant derivatives).

  2. 2.

    Assume there exists a coordinate transformation of the Lorentzian manifold (Mg) which induces a homothety with x a fixed point. Since x is a fixed point of the homothety we have

    $$\begin{aligned} J(x) = e^{m\lambda }J(x) \end{aligned}$$

    differing from above in the equals sign alone. Such an equality can only hold (for natural m and constant nonzero \(\lambda \)) if \(J(x) = 0\).

  3. 3.

    We simply need to construct the coordinate change for plane waves which induces a nontrivial homothety. As in Sect. 3.6, any plane wave metric can be written in the so-called “Rosen coordinates” as

    $$\begin{aligned} g = 2dUdV + g_{ij}(U)dy^idy^j. \end{aligned}$$

    Such a form exhibits obvious translational symmetry in the \(y^j\) and v directions. Due to these symmetries, without loss of generality we can take a general point to be written as \(x = (u_0,0,0)\), which is fixed point of the coordinate transformation

    $$\begin{aligned} (u,v,y^j) \longrightarrow (u,\lambda ^2 v, \lambda y^j) \end{aligned}$$

    for some constant \(\lambda \). Such a coordinate transformation is in fact a homothety, and scales the metric as \(g \longrightarrow \lambda ^2 g\). Since we have shown that this is true for general \(u_0\), the result holds for any point \((u,v,\textbf{y})\) of a plane wave.

\(\square \)

For further details of all classes of spacetimes in which the curvature invariants identically vanish, see [62].

4.2 pp-waves via their wavefronts

As mentioned in Definition 2 above, a distinguishing feature of a null vector field Z is, of course, that it lies in its own orthogonal complement, \(Z_{\perp }\), leading to the Wavefront \(Z_{\perp }/Z\), a vector bundle whose elements are equivalence classes “[X]” of vector fields X orthogonal to Z. Because such vector fields are necessarily spacelike (see, e.g., [47, Lemma 28, p. 142]), \(Z_{\perp }/Z\) will inherit a (positive-definite) inner product from the Lorentzian metric g. It turns out that when Z is also parallel, as it is a for a pp-wave, then \(Z_{\perp }/Z\) will also inherit a well defined linear connection, and this can be used to give an alternative—and very geometric—definition of a pp-wave. This alternative formulation of a pp-wave, which we now provide, is well known; see, e.g., [64, 24, Proposition 3]. In the following, \(\Gamma (E)\) represents the space of sections of the vector bundle E.

Theorem 4.2

Let (Mg) be a Lorentzian manifold and Z a null, parallel vector field defined in an open subset \({\mathcal {U} \subseteq M}\), with orthogonal complement . Then the wavefront \(Z_{\perp }/Z\) admits a positive-definite inner product \(\bar{g}\),

$$\begin{aligned} \bar{g}([X],[Y]) :=g(X,Y)\quad \text {for all} \quad [X], [Y] \in \Gamma (Z_{\perp }/Z), \end{aligned}$$

and a corresponding linear connection ,

This connection is flat if and only if is a pp-wave.

Proof

The metric \(\bar{g}\) will be well defined, and positive definite, whenever Z is null; indeed, every \(X \in \Gamma (Z_{\perp })\) not proportional to Z is necessarily spacelike, so that \(\bar{g}\) is nondegenerate (and positive-definite), and if \([X] = [X']\) and \([Y] =[Y']\), so that \(X' = X +fZ\) and \(Y' = Y+kZ\) for some smooth functions fk, then

$$\begin{aligned} \bar{g}([X'],[Y']) = g(X',Y') = g(X,Y) = \bar{g}([X],[Y]). \end{aligned}$$

On the other hand, the connection \(\overline{\nabla }\) requires Z to be parallel or else it is not well defined: \(\nabla _{{V}}{{Y}} \in \Gamma (Z_{\perp })\) if and only if Z is parallel, in which case

That \(\overline{\nabla }\) is indeed a linear connection follows easily. Now, if this connection is flat, then by definition its curvature endomorphism, which is the mapping

whose action is given by

$$\begin{aligned} \overline{\text {R}}(V,W)[X] :=\overline{\nabla }_{V} {[}\overline{\nabla }_{W}[X]] - \overline{\nabla }_{W} {[}\overline{\nabla }_{V}[X]] - \overline{\nabla }_{{[V,W]}}{[{X}]}, \end{aligned}$$

will vanish, for any section \([X] \in \Gamma (Z_{\perp }/Z)\) and vector fields . Using the metric \(\bar{g}\), this flatness condition is equivalent to

But if we unpack the definitions of \(\overline{\nabla }\) and \(\bar{g}\), we see that

$$\begin{aligned} \bar{g}(\overline{\text {R}}(V,W)[X],[Y]) = \text {Rm}(V,W,X,Y) = \text {Rm}(X,Y,V,W). \end{aligned}$$
(50)

It follows that \(\overline{\text {R}} = 0\) if and only if \(R(X,Y)V = 0\) for all \(X,Y \in \Gamma (Z_{\perp })\) and ; by (12) and Definition 3, this is precisely the condition to be a pp-wave. \(\square \)

4.3 Penrose limits

We now outline the importance and prove the existence of the famous “Penrose limit”, which assigns a plane wave metric Eq. 33 as a limit of any spacetime (Mg) in a neighbourhood of a null geodesic \(\gamma \). This is not a property of the parallel wave metrics, but rather a remarkable feature of all spacetimes. This fact was originally demonstrated by Penrose in 1976 [65], where he described the limiting procedure as a null analogy to the procedure by which one obtains the tangent space (that is, “zooming in” on a small neighbourhood and scaling those neighbourhoods up in a complementary manner). It is worth pointing out that applications of Penrose’s limit in physics continue to the present day, particularly in higher dimensions and in relation to string theory and the AdS/CFT correspondence; see, e.g., [66,67,68] and the references therein.

We adopt a different notation to that of Penrose’s work to be consistent with the majority of modern literature regarding pp-waves, and in particular Theorem 3.1 of this article. We take inspiration from the discussion of [69], who is consistent in explicitly writing the appropriate pullbacks which appear only implicitly in the original work [65].

Theorem 4.3

Consider an n-dimensional Lorentzian manifold (Mg). In a neighborhood of a point on any conjugate-point free portion \(\gamma '\) of a null geodesic \(\gamma \), one can write the metric g in the so-called “null coordinates” as

$$\begin{aligned} g = 2 d u d v+H d u^{2}+2A_a d x^{a} d u+g_{a b} d x^{a} d x^{b}, \end{aligned}$$
(51)

where H, \(A_a\) and \(g_{ab}\) (with \(a,b \in 1,\dots ,n-2\)) are smooth functions of the coordinates and \((g_{ab})\) is a positive-definite matrix, i.e. a family of Riemannian metrics on the \((n-2)\)-dimensional embedded submanifolds defined by \(u=\text {const}, v=\text {const}\). One could represent this metric in matrix notation as

$$\begin{aligned} g= \begin{pmatrix} H &{} 1 &{} A_1 &{} \dots &{} A_{n-2} \\ 1 &{} 0 &{} 0 &{} \dots &{} 0 \\ A_1 &{} 0 &{} &{} &{} \\ \vdots &{} \vdots &{} &{} (g_{ab}) &{} \\ A_{n-2} &{} 0 &{} &{} &{} \end{pmatrix}. \end{aligned}$$
(52)

Note also that in these coordinates, \(\gamma \) is represented by the integral curve of \(\partial /\partial _v\) which passes through the origin.

Proof

First, define a vector field Z (suggestively labelled in analogy to Theorem 3.1) such that along \(\gamma \) we have \(Z = \dot{\gamma '}\). Now we construct the coordinate u. The partial differential equation

$$\begin{aligned} g(\text {grad}(u),\text {grad}(u)) = 0 \end{aligned}$$

with boundary condition \(\text {grad}(u) = Z\) on \(\gamma '\) is a Hamilton-Jacobi equation for u which always admits local solutions (see [70, 585-588]).

As is suggested by the similarity of the result, we take inspiration from the proof of Theorem 3.1, noting that we no longer assume that Z be covariantly constant. The necessary adjustment to the proof is as follows: That Z is nonzero in a neighbourhood of \(\gamma '\) holds again by the fact that it is null, but also by the fact that \(\gamma '\) is geodesic, that is \(\nabla _{\dot{\gamma '}} \dot{\gamma '} = 0\). Thus the remainder of step 1 remains valid, and we may construct a coordinate system \(\{\tilde{x}^0,v,\tilde{x}^1,\ldots ,\tilde{x}^{n-2}\}\) with \(\text {grad}(u)=Z=\tilde{\partial }_v\) via the straightening theorem. Step 2 is not necessary in this context, as u has already been introduced by the above argument. Step 3 follows as before, yielding a coordinate system \(\{u,v,\textbf{x}\} :=\{u,v,x^1,\dots ,x^{n-2}\}\) on an open set \(U \subset M \) containing \(\gamma '\). The form of the metric and the positive-definiteness of \((g_{ab})\) then follow from parts (ii) and (iii) of Proposition 3.2. \(\square \)

Note that an alternative and succinct version of this proof was provided by [22, Sec. 4.3], but the reader should note that their “UV” is our “vu”.

We now describe the limiting procedure by which one can “zoom in” on a null geodesic (called the Penrose limit) while simultaneously scaling up the metric, in a manner analogous to obtaining the tangent space of a Riemannian manifold. The primary difference however is that in the Riemannian case, the space obtained via this procedure is a flat space, whereas in the Penrose limit we will obtain an intrinsically curved space, which will turn out to be the plane wave Eq. 33 written in the Rosen coordinates of Sect. 3.6.

4.3.1 Limiting procedure: Penrose’s construction

This section follows Penrose’s original construction [65] but is presented in a more modern language, in a self-contained manner using the proofs of Sect. 3, and explicitly generalised to arbitrary dimension. The procedure by which we will define the Penrose limit of a spacetime will be (schematically) as follows:

  1. 1.

    Take a spacetime (Mg) and write the metric in null coordinates in a neighbourhood of a null geodesic \(\gamma \).

  2. 2.

    Define a new coordinate system whose coordinate functions are those of the null coordinates divided by powers of a parameter \(\Omega \) (which we will let go to 0 later, causing those coordinates to “blow up”) and write g in these coordinates.

  3. 3.

    Define another metric h on M conformal to g with constant factor \(h = \Omega ^{-2} g\)

  4. 4.

    Show that in the limit \(\Omega \rightarrow 0\), h (that is, \(\Omega ^{-2} g\)) is simply the metric of a plane wave. This is the “Penrose limit” of (Mg) in a neighbourhood of \(\gamma \), and importantly, the construction was independent of the properties of the spacetime metric g. That is, all spacetimes look like a plane wave when we simultaneously scale up the coordinates and scale up the metric near a null geodesic \(\gamma \), which amounts to “zooming in” on \(\gamma \), or equivalently, blowing up a neighbourhood of \(\gamma \) to cover the whole spacetime.

To understand the complementary scaling of the coordinates and the metric, Penrose interprets this procedure as first scaling up the coordinates to “blow up” the points of interest (just as one does when looking at the tangent space of any point), then, to account for the fact that a general curvature tensor will appear to blow up as the coordinates do, we must simultaneously scale up the metric to scale down the curvature tensor and obtain finite results. Physically, Penrose interprets this procedure as boosting an observer closer and closer to the speed of light, and a complementary re-calibration of their clocks in such a manner so as to keep the affine parameter u along the null geodesic \(\gamma \) invariant under the procedure. For details see the original work [65] and [22, Sec. 4.4] for a more modern description.

We now begin the explicit construction. Consider an n-dimensional Lorentzian manifold (Mg) and an open set \(U \subset M\) (containing a conjugate point-free segment of a null geodesic \(\gamma \)) on which the null coordinates Eq. 52 are defined, and label this null coordinate chart \(\psi \). Then consider the map \(\phi _\Omega :=\varphi _{\Omega } \circ \psi : U \rightarrow {\mathbb {R}}^n\) where

$$\begin{aligned} \begin{array}{lcl} \varphi _\Omega &{} : &{} {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\\ \\ &{} : &{}\left( u,v, x^{1}, \ldots , x^{n-2}\right) \mapsto \underbrace{\left( \frac{u}{\Omega ^{2}}, v, \frac{x^{1}}{\Omega }, \ldots , \frac{x^{n-2}}{\Omega } \right) }_{=\left( \tilde{u}, \tilde{v}, \tilde{x}^{1}, \ldots , \tilde{x}^{n-2}\right) } \end{array} \end{aligned}$$

for \(\Omega > 0\) a constant. The map \(\phi _\Omega \) is then a diffeomorphism onto its image \(\phi _\Omega (U) \subset {\mathbb {R}}^n\) for \(\Omega \ne 0\). Define a metricFootnote 23h on \(\phi _\Omega (U)\) whose representation in the tilde coordinates is

$$\begin{aligned} h= \begin{pmatrix} \Omega ^{2} \tilde{H} &{} 1 &{} \Omega \tilde{A}_1 &{} \dots &{} \Omega \tilde{A}_{n-2} \\ 1 &{} 0 &{} 0 &{} \dots &{} 0 \\ \Omega \tilde{A}_1 &{} 0 &{} &{} &{} \\ \vdots &{} \vdots &{} &{}(\tilde{g}_{ab}) &{} \\ \Omega \tilde{A}_{n-2} &{} 0 &{} &{} &{} \end{pmatrix}, \end{aligned}$$
(53)

where \(\tilde{H}\), the \(\tilde{A}_a\) and the \(\tilde{g}_{ab}\) are implicitly functions of all the tilde coordinates defined (strategically) in the following manner

$$\begin{aligned} \tilde{H}&: = H(\Omega ^2 \tilde{u}, \tilde{v}, \Omega \tilde{x}^1, \dots ,\Omega \tilde{x}^{n-2}) = H(u,v,x^1,\dots ,x^{n-2}),\nonumber \\ \tilde{A}_a&:= A_a(\Omega ^2 \tilde{u}, \tilde{v}, \Omega \tilde{x}^1, \dots ,\Omega \tilde{x}^{n-2}) = A_a(u,v,x^1,\dots ,x^{n-2}),\nonumber \\ \tilde{g}_{ab}&:= g_{ab}(\Omega ^2 \tilde{u}, \tilde{v}, \Omega \tilde{x}^1, \dots ,\Omega \tilde{x}^{n-2}) = g_{ab} (u,v,x^1,\dots ,x^{n-2}). \end{aligned}$$
(54)

The metric h is conformal to \((\phi _\Omega ^{-1})^*g\), which can be seen as follows: First, by definition of the tilde coordinate system and Eq. 54, we relate the components of g and h as:

$$\begin{aligned}{} & {} g_{uv} ~ du \otimes dv = du \otimes dv = \Omega ^2 d\tilde{u} \otimes d\tilde{v} = \Omega ^2 h_{\tilde{u}\tilde{v}}~d\tilde{u} \otimes d\tilde{v},\\{} & {} g_{uu} ~ du \otimes du = H du \otimes du = \Omega ^{4} H d\tilde{u} \otimes d\tilde{u} = \Omega ^{2} h_{\tilde{u}\tilde{u}} ~ d\tilde{u} \otimes d\tilde{u}, \end{aligned}$$

and one obtains a similar relationship for the remaining components:

$$\begin{aligned} g_{\rho \sigma } ~ dx^\rho \otimes dx^\sigma = \Omega ^2 h_{\rho \sigma } ~d\tilde{x}^\rho \otimes d\tilde{x}^\sigma . \end{aligned}$$
(55)

Second, since \(\phi _\Omega \) is a change of coordinates, it holds that

$$\begin{aligned} g_{\rho \sigma }~dx^\rho \otimes dx^\sigma =((\phi _\Omega ^{-1})^*g)_{\rho \sigma }~d\tilde{x}^\rho \otimes d\tilde{x}^\sigma \end{aligned}$$
(56)

and thus

$$\begin{aligned} ((\phi _\Omega ^{-1})^*g)_{\rho \sigma }~d\tilde{x}^\rho \otimes d\tilde{x}^\sigma = \Omega ^2 h_{\rho \sigma } ~d\tilde{x}^\rho \otimes d\tilde{x}^\sigma , \end{aligned}$$
(57)

that is, h and \((\phi _\Omega ^{-1})^*g\) are homothetic (conformal with constant conformal factor) as

$$\begin{aligned} h = \frac{1}{\Omega ^2}(\phi _\Omega ^{-1})^*g. \end{aligned}$$
(58)

We now actually take the Penrose limit of \((M,g,\gamma )\), which is a neighbourhood of \(\gamma \) in the spacetime formed by M equipped with the metric

$$\begin{aligned} \lim _{\Omega \rightarrow 0} \frac{1}{\Omega ^2}(\phi _\Omega ^{-1})^*g =\lim _{\Omega \rightarrow 0}h. \end{aligned}$$
(59)

In this limit in the tilde coordinates, h reduces to

$$\begin{aligned} \lim _{\Omega \rightarrow 0}h= \begin{pmatrix} 0 &{} 1 &{} 0 &{} \dots &{} 0 \\ 1 &{} 0 &{} 0 &{} \dots &{} 0 \\ 0 &{} 0 &{} &{} &{} \\ \vdots &{} \vdots &{} &{} (\tilde{g}_{ab}) &{} \\ 0 &{} 0 &{} &{} &{} \end{pmatrix}, \end{aligned}$$
(60)

where \((\tilde{g}_{ab})\) is now a function of \(\tilde{v} = v\) only as \(\tilde{g}_{ab} = g_{ab}(0,\tilde{v}, 0,\dots ,0)\). This is precisely the Rosen coordinate representation of the plane wave metric Eq. 47 (under an appropriate relabelling/reordering of the coordinates).

What we have demonstrated is that in an appropriate limit around a null geodesic \(\gamma \), any spacetime approaches a plane wave in a manner analogous to how a Riemannian manifold locally approaches Euclidean space in an appropriate limit. A collection of the Penrose limits of common spacetimes and a comprehensive overview of the properties of Penrose limits has already been established by [22], such as the hereditary properties (those properties of the limit which are inherited from the original spacetime). A covariant description of the limiting procedure is also provided, making significantly clearer the connection between the original metric g and the properties of the resulting plane wave limit, which are encoded in the wave profile H when written in the “Brinkmann coordinates” as in Eq. 33.

We close our discussion of Penrose’s limit by illustrating a family of examples. These examples are taken from [68, Eqn. (3.1)], wherein full derivations can be found; here we write down only the resulting plane wave limit itself, restricting out attention to dimension 4. Indeed, for both the Scharzschild metric and the Friedmann–Lemaétre–Robertson–Walker (FRW) cosmological models, their Penrose plane wave limits take the following form in Brinkmann coordinates

$$\begin{aligned} ds^2 = 2dudv + \sum _{a,b=1}^2 \frac{A_{ab}x^ax^b}{u^2}du^2 + dx^2+dy^2, \end{aligned}$$

where each \(A_{ab}\) is a constant depending on the original metric, and where \(x^1=x\) and \(x^2=y\).

4.4 Causality in parallel waves

We now review some basic results in the causal properties of parallel waves, starting with the well-known “remarkable property of plane waves” proven by Penrose [71] which spurred on much of this research.

4.4.1 A remarkable property of plane waves

Roughly, Penrose showed that a (not necessarily purely gravitational) plane wave exhibits a “focusing property” on the null cones (see Fig. 1), and as a consequence, there exists no Cauchy hypersurface sufficient for the specification of Cauchy data [71]. This is because the past null cone of any event is focused to a single point (anastygmatism) or line (astygmatism), and since a Cauchy hypersurface has the property that it intersects any causal curve exactly once, it is concluded that this focusing property forces many causal curves to intersect any potential Cauchy hypersurface at least twice. In the following, we maintain consistency with the notation of the original work wherever possible.

To begin, let us first define the relevant objects. As in Sect. 3.3.1 (with a small relabelling), a plane wave is defined as a 4-dimensional standard pp-wave in adapted coordinates \(\{u,v,x^1,x^2\}\) for which the characteristic function \(H(u,x^1,x^2)\) is quadratic in \((x^1,x^2)\), that is the spacetime \((M ={\mathbb {R}}^4,g)\) where

$$\begin{aligned}&g = 2\mathrm{~d} u \mathrm{~d} v + H(u,x^1,x^2)~\textrm{d}u^2 + (\textrm{d} x^1)^{2}+(\textrm{d} x^2)^{2}\\&H(u,x^1,x^2) = \sum _{i, j=1}^{2} h_{i j}(u) x^{i} x^{j} \end{aligned}$$

for some symmetric matrix formed by the \(h_{ij}\). We also define the null cone:

Definition 6

Null Cone The null cone (denoted \(\kappa _3\)) at a point \(Q \in M\) is defined as the set of points lying on all null geodesics through Q.

In this section, Penrose utilises the so-called “sandwich waves”, defined by the characteristic that the amplitudes \(h_{ij}(u) = 0\) unless \(u \in (a,b)\subset {\mathbb {R}}\). One can visualise such a plane wave as in Fig. 1, in which it becomes clear that a sandwich wave is a plane wave for which the infinite extent in the u direction is removed.

Fig. 1
figure 1

(Left) The wave profile of a sandwich plane wave, in which the u coordinate range of the “curved” region is (ab). (Right) The focusing effect of such a wave on the past null cone of a point R in the electromagnetic case. Figures from [71, Fig. 1 & 2]

We now outline the primary result of [71], where some details are omitted and only the main steps of the proof are reproduced.

Theorem 4.4

The past null cone of any point Q in a plane wave (Mg) with compactly supported profile (a “sandwich wave”) is focused to a single point for an electromagnetic sandwich wave, or to a line for a gravitational sandwich wave.

Proof

To begin, choose a point Q in the flat region of M, such that the components of Q are

$$\begin{aligned} u = u_0 < a, \quad v = v_0, \quad x^i = 0, \end{aligned}$$

where a is the lower bound of the interval on which u is nonzero for the sandwich wave. Close to Q, the equation of the null cone \(\kappa _3\) is \((u - u_0)(v - v_0) - x^ix^i = 0\) which can be written

$$\begin{aligned} v = f_{ij}(u) x^ix^j + v_0, \end{aligned}$$
(61)

where \(f_{ij}(u) = (u - u_0)^{-1}\delta _{ij}\) near Q. We now wish to obtain a description of \(\kappa _3\) valid away from Q, that is to find an appropriate \(f_{ij}(u)\). If the surface is to remain null even in the curved regions of M, then one can show that \(f_{ij}\) should be both symmetric and satisfyFootnote 24

$$\begin{aligned} \frac{d}{du}f_{ij} + f_{ik}f_{kj} + h_{ij} = 0. \end{aligned}$$
(62)

With “initial condition” Eq. 61 one obtains an \(f_{ij}\) which describes the null cone \(\kappa _3\) even in the curved region of (Mg). This extension is only valid while \(f_{ij}\) is finite, and so we now examine if and when \(f_{ij} \rightarrow \infty \). To do so, consider the trace of the above differential equation, noting that \(h_{ij}\) is trace-free for a vacuum solution and in general \(h_{ii} > 0\).

$$\begin{aligned} \frac{d}{du}f_{ii} + \frac{1}{2}f_{ii}f_{jj} = -\frac{1}{2} \left( f_{i k} f_{i k} \delta _{j l} \delta _{j l}-f_{i k} \delta _{i k} f_{j l} \delta _{j l}\right) -h_{i i} \le 0 \end{aligned}$$

via Schwarz’ inequality. Defining \(\rho (u) :=\frac{1}{2} \int _{u_0}^u f_{ii}(\bar{u})d\bar{u}\), one finds the integro-differential inequality on the trace of f

$$\begin{aligned} \frac{d^2}{du^2} \rho (u) \le 0, \end{aligned}$$
(63)

where the inequality is sharp for at least some values of u. Since our choice of \(u_0\) in Q was arbitrary, consider the limit \(u_0\longrightarrow -\infty \). Then from the definition of \(f_{ij}\) near Q, we see that \(f_{ij} = 0 ~\forall ~ u < a\). Then via Eq. 61, we see that \(\kappa _3\) is described by the equation \(v = v_0\), that is the null cone is a null hyperplane in the flat region. When \(f_{ij} = 0\) then in particular \(\rho ' = 0\) in the flat region (prime meaning u-derivative), and therefore by Eq. 63 we have that a \(\rho \) which is positive in the flat region near Q will become 0 for finite u. If \(\rho =0 \) then some component of \(f_{ij}\) must become singular.Footnote 25 Denote the u at which \(f_{ij}\) exhibits singularity by \(u_1 > a\) (since for \(u_1 \le a\) we have \(f_{ij} \equiv 0\)).

If this singularity occurs outside the curved region, i.e. \(u_1 > b\) then the null cone \(\kappa _3\) encounters a singularity on the “past” side of the sandwich wave. In fact, one needs to consider large and negative \(u_0\) as opposed to the \(-\infty \) limit, but this does not affect the relevant equations here.

Now consider the flat region containing this singularity. In this region Eq. 62 may be written as \(p_{ij}' = \delta _{ij}\) where \(p_{ij}\) is the inverseFootnote 26 matrix to \(f_{ij}\), i.e. \(p_{ij}f_{jk} = \delta _{ik}\). The solution of this differential equation for \(p_{ij}\) is

$$\begin{aligned} p_{ij}(u) = u\delta _{ij} - q_{ij} \end{aligned}$$

for constant and symmetric \(q_{ij}\) (since f is symmetric). Therefore \(f_{ij}\) has a singularity whenever u is an eigenvalue for \(q_{ij}\). Either these eigenvalues are distinct or they are degenerate, in which case \(q_{ij} = u_1\delta _{ij}\). In this degenerate case, \(p_{ij}\) has the form \((u - u_1)\delta _{ij}\), and \(\kappa _3\) has two vertices, namely P and the point \(R : = (u_1,v_0,\textbf{0})\). This is because the equation of \(\kappa _3\) reduces to a single point at both P and R, as in fig.1. In fact, that \(\kappa _3\) is focused to a single point (anastygmatic) is specific to the purely electromagnetic case in which \(h_{ij}\) is purely diagonal. For the gravitational case, one finds that \(\kappa _3\) is focused onto a line. Since the arguments used are very similar, we omit this proof here. See [71] for details. \(\square \)

To explain why this result shows that plane waves are not globally hyperbolic, consider a candidate for a Cauchy hypersurface. Such a hypersurface would have to intersect the v-line through R. But then some of the other past-oriented lightlike geodesics from R to Q have to be intersected twice. Looking to Fig. 1, a connected spacelike hypersurface such as the proposed Cauchy hypersurface containing Q must initially lie entirely in the past of (drawn as “below” on the diagram) the future null cone of Q. A Cauchy hypersurface can never meet the null line \({\mathcal {R}_1}\), as if it were to do so then it would intersect the null geodesics through Q twice (since they are all focused onto \({\mathcal {R}_1}\)). As a result, the proposed Cauchy hypersurface must “bend downwards” to avoid \({\mathcal {R}_1}\), and can never extend through it while remaining everywhere spacelike, and as in [71]: “Cauchy data on such a hypersurface could thus give no information for specifying amplitudes for a parallel waveFootnote 27 which might lie beyond \({\mathcal {R}_1}\)”.

4.4.2 Generic position on the causal ladder

After Penrose showed that the plane waves are not globally hyperbolic, interest was spurred in discovering the exact position of both the plane waves and pp-waves on the causal ladder. This question has been categorically answered for the plane waves by [72], and then for the (Nh)p-waves by [73]. Note that the causality properties of the more general class of parallel waves does not appear to have been studied. Let us first recall the causal ladder for Lorentzian manifolds:

$$\begin{aligned} \begin{array}{c} \text {Globally hyperbolic}\ (\exists \ \text {a Cauchy surface})\\ \Downarrow \\ \text {Causally simple (pasts and futures are closed + causality)} \\ \Downarrow \\ \text {Causally continuous (``continuity'' of pasts and futures + distinguishing)}\\ \Downarrow \\ \text {Stably causal}\ (\exists \ \text {a global time function})\\ \Downarrow \\ \text {Strongly causal}\ (\not \exists \ \text {closed or ``almost closed'' causal curves})\\ \Downarrow \\ \text {Distinguishing}\ (\not \exists \ \text {points with same pasts and futures})\\ \Downarrow \\ \text {Causal} (\not \exists \ \text {closed causal curves})\\ \Downarrow \\ \text {Chronological}\ (\not \exists \ \text {closed timelike curves})\\ \Downarrow \\ \text {Non--totally vicious}\ (\exists \ \text {points} \ p\in M\ \text {with} \ p\not \ll p) \end{array} \end{aligned}$$

from [74, Sec. 3] and [75]. Note that “stably causal” was first understood as the causality being a stable property under perturbations, but Hawking showed [76] that this is equivalent to the existence of a global time function. Also note that \(x \ll y\) means that x chronologically precedes y, that is there exists a future-directed chronological (timelike) curve from x to y.

To make explicit our conventions, and to align with the conventions of [74] we choose the signature of our spacetimes (Mg) to be \((-,+,\dots ,+)\), i.e., a non-zero vector field \(X \in TM\) is

  • timelike \(\iff \) \(g(X,X) < 0\),

  • lightlike \(\iff \) \(g(X,X) = 0\),

  • spacelike \(\iff \) \(g(X,X) > 0\),

and we take the zero vector to be spacelike. We also use “causal” to mean lightlike or timelike when referring to a vector field. Also to remain consistent with [73], when dealing with parallel waves we will fix our time-orientation such that \(\partial _v\) is past-directed. We now examine the causal classification of the parallel waves, starting with the relatively simple result:

Proposition 4.5

All (Nh)p-waves are chronological.

Proof

For a parallel wave defined by a covariantly constant, null vector field Z, in the adapted coordinates of Theorem 3.1, we have \(Z = \nabla u = \partial _v\). For any future-directed causal curve \(\gamma (s) = (u(s),v(s),\textbf{x}(s))\) it holds that

$$\begin{aligned} \dot{u}(s) = g(\dot{\gamma }(s),\partial _v) \ge 0 \end{aligned}$$

where the inequality is sharp for \(\gamma (s)\) timelike. Such an inequality prevents the existence of closed timelike curves, and thus the spacetime is chronological. \(\square \)

Being one of the “lower rungs” of the causal ladder, being chronological is not a relatively strong restriction. We can however show that a generic (Nh)p-wave lies one step higher on the ladder:

Theorem 4.6

All (Nh)p-waves are causal.

We will prove this theorem below using Proposition 4.7. The proof of this result follows from [72, Scholium 4.11], which we will reproduce here. To do so, we first introduce the concept of a quasi-time function.

Definition 7

Quasi-time function. On a Lorentzian manifold (Mg) a smooth function \(f: M \mapsto {\mathbb {R}}\) is called a quasi-time function for (Mg) if

  1. (i)

    \(\nabla f\) is everywhere nonzero, causal and past-directed, and if

  2. (ii)

    every null geodesic segment \(\gamma \) such that \(f \circ \gamma \) is constant, is injective.

Now we may reproduce the afformentioned [72, Scholium 4.11] for completeness, which is stated as:

Proposition 4.7

Any spacetime admitting a quasi-time function is causal.

Proof

Assume f is a quasi-time function as in Definition 7, then due to (i) we have that f is strictly increasing along all future-directed timelike curves in M, and hence (Mg) is chronological. We now prove causality by contradiction.

Assume (Mg) is not causal, then M would contain [72, Scholium 4.10] a non-trivial, smooth, future-directed null geodesic segment \(\tilde{\gamma }:[0,1] \rightarrow M\) with \(\tilde{\gamma }(0)=\tilde{\gamma }(1)\) and \(\tilde{\gamma }^{\prime }(0)=\tilde{\gamma }^{\prime }(1)\).

Furthermore \(\tilde{\gamma }\) may be extended to an inextendible geodesic \(\gamma : {\mathbb {R}}\rightarrow M\) by letting \(\gamma (s)=\tilde{\gamma }(s \bmod 1) .\) Again because of (i) and by continuity of all the relevant properties, f is non-decreasing along \(\gamma \); hence \(f \circ \gamma (s)=\lambda _{0}\) for all \(s \in {\mathbb {R}}\), constant \(\lambda _0 \in {\mathbb {R}}\), which would contradict (ii), since \(\gamma (0)=\gamma (1)\). Thus, (Mg) must be causal. \(\square \)

We now return to the proof of Theorem 4.6, armed with the knowledge of the above proposition.

Proof of Theorem 4.6

All that we require is that any (Nh)p-wave admits a quasi-time function. This is proven in [72, Lemma 4.1] and again is reproduced here. The claim is as follows:

Claim: When an (Nh)p-wave is written in the adapted coordinates of Theorem 3.1, the coordinate function u is a quasi-time function as in Definition 7.

To prove this, note that by definition we have a covariantly constant, null vector field Z such that \(Z = \nabla u =\partial _v\). Thus \(\nabla u\) is causal by definition. Since \(Z=\nabla u\) is nontrivial and covariantly constant, we have that \(\nabla u\) is everywhere nonzero. Furthermore \(\nabla u\) is past-directed since \(\nabla u = \partial _v\) and the time-orientation on (Mg) can be determined by the condition that \(\partial _v\) be past-directed. Therefore point (i) in the definition of a quasi-time function is satisfied.

Next, note that since the restriction of g (Eq. 45) to the null hypersurface \(\Pi _{u_0} : = u^{-1}(u_0)\) for some \(u_0 \in {\mathbb {R}}\) is independent of the characteristic function H and the wavefront is spacelike, the null geodesic segments will be of the form

$$\begin{aligned} \gamma : v\in {\mathbb {R}}\mapsto (u_0,v,\textbf{x}_0) \in \Pi _{u_0}. \end{aligned}$$

Such a map is injective, and thus point (ii) in the definition of a quasi-time function also holds. \(\square \)

4.4.3 Conditions for stronger causal character

We now shift our focus to finding the conditions under which an (Nh)p-wave exhibits stronger causality properties. This was the subject of [74], in which is was shown that the criterion for determining causal character is the spatial asymptotic behaviour of the characteristic function H (when the parallel wave is written in adapted coordinates), and in some cases the completeness of the Riemannian manifold corresponding to the wavefront. A summary of the results of this work [73, Sec. 7] is given in Table 2, where one uses \(-H\) to classify asymptotic behaviour as opposed to H to be consistent with work which will be presented in Sect. 5. A precise definition of the asymptotic behaviour of H follows from:

Definition 8

Subquadratic Growth. We say that \(-H(u,\textbf{x})\) behaves subquadratically at spatial infinity if there exists some \(\textbf{x}_0\in N\) (where N is the wavefront) and continuous functions \(R_1(u), R_2(u)(\ge 0), p(u) < 2\) such that:

$$\begin{aligned} -H(\textbf{x}, u) \le R_{1}(u) d^{p(u)} (\textbf{x}, \textbf{x}_0)+R_{2}(u) \quad \forall ~(u,\textbf{x}) \in {\mathbb {R}}\times N, \end{aligned}$$

where d is the distance canonically associated to the Riemannian metric on N. When \(p(u) \equiv 2\), then we say \(-H(u,\textbf{x})\) behaves (at most) quadratically at spatial infinity.Footnote 28

Table 2 Causal properties of an (Nh)p-wave under certain conditions on the characteristic function H

In light of Table 2 we can identify H being quadratic as critical for the causal behaviour, in the sense that small perturbations either in the superquadratic or in the subquadratic direction may introduce significative qualitative differences in the causal character.

5 The Ehlers–Kundt conjecture

The Ehlers–Kundt conjecture is a statement about the role of gravitational plane waves (Eq. 33) in the mathematical description of gravitational waves. Roughly, it claims that the plane waves act as a mathematical idealisation of gravitational waves, and was originally stated as follows:

“Prove the plane waves to be the only complete pp-waves.”Footnote 29

The conjecture can be stated in a more modern language as follows, where the terms “plane wave” and “classical pp-wave” are defined consistently with the nomenclature of this article (see Table 1):

“Prove the plane waves to be the only geodesically complete, Ricci-flat classical pp-waves.”

The conjecture stems from the idea that gravitational radiation should not arise in a spacetime in which there is no source to create it. If a spacetime is complete and Ricci-flatFootnote 30 but the metric describes a propagating wave, then that wave would be produced independent of any source. Since complete spacetimes are inextendible, that is they are not part of some larger spacetime, we can be sure that we are not just “missing” the part of the spacetime containing a source. If a vacuum spacetime contains a wave but is not complete, it is certainly possible that we are missing the source in our description.

An analogy would be a room with light coming from behind a curtain. In this analogy light is the pp-wave, “vacuum” means we cant see any lightbulbs (sources), and completeness equates to removing the curtain, so we can see everywhere in the room. If the curtain is present and we see light in the room, it is reasonable to say there must be a source behind the curtain. However it seems impossible that there is light in the room, we can see everywhere, and there is no lightbulb. To translate back to our terminology, it seems it should be impossible that our spacetime contains a wave, is complete, and is also Ricci-flat (Fig. 2).

Fig. 2
figure 2

An analogy for the Ehlers–Kundt conjecture. Art courtesy of Christopher Martin

Ehlers and Kundt [17] showed that the plane waves are always complete, even in the vacuum case. That is they correspond to the apparently unphysical case of a lit room with no curtain and no lightbulb. The Ehlers–Kundt conjecture assigns the plane waves the role of mathematical idealisations, and claims that any other pp-wave (25) must be incomplete, so that the source which “must have” created the waves is simply not part of our description. This is strongly related to the fact proven by [71], wherein Penrose shows that the plane waves are not globally hyperbolic, as discussed in Sect. 4.4.1.

Spacetimes which are bothFootnote 31 complete and not globally hyperbolic are generally considered unphysical, since the development of the spacetime from arbitrary initial data in the initial value formulation of the Einstein equations is not unique in this case. This construction is outlined in Sect. 4.4.1. The EK-conjecture for gravitational pp-waves can be summarised as “spacetime is complete” \(\iff \) it is a plane wave. However since the direction was already proven by [17], the conjecture in fact only refers to the \(\implies \) direction.

Although there is no known counterexample (i.e. a complete classical pp-wave other than the plane wave), the conjecture remains an open question. Significant progress has been made in addressing it however, and the remainder of this section will outline that progress. To begin, let us formulate the conjecture in more precise mathematical terms, and focus our attention on the classical pp-waves on \(M = {\mathbb {R}}^4\) so that our metric takes the form

$$\begin{aligned} g = 2 d u d v - V(u, x,y) d u^{2}+ dx^2 + dy^2, \end{aligned}$$
(64)

where to be Ricci-flat/vaccum we must have that \(V : = -H\) is harmonic in (xy). That is, \(V_{xx} + V_{yy}=0\). The Ehlers–Kundt conjecture in this case states: if (Mg) is geodesically complete, then V(uxy) must be quadratic in (xy). We may replace the “complete” in the original statement with “geodesically complete” and study the geodesic equations of (Mg). Upon calculating the geodesic equations, one finds

$$\begin{aligned} \ddot{u}&=0 \end{aligned}$$
(65)
$$\begin{aligned} \ddot{v}&=\frac{\dot{u}}{2} \left( \dot{u} V_{u}(u, x, y)+2 \dot{x} V_{x}(u,x, y) +2 \dot{y} V_{y}(u, x, y)\right) \end{aligned}$$
(66)
$$\begin{aligned} \ddot{x}&=-\frac{\dot{u}^{2}}{2} V_{x}(u, x, y) \end{aligned}$$
(67)
$$\begin{aligned} \ddot{y}&=-\frac{\dot{u}^{2}}{2} V_{y}(u, x, y), \end{aligned}$$
(68)

where a dot represents the derivative with respect to an affine parameterFootnote 32t. Since the boundary conditions determine u entirely, and the completeness of v(t) evidently depends only on the completeness of x(t) and y(t), in studying the completeness the geodesic equations reduce to

$$\begin{aligned} \ddot{x}(u)&= - V_x(u,x,y),\nonumber \\ \ddot{y}(u)&= - V_y(u,x,y). \end{aligned}$$
(69)

These equations can be recast as a Hamiltonian system by defining \(q(u) = (x(u),y(u))\), \(p = \dot{q}\), and \(\nabla \) the Euclidean gradient on \({\mathbb {R}}^2\), such that we have

$$\begin{aligned} \dot{p} = -\nabla V(u,q). \end{aligned}$$
(70)

In this section we will use only V as opposed to H, in order to maintain the interpretation as the potential of a dynamical system in classical mechanics. The Ehlers–Kundt conjecture can be restated in this language as: Prove that for V(uxy) harmonic in (xy), if the Hamiltonian system \(\dot{p} = -\nabla V(u,q)\) admits global solutions for all initial data, then the u-constant function \(V(u,\cdot )\) is an at most quadratic polynomial in (xy). As mentioned above, this statement has not been proven in general. Before moving on to examine the special cases in which the conjecture have been proven, beginning with the so-called polynomial EK-conjecture, we pause to mention a beautiful connection this conjecture has with complex dynamics, an observation due to G. Cox (private communication).

5.1 Relation to complex dynamics

In what follows, assume that V is independent of u (“autonomous”), and consider the complex-valued function \(f:{\mathbb {C}} \rightarrow {\mathbb {C}}\) constructed from the partial derivatives \(V_x,V_y\) of V:

$$\begin{aligned} z = x+iy ,\quad f(z) = -V_x(x,y) + i V_y(x,y). \end{aligned}$$
(71)

The Cauchy–Riemann equations are

$$\begin{aligned} -V_{xx} = V_{yy},\quad -V_{xy} = -V_{yx}, \end{aligned}$$

and observe that, while the second equation holds trivially, the first equation is satisfied precisely when V(xy) is harmonic (this is also the case for \(f(z) = V_y + i V_x\)). It was shown in [78, Corollary 7.4] that, given any entire function f(z) (i.e., a function holomorphic on the entire complex plane \({\mathbb {C}}\)), the complex-valued ODE

$$\begin{aligned} \ddot{z} = f(z) \end{aligned}$$

admits global solutions for all initial data if and only if f(z) is affine linear. If we apply this result to Eq. 71, one finds

$$\begin{aligned} \ddot{x} + i\ddot{y} = \ddot{z} = f(z) = -V_x + iV_y, \end{aligned}$$
(72)

then [78, Corollary 7.4] yields that this system is complete if and only if \(V_{xxx} = V_{yyy} = 0\); i.e., if and only if V is quadratic in xy. This is not quite a proof of the EK-conjecture, however, since the pair of real ODEs to which Eq. 72 gives rise is not the usual Hamiltonian system Eq. 70, but rather the following variation of it:

$$\begin{aligned} \ddot{x} = -V_x ,\quad \ddot{x} = V_y. \end{aligned}$$

Indeed, to obtain the usual Hamiltonian ODEs we should have chosen instead the function

$$\begin{aligned} f(z) = -V_x - iV_y. \end{aligned}$$

(See also Eq. 75 in Remark 5.1 below.) Unfortunately, this function is holomorphic if and only if the harmonic function V is linear; indeed, owing to Eq. 71, this choice of f(z) is precisely anti-holomorphic (i.e., its complex-conjugate is holomorphic). We therefore come to the beautiful realization that the EK conjecture is the anti-holomorphic analogue of [78, Corollary 7.4] and, as such, forms a bridge connecting general relativity to complex dynamics. The main ingredient in the proof of [78, Corollary 7.4] is a classification of the complete complex orbits of \(\ddot{z} = f(z)\) which shows that they must be isomorphic to certain Riemann surfaces [78, Proposition 3.2]; it is an intriguing question to see if the complete orbits of Eq. 70, in the case when V is harmonic, can be similarly classified.

5.2 Polynomial EK-conjecture

In this section we will outline some of the work done by Flores and Sánchez in [77], who studied the EK-conjecture in the case that the potential V is polynomially bounded. We refer to the case when V does not depend on u as the “autonomous case”, that is \(V = V(x,y)\). The u-dependence of V is not restricted by any of the previous discussion, and so it is natural to first consider the autonomous case. To make statements about the completeness of trajectories, the authors make use of confinement properties of the relevant ODEs, and so we begin by developing some intuition for this:

5.2.1 Motivation for proof

As a point of entry into thinking about the Ehlers–Kundt conjecture, consider for a moment the case when V is an autonomous harmonic polynomial that is even in y, namely, \(V(x,-y) = V(x,y)\); e.g.,

$$\begin{aligned} V(x,y) = -x^3+3xy^2\quad \text {and} \quad V(x,y) = -x^4+6x^2y^2-y^4 \end{aligned}$$
(73)

are two such examples. The virtue of this class of harmonic polynomials is that, since the partial derivative \(V_y\) is necessarily odd in y, we must have \(V_y(x,0) = 0\). As a consequence, the ODE

$$\begin{aligned} \ddot{y} = -V_y(x(t),y(t)) \end{aligned}$$

admits the trivial solution \(y(t) = 0\), for which choice the remaining ODE in x takes the form

$$\begin{aligned} \ddot{x} = -V_x(x(t),0). \end{aligned}$$
(74)

Any solution x(t) to Eq. 74 then yields a solution (x(t), 0) of our original two-dimensional ODE — and the advantage to this approach is that Eq. 74 permits a much easier blow-up analysis. Indeed, consider any autonomous harmonic polynomial that is not even in y, but, like the examples in Eq. 73, has negative leading term in x:Footnote 33

$$\begin{aligned} V(x,0) = - (a_dx^d + a_{d-1}x^{d-1} + \cdots + a_1x + a_0), a_d > 0\ ,\ d \ge 3. \end{aligned}$$

Then, since \(a_d > 0\), we can, by a translation \(x \mapsto x + a\) if necessary (which is an isometry of the standard pp-wave metric), assume that each \(a_i \ge 0\) as well. But now with “every term negative”, it follows easily that the solution x(t) to Eq. 74 satisfying \(x(0) = 1\) and \(\dot{x}(0) = \sqrt{2a_d}\) must be bounded above (i.e. bounded below in absolute value) by the corresponding solution to

$$\begin{aligned} \bar{V}(x) = -a_dx^d , \ddot{x} = \bar{V}'(x(t)) = -da_dx(t)^{d-1}. \end{aligned}$$

since \(V < \bar{V}\). This latter, bounding solution is

$$\begin{aligned} \bar{x}(t) = \frac{b}{(c-t)^{\frac{2}{d-2}}} , b :=\Big [\underbrace{\frac{2}{a_d(d-2)^2}}_{>\,0} \Big ]^{\frac{1}{d-2}} , c :=b^{\frac{d-2}{2}}, \end{aligned}$$

which blows up in finite time. Thus, since |x(t)| is bounded below by a function that blows up in finite time, it follows that the solution (x(t), 0) also blows up in finite time.

What made this approach work? It was the property of being even in y that allowed us to find geodesics that stay in a confined region of the xy-plane—namely, the x-axis—which confinement simplified the resulting ODEs to the point where their behavior was dominated by the leading term of just one polynomial. This is an effective means of symplifying the analysis, but, of course, not every harmonic polynomial is even in y. The questions remains, therefore, as to whether this technique of “concentrating in a particular region of the plane” can work in general. Indeed it was demonstrated in [77] that this technique does work in full generality, thereby resolving the polynomial case of the Ehlers–Kundt conjecture.

Remark 5.1

In their work on the polynomial case of the EK-conjecture [77] the authors use a complex variable approach, wherein \(z : = x + iy\) takes the place of the vector q and similarly \(\dot{z} = p\). There is a good reason that we should consider the polynomial case in the complex numbers \({\mathbb {C}}\) as opposed to the real numbers. As explained in [77, p. 5], in the autonomous case \(V:{\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) we may identify \({\mathbb {C}}\) with \({\mathbb {R}}^2\). The completeness of the trajectories of a potential V is equivalent to the completeness of a corresponding vector field X on the tangent bundle, and there exists a well-established theory about completeness of holomorphic vector fields X on \({\mathbb {C}}^2\) in the case that they are polynomial. The more general case where V is not polynomially bounded does not admit an obvious advantage in the complex language. In this notation, the geodesic equations take the form

$$\begin{aligned} \dot{p} = -\nabla V(q) \implies \ddot{z} = -V_x(x,y) - iV_y(x,y) \end{aligned}$$
(75)

For the purposes of this review, we will continue to explicitly write x and y in place of z.

We now ask ourselves if the above ODE Eq. 75 admits global solutions for V harmonic in (xy), that is we wonder if the corresponding spacetime manifold in the original statement of the EK conjecture is geodesically complete. In fact, this is an open question in general. The following partial result by [58] became an important motivation for the so-called polynomial EK-conjecture:

Theorem 5.2

(Candela, Romero & Sánchez ’13) For \(V: {\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) harmonic in \(q : = (x,y) \in {\mathbb {R}}^2\), if there is a constant \(b\in {\mathbb {R}}\) such that \(V(q) \ge - b|q|^2\) for all \(q \in {\mathbb {R}}^2\), then the ODE \(\ddot{q} = -\nabla V(q)\) admits global solutions for all initial data.

In other words, this is the statement that the Ehlers–Kundt conjecture holds in the case that \(H = -V\) is subquadratic. We reproduce now a short version of the proof which is originally due to G. Cox (private communication):

Proof

It is sufficient to assume \(b > 0\). Since we have translated the original conjecture to the realm of Newtonian dynamics, we may apply simple energy conservation

$$\begin{aligned} \frac{1}{2}|p|^2 + V(q) = E \Rightarrow |p|^2 \le 2(E + b|q|^2). \end{aligned}$$
(76)

We then bound |p| by |q| in the cases of negative and non-negative energy:

$$\begin{aligned} E<0&\Rightarrow |p|^{2} \le 2 b|q|^{2}, \nonumber \\ E \ge 0&\Rightarrow 2\left( E+b|q|^{2}\right) =\underbrace{2(\sqrt{E}+\sqrt{b}|q|)^{2} -4 \sqrt{E b}|q|}_{\le 2(\sqrt{E}+\sqrt{b}|q|)^{2}}. \end{aligned}$$
(77)

Such that in both cases we have the bound

$$\begin{aligned} |p| \le a+c|q| \quad , \quad a \ge 0, c>0. \end{aligned}$$
(78)

We can then bound |q(t)| using |q(0)| as follows:

$$\begin{aligned} \underbrace{\left| \int _{0}^{t} p(s) d s\right| }_{|q(t)|-|q(0)| \le }&\le \int _{0}^{t}|p(s)| d s \le \underbrace{\int _{0}^{t}(a+c|q(s)|) d s}_{a t+c \int _{0}^{t}|q(s)| d s} \\&\Rightarrow |q(t)| \le (|q(0)|+a t)+c \int _{0}^{t}|q(s)| d s \\&\Rightarrow |q(t)| \le \underbrace{(|q(0)|+a t) e^{c t}}_{\textrm{bounded}\ \textrm{on}\ \textrm{compact}\ \textrm{int}.} \end{aligned}$$

where in the final step we have used the integral form of Grönwall’s inequality. The result then follows by Picard-Lindelöf. \(\square \)

This result was proven in [58] even in the case that V is non-autonomous and where \(|\cdot |\) is replaced by a general distance function \(d_g(\cdot ~,\cdot )\) associated to a Riemannian metric g. Therefore the previous result also holds true for a gravitational (Nh)-fronted wave 45. That the EK-conjecture is true for a harmonic and subquadratic \(H = -V\) motivates one to ask if the same is true for harmonic and polynomially bounded H. This question was answered by [77], but before stating the theorem let us first make precise the idea of a polynomially bounded H.

Remark 5.3

Following the terminology of [77], a function \(H:{\mathbb {R}}\times {\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) is called “polynomially u-bounded” (meaning polynomially upper bounded along finite u-times) when for each \(u_{0} \in {\mathbb {R}}\), there exists \(\epsilon _{0}>0\) and a polynomial \(P_{0}:{\mathbb {R}}^{2} \rightarrow {\mathbb {R}}\) such that \(H(u,q) \le P_{0}(q)\) for all \((u,q) \in \left( u_{0}-\epsilon _{0}, u_{0}+\epsilon _{0}\right) \times {\mathbb {R}}^{2}\).

Note that we say H is quadratically polynomially u-bounded when \(P_0\) can be chosen of degree 2 for all \(u_0 \in {\mathbb {R}}\).

5.2.2 Outline of proof

The Polynomial EK-conjecture is stated as follows:

Theorem 5.4

(Flores & Sánchez ’19) Let \(V:{\mathbb {R}}\times {\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) be a polynomially u-bounded \(C^1\)-potential which is also \(C^2\) and harmonic in the pair of variables q = (x, y). Then: all the solutions to the dynamical system Eq. 70 are complete if and only if the function \(V (u,\cdot )\) is an at most quadratic polynomial for each \(u \in {\mathbb {R}}\).

We will present here only a rough outline of the arguments behind the proof, following loosely [77, Sec. 2.3]. The proof of Theorem 5.4 goes as follows:

  1. (i)

    It is first shown that if a harmonic function V is upper bounded by a polynomial of degree n, that is if \(V (x, y) \le A(x^2+y^2)^{n/2}\) for some \(n \in {\mathbb {N}}, A > 0\) at large (xy), then V must itself be a harmonic polynomial of degree \(\le n\).

  2. (ii)

    The homogeneous, harmonic polynomials of degree \(m>0\) on \({\mathbb {R}}^2\) form a two-dimensional vector space. In the standard polar coordinates of \({\mathbb {R}}^2\), such polynomials take the form

    $$\begin{aligned} p_m(\rho ,\theta ) = \lambda _m \rho ^m \cos (m(\theta + \alpha _m)) \end{aligned}$$
    (79)

    for \(\lambda _m > 0\) and \(\alpha _m \in (-\pi ,\pi ]\). Therefore any harmonic polynomial P on \({\mathbb {R}}^2\) of degree \(n \in {\mathbb {N}}\) can be written as

    $$\begin{aligned} P(\rho ,\theta ) = \sum _{m=0}^np_m(\rho ,\theta ) \end{aligned}$$
    (80)

    for some \(p_0 \in {\mathbb {R}}\). In particular, the autonomous potential V(q) of Eq. 75 can be written as such a sum.Footnote 34 For simplicity in this summary, let us take the simple case of a homogeneous degree \(n>2\) polynomial \(V_n\) with \(\lambda _n=-1\) and \(\alpha _n=0\), that is \(V_n(\rho ,\theta ) = -\rho ^n\cos (n\theta )\). In the homogeneous case one can always obtain this via rotations, scaling or adding a real number to V, none of which affect the completeness or harmonic characters necessary for our discussion.

  3. (iii)

    Consider the radial curves in polar coordinates \(\gamma _k(t) = (\rho (t),\hat{\theta }_k)\), \(k \in \{0,\dots ,n-1\}\) where \(\hat{\theta }_k := 2\pi k/n\) (n is the degree of the potential V being considered). Such curves are solutions of \(\ddot{q} = -\nabla V_n(q)\) if and only if the radial component \(\rho (t)\) satisfies \(\ddot{\rho }(t) = n\rho ^{n-1}(t)\).

  4. (iv)

    It is then proved that for any real number \(n>2\) and \(C^1\) function \(\lambda :[0,\infty )\rightarrow {\mathbb {R}}\), the solutions of the differential inequality

    $$\begin{aligned} \ddot{\rho }(t) \ge n\lambda \rho ^{n-1}(t) \end{aligned}$$
    (81)

    with initial conditions \(\rho (0)>0\) and \(\dot{\rho }(0) > 0\) are incomplete under the following conditions:

    1. (a)

      The solutions are incomplete if there exists some \(\lambda _0 >0\) such that \(\lambda \ge \lambda _0\).

    2. (b)

      If \(\lambda (0)>0\) then there exists some \(k>0\) such that such that all solutions with initial conditions \(\rho (0)>k\) or \(\dot{\rho }(0)>k\) are incomplete.

    The first of these points tells us immediately that the solutions \(\gamma _k\) satisfying \(\ddot{\rho }(t) = n\rho ^{n-1}(t)\) are incomplete, as in this case \(\lambda \) is the constant function equal to one, such that any \(0<\lambda _0<1\) provides the necessary bound. In fact a confinement property is shown, whereby there exists regions “around” the \(\gamma _k\) labelled \(D_k[\rho _0,\pi /(2n)]\) such that trajectories starting in \(D_k[\rho _0,\pi /(2n)]\) (with suitable initial conditions) stay in \(D_k[\rho _0,\pi /(2n)]\), and these confined solutions satisfy the differential inequality Eq. 81, allowing us to prove that they too are incomplete.

  5. (v)

    The existence of the confining regions \(D_k[\rho _0,\pi /(2n)]\) for a homogeneous potential \(V_n\) can be understood as follows: Along each \(\gamma _k = (\rho (t),\hat{\theta }_k = 2\pi k/n)\), \(V_n(\rho ,\theta ) = -\rho _n\cos (n\theta )\) is decreasing and concave. Furthermore, the harmonicityFootnote 35 of \(V_n\) implies that \(\frac{\partial V_n}{\partial \theta }(\gamma _k(t)) = 0 \) and that this is in fact a minimum. That is, the \(\hat{\theta }_k\) are stable equilibria of trajectories close to the \(\gamma _k\). This can be visualised by looking at the potential \(V_n\) for some choice of n. In Figure 3 the case \(n = 5\) is demonstrated,Footnote 36 in which one can see \(n = 5\) different “channels” with centers corresponding to the \(\gamma _k, k\in \{0,\dots ,4\}\).

  6. (vi)

    To prove the case in which V is not homogeneous, it is first written as a linear combination of polynomials like \(V_n\). Then the \(\gamma _k\) are no longer solutions of the full dynamical system \(\ddot{q} = -\nabla V(q)\), but it is shown that there still exists regions “around” the \(\gamma _k\) labelled \(D[\rho _0,\theta _+]\) which have qualitatively the same behaviour as the \(D_k[\rho _0,\pi /(2n)]\). This is achieved by showing that the radial component of a trajectory \(\gamma \) grows sufficiently fast compared to the angular oscillation that \(\gamma \) never escapes the \(D[\rho _0,\theta _+]\).

  7. (vii)

    To prove the case when V is non-autonomous a similar procedure is followed to that of the autonomous case, with some technical complications. The first notable difference is that the polar expressions of a harmonic potential V(uq) Eq. 79 and Eq. 80 become valid only on an interval in u, that is

    $$\begin{aligned} p_m(u,\rho ,\theta ) = \lambda _m(u) \rho ^m \cos (m(\theta + \alpha _m(u))),~~~u\in (u_0 - c, u_0 + c) \subset {\mathbb {R}}\end{aligned}$$
    (82)

    for some \(0 < c \in {\mathbb {R}}\). Here we can only choose \(\alpha (u_0) = 0\), and in general \(\alpha (u) \ne 0\). As a result, in the non-autonomous case we have that the \(\hat{\theta }_k\) are no longer constant:

    $$\begin{aligned} \hat{\theta }_k(u) = \frac{2\pi k - \alpha (u)}{n},~~~k = 0,\dots ,n-1. \end{aligned}$$
    (83)

    The remaining differences follow a similar pattern, whereby objects become u-dependent and are defined on intervals. However since the rough details are the same as the autonomous case, these details will be omitted here.

Fig. 3
figure 3

Homogeneous degree-5 potential \(V_5\) as a surface (left) and contour plot (right). These images make clear the n stable trajectories \(\gamma _k\) for \(k\in \{0,\dots ,n-1\}\). The same features can be seen for any natural number \(n>2\)

Summary – Polynomial EK-conjecture

We first saw the Ehlers–Kundt conjecture, stated as:

“Prove the plane waves to be the only complete (gravitational) pp-waves.”

This was a statement about the completeness of the solutions of the geodesic equation for a metric \(g = 2 d u d v - V(u, x,y) d u^{2}+ dx^2 + dy^2\) where V is harmonic in (xy). The geodesic equations were reduced to a Hamiltonian system \(\dot{p} = -\nabla V(q)\) with \(q := (x,y)\) and \(p = \dot{q}\). In mathematical terms, the conjecture states:

$$\begin{aligned} \begin{array}{c} \text {The solutions of}\ \dot{p} = -\nabla V(q) \\ \text {exist for all times} \\ \end{array} \iff V(u,x,y)\ \text {is quadratic in} (x,y). \end{aligned}$$

The direction is already known to hold (see Sect. 3.3.1), and the \(\implies \) direction is an open question. The fact that a quadratically-bounded and harmonic V was proven to have complete trajectories motivated us to ask what happens if the harmonic V is polynomially bounded. This question was answered by [77] where it was proven that for such a V, all the solutions to the dynamical system Eq. 70 are complete if and only if the function \(V (u,\cdot )\) is an at most quadratic polynomial for each \(u \in {\mathbb {R}}\). That is, the Ehlers–Kundt conjecture is proved to hold in the case that V is polynomially bounded.

We may then ask ourselves if it is reasonable to expect that V be polynomially bounded. In fact in the causal study, it was discovered that in the autonomous case unless V were quadratically polynomially bounded, the pp-wave would not be strongly causal. For further evidence supporting such a bound see [77, Sec. 13. (b)]. Therefore this is arguably the strongest known result addressing the EK conjecture. It is not, however, the only one; indeed, in the case of an autonomous potential, the EK conjecture has also been settled in the case when the spacetime is strongly causal, in [80].

It should also be mentioned that exactly the behavior of geodesics in geometries studied in this section (those for which V is a harmonic polynomial that is even in y) have been studied extensively, wherein it was demonstrated via a fractal method that the geodesic flow is chaotic in nature. The geodesics escape to infinity along one of the channels which appear in Fig. 3 in this article (and in Fig. 1 of [79]). For details see also [81] and [82]. This phenomenon was further studied in the context of the sandwich waves in [83], wherein it was demonstrated that as the support of the curved region approaches zero (the so-called “impulsive waves”) the geodesic motion becomes integrable.

5.3 The compact case

One may also wonder if the Ehlers–Kundt conjecture could be answered in the case that a pp-wave (Mg) is a compact Lorentzian manifold, since such manifolds are known to be complete under a wealth of circumstances.Footnote 37 Some examples include when they are flat, have constant curvature, are homogeneous (and even locally homogeneous in the 3 dimensional case), or admit a time-like conformal Killing vector field [24, p. 2]. Unfortunately, general pp-waves do not satisfy any of these properties, and so some additional results are required to address the EK conjecture in this case. The question of completeness for compact pp-waves has indeed been answered by [24], and that work is the subject of this section.

Example

Compact pp-wave. Consider the flat metric h on the n-torus \({\mathbb {T}}^n\), then the product manifold \(M = {\mathbb {T}}^2 \times {\mathbb {T}}^n\) with the metric

$$\begin{aligned} g = 2d\theta d\phi + 2H d\theta ^2 + h \end{aligned}$$

with \(H \in C^{\infty }({\mathbb {T}}^n)\) is compact and is in fact a standard pp-wave with defining covariantly constant vector field represented as \(\partial _{\phi }\). Note however that a “wave” is not a very accurate name in the compact case, since as mentioned in Sect. 2.3 it is the (null) asymptotics which signal the physical presence of radiation, and the compact case does not admit the same notion of “null infinity” as was used to define the presence of radiation.

The principal results of [24] can be summarised as follows:

  1. (A)

    The universal cover of a compact pp-wave is globally isometric to a standard pp-wave (Eq. 31)

  2. (B)

    Every compact pp-wave (Mg) is geodesically complete.

  3. (C)

    Every compact Ricci-flat pp-wave is a plane wave.

Point A is instrumental in proving point B. Point B appears to be in contradiction to the EK conjecture, but such an apparent problem is resolved by point C. That is, there are no non-plane compact vacuum pp-waves, so we need not wonder about their completeness on physical grounds. Thus these results solve the Ehlers–Kundt conjecture in the compact case. Or rather, the authors have proven that one need not conjecture about the incompleteness of non-plane vacuum compact pp-waves, as there are no such pp-waves. The remainder of this section will outline the methods by which these results are obtained. Let us begin with result (A) in more detail:

Theorem 5.5

The universal cover of an n-dimensionalFootnote 38 compact pp-wave defined by a covariantly constant null vector field Z is globally isometric to a standard pp-wave (Eq. 31) which can be written as

$$\begin{aligned} ({\mathbb {R}}^{n}, g^H = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2} +\delta _{a b}d x^{a} d x^{b} ) \end{aligned}$$

and under this isometry, the lift of Z is mapped to the coordinate vector field \(\frac{\partial }{\partial v}\)

Though we don’t present the proof of this theorem here, we remark that it makes significant use of the “screen bundle” which is closely related to the “wavefront” of our Definition 2. However, as remarked in [24, footnote 2] in the compact case this nomenclature is perhaps inappropriate. Using Theorem 5.5, it is then proven that:

Theorem 5.6

Every compact pp-wave (Mg) is geodesically complete.

To prove this statement, let us first examine the completeness of a standard pp-wave (Eq. 31). Then via Theorem 5.5 we can make statements about the completeness of compact pp-waves. Recall that a standard pp-wave may be written in the global coordinate chart \(\{u,v,x^1,\dots ,x^{n-2}\}\) as

$$\begin{aligned} g = 2 d u d v+H\left( u, \textbf{x}\right) d u^{2} +\delta _{a b}d x^{a} d x^{b}. \end{aligned}$$
(84)

Proposition 5.7

[24, lemma 8] The standard pp-wave metric is geodesically complete if

$$\begin{aligned} \left| \frac{\partial ^{2} H}{\partial x^{i} \partial x^{j}}\right| \le c \end{aligned}$$

for \(0 < c \in {\mathbb {R}}\) for all \(i,j \in \{1\dots ,n-2\}\)

Proof

Let us examine the geodesic equations of the standard pp-wave metric: For a curve \(\gamma \) with components \((u(s), v(s), x^1(s), \dots , x^{n-2}(s))\), the geodesic equation for the u-component is given by:

$$\begin{aligned} \ddot{u}(s) =0 \Longrightarrow u(s)=a s+b ~\text {for some }a,b \in {\mathbb {R}}\end{aligned}$$

that is, the u component is defined on all of \({\mathbb {R}}\). The remaining components of the geodesic equations are given by

$$\begin{aligned} \ddot{v}(s)&=-2 a \dot{x}^{k}(s) \frac{\partial H}{\partial x^{k}}-a^{2} \frac{\partial H}{\partial u},\end{aligned}$$
(85)
$$\begin{aligned} \ddot{x}^{k}(s)&=a^{2} \frac{\partial H}{\partial x^{k}}. \end{aligned}$$
(86)

Since the v equation only depends on the \(x^k\) and not on v, then the solution is defined on \({\mathbb {R}}\) provided that the \(x^{k}\) are defined on \({\mathbb {R}}\). Unfortunately the \(x^k\) equation does not in general admit solutions on all of \({\mathbb {R}}\). An example (as in [85]) is found when \(H = \frac{1}{2}(x^j)^4\) for some \(j\in \{1,\dots ,n-2\}\). In this case, the only nontrivial equation (when \(a\ne 0\)) for the \(x^k(s)\) is

$$\begin{aligned} \ddot{x}^j(s) = 2a^2(x^j)^3 \end{aligned}$$

which has solution

$$\begin{aligned} x^j(s) = \frac{1}{1-as}, ~~s\in (-\infty ,1/a). \end{aligned}$$

Since this solution develops a singularity, so too does the solution for v, and we conclude that the standard pp-wave is geodesically incomplete in this case. So then when are the solutions of the \(\ddot{x}^k\) equations defined on all of \({\mathbb {R}}\) (thus making the pp-wave geodesically complete)? This is guaranteed when the second derivatives of H are bounded; as then by the mean value theorem the first derivatives are Lipschitz continuous which suffices in view of the Picard–Lindelöf theorem. \(\square \)

One may think that this result yields many examples of complete pp-waves which are non-plane (and are instead just bounded in second derivative of H) but in fact we have not imposed that the pp-wave is gravitational. For a gravitational pp-wave H is harmonic, and a harmonic function can only have bounded second derivatives (corresponding to a complete pp-wave by the previous proposition) if it is quadratic and thus a plane wave.Footnote 39

In order to apply this result to our case, that is to prove that a compact pp-wave is geodesically complete (Theorem 5.6), we must prove that the second derivatives of H are bounded in the compact case. The following proposition resolves this question:

Proposition 5.8

Consider a compact pp-wave. By Theorem 5.5, its universal cover is a standard pp-wave \(({\mathbb {R}}^{n}, g = 2 d u d v+H\left( u,\textbf{x}\right) d u^{2}+\delta _{a b}d x^{a} d x^{b})\). Then the second derivatives of H are bounded

$$\begin{aligned} 0 \le \frac{\partial ^{2} H}{\partial x^{i} \partial x^{j}} \le c\quad \forall ~i,j = 1,\dots ,n-2. \end{aligned}$$

Proof

We again omit the proof in favour of brevity. See [24, lemma 9]. \(\square \)

Thus one arrives at a proof of theorem (B):

Proof

Let (Mg) be a compact pp-wave. By theorem (A) the universal cover is isometric to a standard pp-wave, and by the above proposition such a standard pp-wave is complete. Therefore (Mg) itself is complete. \(\square \)

We finally arrive at the statement which resolves the EK conjecture in the case of compact pp-waves.

Theorem 5.9

[24, Corollary 1] Every compact Ricci-flat pp-wave is a plane wave.Footnote 40

Proof

Let (Mg) be a compact pp-wave and let \(({\mathbb {R}}^{n+2}, g^H)\) be the standard pp-wave that is globally isometric to the universal cover of (Mg). As in Proposition 5.8, we have that the second derivatives of H are bounded. If g is Ricci-flat, so too is \(g^H\) , and thus H is harmonic with respect to the \(x^i\) directions

$$\begin{aligned} \sum _{i=1}^{n-2} \partial _i^2 H = 0. \end{aligned}$$

But this implies that also \(\partial _i \partial _j H\) is harmonic in the same sense, and thus, by the maximum principle for harmonic functions [86, page 7], independent of the \(x^i\) components. Hence,

$$\begin{aligned} H(u,\textbf{x})=\sum _{i, j=1}^{n-2} a_{i j}(u) x^{i} x^{j} +b_{i}(u) x^{i}+c(u), \end{aligned}$$

where \(a_{ij}\), \(b_i\) and c depend only on u and not the \(x^i\), and thus since H is quadratic in \(x^i\), (Mg) is a plane wave. \(\square \)

Therefore as stated, one need not conjecture about the incompleteness of non-plane vacuum compact pp-waves, as there are no such pp-waves. As a result, the Ehlers–Kundt conjecture has been resolved in the compact case.

5.4 Case of failure

Let us outline very briefly the following case in which the Ehlers–Kundt conjecture is known not to hold:

Impulsive case

Though usually omitted for brevity in this article, the continuity of the characteristic function H of a pp-wave in u of the adapted coordinates is in fact vital. To quote from [77, Sec. 1.3 (d)]:

Impulsive waves have a non-continuous profile type \(H(u,z=(x,y)) = f(z)\delta (u)\) for some (generalized) delta-function \(\delta \) and smooth f. Thus, the function H can be regarded as z-harmonic when \(\Delta f = 0\). The mentioned results of completeness yield counterexamples to the EK conjecture in the impulsive setting, showing the necessity of continuity in u as well as the appropriate smoothness of H.

This necessary smoothness and continuity in the non-autonomous case (H not independent of u) amounts to

  • H should be \(C^1\) in u (for constructing Levi-Civita Connection)

  • H should be \(C^2\) in z (to impose harmonicity, i.e. vacuum condition)

(Note that, in the second condition, being \(C^2\) in z is equivalent to being analytic in z, a well known property of harmonic functions (see, e.g., [87, Theorem 1.28]). For the relevant references in the study of such impulsive waves, consult [77, Sec. 1.3 (d)].