Abstract
The theory of gravitational lensing is reviewed from a spacetime perspective, without quasiNewtonian approximations. More precisely, the review covers all aspects of gravitational lensing where light propagation is described in terms of lightlike geodesics of a metric of Lorentzian signature. It includes the basic equations and the relevant techniques for calculating the position, the shape, and the brightness of images in an arbitrary generalrelativistic spacetime. It also includes general theorems on the classification of caustics, on criteria for multiple imaging, and on the possible number of images. The general results are illustrated with examples of spacetimes where the lensing features can be explicitly calculated, including the Schwarzschild spacetime, the Kerr spacetime, the spacetime of a straight string, plane gravitational waves, and others.
Similar content being viewed by others
1 Introduction
In its most general sense, gravitational lensing is a collective term for all effects of a gravitational field on the propagation of electromagnetic radiation, with the latter usually described in terms of rays. According to general relativity, the gravitational field is coded in a metric of Lorentzian signature on the 4dimensional spacetime manifold, and the light rays are the lightlike geodesics of this spacetime metric. From a mathematical point of view, the theory of gravitational lensing is thus the theory of lightlike geodesics in a 4dimensional manifold with a Lorentzian metric.
The first observation of a ‘gravitational lensing’ effect was made when the deflection of star light by our Sun was verified during a Solar eclipse in 1919. Today, the list of observed phenomena includes the following:

Multiple quasars.
The gravitational field of a galaxy (or a cluster of galaxies) bends the light from a distant quasar in such a way that the observer on Earth sees two or more images of the quasar.

Rings.
An extended light source, like a galaxy or a lobe of a galaxy, is distorted into a closed or almost closed ring by the gravitational field of an intervening galaxy. This phenomenon occurs in situations where the gravitational field is almost rotationally symmetric, with observer and light source close to the axis of symmetry. It is observed primarily, but not exclusively, in the radio range.

Arcs.
Distant galaxies are distorted into arcs by the gravitational field of an intervening cluster of galaxies. Here the situation is less symmetric than in the case of rings. The effect is observed in the optical range and may produce “giant luminous arcs”, typically of a characteristic blue color.

Microlensing.
When a light source passes behind a compact mass, the focusing effect on the light leads to a temporal change in brightness (energy flux). This microlensing effect is routinely observed since the early 1990s by monitoring a large number of stars in the bulge of our Galaxy, in the Magellanic Clouds and in the Andromeda galaxy. Microlensing has also been observed on quasars.

Image distortion by weak lensing.
In cases where the distortion effect on galaxies is too weak for producing rings or arcs, it can be verified with statistical methods. By evaluating the shape of a large number of background galaxies in the field of a galaxy cluster, one can determine the surface mass density of the cluster. By evaluating fields without a foreground cluster one gets information about the largescale mass distribution.
Observational aspects of gravitational lensing and methods of how to use lensing as a tool in astrophysics are the subject of the Living Review by Wambsganss [343]. There the reader may also find some notes on the history of lensing.
The present review is meant as complementary to the review by Wambsganss. While all the theoretical methods reviewed in [343] rely on quasiNewtonian approximations, the present review is devoted to the theory of gravitational lensing from a spaectime perspective, without such approximations. Here the terminology is as follows: “Lensing from a spacetime perspective” means that light propagation is described in terms of lightlike geodesics of a generalrelativistic spacetime metric, without further approximations. (The term “nonperturbative lensing” is sometimes used in the same sense.) “QuasiNewtonian approximation” means that the generalrelativistic spacetime formalism is reduced by approximative assumptions to essentially Newtonian terms (Newtonian space, Newtonian time, Newtonian gravitational field). The quasiNewtonian approximation formalism of lensing comes in several variants, and the relation to the exact formalism is not always evident because sometimes plausibility and adhoc assumptions are implicitly made. A common feature of all variants is that they are “weakfield approximations” in the sense that the spacetime metric is decomposed into a background (“spacetime without the lens”) and a small perturbation of this background (“gravitational field of the lens”). For the background one usually chooses either Minkowski spacetime (isolated lens) or a spatially flat RobertsonWalker spacetime (lens embedded in a cosmological model). The background then defines a Euclidean 3space, similar to Newtonian space, and the gravitational field of the lens is similar to a Newtonian gravitational field on this Euclidean 3space. Treating the lens as a small perturbation of the background means that the gravitational field of the lens is weak and causes only a small deviation of the light rays from the straight lines in Euclidean 3space. In its most traditional version, the formalism assumes in addition that the lens is “thin”, and that the lens and the light sources are at rest in Euclidean 3space, but there are also variants for “thick” and moving lenses. Also, modifications for a spatially curved RobertsonWalker background exist, but in all variants a nontrivial topological or causal structure of spacetime is (explicitly or implicitly) excluded. At the center of the quasiNewtonian formalism is a “lens equation” or “lens map”, which relates the position of a “lensed image” to the position of the corresponding “unlensed image”. In the most traditional version one considers a thin lens at rest, modeled by a Newtonian gravitational potential given on a plane in Euclidean 3space (“lens plane”). The light rays are taken to be straight lines in Euclidean 3space except for a sharp bend at the lens plane. For a fixed observer and light sources distributed on a plane parallel to the lens plane (“source plane”), the lens map is then a map from the lens plane to the source plane. In this way, the geometric spacetime setting of general relativity is completely covered behind a curtain of approximations, and one is left simply with a map from a plane to a plane. Details of the quasiNewtonian approximation formalism can be found not only in the abovementioned Living Review [343], but also in the monographs of Schneider, Ehlers, and Falco [299] and Petters, Levine, and Wambsganss [275].
The quasiNewtonian approximation formalism has proven very successful for using gravitational lensing as a tool in astrophysics. This is impressively demonstrated by the work reviewed in [343]. On the other hand, studying lensing from a spacetime perspective is of relevance under three aspects:

Didactical.
The theoretical foundations of lensing can be properly formulated only in terms of the full formalism of general relativity. Working out examples with strong curvature and with nontrivial causal or topological structure demonstrates that, in principle, lensing situations can be much more complicated than suggested by the quasiNewtonian formalism.

Methodological.
General theorems on lensing (e.g., criteria for multiple imaging, characterizations of caustics, etc.) should be formulated within the exact spacetime setting of general relativity, if possible, to make sure that they are not just an artifact of approximative assumptions. For those results which do not hold in arbitrary spacetimes, one should try to find the precise conditions on the spacetime under which they are true.

Practical.
There are some situations of astrophysical interest to which the quasiNewtonian formalism does not apply. For instance, near a black hole light rays are so strongly bent that, in principle, they can make arbitrarily many turns around the hole. Clearly, in this situation it is impossible to use the quasiNewtonian formalism which would treat these light rays as small perturbations of straight lines.
The present review tries to elucidate all three aspects. More precisely, the following subjects will be covered:

The basic equations and all relevant techniques that are needed for calculating the position, the shape, and the brightness of images in an arbitrary generalrelativistic spacetime are reviewed. Part of this material is wellestablished since decades, like the Sachs equations for the optical scalars (Section 2.3), which are of crucial relevance for calculating distance measures (Section 2.4), image distortion (Section 2.5), and the brightness of images (Section 2.6). It is included here to keep the review selfcontained. Other parts refer to more recent developments which are far from being fully explored, like the exact lens map (Section 2.1) and variational techniques (Section 2.9). Specifications and simplifications are possible for spacetimes with symmetries. The case of spherically symmetric and static spacetimes is treated in greater detail (Section 4.3).

General theorems on lensing in arbitrary spacetimes, or in certain classes of spacetimes, are reviewed. Some of these results are of a local character, like the classification of locally stable caustics (Section 2.2). Others are related to global aspects, like the criteria for multiple imaging in terms of conjugate points and cut points (Sections 2.7 and 2.8). The global theorems can be considerably strengthened if one restricts to globally hyperbolic spacetimes (Section 3.1) or, more specifically, to asymptotically simple and empty spacetimes (Section 3.4). The latter may be viewed as spacetime models for isolated transparent lenses. Also, in globally hyperbolic spacetimes Morse theory can be used for investigating whether the total number of images is finite or infinite, even or odd (Section 3.3). In a spherically symmetric and static spacetime, the occurrence of an infinite sequence of images is related to the occurrence of a “light sphere” (circular lightlike geodesics), like in the Schwarzschild spacetime at r = 3m (Section 4.3).

Several examples of spacetimes are considered, where the lightlike geodesics and, thus, the lensing features can be calculated explicitly. The examples are chosen such that they illustrate the general results. Therefore, in many parts of the review the reader will find suggestions to look at pictures in the example section. The best known and astrophysically most relevant examples are the Schwarzschild spacetime (Section 5.1), the Kerr spacetime (Section 5.8) and the spacetime of a straight string (Section 5.10). Schwarzschild black hole lensing and Kerr black hole lensing was intensively investigated already in the 1960s, 1970s, and 1980s, with astrophysical applications concentrating on observable features of accretion disks. More recently, the increasing evidence that there is a black hole at the center of our Galaxy (and probably at the center of most galaxies) has led to renewed and intensified interest in black hole lensing (see Sections 5.1 and 5.8). This is a major reason for the increasing number of articles on lensing beyond the quasiNewtonian approximation. (It is, of course, true that this number is still small in comparison to the huge number of all articles on lensing; see [298, 244] for extensive lensing bibliographies.)
This introduction ends with some notes on subjects not covered in this review:

Wave optics.
In the electromagnetic theory, light is described by wavelike solutions to Maxwell’s equations. The rayoptical treatment used throughout this review is the standard highfrequency approximation (geometric optics approximation) of the electromagnetic theory for light propagation in vacuum on a generalrelativistic spacetime (see, e.g., [225], § 22.5 or [299], Section 3.2). (Other notions of vacuum light rays, based on a different approximation procedure, have been occasionally suggested [217], but will not be considered here. Also, results specific to spacetime dimensions other than four or to gravitational theories other than Einstein’s are not covered.) For most applications to lensing the rayoptical treatment is valid and appropriate. An exception, where waveoptical corrections are necessary, is the calculation of the brightness of images if a light source comes very close to the caustic of the observer’s light cone (see Section 2.6).

Light propagation in matter.
If light is directly influenced by a medium, the light rays are no longer the lightlike geodesics of the spacetime metric. For an isotropic nondispersive medium, they are the lightlike geodesics of another metric which is again of Lorentzian signature. (This “optical metric” was introduced by Gordon [143]. For a rigourous derivation, starting from Maxwell’s equation in an isotropic nondispersive medium, see Ehlers [88].) Hence, the formalism used throughout this review still applies to this situation after an appropriate reinterpretation of the metric. In anisotropic or dispersive media, however, the light rays are not the lightlike geodesics of a Lorentzian metric. There are some lensing situations where the influence of matter has to be taken into account. For instance., for the deflection of radio signals by our Sun the influence of the plasma in the Solar corona (to be treated as a dispersive medium) is very well measurable. However, such situations will not be considered in this review. For light propagation in media on a generalrelativistic spacetime, see [269] and references cited therein.

Kinetic theory.
As an alternative to the (geometric optics approximation of) electromagnetic theory, light can be treated as a photon gas, using the formalism of kinetic theory. This has relevance, e.g., for the cosmic background radiation. For basic notions of generalrelativistic kinetic theory see, e.g., [89]. Apart from some occasional remarks, kinetic theory will not be considered in this review.

Derivation of the quasiNewtonian formalism.
It is not satisfacory if the quasiNewtonian formalism of lensing is set up with the help of adhoc assumptions, even if the latter look plausible. From a methodological point of view, it is more desirable to start from the exact spacetime setting of general relativity and to derive the quasiNewtonian lens equation by a welldefined approximation procedure. In comparison to earlier such derivations [299, 294, 303] more recent effort has led to considerable improvements. For lenses embedded in a cosmological model, see Pyne and Birkinshaw [283] who consider lenses that need not be thin and may be moving on a RobertsonWalker background (with positive, negative, or zero spatial curvature). For the noncosmological situation, a Lorentz covariant approximation formalism was derived by Kopeikin and Schäfer [185]. Here Minkowski spacetime is taken as the background, and again the lenses need not be thin and may be moving.
2 Lensing in Arbitrary Spacetimes
By a spacetime we mean a 4dimensional manifold \({\mathcal M}\) with a (C^{∞}, if not otherwise stated) metric tensor field g of signature (+, +, +, −) that is timeoriented. The latter means that the nonspacelike vectors make up two connected components in the entire tangent bundle, one of which is called “futurepointing” and the other one “pastpointing”. Throughout this review we restrict to the case that the light rays are freely propagating in vacuum, i.e., are not influenced by mirrors, refractive media, or any other impediments. The light rays are then the lightlike geodesics of the spacetime metric. We first summarize results on the lightlike geodesics that hold in arbitrary spacetimes. In Section 3 these results will be specified for spacetimes with conditions on the causal structure and in Section 4 for spacetimes with symmetries.
2.1 Light cone and exact lens map
In an arbitrary spacetime (\({\mathcal M}\), g), what an observer at an event p_{O} can see is determined by the lightlike geodesics that issue from p_{O} into the past. Their union gives the past light cone of p_{O}. This is the central geometric object for lensing from the spacetime perspective. For a point source with worldline γ_{S}, each pastoriented lightlike geodesic λ from p_{O} to γ_{S} gives rise to an image of γ_{S} on the observer’s sky. One should view any such λ as the central ray of a thin bundle that is focused by the observer’s eye lens onto the observer’s retina (or by a telescope onto a photographic plate). Hence, the intersection of the past light cone with the worldline of a point source (or with the worldtube of an extended source) determines the visual appearance of the latter on the observer’s sky.
In mathematical terms, the observer’s sky or celestial sphere \(({{\mathcal S}_{\rm{O}}})\) can be viewed as the set of all lightlike directions at p_{O}. Every such direction defines a unique (up to parametrization) lightlike geodesic through p_{O}, so \(({{\mathcal S}_{\rm{O}}})\) may also be viewed as a subset of the space of all lightlike geodesics in (\({\mathcal M},g\)) (cf. [209]). One may choose at p_{O} a futurepointing vector U_{O} with g(U_{O}, U_{O}) = −1, to be interpreted as the 4velocity of the observer. This allows identifying the observer’s sky \({{\mathcal S}_{\rm{O}}}\) with a subset of the tangent space \({T_{P{\rm{O}}}}{\mathcal M}\),
If U_{O} is changed, this representation changes according to the standard aberration formula of special relativity. By definition of the exponential map exp, every affinely parametrized geodesic s ↦ τ(s) satisfies \(\lambda (s) = \exp (s\dot \lambda (0))\). Thus, the past light cone of p_{O} is the image of the map
which is defined on a subset of \(]0,\infty [ \times {{\mathcal S}_{\rm{O}}}\). If we restrict to values of s sufficiently close to 0, the map (2) is an embedding, i.e., this truncated light cone is an embedded submanifold; this follows from the wellknown fact that exp maps a neighborhood of the origin, in each tangent space, diffeomorphically into the manifold. However, if we extend the map (2) to larger values of s, it is in general neither injective nor an immersion; it may form folds, cusps, and other forms of caustics, or transverse selfintersections. This observation is of crucial importance in view of lensing. There are some lensing phenomena, such as multiple imaging and image distortion of (point) sources into (1dimensional) rings, which can occur only if the light cone fails to be an embedded submanifold (see Section 2.8). Such lensing phenomena are summarized under the name strong lensing effects. As long as the light cone is an embedded submanifold, the effects exerted by the gravitational field on the apparent shape and on the apparent brightness of light sources are called weak lensing effects. For examples of light cones with caustics and/or transverse selfintersections, see Figures 12, 24, and 25. These pictures show light cones in spacetimes with symmetries, so their structure is rather regular. A realistic model of our own light cone, in the real world, would have to take into account numerous irregularly distributed inhomogeneities (“clumps”) that bend light rays in their neighborhood. Ellis, Bassett, and Dunsby [99] estimate that such a light cone would have at least 10^{22} caustics which are hierarchically structured in a way that reminds of fractals.
For calculations it is recommendable to introduce coordinates on the observer’s past light cone. This can be done by choosing an orthonormal tetrad (e_{0}, e_{1}, e_{2}, e_{3}) with e_{0} = −U_{O} at the observation event p_{O}. This parametrizes the points of the observer’s celestial sphere by spherical coordinates (Ψ, Θ),
In this representation, map (2) maps each (s, Ψ Θ) to a spacetime point. Letting the observation event float along the observer’s worldline, parametrized by proper time τ, gives a map that assigns to each (s, Ψ, Θ, τ) a spacetime point. In terms of coordinates x = (x^{0}, x^{1}, x^{2}, x^{3}) on the spacetime manifold, this map is of the form
It can be viewed as a map from the world as it appears to the observer (via optical observations) to the world as it is. The observational coordinates (s, Ψ, Θ, τ) were introduced by Ellis [98] (see [100] for a detailed discussion). They are particularly useful in cosmology but can be introduced for any observer in any spacetime. It is useful to consider observables, such as distance measures (see Section 2.4) or the ellipticity that describes image distortion (see Section 2.5) as functions of the observational coordinates. Some observables, e.g., the redshift and the luminosity distance, are not determined by the spacetime geometry and the observer alone, but also depend on the 4velocities of the light sources. If a vector field U with g(U, U) = −1 has been fixed, one may restrict to an observer and to light sources which are integral curves of U. The abovementioned observables, like redshift and luminosity distance, are then uniquely determined as functions of the observational coordinates. In applications to cosmology one chooses U as tracing the mean flow of luminous matter (“Hubble flow”) or as the rest system of the cosmic background radiation; present observations are compatible with the assumption that these two distinguished observer fields coincide [32].
Writing map (4) explicitly requires solving the lightlike geodesic equation. This is usually done, using standard index notation, in the Lagrangian formalism, with the Lagrangian \({\mathcal L} = {1 \over 2}{g_{ij}}(x){{\dot x}^i}{{\dot x}^j}\), or in the Hamiltonian formalism, with the Hamiltonian \({\mathcal H} = {1 \over 2}{g^{ij}}(x){p_i}{p_j}\). A nontrivial example where the solutions can be explicitly written in terms of elementary functions is the string spacetime of Section 5.10. Somewhat more general, although still very special, is the situation that the lightlike geodesic equation admits three independent constants of motion in addition to the obvious one g^{ij}(x)p_{i}p_{j} = 0. If, for any pair of the four constants of motion, the Poisson bracket vanishes (“complete integrability”), the lightlike geodesic equation can be reduced to firstorder form, i.e., the light cone can be written in terms of integrals over the metric coefficients. This is true, e.g., in spherically symmetric and static spacetimes (see Section 4.3).
Having parametrized the past light cone of the observation event p_{O} in terms of (s, w), or more specifically in terms of (s, Ψ, Θ), one may set up an exact lens map. This exact lens map is analogous to the lens map of the quasiNewtonian approximation formalism, as far as possible, but it is valid in an arbitrary spacetime without approximation. In the quasiNewtonian formalism for thin lenses at rest, the lens map assigns to each point in the lens plane a point in the source plane (see, e.g., [299, 275, 343]). When working in an arbitrary spacetime without approximations, the observer’s sky \({{\mathcal S}_{\rm{O}}}\) is an obvious substitute for the lens plane. As a substitute for the source plane we choose a 3dimensional submanifold \({\mathcal T}\) with a prescribed ruling by timelike curves. We assume that \({\mathcal T}\) is globally of the form \({\mathcal Q} \times {\mathbb R}\), where the points of the 2manifold \({\mathcal Q}\) label the timelike curves by which \({\mathcal T}\) is ruled. These timelike curves are to be interpreted as the worldlines of light sources. We call any such \({\mathcal T}\) a source surface. In a nutshell, choosing a source surface means choosing a twoparameter family of light sources.
The exact lens map is a map from \({{\mathcal S}_{\rm{O}}}\) to \({\mathcal Q}\). It is defined by following, for each \(w \in {{\mathcal S}_{\rm{O}}}\), the pastpointing geodesic with initial vector ω until it meets \({\mathcal T}\) and then projecting to \({\mathcal Q}\) (see Figure 1). In other words, the exact lens map says, for each point on the observer’s celestial sphere, which of the chosen light sources is seen at this point. Clearly, noninvertibility of the lens map indicates multiple imaging. What one chooses for \({\mathcal T}\) depends on the situation. In applications to cosmology, one may choose galaxies at a fixed redshift z = z_{S} around the observer. In a sphericallysymmetric and static spacetime one may choose static light sources at a fixed radius value r = r_{S}. Also, the surface of an extended light source is a possible choice for \({\mathcal T}\).
The exact lens map was introduced by Frittelli and Newman [123] and further discussed in [91, 90]. The following global aspects of the exact lens map were investigated in [270]. First, in general the lens map is not defined on all of \({{\mathcal S}_{\rm{O}}}\) because not all pastoriented lightlike geodesics that start at p_{O} necessarily meet \({\mathcal T}\). Second, in general the lens map is multivalued because a lightlike geodesic might meet \({\mathcal T}\) several times. Third, the lens map need not be differentiable and not even continuous because a lightlike geodesic might meet \({\mathcal T}\) tangentially. In [270], the notion of a simple lensing neighborhood is introduced which translates the statement that a deflector is transparent into precise mathematical language. It is shown that the lens map is globally welldefined and differentiable if the source surface is the boundary of such a simple lensing neighborhood, and that for each light source that does not meet the caustic of the observer’s past light cone the number of images is finite and odd. This result applies, as a special case, to asymptotically simple and empty spacetimes (see Section 3.4).
For expressing the exact lens map in coordinate language, it is recommendable to choose coordinates (x^{0}, x^{1}, x^{2}, x^{3}) such that the source surface \({\mathcal T}\) is given by the equation \({x^3} = x_{\rm{S}}^3\), with a constant \(x_{\rm{S}}^3\), and that the worldlines of the light sources are x^{0}lines. In this situation the remaining coordinates x^{1} and x^{2} label the light sources and the exact lens map takes the form
It is given by eliminating the two variables s and x^{0} from the four equations (4) with \({F^3}(s,\Psi, \Theta, \tau) = x_{\rm{S}}^3\) and fixed τ. This is the way in which the lens map was written in the original paper by Frittelli and Newman; see Equation (6) in [123]. (They used complex coordinates (\(\eta, \bar \eta\)) for the observer’s celestial sphere that are related to our spherical coordinates (Ψ, Θ) by stereographic projection.) In this explicit coordinate version, the exact lens map can be succesfully applied, in particular, to spherically symmetric and static spacetimes, with x^{0} = t, x^{1} = ϕ, x^{2} = ϑ, and x^{3} = r (see Section 4.3 and the Schwarzschild example in Section 5.1). The exact lens map can also be used for testing the reliability of approximation techniques. In [184] the authors find that the standard quasiNewtonian approximation formalism may lead to significant errors for lensing configurations with two lenses.
2.2 Wave fronts
Wave fronts are related to light rays as solutions of the HamiltonJacobi equation are related to solutions of Hamilton’s equations in classical mechanics. For the case at hand (i.e., vacuum light propagation in an arbitrary spacetime, corresponding to the Hamiltonian \({\mathcal H} = {1 \over 2}{g^{ij}}(x){p_i}{p_j})\), a wave front is a subset of the spacetime that can be constructed in the following way:

1.
Choose a spacelike 2surface S that is orientable.

2.
At each point of \({\mathcal S}\), choose a lightlike direction orthogonal to \({\mathcal S}\) that depends smoothly on the footpoint. (You have to choose between two possibilities.)

3.
Take all lightlike geodesics that are tangent to the chosen directions. These lightlike geodesics are called the generators of the wave front, and the wave front is the union of all generators.
Clearly, a light cone is a special case of a wave front. One gets this special case by choosing for \({\mathcal S}\) an appropriate (small) sphere. Any wave front is the envelope of all light cones with vertices on the wave front. In this sense, generalrelativistic wave fronts can be constructed according to the Huygens principle.
In the context of general relativity the notion of wave fronts was introduced by Kermack, McCrea, and Whittaker [180]. For a modern review article see, e.g., Ehlers and Newman [93].
A coordinate representation for a wave front can be given with the help of (local) coordinates (u^{1}, u^{1}) on \({\mathcal S}\). One chooses a parameter value s_{0} and parametrizes each generator λ affinely such that \(\lambda ({s_0}) \in {\mathcal S}\) and \(\dot \lambda ({s_0})\) depends smoothly on the footpoint in \({\mathcal S}\). This gives the wave front as the image of a map
For light cones we may choose spherical coordinates, (u^{1} = Ψ, u^{2} = Θ), (cf. Equation (4) with fixed τ). Near s = s_{0}, map (6) is an embedding, i.e., the wave front is a submanifold. Orthogonality to \({\mathcal S}\) of the initial vectors \(\dot \lambda ({s_0})\) assures that this submanifold is lightlike. Farther away from \({\mathcal S}\), however, the wave front need not be a submanifold. The caustic of the wave front is the set of all points where the map (6) is not an immersion, i.e., where its differential has rank < 3. As the derivative with respect to s is always nonzero, the rank can be 3 − 1 (caustic point of multiplicity one, astigmatic focusing) or 3 − 2 (caustic point of multiplicity two, anastigmatic focusing). In the first case, the crosssection of an “infinitesimally thin” bundle of generators collapses to a line, in the second case to a point (see Section 2.3). For the case that the wave front is a light cone with vertex p_{O}, caustic points are said to be conjugate to p_{O} along the respective generator. For an arbitrary wave front, one says that a caustic point is conjugate to any spacelike 2surface in the wave front. In this sense, the terms “conjugate point” and “caustic point” are synonymous. Along each generator, caustic points are isolated (see Section 2.3) and thus denumerable. Hence, one may speak of the first caustic, the second caustic, and so on. At all points where the caustic is a manifold, it is either spacelike or lightlike. For instance, the caustic of the Schwarzschild light cone in Figure 12 is a spacelike curve; in the spacetime of a transparent string, the caustic of the light cone consists of two lightlike 2manifolds that meet in a spacelike curve (see Figure 25).
Near a noncaustic point, a wave front is a hypersurface S = constant where S satisfies the HamiltonJacobi equation
In the terminology of optics, Equation (7) is called the eikonal equation.
At caustic points, a wave front typically forms cuspidal edges or vertices whose geometry might be arbitrarily complicated, even locally. If one restricts to caustics which are stable against perturbations in a certain sense, then a local classification of caustics is possible with the help of Arnold’s singularity theory of Lagrangian or Legendrian maps. Full details of this theory can be found in [11]. For a readable review of Arnold’s results and its applications to wave fronts in general relativity, we refer again to [93]. In order to apply Arnold’s theory to wave fronts, one associates each wave front with a Legendrian submanifold in the projective cotangent bundle over \({\mathcal M}\) (or with a Lagrangian submanifold in an appropriately reduced bundle). A caustic point of the wave front corresponds to a point where the differential of the projection from the Legendrian submanifold to \({\mathcal M}\) has nonmaximal rank. For the case \(\dim ({\mathcal M}) = 4\), which is of interest here, Arnold has shown that there are only five types of caustic points that are stable with respect to perturbations within the class of all Legendrian submanifolds. They are known as fold, cusp, swallowtail, pyramid, and purse (see Figure 2). Any other type of caustic is unstable in the sense that it changes nondiffeomorphically if it is perturbed within the class of Legendrian submanifolds.
Fold singularities of a wave front form a lightlike 2manifold in spacetime, on a sufficiently small neighborhood of any fold caustic point. The second picture in Figure 2 shows such a “fold surface”, projected to 3space along the integral curves of a timelike vector field. This projected fold surface separates a region covered twice by the wave front from a region not covered at all. If the wave front is the past light cone of an observation event, and if one restricts to light sources with worldlines in a sufficiently small neighborhood of a fold caustic point, there are two images for light sources on one side and no images for light sources on the other side of the fold surface. Cusp singularities of a wave front form a spacelike curve in spacetime, again locally near any cusp caustic point. Such a curve is often called a “cusp ridge”. Along a cusp ridge, two fold surfaces meet tangentially. The third picture in Figure 2 shows the situation projected to 3space. Near a cusp singularity of a past light cone, there is local tripleimaging for light sources in the wedge between the two fold surfaces and local singleimaging for light sources outside this wedge. Swallowtail, pyramid, and purse singularities are points where two or more cusp ridges meet with a common tangent, as illustrated by the last three pictures in Figure 2.
Friedrich and Stewart [118] have demonstrated that all caustic types that are stable in the sense of Arnold can be realized by wave fronts in Minkowski spacetime. Moreover, they stated without proof that, quite generally, one gets the same stable caustic types if one allows for perturbations only within the class of wave fronts (rather than within the larger class of Legendrian submanifolds). A proof of this statement was claimed to be given in [150] where the Lagrangian rather than the Legendrian formalism was used. However, the main result of this paper (Theorem 4.4 of [150]) is actually too weak to justify this claim. A different version of the desired stability result was indeed proven by another approach. In this approach one concentrates on an instantaneous wave front, i.e., on the intersection of a wave front with a spacelike hypersurface \({\mathcal C}\). As an alternative terminology, one calls the intersection of a (“big”) wave front with a hypersurface \({\mathcal C}\) that is transverse to all generators a “small wave front”. Instantaneous wave fronts are special cases of small wave fronts. The caustic of a small wave front is the set of all points where the small wave front fails to be an immersed 2dimensional submanifold of \({\mathcal C}\). If the spacetime is foliated by spacelike hypersurfaces, the caustic of a wave front is the union of the caustics of its small (= instantaneous) wave fronts. Such a foliation can always be achieved locally, and in several spacetimes of interest even globally. If one identifies different slices with the help of a timelike vector field, one can visualize a wave front, and in particular a light cone, as a motion of small (= instantaneous) wave fronts in 3space. Examples are shown in Figures 13, 18, 19, 27, and 28. Mathematically, the same can be done for nonspacelike slices as long as they are transverse to the generators of the considered wave front (see Figure 30 for an example). Turning from (big) wave fronts to small wave fronts reduces the dimension by one. The only caustic points of a small wave front that are stable in the sense of Arnold are cusps and swallowtails. What one wants to prove is that all other caustic points are unstable with respect to perturbations of the wave front within the class of wave fronts, keeping the metric and the slicing fixed. For spacelike slicings (i.e., for instantaneous wave fronts), this was indeed demonstrated by Low [210]. In this article, the author views wave fronts as subsets of the space \({\mathcal N}\) of all lightlike geodesics in (\({\mathcal M},g\)). General properties of this space \({\mathcal N}\) are derived in earlier articles by Low [208, 209] (also see Penrose and Rindler [262], volume II, where the space \({\mathcal N}\) is treated in twistor language). Low considers, in particular, the case of a globally hyperbolic spacetime [210]; he demonstrates the desired stability result for the intersections of a (big) wave front with Cauchy hypersurfaces (see Section 3.2). As every point in an arbitrary spacetime admits a globally hyperbolic neighborhood, this local stability result is universal. Figure 28 shows an instantaneous wave front with cusps and a swallowtail point. Figure 13 shows instantaneous wave fronts with caustic points that are neither cusps nor swallowtails; hence, they must be unstable with respect to perturbations of the wave front within the class of wave fronts.
It is to be emphasized that Low’s work allows to classify the stable caustics of small wave fronts, but not directly of (big) wave fronts. Clearly, a (big) wave front is a oneparameter family of small wave fronts. A qualitative change of a small wave front, in dependence of a parameter, is called a “metamorphosis” in the English literature and a “perestroika” in the Russian literature. Combining Low’s results with the theory of metamorphoses, or perestroikas, could lead to a classsification of the stable caustics of (big) wave fronts. However, this has not been worked out until now.
Wave fronts in general relativity have been studied in a long series of articles by Newman, Frittelli, and collaborators. For some aspects of their work see Sections 2.9 and 3.4. In the quasiNewtonian approximation formalism of lensing, the classification of caustics is treated in great detail in the book by Petters, Levine, and Wambsganss [275]. Interesting related mateial can also be found in Blandford and Narayan [33]. For a nice exposition of caustics in ordinary optics see Berry and Upstill [28].
A light source that comes close to the caustic of the observer’s past light cone is seen strongly magnified. For a point source whose worldline passes exactly through the caustic, the rayoptical treatment even gives an infinite brightness (see Section 2.6). If a light source passes behind a compact deflecting mass, its brightness increases and decreases in the course of time, with a maximum at the moment of closest approach to the caustic. Such microlensing events are routinely observed by monitoring a large number of stars in the bulge of our Galaxy, in the Magellanic Clouds, and in the Andromeda Galaxy (see, e.g., [226] for an overview). In his millennium essay on future perspectives of gravitational lensing, Blandford [34] mentioned the possibility of observing a chosen light source strongly magnified over a period of time with the help of a spaceborn telescope. The idea is to guide the spacecraft such that the worldline of the light source remains in (or close to) the oneparameter family of caustics of past light cones of the spacecraft over a period of time. This futuristic idea of “caustic surfing” was mathematically further discussed by Frittelli and Petters [128].
2.3 Optical scalars and Sachs equations
For the calculation of distance measures, of image distortion, and of the brightness of images one has to study the Jacobi equation (= equation of geodesic deviation) along lightlike geodesics. This is usually done in terms of the optical scalars which were introduced by Sachs et al. [172, 292]. Related background material on lightlike geodesic congruences can be found in many textbooks (see, e.g., Wald [341], Section 9.2). In view of applications to lensing, a particularly useful exposition was given by Seitz, Schneider and Ehlers [303]. In the following the basic notions and results will be summarized.
2.3.1 Infinitesimally thin bundles
Let s ↦ λ(s) be an affinely parametrized lightlike geodesic with tangent vector field \(K = \dot \lambda\). We assume that λ is pastoriented, because in applications to lensing one usually considers rays from the observer to the source. We use the summation convention for capital indices A, B, … taking the values 1 and 2. An infinitesimally thin bundle (with elliptical crosssection) along λ is a set
Here δ_{ab} denotes the Kronecker delta, and Y_{1} and Y_{2} are two vector fields along λ with
such that Y_{1}(s), Y_{2}(s), and K(s) are linearly independent for almost all s. As usual, R denotes the curvature tensor, defined by
Equation (9) is the Jacobi equation. It is a precise mathematical formulation of the statement that “the arrowhead of Y_{a} traces an infinitesimally neighboring geodesic”. Equation (10) guarantees that this neighboring geodesic is, again, lightlike and spatially related to λ.
2.3.2 Sachs basis
For discussing the geometry of infinitesimally thin bundles it is usual to introduce a Sachs basis, i.e., two vector fields E_{1} and E_{2} along λ that are orthonormal, orthogonal to \(K = \dot \lambda\), and parallelly transported,
Apart from the possibility to interchange them, E_{1} and E_{2} are unique up to transformations
where a, a_{1}, and a_{2} are constant along λ. A Sachs basis determines a unique vector field U with g(U, U) = −1 and g(U, K) = 1 along λ that is perpendicular to E_{1}, and E_{2}. As K is assumed pastoriented, U is futureoriented. In the rest system of the observer field U, the Sachs basis spans the 2space perpendicular to the ray. It is helpful to interpret this 2space as a “screen”; correspondingly, linear combinations of E_{1} and E_{2} are often refered to as “screen vectors”.
2.3.3 Jacobi matrix
With respect to a Sachs basis, the basis vector fields Y_{1} and Y_{2} of an infinitesimally thin bundle can be represented as
The Jacobi matrix \(D = (D_A^B)\) relates the shape of the crosssection of the infinitesimally thin bundle to the Sachs basis (see Figure 3). Equation (9) implies that D satisfies the matrix Jacobi equation
where an overdot means derivative with respect to the affine parameter s, and
is the optical tidal matrix, with
Here Ric denotes the Ricci tensor, defined by Ric(X, Y) = tr(R(·, X, Y)), and C denotes the conformal curvature tensor (= Weyl tensor). The notation in Equation (18) is chosen in agreement with the NewmanPenrose formalism (cf., e.g., [54]). As Y_{1}, Y_{2}, and K are not everywhere linearly dependent, det(D) does not vanish identically. Linearity of the matrix Jacobi equation implies that det(D) has only isolated zeros. These are the “caustic points” of the bundle (see below).
2.3.4 Shape parameters
The Jacobi matrix D can be parametrized according to
Here we made use of the fact that any matrix can be written as the product of an orthogonal and a symmetric matrix, and that any symmetric matrix can be diagonalized. Note that, by our definition of infinitesimally thin bundles, D_{+} and D_{−} are nonzero almost everywhere. Equation (19) determines D_{+} and D_{−} up to sign. The most interesting case for us is that of an infinitesimally thin bundle that issues from a vertex at an observation event p_{O} into the past. For such bundles we require D_{+} and D_{−} to be positive near the vertex and differentiable everywhere; this uniquely determines D_{+} and D_{−} everywhere. With D_{+} and D_{−} fixed, the angles χ and ψ are unique at all points where the bundle is noncircular; in other words, requiring them to be continuous determines these angles uniquely along every infinitesimally thin bundle that is noncircular almost everywhere. In the representation of Equation (19), the extremal points of the bundle’s elliptical crosssection are given by the position vectors
where ≃ means equality up to multiples of K. Hence, D_{+} and D_{−} give the semiaxes of the elliptical crosssection and χ gives the angle by which the ellipse is rotated with respect to the Sachs basis (see Figure 3). We call D_{+}, D_{−}, and χ the shape parameters of the bundle, following Frittelli, Kling, and Newman [121, 120]. Instead of D_{+} and D_{−} one may also use D_{+}D_{−} and D_{+}/D_{−}. For the case that the infinitesimally thin bundle can be embedded in a wave front, the shape parameters D_{+} and D_{−} have the following interesting property (see Kantowski et al. [173, 84]). \({{\dot D}_ +}/{D_ +}\) and \({{\dot D}_ }/{D_ }\) give the principal curvatures of the wave front in the rest system of the observer field U which is perpendicular to the Sachs basis. The notation D_{+} and D_{−}, which is taken from [84], is convenient because it often allows to write two equations in the form of one equation with a ± sign (see, e.g., Equation (27) or Equation (93) below). The angle χ can be directly linked with observations if a light source emits linearly polarized light (see Section 2.5). If the Sachs basis is transformed according to Equations (13, 14) and Y_{1} and Y_{2} are kept fixed, the Jacobi matrix changes according to \({{\tilde D}_ \pm} = {{\tilde D}_ \pm},\tilde {\mathcal X} = {\mathcal X} + \alpha, \psi = \psi\). This demonstrates the important fact that the shape and the size of the crosssection of an infinitesimally thin bundle has an invariant meaning [292].
2.3.5 Optical scalars
Along each infinitesimally thin bundle one defines the deformation matrix S by
This reduces the secondorder linear differential equation (16) for D to a firstorder nonlinear differential equation for S,
It is usual to decompose S into antisymmetric, symmetrictracefree, and trace parts,
This defines the optical scalars ω (twist), φ (expansion), and (σ_{1}, σ_{2}) (shear). One usually combines them into two complex scalars ϱ = φ + iω and σ = σ_{1} + iσ_{2}. A change (13, 14) of the Sachs basis affects the optical scalars according to \(\tilde \varrho = \varrho\) and \(\tilde \sigma = {e^{ 2i\alpha}}\sigma\). Thus, ϱ and σ are invariant. If rewritten in terms of the optical scalars, Equation (23) gives the Sachs equations
One sees that the Ricci curvature term Φ_{00} directly produces expansion (focusing) and that the conformal curvature term ψ_{0} directly produces shear. However, as the shear appears in Equation (25), conformal curvature indirectly influences focusing (cf. Penrose [260]). With D written in terms of the shape parameters and S written in terms of the optical scalars, Equation (22) results in
Along λ, Equations (25, 26) give a system of 4 real firstorder differential equations for the 4 real variables ϱ and σ; if ϱ and σ are known, Equation (27) gives a system of 4 real firstorder differential equations for the 4 real variables D_{±}, χ, and ψ. The twistfree solutions (g real) to Equations (25, 26) constitute a 3dimensional linear subspace of the 4dimensional space of all solutions. This subspace carries a natural metric of Lorentzian signature, unique up to a conformal factor, and was nicknamed Minikowski space in [20].
2.3.6 Conservation law
As the optical tidal matrix R is symmetric, for any two solutions D_{1} and D_{2} of the matrix Jacobi equation (16) we have
where ()^{T} means transposition. Evaluating the case D_{1} = D_{2} shows that for every infinitesimally thin bundle
Thus, there are two types of infinitesimally thin bundles: those for which this constant is nonzero and those for which it is zero. In the first case the bundle is twisting (ω ≠ 0 everywhere) and its crosssection nowhere collapses to a line or to a point (D_{+} = 0 and D_{−} = 0 everywhere). In the second case the bundle must be nontwisting (ω = 0 everywhere), because our definition of infinitesimally thin bundles implies that D_{+} = 0 and D_{−} = 0 almost everywhere. A quick calculation shows that ω = 0 is exactly the integrability condition that makes sure that the infinitesimally thin bundle can be embedded in a wave front. (For the definition of wave fronts see Section 2.2.) In other words, for an infinitesimally thin bundle we can find a wave front such that λ is one of the generators, and Y_{1} and Y_{2} connect λ with infinitesimally neighboring generators if and only if the bundle is twistfree. For a (necessarily twistfree) infinitesimally thin bundle, points where one of the two shape parameters D_{+} and D_{−} vanishes are called caustic points of multiplicity one, and points where both shape parameters D_{+} and D_{−} vanish are called caustic points of multiplicity two. This notion coincides exactly with the notion of caustic points, or conjugate points, of wave fronts as introduced in Section 2.2. The behavior of the optical scalars near caustic points can be deduced from Equation (27) with Equations (25, 26). For a caustic point of multiplicty one at s = s_{0} one finds
By contrast, for a caustic point of multiplicity two at s = s_{0} the equations read (cf. [303])
2.3.7 Infinitesimally thin bundles with vertex
We say that an infinitesimally thin bundle has a vertex at s = s_{0} if the Jacobi matrix satisfies
A vertex is, in particular, a caustic point of multiplicity two. An infinitesimally thin bundle with a vertex must be nontwisting. While any nontwisting infinitesimally thin bundle can be embedded in a wave front, an infinitesimally thin bundle with a vertex can be embedded in a light cone. Near the vertex, it has a circular crosssection. If D_{1} has a vertex at s_{1} and D_{2} has a vertex at s_{2}, the conservation law (28) implies
This is Etherington’s [104] reciprocity law. The method by which this law was proven here follows Ellis [97] (cf. Schneider, Ehlers, and Falco [299]). Etherington’s reciprocity law is of relevance, in particular in view of cosmology, because it relates the luminosity distance to the area distance (see Equation (47)). It was independently rediscovered in the 1960s by Sachs and Penrose (see [260, 190]).
The results of this section are the basis for Sections 2.4, 2.5, and 2.6.
2.4 Distance measures
In this section we summarize various distance measures that are defined in an arbitrary spacetime. Some of them are directly related to observable quantities with relevance for lensing. The material of this section makes use of the results on infinitesimally thin bundles which are summarized in Section 2.3. All of the distance measures to be discussed refer to a pastoriented lightlike geodesic λ from an observation event p_{O} to an emission event p_{S} (see Figure 4). Some of them depend on the 4velocity U_{O} of the observer at p_{O} and/or on the 4velocity U_{S} of the light source at p_{S}. If a vector field U with g(U, U) = −1 is distinguished on \({\mathcal M}\), we can choose for the observer an integral curve of U and for the light sources all other integral curves of U. Then each of the distance measures becomes a function of the observational coordinates (s, Ψ, Θ, τ) (recall Section 2.1).
2.4.1 Affine distance
There is a unique affine parametrization s ↦ λ(s) for each lightlike geodesic through the observation event p_{O} such that λ(0) = p_{O} and \(g\left({\dot \lambda (0),{U_O}} \right) = 0\). Then the affine parameter s itself can be viewed as a distance measure. This affine distance has the desirable features that it increases monotonously along each ray and that it coincides in an infinitesimal neighborhood of p_{O} with Euclidean distance in the rest system of U_{O}. The affine distance depends on the 4velocity U_{O} of the observer but not on the 4velocity U_{S} of the light source. It is a mathematically very convenient notion, but it is not an observable. (It can be operationally realized in terms of an observer field whose 4velocities are parallel along the ray. Then the affine distance results by integration if each observer measures the length of an infinitesimally short part of the ray in his rest system. However, in view of astronomical situations this is a purely theoretical construction.) The notion of affine distance was introduced by Kermack, McCrea, and Whittaker [180].
2.4.2 Travel time
As an alternative distance measure one can use the travel time. This requires the choice of a time function, i.e., of a function t that slices the spacetime into spacelike hypersurfaces t = constant. (Such a time function globally exists if and only if the spacetime is stably causal; see, e.g., [154], p. 198.) The travel time is equal to t(p_{O}) − t(p_{S}), for each p_{S} on the past light cone of p_{O}. In other words, the intersection of the light cone with a hypersurface t = constant determines events of equal travel time; we call these intersections “instantaneous wave fronts” (recall Section 2.2). Examples of instantaneous wave fronts are shown in Figures 13, 18, 19, 27, and 28. The travel time increases monotonously along each ray. Clearly, it depends neither on the 4velocity U_{O} of the observer nor on the 4velocity U_{S} of the light source. Note that the travel time has a unique value at each point of p_{O}’s past light cone, even at events that can be reached by two different rays from p_{O}. Near p_{O} the travel time coincides with Euclidean distance in the observer’s rest system only if U_{O} is perpendicular to the hypersurface t = constant with dt(U_{O}) = 1. (The latter equation is true if along the observer’s world line the time function t coincides with proper time.) The travel time is not directly observable. However, travel time differences are observable in multipleimaging situations if the intrinsic luminosity of the light source is timedependent. To illustrate this, think of a light source that flashes at a particular instant. If the flash reaches the observer’s wordline along two different rays, the proper time difference Δτ_{O} of the two arrival events is directly measurable. For a time function t that along the observer’s worldline coincides with proper time, this observed time delay Δτ_{O} gives the difference in travel time for the two rays. In view of applications, the measurement of time delays is of great relevance for quasar lensing. For the double quasar 0957+561 the observed time delay Δτ_{O} is about 417 days (see, e.g., [275], p. 149).
2.4.3 Redshift
In cosmology it is common to use the redshift as a distance measure. For assigning a redshift to a lightlike geodesic λ that connects the observation event p_{O} on the worldline γ_{O} of the observer with the emission event p_{S} on the worldline γ_{S} of the light source, one considers a neighboring lightlike geodesic that meets γ_{O} at a proper time interval Δτ_{O} from p_{O} and γ_{S} at a proper time interval Δτ_{S} from p_{S}. The redshift z is defined as
If λ is affinely parametrized with λ(0) = p_{O} and λ(s) = p_{S}, one finds that z is given by
This general redshift formula is due to Kermack, McCrea, and Whittaker [180]. Their proof is based on the fact that \(g(\dot \lambda, Y)\) is a constant for all Jacobi fields Y that connect λ with an infinitesimally neighboring lightlike geodesic. The same proof can be found, in a more elegant form, in [41] and in [312], p. 109. An alternative proof, based on variational methods, was given by Schrödinger [300]. Equation (37) is in agreement with the Hamilton formalism for photons. Clearly, the redshift depends on the 4velocity U_{O} of the observer and on the 4velocity U_{S} of the light source. If a vector field U with g(U, U) = −1 has been distinguished on \({\mathcal M}\), we may choose one integral curve of U as the observer and all other integral curves of U as the light sources. Then the redshift becomes a function of the observational coordinates (s, Ψ, Θ, τ). For s → 0, the redshift goes to 0,
with a (generalized) Hubble parameter h(Ψ, Θ, τ) that depends on spatial direction and on time. For criteria that h and the higherorder coefficients are independent of Ψ and Θ (see [152]). If the redshift is known for one observer field U, it can be calculated for any other U, according to Equation (37), just by adding the usual specialrelativistic Doppler factors. Note that if U_{O} is given, the redshift can be made to zero along any one ray λ from p_{O} by choosing the 4velocities U_{λ(s)} appropriately. This shows that z is a reasonable distance measure only for special situations, e.g., in cosmological models with U denoting the mean flow of luminous matter (“Hubble flow”). In any case, the redshift is directly observable if the light source emits identifiable spectral lines. For the calculation of Sagnaclike effects, the redshift formula (37) can be evaluated piecewise along broken lightlike geodesics [23].
2.4.4 Angular diameter distances
The notion of angular diameter distance is based on the intuitive idea that the farther an object is away the smaller it looks, according to the rule
The formal definition needs the results of Section 2.3 on infinitesimally thin bundles. One considers a pastoriented lightlike geodesic s → λ(s) parametrized by affine distance, i.e., λ(0) = p_{O} and \(g\left({\dot \lambda (0),{U_O}} \right) = 1\), and along λ an infinitesimally thin bundle with vertex at the observer, i.e., at s = 0. Then the shape parameters D_{+}(s) and D_{−}(s) (recall Figure 3) satisfy the initial conditions D_{±}(0) = 0 and \({{\dot D}_ \pm}(0) = 1\). They have the following physical meaning. If the observer sees a circular image of (small) angular diameter α on his or her sky, the (small but extended) light source at affine distance s actually has an elliptical crosssection with extremal diameters αD_{±}(s). It is therefore reasonable to call D_{+} and D_{−} the extremal angular diameter distances. Near the vertex, D_{+} and D_{−} are monotonously increasing functions of the affine distance, \({D_ \pm}(s) = (s) + {\mathcal O}({s^2})\). Farther away from the vertex, however, they may become decreasing, so the functions s ↦ D_{+}(s) and s ↦ D_{−}(s) need not be invertible. At a caustic point of multiplicity one, one of the two functions D_{+} and D_{−} changes sign; at a caustic point of multiplicity two, both change sign (recall Section 2.3). The image of a light source at affine distance s is said to have even parity if D_{+}(s)D_{−}(s) > 0 and odd parity if D_{+}(s)D_{−}(s) < 0. Images with odd parity show the neighborhood of the light source sideinverted in comparison to images with even parity. Clearly, D_{+} and D_{−} are reasonable distance measures only in a neighborhood of the vertex where they are monotonously increasing. However, the physical relevance of D_{+} and D_{−} lies in the fact that they relate crosssectional diameters at the source to angular diameters at the observer, and this is always true, even beyond caustic points. D_{+} and D_{−} depend on the 4velocity U_{O} of the observer but not on the 4velocity U_{S} of the source. This reflects the fact that the angular diameter of an image on the observer’s sky is subject to aberration whereas the crosssectional diameter of an infinitesimally thin bundle has an invariant meaning (recall Section 2.3). Hence, if the observer’s worldline γ_{O} has been specified, D_{+} and D_{−} are welldefined functions of the observational coordinates (s, Ψ, Θ, τ).
2.4.5 Area distance
The area distance D_{area} is defined according to the idea
As a formal definition for D_{area}, in terms of the extremal angular diameter distances D_{+} and D_{−} as functions of affine distance s, we use the equation
D_{area}(s)^{2} indeed relates, for a bundle with vertex at the observer, the crosssectional area at the source to the opening solid angle at the observer. Such a bundle has a caustic point exactly at those points where D_{area}(s) = 0. The area distance is often called “angular diameter distance” although, as indicated by Equation (41), the name “averaged angular diameter distance” would be more appropriate. Just as D_{+} and D_{−}, the area distance depends on the 4velocity U_{O} of the observer but not on the 4velocity U_{S} of the light source. The area distance is observable for a light source whose true size is known (or can be reasonably estimated). It is sometimes convenient to introduce the magnification or amplification factor
The absolute value of μ determines the area distance, and the sign of μ determines the parity. In Minkowski spacetime, D_{±}(s) = s and, thus, μ(s) = 1. Hence, μ(s) > 1 means that a (small but extended) light source at affine distance s subtends a larger solid angle on the observer’s sky than a light source of the same size at the same affine distance in Minkowski spacetime. Note that in a multipleimaging situation the individual images may have different affine distances. Thus, the relative magnification factor of two images is not directly observable. This is an important difference to the magnification factor that is used in the quasiNewtonian approximation formalism of lensing. The latter is defined by comparison with an “unlensed image” (see, e.g., [299]), a notion that makes sense only if the metric is viewed as a perturbation of some “background” metric. One can derive a differential equation for the area distance (or, equivalently, for the magnification factor) as a function of affine distance in the following way. On every parameter interval where D_{+}D_{−} has no zeros, the real part of Equation (27) shows that the area distance is related to the expansion by
Insertion into the Sachs equation (25) for θ = ϱ gives the focusing equation
Between the vertex at s = 0 and the first conjugate point (caustic point), D_{area} is determined by Equation (44) and the initial conditions
The Ricci term in Equation (44) is nonnegative if Einstein’s field equation holds and if the energy density is nonnegative for all observers (“weak energy condition”). Then Equations (44, 45) imply that
i.e., 1 ≤ μ(s), for all s between the vertex at s = 0 and the first conjugate point. In Minkowski spacetime, Equation (46) holds with equality. Hence, Equation (46) says that the gravitational field has a focusing, as opposed to a defocusing, effect. This is sometimes called the focusing theorem.
2.4.6 Corrected luminosity distance
The idea of defining distance measures in terms of bundle crosssections dates back to Tolman [323] and Whittaker [351]. Originally, this idea was applied not to bundles with vertex at the observer but rather to bundles with vertex at the light source. The resulting analogue of the area distance is the socalled corrected luminosity distance D′_{lum}. It relates, for a bundle with vertex at the light source, the crosssectional area at the observer to the opening solid angle at the light source. Owing to Etherington’s reciprocity law (35), area distance and corrected luminosity distance are related by
The redshift factor has its origin in the fact that the definition of D′_{lum} refers to an affine parametrization adapted to U_{S}, and the definition of D_{area} refers to an affine parametrization adapted to U_{O}. While D_{area} depends on U_{O} but not on U_{S}, D′_{lum} depends on U_{S} but not on U_{O}.
2.4.7 Luminosity distance
The physical meaning of the corrected luminosity distance is most easily understood in the photon picture. For photons isotropically emitted from a light source, the percentage that hit a prescribed area at the observer is proportional to 1/(D′_{lum})^{2}. As the energy of each photon undergoes a redshift, the energy flux at the observer is proportional to 1/(D_{lum})^{2}, where
Thus, D_{lum} is the relevant quantity for calculating the luminosity (apparent brightness) of pointlike light sources (see Equation (52)). For this reason D_{lum} is called the (uncorrected) luminosity distance. The observation that the purely geometric quantity D′_{lum} must be modified by an additional redshift factor to give the energy flux is due to Walker [342]. D_{lum} depends on the 4velocity U_{O} of the observer and of the 4velocity U_{S} of the light source. D_{lum} and D′_{lum} can be viewed as functions of the observational coordinates (s, Ψ, Θ, τ) if a vector field U with g(U, U) = −1 has been distinguished, one integral curve of U is chosen as the observer, and the other integral curves of U are chosen as the light sources. In that case Equation (38) implies that not only D_{area}(s) but also D_{lum}(s) and D′_{lum}(s) are of the form \(s + {\mathcal O}({s^2})\). Thus, near the observer all three distance measures coincide with Euclidean distance in the observer’s rest space.
2.4.8 Parallax distance
In an arbitrary spacetime, we fix an observation event p_{O} and the observer’s 4velocity U_{O}. We consider a pastoriented lightlike geodesic λ parametrized by affine distance, λ(0) = p_{O} and \(g\left({\dot \lambda (0),{U_{\rm{O}}}} \right) = 1\). To a light source passing through the event λ(s) we assign the (averaged) parallax distance D_{par}(s) = −θ(0)^{−1}, where θ is the expansion of an infinitesimally thin bundle with vertex at λ(s). This definition follows [172]. Its relevance in view of cosmology was discussed in detail by Rosquist [289]. D_{par} can be measured by performing the standard trigonometric parallax method of elementary Euclidean geometry, with the observer at p_{O} and an assistant observer at the perimeter of the bundle, and then averaging over all possible positions of the assistant. Note that the method refers to a bundle with vertex at the light source, i.e., to light rays that leave the light source simultaneously. (Averaging is not necessary if this bundle is circular.) D_{par} depends on the 4velocity of the observer but not on the 4velocity of the light source. To within firstorder approximation near the observer it coincides with affine distance (recall Equation (32)). For the potential obervational relevance of D_{par} see [289], and [299], p. 509.
In view of lensing, D_{+}, D_{−}, and D_{lum} are the most important distance measures because they are related to image distortion (see Section 2.5) and to the brightness of images (see Section 2.6). In spacetimes with many symmetries, these quantities can be explicitly calculated (see Section 4.1 for conformally flat spactimes, and Section 4.3 for spherically symmetric static spacetimes). This is impossible in a spacetime without symmetries, in particular in a realistic cosmological model with inhomogeneities (“clumpy universe”). Following Kristian and Sachs [190], one often uses series expansions with respect to s. For statistical considerations one may work with the focusing equation in a FriedmannRobertsonWalker spacetime with average density (see Section 4.1), or with a heuristically modified focusing equation taking clumps into account. The latter leads to the socalled DyerRoeder distance [86, 87] which is discussed in several textbooks (see, e.g., [299]). (For preDyerRoeder papers on optics in cosmological models with inhomogeneities, see the historical notes in [174].) As overdensities have a focusing and underdensities have a defocusing effect, it is widely believed (following [344]) that after averaging over sufficiently large angular scales the FriedmannRobertsonWalker calculation gives the correct distanceredshift relation. However, it was argued by Ellis, Bassett, and Dunsby [99] that caustics produced by the lensing effect of overdensities lead to a systematic bias towards smaller angular sizes (“shrinking”). For a spherically symmetric inhomogeneity, the effect on the distanceredshift relation can be calculated analytically [230]. For thorough discussions of light propagation in a clumpy universe also see Pyne and Birkinshaw [283], and Holz and Wald [161].
2.5 Image distortion
In special relativity, a spherical object always shows a circular outline on the observer’s sky, independent of its state of motion [257, 321]. In general relativity, this is no longer true; a small sphere usually shows an elliptic outline on the observer’s sky. This distortion is caused by the shearing effect of the spacetime geometry on light bundles. For the calculation of image distortion we need the material of Sections 2.3 and 2.4. For an observer with 4velocity U_{O} at an event p_{O}, there is a unique affine parametrization s ↦ λ(s) for each lightlike geodesic through p_{O} such that λ(0) = p_{O} and \(g\left({\dot \lambda (0),{U_{\rm{O}}}} \right) = 1\). Around each of these λ we can consider an infinitesimally thin bundle with vertex at s = 0. The elliptical crosssection of this bundle can be characterized by the shape parameters D_{+}(s), D_{−}(s) and χ(s) (recall Figure 3). In the terminology of Section 2.4, s is the affine distance, and D_{+}(s) and D_{−}(s) are the extremal angular diameter distances. The complex quantity
is called the ellipticity of the bundle. The phase of ϵ determines the position angle of the elliptical crosssection of the bundle with respect to the Sachs basis. The absolute value of ϵ(s) determines the eccentricity of this crosssection; ϵ(s) = 0 indicates a circular crosssection and ϵ(s) = ∞ indicates a caustic point of multiplicity one. (It is also common to use other measures for the eccentricity, e.g., D_{+} − D_{−}/D_{+} + D_{−}.) From Equation (27) with ϱ = θ we get the derivative of ϵ with respect to the affine distance s,
The initial conditions \({D_ \pm}(0) = 0,\,{\dot D_ \pm}(0) = 1\) imply
Equation (50) and Equation (51) determine e if the shear σ is known. The shear, in turn, is determined by the Sachs equations (25, 26) and the initial conditions (32, 33) with s_{0} = 0 for θ(= ϱ) and σ.
It is recommendable to change from the ϵ determined this way to \(\varepsilon =  \bar \epsilon\). This transformation corresponds to replacing the Jacobi matrix D by its inverse. The original quantity ϵ(s) gives the true shape of objects at affine distance s that show a circular image on the observer’s sky. The new quantity ε(s) gives the observed shape for objects at affine distance s that actually have a circular crosssection. In other words, if a (small) spherical body at affine distance s is observed, the ellipticity of its image on the observer’s sky is given by ε(s).
By Equations (50, 51), ϵ vanishes along the entire ray if and only if the shear σ vanishes along the entire ray. By Equations (26, 33), the shear vanishes along the entire ray if and only if the conformal curvature term ψ_{0} vanishes along the entire ray. The latter condition means that \(K = \dot \lambda\) is tangent to a principal null direction of the conformal curvature tensor (see, e.g., Chandrasekhar [54]). At a point where the conformal curvature tensor is not zero, there are at most four different principal null directions. Hence, the distortion effect vanishes along all light rays if and only if the conformal curvature vanishes everywhere, i.e., if and only if the spacetime is conformally flat. This result is due to Sachs [296]. An alternative proof, based on expressions for image distortions in terms of the exponential map, was given by Hasse [149].
For any observer, the distortion measure \(\varepsilon =  \bar \epsilon\) is defined along every light ray from every point of the observer’s worldline. This gives ε as a function of the observational coordinates (s, Ψ, Θ, τ) (recall Section 2.1, in particular Equation (4)). If we fix τ and s, ε is a function on the observer’s sky. (Instead of s, one may choose any of the distance measures discussed in Section 2.4, provided it is a unique function of s.) In spacetimes with sufficiently many symmetries, this function can be explicitly determined in terms of integrals over the metric function. This will be worked out for spherically symmetric static spacetimes in Section 4.3. A general consideration of image distortion and example calculations can also be found in papers by Frittelli, Kling and Newman [121, 120]. Frittelli and Oberst [127] calculate image distortion by a “thick gravitational lens” model within a spacetime setting.
In cases where it is not possible to determine ε by explicitly integrating the relevant differential equations, one may consider series expansions with respect to the affine parameter s. This technique, which is of particular relevance in view of cosmology, dates back to Kristian and Sachs [190] who introduced image distortion as an observable in cosmology. In lowest nonvanishing order, ε(s, Ψ, Θ, τ_{O}) is quadratic with respect to s and completely determined by the conformal curvature tensor at the observation event p_{O} = γ(τ_{O}), as can be read from Equations (50, 51, 33). One can classify all possible distortion patterns on the observer’s sky in terms of the Petrov type of the Weyl tensor [56]. As outlined in [56], these patterns are closely related to what Penrose and Rindler [262] call the fingerprint of the Weyl tensor. At all observation events where the Weyl tensor is nonzero, the following is true. There are at most four points on the observer’s sky where the distortion vanishes, corresponding to the four (not necessarily distinct) principal null directions of the Weyl tensor. For type N, where all four principal null directions coincide, the distortion pattern is shown in Figure 5.
The distortion effect is routinely observed since the mid1980s in the form of arcs and (radio) rings (see [299, 275, 343] for an overview). In these cases a distant galaxy appears strongly elongated in one direction. Such strong elongations occur near a caustic point of multiplicity one where ε → ∞. In the case of rings and (long) arcs, the entire bundle cannot be treated as infinitesimally thin, i.e., a theoretical description of the effect requires an integration. For the idealized case of a point source, images in the form of (1dimensional) rings on the observer’s sky occur in cases of rotational symmetry and are usually called “Einstein rings” (see Section 4.3). The rings that are actually observed show extended sources in situations close to rotational symmetry.
For the majority of galaxies that are not distorted into arcs or rings, there is a “weak lensing” effect on the apparent shape that can be investigated statistically. The method is based on the assumption that there is no prefered direction in the universe, i.e., that the axes of (approximately spheroidal) galaxies are randomly distributed. So, without a distortion effect, the axes of galaxy images should make a randomly distributed angle with the (Ψ, Θ) grid on the observer’s sky. Any deviation from a random distribution is to be attributed to a distortion effect, produced by the gravitational field of intervening masses. With the help of the quasiNewtonian approximation, this method has been elaborated into a sophisticated formalism for determining mass distributions, projected onto the plane perpendicular to the line of sight, from observed image distortions. This is one of the most important astrophysical tools for detecting (dark) matter. It has been used to determine the mass distribution in galaxies and galaxy clusters, and more recently observations of image distortions produced by largescale structure have begun (see [22] for a detailed review). From a methodological point of view, it would be desirable to analyse this important line of astronomical research within a spacetime setting. This should give prominence to the role of the conformal curvature tensor.
Another interesting way of observing weak image distortions is possible for sources that emit linearly polarized radiation. (This is true for many radio galaxies. Polarization measurements are also relevant for stronglensing situations; see Schneider, Ehlers, and Falco [299], p. 82 for an example.) The method is based on the geometric optics approximation of Maxwell’s theory. In this approximation, the polarization vector is parallel along each ray between source and observer [88] (cf., e.g., [225], p. 577). We may, thus, use the polarization vector as a realization of the Sachs basis vector E_{1}. If the light source is a spheroidal celestial body (e.g., an elliptic galaxy), it is reasonable to assume that at the light source the polarization direction is aligned with one of the axes, i.e., 2χ(s)/π ∈ ℤ. A distortion effect is verified if the observed polarization direction is not aligned with an axis of the image, 2χ(0)/π ∉ ℤ. It is to be emphasized that the deviation of the polarization direction from the elongation axis is not the result of a rotation (the bundles under consideration have a vertex and are, thus, twistfree) but rather of successive shearing processes along the ray. Also, the effect has nothing to do with the rotation of an observer field. It is a pure conformal curvature effect. Related misunderstandings have been clarified by Panov and Sbytov [254, 255]. The distortion effect on the polarization plane has, so far, not been observed. (Panov and Sbytov [254] have clearly shown that an effect observed by Birch [31], even if real, cannot be attributed to distortion.) Its future detectability is estimated, for distant radio sources, in [318].
2.6 Brightness of images
For calculating the brightness of images we need the definitions and results of Section 2.4. In particular we need the luminosity distance D_{lum} and its relation to other distance measures. We begin by considering a point source (worldline) that emits isotropically with (bolometric, i.e., integrated over all frequencies) luminosity L. By definition of D_{lum}, in this case the energy flux at the observer is
F is a measure for the brightness of the image on the observer’s sky. The magnitude m used by astronomers is essentially the negative logarithm of F,
with m_{0} being a universal constant. In Equation (52), D_{lum} can be expresed in terms of the area distance D_{area} and the redshift z with the help of the general relation (48). This demonstrates that the magnification factor μ, which is defined by Equation (42), admits the following reinterpretation. μ(s) relates the flux from a point source at affine distance s to the flux from a point source with the same luminosity at the same affine distance and at the same redshift in Minkowski spacetime.
D_{lum} can be explicitly calculated in spacetimes where the Jacobi fields along lightlike geodesics can be explicitly determined. This is true, e.g., in spherically symmetric and static spacetimes where the extremal angular diameter distances D_{+} and D_{−} can be calculated in terms of integrals over the metric coefficients. The resulting formulas are given in Section 4.3 below. Knowledge of D_{+} and D_{−} immediately gives the area distance D_{area} via Equation (41). D_{area} together with the redshift determines D_{lum} via Equation (48). Such an explicit calculation is, of course, possible only for spacetimes with many symmetries.
By Equation (48), the zeros of D_{lum} coincide with the zeros of D_{area}, i.e., with the caustic points. Hence, in the rayoptical treatment a point source is infinitely bright (magnitude m = −∞) if it passes through the caustic of the observer’s past light cone. A waveoptical treatment shows that the energy flux at the observer is actually bounded by diffraction. In the quasiNewtonian approximation formalism, this was demonstrated by an explicit calculation for light rays deflected by a spheroidal mass by Ohanian [245] (cf. [299], p. 220). Quite generally, the rayoptical calculation of the energy flux gives incorrect results if, for two different light paths from the source worldline to the observation event, the time delay is smaller than or approximately equal to the coherence time. Then interference effects give rise to frequencydependent corrections to the energy flux that have to be calculated with the help of wave optics. In multipleimaging situations, the time delay decreases with decreasing mass of the deflector. If the deflector is a cluster of galaxies, a galaxy, or a star, interference effects can be ignored. Gould [145] suggested that they could be observable if a deflector of about 10^{−15} Solar masses happens to be close to the line of sight to a gammaray burster. In this case, the angleseparation between the (unresolvable) images would be of the order 10^{−15} arcseconds (“femtolensing”). Interference effects could make a frequencydependent imprint on the total intensity. Ulmer and Goodman [328] discussed related effects for deflectors of up to 10^{−11} Solar masses. Femtolensing has not been observed so far. However, it is an interesting future perspective for lensing effects where wave optics has to be taken into account. This would give practical relevance to the theoretical work of Herlt and Stephani [156, 157] who calculated gravitational lensing on the basis of wave optics in the Schwarzschild spacetime.
We now turn to the case of an extended source, whose surface makes up a 3dimensional timelike submanifold \({\mathcal T}\) of the spacetime. In this case the radiation is characterized by the surface brightness B (= luminosity L per area) at the source and by the intensity I (= energy flux F per solid angle) at the observer. For each pastoriented light ray from an observation event p_{O} and to an event p_{S} on \({\mathcal T}\), we can relate B and I in the following way. By definition, the area distance D_{area} relates the area at the source to the solid angle at the observer, so we get from Equation (52) \(I = B{D_{{\rm{area}}}}^2/{D_{{\rm{lum}}}}^2\). As area distance and luminosity distance are related by a redshift factor, according to the general law (48), this gives the relation
This result is, of course, valid only if the radiation from different parts of the emitting surface is incoherent; otherwise interference effects have to be taken into account. The most remarkable feature of Equation (54) is that all distance measures have dropped out. Save for a redshift factor, the (observed) intensity of a radiating surface is the same for all observers.
The law for point sources (52) and the law for extended sources (54) refer to bolometric quantities, i.e., to integration over all frequencies. As every astronomical observation is restricted to a certain frequency range, it is actually necessary to consider frequencyspecific quantities. For a point source, one writes \(L = \int\nolimits_0^\infty \ell ({\omega _{\rm{S}}})d{\omega _{\rm{S}}}\) and \(F = \int\nolimits_0^\infty f({\omega _{\rm{O}}})d{\omega _{\rm{O}}}\), where the specific luminosity ℓ is a function of the emitted frequency ω_{S} and the specific flux f is a function of the received frequency ω_{O}. As ω_{S} and ω_{O} are related by a redshift factor, the frequencyspecific version of Equation (52) reads
Similarly, for an extended source one introduces a specific surface brightness b and a specific intensity i such that \(B = \int\nolimits_0^\infty b({\omega _{\rm{S}}})d{\omega _{\rm{S}}}\) and \(I = \int\nolimits_0^\infty i({\omega _{\rm{O}}})d{\omega _{\rm{O}}}\). Then one gets the following frequencyspecific version of Equation (54).
The results summarized in this section can also be derived from the kinetic theory of photons (see, e.g., [89]). In the photon picture, the three redshift factors in Equation (56) are easily understood: The first reflects the fact that each photon undergoes a redshift; the second relates the rate of emission (with respect to proper time at the source) to the rate of reception (with respect to proper time at the obsever); the third reflects the aberration effect on the angular size of the source in dependence of the motion of the observer.
As an example for the calculation of the brightness of images we consider the Schwarzschild spactime (see Figure 17).
2.7 Conjugate points and cut points
In general, the past light cone of an event forms caustics and transverse selfintersections, i.e., it is neither an embedded nor an immersed submanifold. The relevance of this fact in view of lensing was emphasized already in Section 2.1. In the following we demonstrate that caustics and transverse selfintersections of the light cone are related to extremizing properties of lightlike geodesics. A light cone with a caustic and a transverse selfintersection is shown in Figure 25.
In this section and in Section 2.8 we use mathematical techniques which are related to the PenroseHawking singularity theorems. For background material, see Penrose [261], Hawking and Ellis [154], O’Neill [247], and Wald [341].
Recall from Section 2.2 that the caustic of the past light cone of p_{O} is the set of all points where this light cone is not an immersed submanifold. A point p_{S} is in the caustic if a generator λ of the light cone intersects at p_{S} an infinitesimally neighboring generator. In this situation p_{S} is said to be conjugate to p_{O} along λ. The caustic of the past light cone of p_{O} is also called the “past lightlike conjugate locus” of p_{O}.
The notion of conjugate points is related to the extremizing properties of lightlike geodesics in the following way. Let λ be a pastoriented lightlike geodesic with λ(0) = p_{O}. Assume that p_{S} = λ(s_{0}) is the first conjugate point along this geodesic. This means that p_{S} is in the caustic of the past light cone of p_{O} and that λ does not meet the caustic at parameter values between 0 and s_{0}. Then a wellknown theorem says that all points λ(s) with 0 < s < s_{0} cannot be reached from p_{O} along a timelike curve arbitrarily close to λ, and all points λ(s) with s > s_{0} can. For a proof we refer to Hawking and Ellis [154], Proposition 4.5.11 and Proposition 4.5.12. It might be helpful to consult O’Neill [247], Chapter 10, Proposition 48, in addition.
Here we have considered a pastoriented lightlike geodesic because this is the situation with relevance to lensing. Actually, Hawking and Ellis consider the timereversed situation, i.e., with λ futureoriented. Then the result can be phrased in the following way. A material particle may catch up with a light ray λ after the latter has passed through a conjugate point and, for particles staying close to λ, this is impossible otherwise. The restriction to particles staying close to λ is essential. Particles “taking a short cut” may very well catch up with a lightlike geodesic even if the latter is free of conjugate points.
For a discussion of the extremizing property in the global sense, not restricted to timelike curves close to λ, we need the notion of cut points. The precise definition of cut points reads as follows.
As ususal, let I^{−}(p_{O}) denote the chronological past of p_{O}, i.e., the set of all \(q \in {\mathcal M}\) that can be reached from p_{O} along a pastpointing timelike curve. In Minkowski spacetime, the boundary ∂I^{−}(p_{O}) of I^{∂}(p_{O}) is just the past light cone of p_{O} united with {p_{O}}. In an arbitrary spacetime, this is not true. A lightlike geodesic λ that issues from p_{O} into the past is always confined to the closure of I−(p_{O}), but it need not stay on the boundary. The last point on λ that is on the boundary is by definition [46] the cut point of λ. In other words, it is exactly the part of λ beyond the cut point that can be reached from p_{O} along a timelike curve. The union of all cut points, along any pastpointing lightlike geodesic λ from p_{O}, is called the cut locus of the past light cone (or the past lightlike cut locus of p_{O}). For the light cone in Figure 24 this is the curve (actually 2dimensional) where the two sheets of the light cone intersect. For the light cone in Figure 25 the cut locus is the same set plus the swallowtail point (actually 1dimensional). For a detailed discussion of cut points in manifolds with metrics of Lorentzian signature, see [25]. For positive definite metrics, the notion of cut points dates back to Poincaré [280] and Whitehead [350].
For a generator λ of the past light cone of p_{O}, the cut point of λ does not exist in either of the two following cases:

1.
λ always stays on the boundary ∂I^{−}(p_{O}), i.e., it never loses its extremizing property.

2.
λ is always in I^{−}(p_{O}), i.e., it fails to be extremizing from the very beginning.
Case 2 occurs, e.g., if there is a closed timelike curve through p_{O}. More precisely, Case 2 is excluded if the past distinguishing condition is satisfied at p_{O}, i.e., if for \(q \in {\mathcal M}\) the implication
holds. If Equation (57) is true, the following can be shown:

(P1)
If, along λ, the point λ(s) is conjugate to λ(0), the cut point of λ exists and it comes on or before λ(s).

(P2)
Assume that a point q can be reached from p_{O} along two different lightlike geodesics λ_{1} and λ_{2} from p_{O}. Then the cut point of λ_{1} and of λ_{2} exists and it comes on or before q.

(P3)
If the cut locus of a past light cone is empty, this past light cone is an embedded submanifold of \({\mathcal M}\).
For proofs see [268]; The proofs can also be found in or easily deduced from [25]. Statement (P1) says that conjugate points and cut points are related by the easily remembered rule “the cut point comes first”. Statement (P2) says that a “cut” between two geodesics is indicated by the occurrence of a cut point. However, it does not say that exactly at the cut point a second geodesic is met. Such a stronger statement, which truly justifies the name “cut point”, holds in globally hyperbolic spacetimes (see Section 3.1). Statement (P3) implies that the occurrence of transverse selfintersections of a light cone are always indicated by cut points. Note, however, that transverse selfintersections of the past light cone of p_{O} may occur inside I^{−} (p_{O}) and, thus, far away from the cut locus.
Statement (P1) implies that dI^{−}(p_{O}) is an immersed submanifold everywhere except at the cut locus and, of course, at the vertex p_{O}. It is known (see [154], Proposition 6.3.1) that ∂I^{−}(p_{O}) is achronal (i.e., it is impossible to connect any two of its points by a timelike curve) and thus a 3dimensional Lipschitz topological submanifold. By a general theorem of Rademacher (see [113], Theorem 3.6.1), this implies that ∂I^{−}(p_{O}) is differentiable almost everywhere, i.e., that the cut locus has measure zero in ∂I^{−}(p_{O}). Note that this argument does not necessarily imply that the cut locus is a “small” subset of ∂I^{−}(p_{O}). Chruściel and Galloway [57] have demonstrated, by way of example, that an achronal subset \({\mathcal A}\) of a spacetime may fail to be differentiable on a set that is dense in \({\mathcal A}\). So our reasoning so far does not even exclude the possibility that the cut locus is dense in an open subset of ∂I^{−}(p_{O}). This possibility can be excluded in globally hyperbolic spacetimes where the cut locus is always a closed subset of \({\mathcal M}\) (see Section 3.1). In general, the cut locus need not be closed as is exemplified by Figure 24.
In Section 2.8 we investigate the relevance of cut points (and conjugate points) for multiple imaging.
2.8 Criteria for multiple imaging
To investigate whether multiple imaging occurs in a spacetime (\({\mathcal M},g\)), we choose any point p_{O} (observation event) and any timelike curve γ_{S} (wordline of light source) in \({\mathcal M}\). The following cases are possible:

1.
There is no pastpointing lightlike geodesic from p_{O} to γ_{S}. Then the observer at p_{O} does not see any image of the light source γ_{S}. For instance, this occurs in Minkowski spacetime for an inextendible worldline γ_{S} that asymptotically approaches the past light cone of p_{O}.

2.
There is exactly one pastpointing lightlike geodesic from p_{O} to γ_{S}. Then the observer at p_{S} sees exactly one image of the light source γ_{S}. This is the situation naively taken for granted in prerelativistic astronomy.

3.
There are at least two but not more than denumerably many pastpointing lightlike geodesics from p_{O} to γ_{S}. Then the observer at p_{O} sees finitely or infinitely many distinct images of γ_{O} at his or her celestial sphere.

4.
There are more than denumerably many pastpointing lightlike geodesics from p to γ. This happens, e.g., in rotationally symmetric situations where it gives rise to the socalled “Einstein rings” (see Section 4.3). It also happens, e.g., in planewave spacetimes (see Section 5.11).
If Case 3 or 4 occurs, astronomers speak of multiple imaging. We first demonstrate that Case 4 is exceptional. It is easy to prove (see, e.g., [268], Proposition 12) that no finite segment of the timelike curve γ_{S} can be contained in the past light cone of p_{O}. Thus, if there is a continuous oneparameter family of lightlike geodesics that connect p_{O} and γ_{O}, then all family members meet γ_{S} at the same point, say p_{S}. This point must be in the caustic of the light cone because through all noncaustic points there is only a discrete number of generators. One can always find a point p′_{O} arbitrarily close to p_{O} such that γ_{S} does not meet the caustic of the past light cone of p′_{O} (see, e.g., [268], Proposition 10). Hence, by an arbitrarily small perturbation of p_{O} one can always destroy a Case 4 situation. One may interpret this result as saying that Case 4 situations have zero probability. This is, indeed, true as long as we consider point sources (worldlines). The observed rings and arcs refer to extended sources (worldtubes) which are close to the caustic (recall Section 2.5). Such situations occur with nonzero probability.
We will now show how multiple imaging is related to the notion of cut points (recall Section 2.7). For any point p_{O} in an arbitrary spacetime, the following criteria for multiple imaging hold:

(C1)
Let λ be a pastpointing lightlike geodesic from p_{O} and let p_{S} be a point on λ beyond the cut point or beyond the first conjugate point. Then there is a timelike curve γ_{S} through p_{S} that can be reached from p_{O} along a second pastpointing lightlike geodesic.

(C2)
Assume that at p_{O} the pastdistinguishing condition (57) is satisfied. If a timelike curve γ_{S} can be reached from p_{O} along two different pastpointing lightlike geodesics, at least one of them passes through the cut locus of the past light cone of p_{O} on or before arriving at γ_{S}.
For proofs see [267] or [268]. (In [267] Criterion (C2) is formulated with the strong causality condition, although the pastdistinguishing condition is sufficient.) Criteria (C1) and (C2) say that the occurrence of cut points is sufficient and, in pastdistinguishing spacetimes, also necessary for multiple imaging. The occurrence of conjugate points is sufficient but, in general, not necessary for multiple imaging (see Figure 24 for an example without conjugate points where multiple imaging occurs). In Section 3.1 we will see that in globally hyperbolic spacetimes conjugate points are necessary for multiple imaging. So we have the following diagram:
It is well known (see [154], in particular Proposition 4.4.5) that, under conditions which are to be considered as fairly general from a physical point of view, a lightlike geodesic must either be incomplete or contain a pair of conjugate points. These “fairly general conditions” are, e.g., the weak energy condition and the socalled generic condition (see [154] for details). This result implies the occurrence of conjugate points and, thus, of multiple imaging, for a large class of spacetimes.
The occurrence of conjugate points has an important consequence in view of the focusing equation for the area distance D_{area} (recall Section 2.4 and, in particular, Equation (44)). As D_{area} vanishes at the vertex s = 0 and at each conjugate point, there must be a parameter value s_{m} with Ḋ_{area}(s_{m}) = 0 between the vertex and the first conjugate point. An elementary evaluation of the focusing equation (44) then implies
As the Ricci term is related to the energy density via Einstein’s field equation, (58) gives an estimate of energydensityplusshear along the ray. If we observe a multiple imaging situation, and if we know (or assume) that we are in a situation where conjugate points are necessary for multiple imaging, we have thus an estimate on energydensityplusshear along the ray. This line of thought was worked out, under additional assumptions on the spacetime, in [250].
2.9 Fermat’s principle for light rays
It is often advantageous to characterize light rays by a variational principle, rather than by a differential equation. This is particularly true in view of applications to lensing. If we have chosen a point p_{O} (observation event) and a timelike curve γ_{S} (worldline of light source) in spacetime \({\mathcal M}\), we want to determine all pastpointing lightlike geodesics from p_{O} to γ_{S}. When working with a differential equation for light rays, we have to calculate all light rays issuing from p_{O} into the past, and to see which of them meet γ_{S}. If we work with a variational principle, we can restrict to curves from p_{O} to γ_{S} at the outset.
To set up a variational principle, we have to choose the trial curves among which the solution curves are to be determined and the functional that has to be extremized. Let \({{\mathcal L}_{{p_{\rm{O}}},\gamma {\rm{s}}}}\) denote the set of all pastpointing lightlike curves from p_{O} to γ_{S}. This is the set of trial curves from which the lightlike geodesics are to be singled out by the variational principle. Choose a pastoriented but otherwise arbitrary parametrization for the timelike curve γ_{S} and assign to each trial curve the parameter at which it arrives. This gives the arrival time functional T: \({{\mathcal L}_{{p_{\rm{O}}},\gamma {\rm{s}}}} \rightarrow {\mathbb R}\) that is to be extremized. With respect to an appropriate differentiability notion for T, it turns out that the critical points (i.e., the points where the differential of T vanishes) are exactly the geodesics in \({{\mathcal L}_{{p_{\rm{O}}},\gamma {\rm{s}}}}\). This result (or its timereversed version) can be viewed as a generalrelativistic Fermat principle:
Among all ways to move from p_{O} to γ_{S} in the pastpointing (or futurepointing) direction at the speed of light, the actual light rays choose those paths that make the arrival time stationary.
This formulation of Fermat’s principle was suggested by Kovner [187]. The crucial idea is to refer to the arrival time which is given only along the curve γ_{S}, and not to some kind of global time which in an arbitrary spacetime does not even exist. The proof that the solution curves of Kovner’s variational principle are, indeed, exactly the lightlike geodesics was given in [264]. The proof can also be found, with a slight restriction on the spacetime that simplifies matters considerably, in [299]. An alternative version, based on making \({{\mathcal L}_{{p_{\rm{O}}},\gamma {\rm{s}}}}\) into a Hilbert manifold, is given in [266].
As in ordinary optics, the light rays make the arrival time stationary but not necessarily minimal. A more detailed investigation shows that for a geodesic \(\lambda \in {{\mathcal L}_{p{\rm{o}}\gamma {\rm{s}}}}\) the following holds. (For the notion of conjugate points see Sections 2.2 and 2.7.)

(A1)
If along λ there is no point conjugate to p_{O}, λ is a strict local minimum of T.

(A2)
If λ passes through a point conjugate to p_{O} before arriving at γ, it is a saddle of T.

(A3)
If λ reaches the first point conjugate to p_{O} exactly on its arrival at γ_{S}, it may be a local minimum or a saddle but not a local maximum.
For a proof see [264] or [266]. The fact that local maxima cannot occur is easily understood from the geometry of the situation: For every trial curve we can find a neighboring trial curve with a larger T by putting “wiggles” into it, preserving the lightlike character of the curve. Also for Fermat’s principle in ordinary optics, the extremum is never a local maximum, as is mentioned, e.g., in Born and Wolf [35], p. 137.
The advantage of Kovner’s version of Fermat’s principle is that it works in an arbitrary spacetime. In particular, the spacetime need not be stationary and the light source may arbitrarily move around (at subluminal velocity, of course). This allows applications to dynamical situations, e.g., to lensing by gravitational waves (see Section 5.11). If the spacetime is stationary or conformally stationary, and if the light source is at rest, a purely spatial reformulation of Fermat’s principle is possible. This more specific version of Femat’s principle is known since decades and has found various applications to lensing (see Section 4.2). A more sophisticated application of Fermat’s principle to lensing theory is to put up a Morse theory in order to prove theorems on the possible number of images. In its strongest version, this approach has to presuppose a globally hyperbolic spacetime and will be reviewed in Section 3.3.
For a generalization of Kovner’s version of Fermat’s principle to the case that observer and light source have a spatial extension (see [272]).
An alternative variational principle was introduced by Frittelli and Newman [123] and evaluated in [124, 12]. While Kovner’s principle, like the classical Fermat principle, is a varional principle for rays, the FrittelliNewman principle is a variational principle for wave fronts. (For the definition of wave fronts see Section 2.2.) Although Frittelli and Newman call their variational principle a version of Fermat’s principle, it is actually closer to the classical Huygens principle than to the classical Fermat principle. Again, one fixes p_{O} and γ_{S} as above. To define the trial maps, one chooses a set \({\mathcal W}({p_{\rm{O}}})\) of wave fronts, such that for each lightlike geodesic through p_{O} there is exactly one wave front in \({\mathcal W}({p_{\rm{O}}})\) that contains this geodesic. Hence, \({\mathcal W}({p_{\rm{O}}})\) is in onetoone correspondence to the lightlike directions at p_{O} and thus to the 2sphere. Now let \({\mathcal W}({p_{\rm{O}}},{\gamma _{\rm{s}}})\) denote the set of all wave fronts in \({\mathcal W}({p_{\rm{O}}})\) that meet γ_{S}. We can then define the arrival time functional \(T:{\mathcal W}({p_{\rm{O}}},{\gamma _{\rm{s}}}) \rightarrow {\mathbb R}\) by assigning to each wave front the parameter value at which it intersects γ_{S}. There are some cases to be excluded to make sure that T is defined on an open subset of \({\mathcal W}({p_{\rm{O}}}) \simeq {S^2}\), singlevalued and differentiable. If this is the case, one finds that T is stationary at \(W \in {\mathcal W}({p_{\rm{O}}})\) if and only if W contains a lightlike geodesic from p_{O} to γ_{S}. Thus, to each image of γ_{S} on the sky of po there corresponds a critical point of T. The great technical advantage of the FrittelliNewman principle over the Kovner principle is that T is defined on a finite dimensional manifold, directly to be identified with (part of) the observer’s celestial sphere. The arrival time T in the FrittelliNewman approach is directly analogous to the “Fermat potential” in the quasiNewtonian formalism which is discussed, e.g., in [299]. In view of applications, a crucial point is that the space \({\mathcal W}({p_{\rm{O}}})\) is a matter of choice; there are many wave fronts which have one light ray in common. There is a natural choice, e.g., in asymptotically simple spacetimes (see Section 3.4).
Frittelli, Newman, and collaborators have used their variational principle in combination with the exact lens map (recall Section 2.1) to discuss thick and thin lens models from a spacetime perspective [ 24, 12 ]. Methods from differential topology or global analysis, e.g., Morse theory, have not yet been applied to the FrittelliNewman principle.
3 Lensing in Globally Hyperbolic Spacetimes
In a globally hyperbolic spacetime, considerably stronger statements on qualitative lensing features can be made than in an arbitrary spacetime. This includes, e.g., multiple imaging criteria in terms of cut points or conjugate points, and also applications of Morse theory. The value of these results lies in the fact that they hold in globally hyperbolic spacetimes without symmetries, where lensing cannot be studied by explicitly integrating the lightlike geodesic equation.
The most convenient formal definition of global hyperbolicity is the following. In a spacetime (\({\mathcal M},g\)), a subset \({\mathcal C}\) of \({\mathcal M}\) is called a Cauchy surface if every inextendible causal (i.e., timelike or lightlike) curve intersects \({\mathcal C}\) exactly once. A spacetime is globally hyperbolic if and only if it admits a Cauchy surface. The name globally hyperbolic refers to the fact that for hyperbolic differential equations, like the wave equation, existence and uniqueness of a global solution is guaranteed for initial data given on a Cauchy surface. For details on globally hyperbolic spacetimes see, e.g., [154, 25]. It was demonstrated by Geroch [133] that every gobally hyperbolic spacetime admits a continuous function \(t:{\mathcal M} \rightarrow {\mathbb R}\) such that t^{−1}(t_{0}) is a Cauchy surface for every t_{0} ∈ ℝ. A complete proof of the fact that such a Cauchy time function can be chosen differentiable was given much later by Bernal and Sánchez [27, 26]. The topology of a globally hyperbolic spacetime is determined by the topology of any of its Cauchy surfaces, \({\mathcal M} \simeq {\mathcal C} \times {\mathbb R}\). Note, however, that the converse is not true because \({{\mathcal C}_1} \times {\mathbb R}\) may be homeomorphic (and even diffeomorphic) to \({{\mathcal C}_2} \times {\mathbb R}\) without \({{\mathcal C}_1}\) being homeomorphic to \({{\mathcal C}_2}\). For instance, one can construct a globally hyperbolic spacetime with topology ℝ^{4} that admits a Cauchy surface which is not homeomorphic to ℝ^{3} [238].
In view of applications to lensing the following observation is crucial. If one removes a point, a worldline (timelike curve), or a world tube (region with timelike boundary) from an arbitrary spacetime, the resulting spacetime cannot be globally hyperbolic. Thus, restricting to globally hyperbolic spacetimes excludes all cases where a deflector is treated as nontransparent by cutting its world tube from spacetime (see Figure 24 for an example). Note, however, that this does not mean that globally hyperbolic spacetimes can serve as models only for transparent deflectors. First, a globally hyperbolic spacetime may contain “nontransparent” regions in the sense that a light ray may be trapped in a spatially compact set. Second, the region outside the horizon of a (Schwarzschild, Kerr, …) black hole is globally hyperbolic.
3.1 Criteria for multiple imaging in globally hyperbolic spacetimes
In Section 2.7 we have considered the past light cone of an event p_{O} in an arbitrary spacetime. We have seen that conjugate points (= caustic points) indicate that the past light cone fails to be an immersed submanifold and that cut points indicate that it fails to be an embedded submanifold. In a globally hyperbolic spacetime (\({\mathcal M},g\)), the following additional statements are true.

(H1)
The past light cone of any event p_{O}, together with the vertex {p_{O}}, is closed in \({\mathcal M}\).

(H2)
The cut locus of the past light cone of p_{O} is closed in \({\mathcal M}\).

(H3)
Let p_{S} be in the cut locus of the past light cone of p_{O} but not in the conjugate locus (= caustic). Then p_{S} can be reached from p_{O} along two different lightlike geodesics. The past light cone of p_{O} has a transverse selfintersection at p_{S}.

(H4)
The past light cone of p_{O} is an embedded submanifold if and only if its cut locus is empty.
Analogous results hold, of course, for the future light cone, but the past version is the one that has relevance for lensing. For proofs of these statements see [25], Propositions 9.35 and 9.29 and Theorem 9.15, and [268], Propositions 13, 14, and 15. According to Statement (H3), a “cut point” indicates a “cut” of two lightlike geodesics. For geodesics in Riemannian manifolds (i.e., in the positive definite case), an analogous statement holds if the Riemannian metric is complete and is known as Poincaré theorem [280, 350]. It was this theorem that motivated the name “cut point”. Note that Statement (H3) is not true without the assumption that p_{S} is not in the caustic. This is exemplified by the swallowtail point in Figure 25. However, as points in the caustic of the past light cone of p_{O} can be reached from p_{O} along two “infinitesimally close” lightlike geodesics, the name “cut point” may be considered as justified also in this case.
In addition to Statemens (H1) and (H2) one would like to know whether in globally hyperbolic spactimes the caustic of the past light cone of p_{O} (also known as the past lightlike conjugate locus of p_{O}) is closed. This question is closely related to the question of whether in a complete Riemannian manifold the conjugate locus of a point is closed. For both questions, the answer was widely believed to be ‘yes’ although actually it is ‘no’. To the surprise of many, Margerin [215] constructed Riemannian metrics on the 2sphere such that the conjugate locus of a point is not closed. Taking the product of such a Riemannian manifold with 2dimensional Minkowski space gives a globally hyperbolic spacetime in which the caustic of the past light cone of an event is not closed.
In Section 2.8 we gave criteria for the number of pastoriented lightlike geodesics from a point p_{O} (observation event) to a timelike curve γ_{S} (worldline of a light source) in an arbitrary spacetime. With Statements (H1), (H2), (H3), and (H4) at hand, the following stronger criteria can be given.
Let (\({\mathcal M},g\)) be globally hyperbolic, fix a point p_{O} and an inextendible timelike curve γ_{S} in \({\mathcal M}\). Then the following is true:

(H5)
Assume that γ_{S} enters into the chronological past I^{−} (p_{O}) of p_{O}. Then there is a pastoriented lightlike geodesic λ from p_{O} to γ_{S} that is completely contained in the boundary of I^{−}(p_{O}). This geodesic does not pass through a cut point or through a conjugate point before arriving at γ_{S}.

(H6)
Assume that γ_{S} can be reached from po along a pastoriented lightlike geodesic that passes through a conjugate point or through a cut point before arriving at γ_{S}. Then γ_{S} can be reached from p_{O} along a second pastoriented lightlike geodesic.
Statement (H5) was proven in [327] with the help of Morse theory. For a more elementary proof see [268], Proposition 16. Statement (H5) gives a characterization of the primary image in globally hyperbolic spacetimes. (The primary image is the one that shows the light source at an older age than all other images.) The condition of γ_{S} entering into the chronological past of p_{O} is necessary to exclude the case that p_{O} sees no image of γ_{S}. Statement (H5) implies that there is a unique primary image unless γ_{S} passes through the cut locus of the past light cone of p_{O}. The primary image has even parity. If the weak energy condition is satisfied, the focusing theorem implies that the primary image has magnification factor ≥ 1, i.e., that it appears brighter than a source of the same luminosity at the same affine distance and at the same redshift in Minkowski spacetime (recall Sections 2.4 and 2.6, in particular Equation (46)).
For a proof of Statement (H6) see [268], Proposition 17. Statement (H6) says that in a globally hyperbolic spacetime the occurrence of cut points is necessary and sufficient for multiple imaging, and so is the occurrence of conjugate points.
3.2 Wave fronts in globally hyperbolic spacetimes
In Section 2.2 the notion of wave fronts was discussed in an arbitrary spacetime (\({\mathcal M},g\)). It was mentioned that a wave front can be viewed as a subset of the space \({\mathcal N}\) of all lightlike geodesics in (\({\mathcal M},g\)). This approach is particularly useful in globally hyperbolic spacetimes, as was demonstrated by Low [209, 210]. The construction is based on the observations that, if (\({\mathcal M},g\)) is globally hyperbolic and \({\mathcal C}\) is a smooth Cauchy surface, the following is true:

(N1)
\({\mathcal N}\) can be identified with a sphere bundle over \({\mathcal C}\). The identification is made by assigning to each lightlike geodesic its tangent line at the point where it intersects \({\mathcal C}\). As every sphere bundle over an orientable 3manifold is trivializable, \({\mathcal N}\) is diffeomorphic to \({\mathcal C} \times {S^2}\).

(N2)
\({\mathcal N}\) carries a natural contact structure. (This contact structure is also discussed, in twistor language, in [262], volume II.)

(N3)
The wave fronts are exactly the Legendre submanifolds of \({\mathcal N}\).
Using Statement (N1), the projection from \({\mathcal N}\) to \({\mathcal C}\) assigns to each wave front its intersection with \({\mathcal C}\), i.e., an “instantaneous wave front” or “small wave front” (cf. Section 2.2 for terminology). The points where this projection has nonmaximal rank give the caustic of the small wave front. According to the general stability results of Arnold (see [11]), the only caustic points that are stable with respect to local perturbations within the class of Legendre submanifolds are cusps and swallowtails. By Statement (N3), perturbing within the class of Legendre submanifolds is the same as perturbing within the class of wave fronts. For this local stability result the assumption of global hyperbolicity is irrelevant because every spacelike hypersurface is a Cauchy surface for an appropriately chosen neighborhood of any of its points. So we get the result that was already mentioned in Section 2.2: In an arbitrary spacetime, a caustic point of an instantaneous wave front is stable if and only if it is a cusp or a swallowtail. Here stability refers to perturbations that keep the metric and the hypersurface fixed and perturb the wave front within the class of wave fronts. For a picture of an instantaneous wave front with cusps and a swallowtail point, see Figure 28. In Figure 13, the caustic points are neither cusps nor swallowtails, so the caustic is unstable.
3.3 Fermat’s principle and Morse theory in globally hyperbolic spacetimes
In an arbitrary spacetime, the pastoriented lightlike geodesics from a point p_{O} (observation event) to a timelike curve γ_{S} (worldline of light source) are the solutions of a variational principle (Kovner’s version of Fermat’s principle; see Section 2.9). Every solution of this variational principle corresponds to an image on p_{O}’s sky of γ_{S}. Determining the number of images is the same as determining the number of solutions to the variational problem. If the variational functional satisfies some technical conditions, the number of solutions to the variational principle can be related to the topology of the space of trial paths. This is the content of Morse theory. In the case at hand, the “technical conditions” turn out to be satisfied in globally hyperbolic spacetimes.
To briefly review Morse theory, we consider a differentiable function \(F:{\mathcal X} \rightarrow {\mathbb R}\) on a real manifold \({\mathcal X}\). Points where the differential of F vanishes are called critical points of F. A critical point is called nondegenerate if the Hessian of F is nondegenerate at this point. F is called a Morse function if all its critical points are nondegenerate. In applications to variational problems, \({\mathcal X}\) is the space of trial maps, F is the functional to be varied, and the critical points of F are the solutions to the variational problem. The nondegeneracy condition guarantees that the character of each critical point — local minimum, local maximum, or saddle — is determined by the Hessian of F at this point. The index of the Hessian is called the Morse index of the critical point. It is defined as the maximal dimension of a subspace on which the Hessian is negative definite. At a local minimum the Morse index is zero, at a local maximum it is equal to the dimension of \({\mathcal X}\).
Morse theory was first worked out by Morse [229] for the case that \({\mathcal X}\) is finitedimensional and compact (see Milnor [224] for a detailed exposition). The main result is the following. On a compact manifold \({\mathcal X}\), for every Morse function the Morse inequalities
and the Morse relation
hold true. Here N_{k} denotes the number of critical points with Morse index k and B_{k} denotes the kth Betti number of \({\mathcal X}\). Formally, B_{k} is defined for each topological space \({\mathcal X}\) in terms of the kth singular homology space \({H_k}({\mathcal X})\) with coefficients in a field \({\mathbb F}\) (see, e.g., [78], p. 32). (The results of Morse theory hold for any choice of \({\mathbb F}\).) Geometrically, B_{0} counts the connected components of \({\mathcal X}\) and, for k ≥ 1, B_{k} counts the “holes” in \({\mathcal X}\) that prevent a kcycle with coefficients in \({\mathbb F}\) from being a boundary. In particular, if \({\mathcal X}\) is contractible to a point, then B_{k} = 0 for k ≥ 1. The righthand side of Equation (60) is, by definition, the Euler characteristic of \({\mathcal X}\). By compactness of \({\mathcal X}\), all N_{k}and B_{k} are finite and in both sums of Equation (60) only finitely many summands are different from zero.
Palais and Smale [251, 252] realized that the Morse inequalities and the Morse relations are also true for a Morse function F on a noncompact and possibly infinitedimensional Hilbert manifold, provided that F is bounded below and satisfies a technical condition known as Condition C or PalaisSmale condition. In that case, the N_{k} and B_{k} need not be finite.
The standard application of Morse theory is the geodesic problem for Riemannian (i.e., positive definite) metrics: given two points in a Riemannian manifold, to find the geodesics that join them. In this case F is the “energy functional” (squaredlength functional). Varying the energy functional is related to varying the length functional like Hamilton’s principle is related to Maupertuis’ principle in classical mechanics. For the space \({\mathcal X}\) one chooses, in the PalaisSmale approach [251], the H^{1}curves between the given two points. (An H^{n}curve is a curve with locally squareintegrable nth derivative). This is an infinitedimensional Hilbert manifold. It has the same homotopy type (and thus the same Betti numbers) as the loop space of the Riemannian manifold. (The loop space of a connected topological space is the space of all continuous curves joining any two fixed points.) On this Hilbert manifold, the energy functional is always bounded from below, and its critical points are exactly the geodesics between the given endpoints. A critical point (geodesic) is nondegenerate if the two endpoints are not conjugate to each other, and its Morse index is the number of conjugate points in the interior, counted with multiplicity (“Morse index theorem”). The PalaisSmale condition is satisfied if the Riemannian manifold is complete. So one has the following result: Fix any two points in a complete Riemannian manifold that are not conjugate to each other along any geodesic. Then the Morse inequalities (59) and the Morse relation (60) are true, with N_{k} denoting the number of geodesics with Morse index k between the two points and B_{k} denoting the kth Betti number of the loop space of the Riemannian manifold. The same result is achieved in the original version of Morse theory [229] (cf. [224]) by choosing for \({\mathcal X}\) the space of broken geodesics between the two given points, with N break points, and sending N → ∞ at the end.
Using this standard example of Morse theory as a pattern, one can prove an analogous result for Kovner’s version of Fermat’s principle. The following hypotheses have to be satisfied:

(M1)
p_{O} is a point and γ_{S} is a timelike curve in a globally hyperbolic spacetime (\({\mathcal M},g\)).

(M2)
γ_{S} does not meet the caustic of the past light cone of p_{O}.

(M3)
Every continuous curve from p_{O} to γ_{S} can be continuously deformed into a pastoriented lightlike curve, with all intermediary curves starting at p_{O} and terminating on γ_{S}.
The global hyperbolicity assumption in Statement (M1) is analogous to the completeness assumption in the Riemannian case. Statement (M2) is the direct analogue of the nonconjugacy condition in the Riemmanian case. Statement (M3) is necessary for relating the space of trial paths (i.e., of pastoriented lightlike curves from p_{O} to γ_{S}) to the loop space of the spacetime manifold or, equivalently, to the loop space of a Cauchy surface. If Statements (M1), (M2), and (M3) are valid, the Morse inequalities (59) and the Morse relation (60) are true, with N_{k} denoting the number of pastoriented lightlike geodesics from p_{O} to γ_{S} that have k conjugate points in its interior, counted with muliplicity, and B_{k} denoting the kth Betti number of the loop space of \({\mathcal M}\) or, equivalently, of a Cauchy surface. This result was proven by Uhlenbeck [327] à la Morse and Milnor, and by Giannoni and Masiello [136] in an infinitedimensional Hilbert manifold setting à la Palais and Smale. A more general version, applying to spacetime regions with boundaries, was worked out by Giannoni, Masiello, and Piccione [137, 138]. In the work of Giannoni et al., the proofs are given in greater detail than in the work of Uhlenbeck.
If Statements (M1), (M2), and (M3) are satisfied, Morse theory gives us the following results about the number of images of γ_{S} on the sky of p_{O} (cf. [220]):

(R1)
If \({\mathcal M}\) is not contractible to a point, there are infinitely many images. This follows from Equation (59) because for the loop space of a noncontractible space either B_{0} is infinite or almost all B_{k} are different from zero [304].

(R2)
If \({\mathcal M}\) is contractible to a point, the total number of images is infinite or odd. This follows from Equation (60) because in this case the loop space of \({\mathcal M}\) is contractible to a point, so all Betti numbers B_{k} vanish with the exception of B_{0} = 1. As a consequence, Equation (60) can be written as N_{+} − N_{−} = 1, where N_{+} is the number of images with even parity (geodesics with even Morse index) and N_{−} is the number of images with odd parity (geodesics with odd Morse index), hence N_{+} + N_{−} = 2N_{−} + 1.
These results apply, in particular, to the following situations of physical interest:
3.3.1 Black hole spacetimes
Let (\({\mathcal M},g\)) be the domain of outer communication of the Kerr spacetime, i.e., the region between the (outer) horizon and infinity (see Section 5.8). Then the assumption of global hyperbolicity is satisfied and \({\mathcal M}\) is not contractible to a point. Statement (M3) is satisfied if γ_{S} is inextendible and approaches neither the horizon nor (past lightlike) infinity for t → −∞. (This can be checked with the help of an analytical criterion that is called the “metric growth condition” in [327].) If, in addition Statement (M2) is satisfied, the reasoning of Statement (R1) applies. Hence, a Kerr black hole produces infinitely many images. The same argument can be applied to black holes with (electric, magnetic, YangMills, …) charge.
3.3.2 Asymptotically simple and empty spacetimes
As discussed in Section 3.4, asymptotically simple and empty spacetimes are globally hyperbolic and contractible to a point. They can be viewed as models of isolated transparent gravitational lenses. Statement (M3) is satisfied if γ_{S} is inextendible and bounded away from past lightlike infinity ℐ^{}. If, in addition, Statement (M2) is satisfied, Statement (R2) guarantees that the number of images is infinite or odd. If it were infinite, we had as the limit curve a pastinextendible lightlike geodesic that would not go out to ℐ^{}, in contradiction to the definition of asymptotic simplicity. So the number of images must be finite and odd. The same oddnumber theorem can also be proven with other methods (see Section 3.4).
In this way Morse theory provides us with precise mathematical versions of the statements “A black hole produces infinitely many images” and “An isolated transparent gravitational lens produces an odd number of images”. When comparing this theoretical result with observations one has to be aware of the fact that some images might be hidden behind the deflecting mass, some might be too faint for being detected, and some might be too close together for being resolved.
In conformally stationary spacetimes, with γ_{S} being an integral curve of the conformal Killing vector field, a simpler version of Fermat’s principle and Morse theory can be used (see Section 4.2).
3.4 Lensing in asymptotically simple and empty spacetimes
In elementary optics one often considers “light sources at infinity” which are characterized by the fact that all light rays emitted from such a source are parallel to each other. In general relativity, “light sources at infinity” can be defined if one restricts to a special class of spacetimes. These spacetimes, known as “asymptotically simple and empty” are, in particular, globally hyperbolic. Their formal definition, which is due to Penrose [258], reads as follows (cf. [154], p. 222., and [117], Section 2.3). (Recall that a spacetime is called “strongly causal” if each neighborhood of an event p admits a smaller neighborhood that is intersected by any nonspacelike curve at most once.)
A spacetime (\({\mathcal M},g\)) is called asymptotically simple and empty if there is a strongly causal spacetime (\(\tilde {\mathcal M},\tilde g\)) with the following properties:

(S1)
\({\mathcal M}\) is an open submanifold of \({\tilde {\mathcal M}}\) with a nonempty boundary \(\partial {\mathcal M}\).

(S2)
There is a smooth function \(\Omega:\tilde {\mathcal M} \rightarrow {\mathbb R}\) such that \({\mathcal M} = \{p \in \tilde {\mathcal M}\vert \Omega (p) > 0\}, \partial {\mathcal M} = \{p \in \tilde {\mathcal M}\vert \Omega (p) = 0\}\), dΩ ≠ 0 everywhere on \(\partial {\mathcal M}\) and \(\tilde g = {\Omega ^2}g\) on \({\mathcal M}\).

(S3)
Every inextendible lightlike geodesic in M has past and future endpoint on \(\partial {\mathcal M}\).

(S4)
There is a neighborhood \({\mathcal V}\) of \(\partial {\mathcal M}\) such that the Ricci tensor of g vanishes on \({\mathcal V} \cap {\mathcal M}\).
Asymptotically simple and empty spacetimes are mathematical models of transparent uncharged gravitating bodies that are isolated from all other gravitational sources. In view of lensing, the transparency condition (S3) is particularly important.
We now summarize some wellknown facts about asymptotically simple and empty spacetimes (cf. again [154], p. 222, and [117], Section 2.3). Every asymptotically simple and empty spacetime is globally hyperbolic. \(\partial {\mathcal M}\) is a \({\tilde g}\)lightlike hypersurface of \({\tilde {\mathcal M}}\). It has two connected components, denoted ℐ^{+} and ℐ^{}. Each lightlike geodesic in (\({\mathcal M},g\)) has past endpoint on ℐ^{} and future endpoint on ℐ_{+}roch [134] gave a proof that every Cauchy surface \({\mathcal C}\) of an asymptotically simple and empty spacetime has topology ℝ^{3} and that ℐ_{±} has topology S^{2} × ℝ. The original proof, which is repeated in [154], is incomplete. A complete proof that \({\mathcal C}\) must be contractible and that ℐ_{±} has topology S^{2} × ℝ was given by Newman and Clarke [238] (cf. [237]); the stronger statement that C must have topology ℝ^{3} needs the assumption that the Poincaré conjecture is true (i.e., that every compact and simply connected 3manifold is a 3sphere). In [238] the authors believed that the Poincaré conjecture was proven, but the proof they are refering to was actually based on an error. If the most recent proof of the Poincaré conjecture by Perelman [263] (cf. [346]) turns out to be correct, this settles the matter.
As ℐ_{±} is a lightlike hypersurface in \({\tilde {\mathcal M}}\), it is in particular a wave front in the sense of Section 2.2. The generators of ℐ_{±} are the integral curves of the gradient of Ω. The generators of ℐ_{} can be interpreted as the “worldlines” of light sources at infinity that send light into \({\mathcal M}\). The generators of ℐ_{+} can be interpreted as the “worldlines” of observers at infinity that receive light from \({\mathcal M}\). This interpretation is justified by the observation that each generator of ℐ_{±} is the limit curve for a sequence of timelike curves in \({\mathcal M}\).
For an observation event p_{O} inside \({\mathcal M}\) and light sources at infinity, lensing can be investigated in terms of the exact lens map (recall Section 2.1), with the role of the source surface \({\mathcal T}\) played by ℐ_{}. (For the mathematical properties of the lens map it is rather irrelevant whether the source surface is timelike, lightlike or even spacelike. What matters is that the arriving light rays meet the source surface transversely.) In this case the lens map is a map S^{2} → S^{2}, namely from the celestial sphere of the observer to the set of all generators of ℐ_{}. One can construct it in two steps: First determine the intersection of the past light cone of p_{O} with ℐ_{}, then project along the generators. The intersections of light cones with ℐ_{±} (“light cone cuts of null infinity”) have been studied in [189, 188].
One can assign a mapping degree (= Brouwer degree = winding number) to the lens map S^{2} → S^{2} and prove that it must be ±1 [270]. (The proof is based on ideas of [238, 237]. Earlier proofs of similar statements — [188], Lemma 1, and [268], Theorem 6 — are incorrect, as outlined in [270].) Based on this result, the following oddnumber theorem can be proven for observer and light source inside \({\mathcal M}\) [270]: Fix a point p_{O} and a timelike curve γ_{S} in an asymptotically simple and empty spacetime (\({\mathcal M},g\)). Assume that the image of γ_{S} is a closed subset of ℐ_{+} and that γ_{S} meets neither the point p_{O} nor the caustic of the past light cone of p_{O}. Then the number of pastpointing lightlike geodesics from p_{O} to γ_{S} in \({\mathcal M}\) is finite and odd. The same result can be proven with the help of Morse theory (see Section 3.3).
We will now give an argument to the effect that in an asymptotically simple and empty spacetime the nonoccurrence of multiple imaging is rather exceptional. The argument starts from a standard result that is used in the PenroseHawking singularity theorems. This standard result, given as Proposition 4.4.5 in [154], says that along a lightlike geodesic that starts at a point p_{O} there must be a point conjugate to p_{O}, provided that

1.
the socalled generic condition is satisfied at p_{O},

2.
the weak energy condition is satisfied along the geodesic, and

3.
the geodesic can be extended sufficiently far.
The last assumption is certainly true in an asymptotically simple and empty spacetime because there all lightlike geodesics are complete. Hence, the generic condition and the weak energy condition guarantee that every past light cone must have a caustic point. We know from Section 3.1 that this implies multiple imaging for every observer. In other words, the only asymptotically simple and empty spacetimes in which multiple imaging does not occur are nongeneric cases (like Minkowski spacetime) and cases where the gravitating bodies have negative energy.
The result that, under the aforementioned conditions, light cones in an asymptotically simple and empty spacetime must have caustic points is due to [165]. This paper investigates the past light cones of points on ℐ_{+} and their caustics. These light cones are the generalizations, to an arbitrary asymptotically simple and empty spacetime, of the lightlike hyperplanes in Minkowski spacetime. With their help, the eikonal equation (HamiltonJacobi equation) g^{ij}∂_{i}S∂_{j}S = 0 in an asymptotically simple and empty spacetime can be studied in analogy to Minkowski spacetime [126, 125]. In Minkowski spacetime the lightlike hyperplanes are associated with a twoparameter family of solutions to the eikonal equation. In the terminology of classical mechanics such a family is called a complete integral. Knowing a complete integral allows constructing all solutions to the HamiltonJacobi equation. In an asymptotically simple and empty spacetime the past light cones of points on ℐ_{+} give us, again, a complete integral for the eikonal equation, but now in a generalized sense, allowing for caustics. These past light cones are wave fronts, in the sense of Section 2.2, and cannot be represented as surfaces S = constant near caustic points. The way in which all other wave fronts can be determined from knowledge of this distinguished family of wave fronts is detailed in [125]. The distinguished family of wave fronts gives a natural choice for the space of trial maps in the FrittelliNewman variational principle which was discussed in Section 2.9.
4 Lensing in Spacetimes with Symmetry
4.1 Lensing in conformally flat spacetimes
By definition, a spacetime is conformally flat if the conformal curvature tensor (=Weyl tensor) vanishes. An equivalent condition is that every point admits a neighborhood that is conformal to an open subset of Minkowski spacetime. As a consequence, conformally flat spacetimes have the same local conformal symmetry as Minkowski spacetime, that is they admit 15 independent conformal Killing vector fields. The global topology, however, may be different from the topology of Minkowski spacetime. The class of conformally flat spacetimes includes all (kinematic) RobertsonWalker spacetimes. Other physically interesting examples are some (generalized) interior Schwarzschild solutions and some pure radiation spacetimes. All conformally flat solutions to Einstein’s field equation with a perfect fluid or an electromagnetic field are known (see [311], Section 37.5.3).
If a spacetime is globally conformal to an open subset of Minkowski spacetime, the past light cone of every event is an embedded submanifold. Hence, multiple imaging cannot occur (recall Section 2.8). For instance, multiple imaging occurs in spatially closed but not in spatially open RobertsonWalker spacetimes. In any conformally flat spacetime, there is no image distortion, i.e., a sufficiently small sphere always shows a circular outline on the observer’s sky (recall Section 2.5). Correspondingly, every infinitesimally thin bundle of light rays with a vertex is circular, i.e., the extremal angular diameter distances D_{+} and D_{−} coincide (recall Section 2.4). In addition, D_{+} = D_{−} also coincides with the area distance D_{area}, at least up to sign. D_{+} = D_{−} changes sign at every caustic point. As D_{+} has a zero if and only if D_{−} has a zero, all caustic points of an infinitesimally thin bundle with vertex are of multiplicity two (anastigmatic focusing), so all images have even parity.
The geometry of light bundles can be studied directly in terms of the Jacobi equation (= equation of geodesic deviation) along lightlike geodesics. For a detailed investigation of the latter in conformally flat spacetimes, see [273]. The more special case of FriedmannLemaîtreRobertsonWalker spacetimes (with dust, radiation, and cosmological constant) is treated in [101]. For bundles with vertex, one is left with one scalar equation for D_{+} = D_{−} = ±D_{area}, that is the focusing equation (44) with σ = 0. This equation can be explicitly integrated for FriedmannRobertsonWalker spacetimes (dust without cosmological constant). In this way one gets, for the standard observer field in such a spacetime, relations between redshift and (area or luminosity) distance in closed form [219]. There are generalizations for a RobertsonWalker universe with dust plus cosmological constant [178] and dust plus radiation plus cosmological constant [71]. Similar formulas can be written for the relation between age and redshift [322].
4.2 Lensing in conformally stationary spacetimes
Conformally stationary spacetimes are models for gravitational fields that are timeindependent up to an overall conformal factor. (The timedependence of the conformal factor is important, e.g., if cosmic expansion is to be taken into account.) This is a reasonable model assumption for many, though not all, lensing situations of interest. It allows describing light rays in a 3dimensional (spatial) formalism that will be outlined in this section. The class of conformally stationary spacetimes includes spherically symmetric and static spacetimes (see Sections 4.3) and axisymmetric stationary spacetimes (see Section 4.4). Also, conformally flat spacetimes (see Section 4.1) are conformally stationary, at least locally. A physically relevant example where the conformalstationarity assumption is not satisfied is lensing by a gravitational wave (see Section 5.11).
By definition, a spacetime is conformally stationary if it admits a timelike conformal Killing vector field W. If W is complete and if there are no closed timelike curves, the spacetime must be a product, \({\mathcal M} \simeq {\mathbb R} \times \hat {\mathcal M}\) with a (Hausdorff and paracompact) 3manifold \(\hat{\mathcal M}\) and W parallel to the ℝlines [148]. If we denote the projection from \({\mathcal M}\) to ℝ by t and choose local coordinates x = (x^{1}, x^{2}, x^{3}) on \(\hat{\mathcal M}\), the metric takes the form
with μ, ν, … = 1, 2, 3. The conformal factor e^{2f} does not affect the lightlike geodesics apart from their parametrization. So the paths of light rays are completely determined by the metric \(\hat g = {{\hat g}_{\mu \nu}}(x)d{x^\mu}d{x^\nu}\) and the oneform \(\hat \phi = {{\hat \phi}_\mu}(x)d{x^\mu}\) which live on \(\hat{\mathcal M}\). The metric \({\hat g}\) must be positive definite to give a spacetime metric of Lorentzian signature. We call f the redshift potential, \({\hat g}\) the Fermat metric and \({\hat \phi}\) the Fermat oneform. The motivation for these names will become clear from the discussion below.
If \({{\hat \phi}_\mu} = {\partial _\mu}h\), where h is a function of x = (x^{1}, x^{2}, x^{3}), we can change the time coordinate according to t ↦ t + h(x), thereby transforming \({{\hat \phi}_\mu}d{x^\mu}\) to zero, i.e., making the surfaces t = constant orthogonal to the tlines. This is the conformally static case. Also, Equation (61) includes the stationary case (f independent of t) and the static case (\({{\hat \phi}_\mu} = {\partial _\mu}h\) and f independent of t).
In Section 2.9 we have discussed Kovner’s version of Fermat’s principle which characterizes the lightlike geodesics between a point (observation event) p_{O} and a timelike curve (worldline of light source) γ_{S}. In a conformally stationary spacetime we may specialize to the case that γ_{S} is an integral curve of the conformal Killing vector field, parametrized by the “conformal time” coordinate t (in the pastpointing sense, to be in agreement with Section 2.9). Without loss of generality, we may assume that the observation event p_{O} takes place at t = 0. Then for each trial path (pastoriented lightlike curve) λ from p_{O} to γ_{S} the arrival time is equal to the travel time in terms of the time function t. By Equation (61) this puts the arrival time functional into the following coordinate form
where ℓ is any parameter along the trial path, ranging over an interval [ℓ_{1}, ℓ_{2}] that depends on the individual curve. The righthand side of Equation (62) is a functional for curves in \(\hat{\mathcal M}\) with fixed endpoints. The projections to \(\hat{\mathcal M}\) of light rays are the stationary points of this functional. In general, the righthand side of Equation (62) is the length functional of a Finsler metric. In the conformally static case \({{\hat \phi}_\mu} = {\partial _\mu}h\), the integral over \({{\hat \phi}_\mu}(x)d{x^\mu}/d\ell\) is the same for all trial paths, so we are left with the length functional of the Fermat metric \({\hat g}\). In this case the light rays, if projected to \(\hat{\mathcal M}\), are the geodesics of \({\hat g}\). Note that the travel time functional (62) is invariant under reparametrization; in the terminology of classical mechanics, it is a special case of Maupertuis’ principle. It is often convenient to switch to a parametrizationdependent variational principle which, in the terminology of classical mechanics, is called Hamilton’s principle. The Maupertuis principle with action functional (62) corresponds to Hamilton’s principle with a Lagrangian
(see, e.g., Carathéodory [52], Sections 304–307). The pertaining EulerLagrange equations read
where \(\hat \Gamma _{\sigma \tau}^\nu\) are the Christoffel symbols of the Fermat metric \({\hat g}\). The solutions admit the constant of motion
which can be chosen equal to 1 for each ray, such that ℓ gives the \({\hat g}\)arclength. By Equation (62), the latter gives the travel time if \(\hat \phi = 0\). According to Equation (64), the Fermat twoform
exerts a kind of Coriolis force on the light rays. This force has the same mathematical structure as the Lorentz force in a magnetostatic field. In this analogy, \(\hat \phi = 0\) corresponds to the magnetic (vector) potential. In other words, light rays in a conformally stationary spacetime behave like charged particles, with fixed chargetomass ratio, in a magnetostatic field \({\hat w}\) on a Riemannian manifold (\(\hat M,\hat g\)).
Fermat’s principle in static spacetimes dates back to Weyl [347] (cf. [206, 319]). The stationary case was treated by Pham Mau Quan [284], who even took an isotropic medium into account, and later, in a more elegant presentation, by Brill [42]. These versions of Fermat’s principle are discussed in several textbooks on general relativity (see, e.g., [225, 116, 312] for the static and [200] for the stationary case). A detailed discussion of the conformally stationary case can be found in [265]. Fermat’s principle in conformally stationary spacetimes was used as the starting point for deriving the lens equation of the quasiNewtonian apporoximation formalism by Schneider [297] (cf. [299]). As an alternative to the name “Fermat metric” (used, e.g., in [116, 312, 265]), the names “optical metric” (see, e.g., [141, 79]) and “optical reference geometry” (see, e.g., [4]) are also used.
In the conformally static case, one can apply the standard Morse theory for Riemannian geodesics to the Fermat metric \({\hat g}\) to get results on the number of \({\hat g}\)geodesics joining two points in space. This immediately gives results on the number of lightlike geodesics joining a point in spacetime to an integral curve of W = ∂_{t}. Completeness of the Fermat metric corresponds to global hyperbolicity of the spacetime metric. The relevant techniques, and their generalization to (conformally) stationary spacetimes, are detailed in a book by Masiello [218]. (Note that, in contrast to standard terminology, Masiello’s definition of a stationary spacetime includes the assumption that the hypersurfaces t = constant are spacelike.) The resulting Morse theory is a special case of the Morse theory for Fermat’s principle in globally hyperbolic spacetimes (see Section 3.3). In addition to Morse theory, other standard methods from Riemannian geometry have been applied to the Fermat metric, e.g., convexity techniques [139, 140].
If the metric (61) is conformally static, \({{\hat \phi}_\mu}(x) = {\partial _\mu}h(x)\), and if the Fermat metric is conformal to the Euclidean metric, \({{\hat g}_{\mu \nu}}(x) = n{(x)^2}{\delta _{\mu \nu}}\), the arrival time functional (62) can be written as
where ℓ is Euclidean arclength. Hence, Fermat’s principle reduces to its standard optics form for an isotropic medium with index of refraction n on Euclidean space. As a consequence, light propagation in a spacetime with the assumed properties can be mimicked by a medium with an appropriately chosen index of refraction. This remark applies, e.g., to spherically symmetric and static spacetimes (see Section 4.3) and, in particular, to the Schwarzschild spacetime (see Section 5.1). The analogy with ordinary optics in media has been used for constructing, in the laboratory, analogue models for light propagation in generalrelativistic spacetimes (see [242]).
Extremizing the functional (67) is formally analogous to Maupertuis’ principle for a particle in a scalar potential on flat space, which is discussed in any book on classical mechanics. Dropping the assumption that the Fermat oneform is a differential, but still requiring the Fermat metric to be conformal to the Euclidean metric, corresponds to introducing an additional vector potential. This form of the opticalmechanical analogy, for light rays in stationary spacetimes whose Fermat metric is conformal to the Euclidean metric, is discussed, e.g., in [7].
The conformal factor e^{2f} in Equation (61) does not affect the paths of light rays. However, it does affect redshifts and distance measures (recall Section 2.4). If g is of the form (61), for every lightlike geodesic λ the quantity \(g(\dot \lambda, {\partial _t})\) is a constant of motion. This leads to a particularly simple form of the general redshift formula (36). We consider an arbitrary lightlike geodesic s ↦ λ(s) in terms of its coordinate representation s ↦ (t(s), x^{1}(s), x^{2}(s), x^{3}(s)). If both observer and emitter are at rest in the sense that their 4velocities U_{O} and U_{S} are parallel to W = ∂_{t}, Equation (36) can be rewritten as
This justifies calling f the redshift potential. It is shown in [151] that there is a redshift potential for a congruence of timelike curves in a spacetime if and only if the timelike curves are the integral curves of a conformal Killing vector field. The notion of a redshift potential or redshift function is also discussed in [74]. Note that Equation (68) immediately determines the redshift in conformally stationary spacetimes for any pair of observer and emitter. If the velocity of the observer or of the emitter is not parallel to W = ∂_{t}, one just has to add the usual specialrelativistic Doppler factor.
Conformally stationary spacetimes can be characterized by another interesting property. Let W be a timelike vector field in a spacetime and fix three observers whose worldlines are integral curves of W. Then the angle under which two of them are seen by the third one remains constant in the course of time, for any choice of the observers, if and only if W is proportional to a conformal Killing vector field. For a proof see [151].
4.3 Lensing in spherically symmetric and static spacetimes
The class of spherically symmetric and static spacetimes is of particular relevance in view of lensing, because it includes models for nonrotating stars and black holes (see Sections 5.1, 5.2, 5.3), but also for more exotic objects such as wormholes (see Section 5.4), monopoles (see Section 5.5), naked singularities (see Section 5.6), and Boson or Fermion stars (see Section 5.7). Here we collect the relevant formulas for an unspecified spherically symmetric and static metric. We find it convenient to write the metric in the form
As Equation (69) is a special case of Equation (61), all results of Section 4.2 for conformally stationary metrics apply. However, much stronger results are possible because for metrics of the form (69) the geodesic equation is completely integrable. Hence, all relevant quantities can be determined explicitly in terms of integrals over the metric coefficients.
4.3.1 Redshift and Fermat geometry
Comparison of Equation (69) with the general form (61) of a conformally stationary spacetime shows that here the redshift potential f is a function of r only, the Fermat oneform \({\hat \phi}\) vanishes, and the Fermat metric \({\hat g}\) is of the special form
By Fermat’s principle, the geodesics of \({\hat g}\) coincide with the projection to 3space of light rays. The travel time (in terms of the time coordinate t) of a lightlike curve coincides with the \({\hat g}\)arclength of its projection. By symmetry, every \({\hat g}\)geodesic stays in a plane through the origin. From Equation (70) we read that the sphere of radius r has area 4πR(r)^{2} with respect to the Fermat metric. Also, Equation (70) implies that the second fundamental form of this sphere is a multiple of its first fundamental form, with a factor −R′(r) (R(r) S(r))^{−1}. If
the sphere r = r_{p} is totally geodesic, i.e., a \({\hat g}\)geodesic that starts tangent to this sphere remains in it. The best known example for such a light sphere or photon sphere is the sphere r = 3m in the Schwarzschild spacetime (see Section 5.1). Light spheres also occur in the spacetimes of wormholes (see Section 5.4). If R″(r_{p}) < 0, the circular light rays in a light sphere are stable with respect to radial perturbations, and if R″(r_{p}) < 0, they are unstable like in the Schwarschild case. The condition under which a spherically symmetric static spacetime admits a light sphere was first given by Atkinson [13]. Abramowicz [1] has shown that for an observer traveling along a circular light orbit (with subluminal velocity) there is no centrifugal force and no gyroscopic precession. Claudel, Virbhadra, and Ellis [59] investigated, with the help of Einstein’s field equation and energy conditions, the amount of matter surrounded by a light sphere. Among other things, they found an energy condition under which a spherically symmetric static black hole must be surrounded by a light sphere. A purely kinematical argument shows that any spherically symmetric and static spacetime that has a horizon at r = r_{H} and is asymptotically flat for r → ∞ must contain a light sphere at some radius between r_{H} and ∞ (see Hasse and Perlick [153]). In the same article, it is shown that in any spherically symmetric static spacetime with a light sphere there is gravitational lensing with infinitely many images. Bozza [37] investigated a strongfield limit of lensing in spherically symmetric static spacetimes, as opposed to the wellknown weakfield limit, which applies to light rays that come close to an unstable light sphere. (Actually, the term “strongbending limit” would be more appropriate because the gravitational field, measured in terms of tidal forces, need not be particularly strong near an unstable light sphere.) This limit applies, in particular, to light rays that approach the sphere r = 3m in the Schwarzschild spacetime (see [39] and, for illustrations, Figures 15, 16, and 17).
4.3.2 Index of refraction and embedding diagrams
Transformation to an isotropic radius coordinate \({\tilde r}\) via
takes the Fermat metric (70) to the form
where
On the righthand side r has to be expressed by \({\tilde r}\) with the help of Equation (72). The results of Section 4.2 imply that the lightlike geodesics in a spherically symmetric static spacetime are equivalent to the light rays in a medium with index of refraction (74) on Euclidean 3space. For arbitrary metrics of the form (69), this result is due to Atkinson [13]. It reduces the lightlike geodesic problem in a spherically symmetric static spacetime to a standard problem in ordinary optics, as treated, e.g., in [212], §27, and [199], Section 4. One can combine this result with our earlier observation that the integral in Equation (67) has the same form as the functional in Maupertuis’ principle in classical mechanics. This demonstrates that light rays in spherically symmetric and static spacetimes behave like particles in a spherically symmetric potential on Euclidean 3space (cf., e.g., [105]). If the embeddability condition
is satisfied, we define a function Z(r) by
then the Fermat metric (70) reads
If restricted to the equatorial plane ϑ = π/2, the metric (77) describes a surface of revolution, embedded into Euclidean 3space as
Such embeddings of the Fermat geometry have been visualized for several spacetimes of interest (see Figure 11 for the Schwarzschild case and [160] for other examples). This is quite instructive because from a picture of a surface of revolution one can read the qualitative features of its geodesics without calculating them. Note that Equation (72) defines the isotropic radius coordinate uniquely up to a multiplicative constant. Hence, the straight lines in this coordinate representation give us an unambiguously defined reference grid for every spherically symmetric and static spacetime. These straight lines have been called triangulation lines in [62, 63], where their use for calculating bending angles, exactly or approximately, is outlined.
4.3.3 Light cone
In a spherically symmetric static spacetime, the (past) light cone of an event p_{O} can be written in terms of integrals over the metric coefficients. We first restrict to the equatorial plane ϑ = π/2. The \({\hat g}\)geodesics are then determined by the Lagrangian
For fixed radius value r_{O}, initial conditions
determine a unique solution r = r(ℓ, Θ), ϕ = φ(ℓ, Θ) of the EulerLagrange equations. Θ measures the initial direction with respect to the symmetry axis (see Figure 6). We get all light rays issuing from the event r = r_{O}, ϕ = 0, ϑ = π/2, t = t_{O} into the past by letting Θ range from 0 to π and applying rotations around the symmetry axis. This gives us the past light cone of this event in the form
Ψ and Θ are spherical coordinates on the observer’s sky. If we let t_{O} float over ℝ, we get the observational coordinates (4) for an observer on a tline, up to two modifications. First, t_{O} is not the same as proper time τ; however, they are related just by a constant,
Second, i is not the same as the affine parameter s; along a ray with initial direction Θ, they are related by
The constants of motion
give us the functions r(ℓ, Θ), φ(ℓ, Θ) in terms of integrals,
Here the notation with the dots is a shorthand; it means that the integral is to be decomposed into sections where r(ℓ, Θ) is a monotonous function of ℓ and that the absolute value of the integrals over all sections have to be added up. Turning points occur at radius values where R(r) = R(r_{O}) sin Θ and R′(r) ≠ 0 (see Figure 9). If the metric coefficients S and R have been specified, these integrals can be calculated and give us the light cone (see Figure 12 for an example). Having parametrized the rays with \({\hat g}\)arclength (= travel time), we immediately get the intersections of the light cone with hypersurfaces t = constant (“instantaneous wave fronts”); see Figures 13, 18, and 19.
4.3.4 Exact lens map
Recall from Section 2.1 that the exact lens map [123] refers to a chosen observation event p_{O} and a chosen “source surface” \({\mathcal T}\). In general, for \({\mathcal T}\) we may choose any 3dimensional submanifold that is ruled by timelike curves. The latter are to be interpreted as wordlines of light sources. In a spherically symmetric and static spacetime, we may take advantage of the symmetry by choosing for \({\mathcal T}\) a sphere r = r_{S} with its ruling by the tlines. This restricts the consideration to lensing for static light sources. Note that all static light sources at radius r_{S} undergo the same redshift, log(1 + z) = f (r_{S}) − f (r_{O}). Without loss of generality, we place the observation event p_{O} on the 3axis at radius r_{O}. This gives us the past light cone in the representation (81). To each ray from the observer, with initial direction characterized by Θ, we can assign the total angle Φ(Θ) the ray sweeps out on its way from r_{O} to r_{S} (see Figure 6). Φ(Θ) is given by Equation (86),
where the same shorthand notation is used as in Equation (86). Φ(Θ) is not necessarily defined for all Θ because not all light rays that start at r_{O} may reach r_{S}. Also, Φ(Θ) may be multivalued because a light ray may intersect the sphere r = r_{S} several times. Equation (81) gives us the (possibly multivalued) lens map
It assigns to each point on the observer’s sky the position of the light source which is seen at that point. Φ(Θ) may take all values between 0 and infinity. For each image we can define the order
which counts how often the ray has met the axis. The standard example where images of arbitrarily high order occur is the Schwarzschild spacetime (see Section 5.1). For many, though not all, applications one may restrict to the case that the spacetime is asymptotically flat and that r_{O} and r_{S} are so large that the spacetime is almost flat at these radius values. For a light ray with turning point at r_{m}(Θ), Equation (87) can then be approximated by
If the entire ray remains in the region where the spacetime is almost flat, Equation (90) gives the usual weakfield approximation of light bending with Φ(Θ) close to π. However, Equation (90) does not require that the ray stays in the region that is almost flat. The integral in Equation (90) becomes arbitrarily large if r_{m}(Θ) comes close to an unstable light sphere, R′(r_{p}) = 0 and R″(r_{p}) > 0. This situation is well known to occur in the Schwarzschild spacetime with r_{p} = 3m (see Section 5.1, in particular Figures 9, 14, and 15). The divergence of Φ(Θ) is always logarithmic [37]. Virbhadra and Ellis [337] (cf. [339] for an earlier version) approximately evaluate Equation (90) for the case that source and observer are almost aligned, i.e., that Φ(Θ) is close to an odd multiple of π. This corresponds to replacing the sphere at r_{S} with its tangent plane. The resulting “almost exact lens map” takes an intermediary position between the exact treatment and the quasiNewtonian approximation. It was originally introduced for the Schwarzschild metric [337] where it approximates the exact treatment remarkably well within a wide range of validity [119]. On the other hand, neither analytical nor numerical evaluation of the “almost exact lens map” is significantly easier than that of the exact lens map. For situations where the assumption of almost perfect alignment cannot be maintained the VirbhadraEllis lens equation must be modified (see [70]; related material can also be found in [38]).
4.3.4.1 Distance measures, image distortion and brightness of images
For calculating image distortion (see Section 2.5) and the brightness of images (see Section 2.6) we have to consider infinitesimally thin bundles with vertex at the observer. In a spherically symmetric and static spacetime, we can apply the orthonormal derivative operators ∂_{Θ} and sin Θ∂_{Ψ} to the representation (81) of the past light cone. Along each ray, this gives us two Jacobi fields Y_{1} and Y_{2} which span an infinitesimally thin bundle with vertex at the observer. Y_{1} points in the radial direction and Y_{2} points in the tangential direction (see Figure 7). The radial and the tangential direction are orthogonal to each other and, by symmetry, paralleltransported along each ray. Thus, in contrast to the general situation of Figure 3, Y_{1} and Y_{2} are related to a Sachs basis (E_{1}, E_{2}) simply by Y_{1} = and D_{+} = D_{−}E_{2}. The coefficients D_{+} and D_{−} are the extremal angular diameter distances of Section 2.4 with respect to a static observer (because the (Ψ, Θ)grid refers to a static observer). In the case at hand, they are called the radial and tangential angular diameter distances. They can be calculated by normalizing Y_{1} and Y_{2},
These formulas have been derived first for the special case of the Schwarzschild metric by Dwivedi and Kantowski [84] and then for arbitrary spherically symmetric static spacetimes by Dyer [85]. (In [85], Equation (92) is erroneously given only for the case that, in our notation, e^{f}^{(r)}R(r) = r.) From these formulas we immediately get the area distance \({D_{{\rm{area}}}} = \sqrt {\left\vert {{D_ +}{D_ }} \right\vert}\) for a static observer and, with the help of the redshift z, the luminosity distance D_{lum} = (1 + z)^{2}D_{area} (recall Section 2.4). In this way, Equation (91) and Equation (92) allow to calculate the brightness of images according to the formulas of Section 2.6. Similarly, Equation (91) and Equation (92) allow to calculate image distortion in terms of the ellipticity ε (recall Section 2.5). In general, ε is a complex quantity, defined by Equation (49). In the case at hand, it reduces to the real quantity ε = D_{−}/D_{+} − D_{+}/D_{−}. The expansion θ and the shear a of the bundles under consideration can be calculated from Kantowski’s formula [173, 84],
to which Equation (27) reduces in the case at hand. The dot (= derivative with respect to the affine parameter s) is related to the derivative with respect to ℓ by Equation (83). Evaluating Equations (91, 92) in connection with the exact lens map leads to quite convenient formulas, for static light sources at r = r_{S}. Setting r(ℓ, Θ) = r_{S} and φ(ℓ, Θ) = Φ(Θ) and comparing with Equation (87) yields (cf. [271])
These formulas immediately give image distortion and the brightness of images if the map Θ ↦ Φ(Θ) is known.
4.3.5 Caustics of light cones
Quite generally, the past light cone has a caustic point exactly where at least one of the extremal angular diameter distances D_{+}, D_{−} vanishes (see Sections 2.2, 2.3, and 2.4). In the case at hand, zeros of D_{+} are called radial caustic points and zeros of D_{−} are called tangential caustic points (see Figure 8). By Equation (92), tangential caustic points occur if φ(ℓ, Θ) is a multiple of n, i.e., whenever a light ray crosses the axis of symmetry through the observer (see Figure 8). Symmetry implies that a point source is seen as a ring (“Einstein ring”) if its worldline crosses a tangential caustic point. By contrast, a point source whose wordline crosses a radial caustic point is seen infinitesimally extended in the radial direction. The set of all tangential caustic points of the past light cone is called the tangential caustic for short. In general, it has several connected components (“first, second, etc. tangential caustic”). Each connected component is a spacelike curve in spacetime which projects to (part of) the axis of symmetry through the observer. The radial caustic is a lightlike surface in spacetime unless at points where it meets the axis; its projection to space is rotationally symmetric around the axis. The best known example for a tangential caustic, with infinitely many connected components, occurs in the Schwarzschild spacetime (see Figure 12). It is also instructive to visualize radial and tangential caustics in terms of instantaneous wave fronts, i.e., intersections of the light cone with hypersurfaces t = constant. Examples are shown in Figures 13, 18, and 19. By symmetry, a tangential caustic point of an instantaneous wave front can be neither a cusp nor a swallowtail. Hence, the general result of Section 2.2 implies that the tangential caustic is always unstable. The radial caustic in Figure 19 consists of cusps and is, thus, stable.
4.4 Lensing in axisymmetric stationary spacetimes
Axisymmetric stationary spacetimes are of interest in view of lensing as generalrelativistic models for rotating deflectors. The best known and most important example is the Kerr metric which describes a rotating black hole (see Section 5.8). For noncollapsed rotating objects, exact solutions of Einstein’s field equation are known only for the idealized cases of infinitely long cylinders (including string models; see Section 5.10) and disks (see Section 5.9). Here we collect, as a preparation for these examples, some formulas for an unspecified axisymmetric stationary metric. The latter can be written in coordinates (y^{1}, y^{2}, ϕ, t), with capital indices A, B, … taking the values 1 and 2, as
where all metric coefficients depend on y = (y^{1}, y^{2}) only. We assume that the integral curves of ∂_{ϕ} are closed, with the usual (2_{π})periodicity, and that the 2dimensional orbits spanned by ∂_{ϕ} and ∂_{t} are timelike. Then the Lorentzian signature of g implies that g_{AB} (y) is positive definite. In general, the vector field ∂_{t} need not be timelike and the hypersurfaces t = constant need not be spacelike. Our assumptions allow for transformations (ϕ, t) ↦ Ωt, t) with a constant Ω. If, by such a transformation, we can achieve that g_{tt} < 0 everywhere, we can use the purely spatial formalism for light rays in terms of the Fermat geometry (recall Section 4.2). Comparison of Equation (96) with Equation (61) shows that the redshift potential f, the Fermat metric \({\hat g}\), and the Fermat oneform \({\hat \phi}\) are
respectively. If it is not possible to make g_{tt} negative on the entire spacetime domain under consideration, the Fermat geometry is defined only locally and, therefore, of limited usefulness. This is the case, e.g., for the Kerr metric where, in BoyerLindquist coordinates, g_{tt} is positive in the ergosphere (see Section 5.8).
Variational techniques related to Fermat’s principal in stationary spacetimes are detailed in a book by Masiello [218]. Note that, in contrast to standard terminology, Masiello’s definition of stationarity includes the assumption that the surfaces t = constant are spacelike.
For a rotating body with an equatorial plane (i.e., with reflectional symmetry), the Fermat metric of the equatorial plane can be represented by an embedding diagram, in analogy to the spherically symmetric static case (recall Figure 11). However, one should keep in mind that in the nonstatic case the lightlike geodesics do not correspond to the geodesics of \({\hat g}\) but are affected, in addition, by a sort of Coriolis force produced by \({\hat \phi}\). For a review on embedding diagrams, including several examples (see [160]).
5 Examples
5.1 Schwarzschild spacetime
The (exterior) Schwarzschild metric
has the form (69) with
It is the unique spherically symmetric vacuum solution of Einstein’s field equation. At the same time, it is the most important and best understood spacetime in which lensing can be explicitly studied without approximations. Schwarzschild lensing beyond the weakfield approximation has astrophysical relevance in view of black holes and neutron stars. The increasing evidence that there is a supermassive black hole at the center of our Galaxy (see [107] for background material) is a major motivation for a detailed study of Schwarzschild lensing (and of Kerr lensing; see Section 5.8). In the following we consider the Schwarzschild metric with a constant m > 0 and we ignore the region r < 0 for which the singularity at r = 0 is naked. The Schwarzschild metric is static on the region 2m < r < ∞. (The region r < 0 for m > 0 is equivalent to the region r > 0 for m < 0. It is usually considered as unphysical but has found some recent interest in connection with lensing by wormholes; see Section 5.4.)
5.1.1 Historical notes
Shortly after the discovery of the Schwarzschild metric by Schwarzschild [302] and independently by Droste [80], basic features of its lightlike geodesics were found by Flamm [114], Hilbert [158], and Weyl [347]. Detailed studies of its timelike and lightlike geodesics were made by Hagihara [146] and Darwin [72, 73]. For a fairly complete list of the pre1979 literature on Schwarzschild geodesics see Sharp [306]. All modern textbooks on general relativity include a section on Schwarzschild geodesics, but not all of them go beyond the weakfield approximation. For a particularly detailed exposition see Chandrasekhar [54].
5.1.2 Redshift and Fermat geometry
The redshift potential f for the Schwarzschild metric is given in Equation (101). With the help of f we can directly calculate the redshift via Equation (68) if observer and light source are static (i.e., tlines). If the light source or the observer does not follow a tline, a Doppler factor has to be added. Independent of the velocity of observer and light source, the redshift becomes arbitrarily large if the light source is sufficiently close to the horizon. For light source and observer freely falling, the redshift formula was discussed by Bazanski and Jaranowski [24]. If projected to 3space, the light rays in the Schwarzschild spacetime are the geodesics of the Fermat metric which can be read from Equation (70) (cf. Frankel [116]),
The metric coefficient R(r), as given by Equation (101), has a strict minimum at r = 3m and no other extrema (see Figure 9). Hence, there is an unstable light sphere at this radius (recall Equation (71)). The existence of circular light rays at r = 3m was noted already by Hilbert [158]. The relevance of these circular light rays in view of lensing was clearly seen by Darwin [72, 73] and Atkinson [13]. They realized, in particular, that a Schwarzschild black hole produces infinitely many images of each light source, corresponding to an infinite sequence of light rays that asymptotically spiral towards a circular light ray. The circular light rays at r = 3m are also associated with other physical effects such as centrifugal force reversal and “locking” of gyroscopes. These effects have been discussed with the help of the Fermat geometry (= optical reference geometry) in various articles by Abramowicz and collaborators (see, e.g., [5, 4, 6, 2]).
5.1.3 Index of refraction and embedding diagrams
We know from Section 4.3 that light rays in any spherically symmetric and static spacetime can be characterized by an index of refraction. This requires introducing an isotropic radius coordinate \({\tilde r}\) via Equation (72). In the Schwarzschild case, \({\tilde r}\) is related to the Schwarzschild radius coordinate r by
\({\tilde r}\) ranges from m/2 to infinity if r ranges from 2m to infinity. In terms of the isotropic coordinate, the Fermat metric (102) takes the form
with
Hence, light propagation in the Schwarzschild metric can be mimicked by the index of refraction (105); see Figure 10. The index of refraction (105) is known since Weyl [349]. It was employed for calculating lightlike Schwarzschild geodesics, exactly or approximately, e.g., in [13, 231, 106, 204]. This index of refraction can be modeled by a fluid flow [288]. The embeddability condition (75) is satisfied for r > 2.25m (which coincides with the socalled Buchdahl limit). On this domain the Fermat geometry, if restricted to the equatorial plane ϑ = π/2, can be represented as a surface of revolution in Euclidean 3space (see Figure 11). The entire region r > 2m can be embedded into a space of constant negative curvature [3].
5.1.4 Lensing by a Schwarzschild black hole
To get a Schwarzschild black hole, one joins at r = 2m the static Schwarzschild region 2m < r < ∞ to the nonstatic Schwarzschild region 0 < r < 2m in such a way that ingoing light rays can cross this surface but outgoing cannot. If the observation event p_{O} is at r_{O} > 2m, only the region r > 2m is of relevance for lensing, because the past light cone of such an event does not intersect the blackhole horizon at r = 2m. (For a Schwarzschild white hole see below.) Such a light cone is depicted in Figure 12 (cf. [183]). The picture was produced with the help of the representation (81) which requires integrating Equation (85) and Equation (86). For the Schwarzschild case, these are elliptical integrals. Their numerical evaluation is an exercise for students (see [45] for a MATHEMATICA program). Note that the evaluation of Equation (85) and Equation (86) requires knowledge of the turning points. In the Schwarzschild case, there is at most one turning point r_{m}(Θ) along each ray (see Figure 9), and it is given by the cubic equation
The representation (81) in terms of Fermat arclength ℓ (= travel time) gives us the intersections of the light cone with hypersurfaces t = constant. These “instantaneous wave fronts” are depicted in Figure 13 (cf. [147]). With the light cone explicitly known, one can analytically verify that every inextendible timelike curve in the region r > 2m intersects the light cone infinitely many times, provided it is bounded away from the horizon and from (past lightlike) infinity. This shows that the observer sees infinitely many images of a light source with this worldline. The same result can be proven with the help of Morse theory (see Section 3.3), where one has to exclude the case that the worldline meets the caustic of the light cone. In the latter case the light source is seen as an Einstein ring. For static light sources (i.e., tlines), either all images are Einstein rings or none. For such light sources we can study lensing in the exactlensmap formulation of Section 4.3 (see in particular Figure 6). Also, Section 4.3 provides us with formulas for distance measures, brightness, and image distortion which we just have to specialize to the Schwarzschild case. For another treatment of Schwarzschild lensing with the help of the exact lens map, see [119]. We place our static light sources at radius r_{S}. If r_{O} < r_{S} and 3m < r_{S}, only light rays with Θ < δ,
can reach the radius value r_{S} (see Figure 9). Rays with Θ = δ asymptotically spiral towards the light sphere at r = 3m. δ lies between 0 and π/2 for r_{O} < 3m and between π/2 and π for r_{O} > 3m. The escape cone defined by Equation (107) is depicted, for different values of r_{O}, in Figure 14. It gives the domain of definition for the lens map. The lens map is graphically discussed in Figure 15. The pictures are valid for r_{O} = 5m and r_{S} = 10m. Qualitatively, however, they look the same for all cases with r_{S} > r_{O} and r_{S} > 3m. From the diagram one can read the position of the infinitely many images for each light source which, for the two light sources on the axis, degenerate into infinitely many Einstein rings. For each fixed source, the images are ordered by the number i (= 1, 2, 3, …) which counts how often the ray has met the axis. This coincides with ordering according to travel time. With increasing order i, the images come closer and closer to the rim at Θ = δ (see Figure 15). They are alternately upright and sideinverted (see Figure 16), and their brightness rapidly decreases (see Figure 17). These basic features of Schwarzschild lensing are known since pioneering papers by Darwin [72] and Atkinson [13] (cf. [211, 246, 201]). A detailed study of Schwarzschild lensing was carried through by Virbhadra and Ellis [337] with the help of an “almost exact lens map” (see Section 4.3). The latter assumes that observer and light source are in the asymptotic region and almost aligned, but the light rays are allowed to make arbitrarily many turns around the black hole. Various methods of how multiple imaging by a black hole could be discovered, directly or indirectly, have been discussed [211, 201, 15, 14, 274, 76]. Related work has also been done for Kerr black holes (see Section 5.8). An interesting suggestion was made in [162]. A Schwarzschild black hole, somewhere in the universe, would send photons originating from our Sun back to the vicinity of our Sun (“boomerang photons” [316]). If the black hole is sufficiently close to our Solar system, this would produce images of our own Sun on the sky that could be detectable. The lensing effect of a Schwarzschild black hole has been visualized in two ways:

1.
by showing the visual appearance of some background pattern as distorted by the black hole [66, 296, 233] (only primary images, i = 1, are considered), and

2.
by showing the visual appearence of an accretion disk around the black hole [211, 130, 15, 14] (higherorder images are taken into account).
Numerous ray tracing programs have been developed for the Schwarzschild metric and, more generally, for the Kerr metric (see Section 5.8).
5.1.5 Lensing by a nontransparent Schwarzschild star
To model a nontransparent star of radius r_{*} one has to restrict the exterior Schwarzschild metric to the region r > r_{*}. Lightlike geodesics terminate when they arrive at r = r_{*}. The star’s radius cannot be smaller than 2m unless it is allowed to be timedependent. The qualitative features of lensing depend on whether r_{*} is bigger than 3m. Stars with 2m < r_{*} ≤ 3m are called ultracompact [166]. Their existence is speculative. The lensing properties of an ultracompact star are the same as that of a Schwarzschild black hole of the same mass, for observer and light source in the region r > r_{*}. In particular, the apparent angular radius δ on the observer’s sky of an ultracompact star is given by the escape cone of Figure 14. Also, an ultracompact star produces the same infinite sequence of images of each light source as a black hole. For r_{*} > 3m, only finitely many of the images survive because the other lightlike geodesics are blocked. A nontransparent star has a finite focal length r_{f} > 2m in the sense that parallel light from infinity is focused along a line that extends from radius value r_{f} to infinity. r_{f} depends on m and on r_{*}. For the values of our Sun one finds r_{f} = 550 au (1 au = 1 astronomical unit = average distance from the Earth to the Sun). An observer at r ≥ r_{f} can observe strong lensing effects of the Sun on distant light sources. The idea of sending a spacecraft to r ≥ r_{f} was occasionally discussed in the literature [103, 234, 326]. The lensing properties of a nontransparent Schwarzschild star have been illustrated by showing the appearance of the star’s surface to a distant observer. For r_{*} bigger than but of the same order of magnitude as 3m, this has relevance for neutron stars (see [352, 256, 129, 287, 221,]). r_{*} may be chosen timedependent, e.g., to model a nontransparent collapsing star. A star starting with r_{*} > 2m cannot reach r = 2m in finite Schwarzschild coordinate time t (though in finite proper time of an observer at the star’s surface), i.e., for a collapsing star one has r_{*}(t) ↦ 2m for t → ∞. To a distant observer, the total luminosity of a freely (geodesically) collapsing star is attenuated exponentially, \(L(t) \propto \exp ( t(3\sqrt 3 m)  1)\). This formula was first derived by Podurets [279] with an incorrect factor 2 under the exponent and corrected by Ames and Thorne [8]. Both papers are based on kinetic photon theory (Liouville’s equation). An alternative derivation of the luminosity formula, based on the optical scalars, was given by Dwivedi and Kantowski [84]. Ames and Thorne also calculated the spectral distribution of the radiation as a function of time and position on the apparent disk of the star. All these analyses considered radiation emitted at an angle < π/2 against the normal of the star as measured by a static (Killing) observer. Actually, one has to refer not to a static observer but to an observer comoving with the star’s surface. This modification was worked out by Lake and Roeder [198].
5.1.6 Lensing by a transparent Schwarzschild star
To model a transparent star of radius r_{*} one has to join the exterior Schwarzschild metric at r = r_{*} to an interior (e.g., perfect fluid) metric. Lightlike geodesics of the exterior Schwarzschild metric are to be joined to lightlike geodesics of the interior metric when they arrive at r = r_{*}. The radius r_{*} of the star can be timeindependent only if r_{*} > 2m. For 2m < r_{*} ≤ 3m (ultracompact star), the lensing properties for observer and light source in the region r > r_{*} differ from the black hole case only by the possible occurrence of additional images, corresponding to light rays that pass through the star. Inside such a transparent ultracompact star, there is at least one stable photon sphere, in addition to the unstable one at r = 3m outside the star (cf. [153]). In principle, there may be arbitrarily many photon spheres [177]. For r_{*} > 3m, the lensing properties depend on whether there are light rays trapped inside the star. For a perfect fluid with constant density, this is not the case; the resulting spacetime is then asymptotically simple, i.e., all inextendible light rays come from infinity and go to infinity. General results (see Section 3.4) imply that then the number of images must be finite and odd. The light cone in this exteriorplusinterior Schwarzschild spacetime is discussed in detail by Kling and Newman [183]. (In this paper the authors constantly refer to their interior metric as to a “dust” where obviously a perfect fluid with constant density is meant.) Effects on light rays issuing from the star’s interior have been discussed already earlier by Lawrence [203]. The “escape cones”, which are shown in Figure 14 for the exterior Schwarzschild metric have been calculated by Jaffe [167] for points inside the star. The focal length of a transparent star with constant density is smaller than that of a nontransparent star of the same mass and radius. For the mass and the radius of our Sun, one finds 30 au for the transparent case, in contrast to the abovementioned 550 au for the nontransparent case [234]. Radiation from a spherically symmetric homogeneous dust star that collapses to a black hole is calculated in [305], using kinetic theory. An inhomogeneous spherically symmetric dust configuration may form a naked singularity. The redshift of light rays that travel from such a naked singularity to a distant observer is discussed in [83].
5.1.7 Lensing by a Schwarzschild white hole
To get a Schwarzschild white hole one joins at r = 2m the static Schwarzschild region 2m < r < ∞ to the nonstatic Schwarzschild region 0 < r < 2m at r = 2m in such a way that outgoing light rays can cross this surface but ingoing cannot. The appearance of light sources in the region r < 2m to an observer in the region r > 2m is discussed in [112, 232, 81, 196, 197].
5.2 Kottler spacetime
The Kottler metric
is the unique spherically symmetric solution of Einstein’s vacuum field equation with a cosmological constant A. It has the form (69) with
It is also known as the SchwarzschilddeSitter metric for Λ > 0 and as the SchwarzschildantideSitter metric for Λ < 0. The Kottler metric was found independently by Kottler [186] and by Weyl [348].
In the following we consider the Kottler metric with a constant m > 0 and we ignore the region r < 0 for which the singularity at r = 0 is naked, for any value of Λ. For Λ < 0, there is one horizon at a radius r_{H} with 0 < r_{H} < 2m; the staticity condition e^{f}^{(r)} > 0 is satisfied on the region r_{H} < r < ∞. For 0 < Λ < (3m)^{−2}, there are two horizons at radii r_{H1} and r_{H2} with 2m < r_{H1} < 3m < r_{H2}; the staticity condition e^{f}^{(r)} > 0 is satisfied on the region r_{H1} < r < r_{H2}. For Λ > (3m)^{−2} there is no horizon and no static region. At the horizon(s), the Kottler metric can be analytically extended into nonstatic regions. For Λ < 0, the resulting global structure is similar to the Schwarzschild case. For 0 < Λ < (3m)^{−2}, the resulting global structure is more complex (see [195]). The extreme case Λ = (3m)^{−2} is discussed in [278].
For any value of Λ, the Kottler metric has a light sphere at r = 3m. Escape cones and embedding diagrams for the Fermat geometry (optical geometry) can be found in [314, 160] (cf. Figures 14 and 11 for the Schwarzschild case). Similarly to the Schwarzschild spacetime, the Kottler spacetime can be joined to an interior perfectfluid metric with constant density. Embedding diagrams for the Fermat geometry (optical geometry) of the exteriorplusinterior spacetime can be found in [315]. The dependence on Λ of the light bending is discussed in [194]. For the optical appearance of a Kottler white hole see [196]. The shape of infinitesimally thin light bundles in the Kottler spacetime is determined in [85].
5.3 ReissnerNordström spacetime
The ReissnerNordström metric
is the unique spherically symmetric and asymptotically flat solution of the EinsteinMaxwell equations. It has the form (69) with
It describes the field around an isolated spherical object with mass m and charge e. The ReissnerNordström metric was found independently by Reissner [286], Weyl [347], and Nordström [241]. A fairly complete list of the pre1979 literature on ReissnerNordström geodesics can be found in [306]. A detailed account of ReissnerNordström geodesics is given in [54]. (The ReissnerNordström spacetime can be modified by introducing a cosmological constant. This generalized ReissnerNordström spacetime, whose global structure is investigated in [202], will not be considered here.)
We assume m > 0 and ignore the region r < 0 for which the singularity at r = 0 is naked, for any value of e. Two cases are to be distinguished:

1.
0 ≤ e^{2} ≤ m^{2}; in this case the staticity condition e^{f}^{(r)} > 0 is satisfied on the regions \(0 < r < m  \sqrt {{m^2}  {e^2}}\) and \(m + \sqrt {{m^2}  {e^2}} < r < \infty\), i.e., there are two horizons.

2.
m^{2} < e^{2}; then the staticity condition e^{f}^{(r)} > 0 is satisfied on the entire region 0 < r < ∞, i.e., there is no horizon and the singularity at r = 0 is naked.
As the net charge of all known celestial bodies is close to zero, the nakedsingularity case 2 is usually thought to be of little astrophysical relevance.
By switching to isotropic coordinates, one can describe light propagation in the ReissnerNordström metric by an index of refraction (see, e.g., [105]). The resulting Fermat geometry (optical geometry) is discussed, in terms of embedding diagrams for the blackhole case and for the nakedsingularity case, in [191, 3] (cf. [160]). The visual appearance of a background, as distorted by a ReissnerNordström black hole, is calculated in [222]. Lensing by a charged neutron star, whose exterior is modeled by the ReissnerNordström metric, is the subject of [68, 69]. The lensing properties of a ReissnerNordström black hole are qualitatively (though not quantitatively) the same as that of a Schwarzschild black hole. The reason is the following. For a ReissnerNordström black hole, the metric coefficient R(r) has one local minimum and no other extremum between horizon and infinity, just as in the Schwarzschild case (recall Figure 9). The minimum of R(r) indicates an unstable light sphere towards which light rays can spiral asymptotically. The existence of this minimum, and of no other extremum, was responsible for all qualitative features of Schwarzschild lensing. Correspondingly, Figures 15, 16, and 17 also qualitatively illustrate lensing by a ReissnerNordström black hole. In particular, there is an infinite sequence of images for each light source, corresponding to an infinite sequence of light rays whose limit curve asymptotically spirals towards the light sphere. One can consider the “strongfield limit” [39, 37] of lensing for a ReissnerNordström black hole, in analogy to the Schwarzschild case which is indicated by the asymptotic straight line in the middle graph of Figure 15. Bozza [37] investigates whether quantitative features of the “strongfield limit”, e.g., the slope of the asymptotic straight line, can be used to distinguish between different black holes. For the ReissnerNordström black hole, image positions and magnifications have been calculated in [96], and travel times have been calculated in [290]. In both cases, the authors use the “almost exact lens map” of Virbhadra and Ellis [337] (recall Section 4.3) and analytical methods of Bozza et al. [39, 37, 40].
5.4 MorrisThorne wormholes
We consider a spacetime whose metric is of the form (69) with e^{f}^{(r)}S(r) = 1, i.e.,
where r ranges from −∞ to ∞. We call such a spacetime a MorrisThorne wormhole (see [227]) if
such that the metric (112) is asymptotically flat for r → −∞ and for r → ∞.
A particular example of a MorrisThorne wormhole is the Ellis wormhole [102] where
with a constant a. The Ellis wormhole has an unstable light sphere at r = 0, i.e., at the “neck” of the wormhole. It is easy to see that every MorrisThorne wormhole must have an unstable light sphere at some radius between r = −∞ and r = ∞. This has the consequence [153] that every MorrisThorne wormhole produces an infinite sequence of images of an appropriately placed light source. This infinite sequence corresponds to infinitely many light rays whose limit curve asymptotically spirals towards the unstable light sphere.
Lensing by the Ellis wormhole was discussed in [55]; in this paper the authors identified the region r > 0 with the region r < 0 and they developed a scattering formalism, assuming that observer and light source are in the asymptotic region. Lensing by the Ellis wormhole was also discussed in [271] in terms of the exact lens map. The resulting features are qualitatively very similar to the Schwarzschild case, with the radius values r = −∞, r = 0, r = ∞ in the wormhole case corresponding to the radius values r = 2m, r = 3m, r = ∞ in the Schwarzschild case. With this correspondence, Figures 15, 16, and 17 qualitatively illustrate lensing by the Ellis wormhole. More generally, the same qualitative features occur whenever the metric function R(r) has one minimum and no other extrema, as in Figure 9.
If observer and light source are on the same side of the wormhole’s neck, and if only light rays in the asymptotic region are considered, lensing by a wormhole can be studied in terms of the quasiNewtonian approximation formalism [182]. However, as wormholes are typically associated with negative energy densities [227, 228], the usual assumption of the quasiNewtonian approximation formalism that the mass density is positive cannot be maintained. This observation has raised some interest in lensing by negative masses, in particular in the question of whether negative masses can be detected by their (“microlensing”) effect on the energy flux from sources passing behind them. So far, related calculations [64, 293] have been done only in the quasiNewtonian approximation formalism.
5.5 BarriolaVilenkin monopole
The BarriolaVilenkin monopole [21] is given by the metric
with a constant k < 1. There is a deficit solid angle and a singularity at r = 0; the plane t = constant, ϑ = π/2 has the geometry of a cone. (Similarly, for k > 1 one gets a surplus solid angle.) The Einstein tensor has nonvanishing components G_{tt} = − G_{rr} = (1 − k^{2})/r^{2}.
The metric (115) was briefly mentioned as an example for a conical singularity by Sokolov and Starobinsky [308]. Barriola and Vilenkin [21] realized that this metric can be used as a model for monopoles that might exist in the universe, resulting from breaking a global \({\mathcal O}(3)\) symmetry. They also discussed the question of whether such monopoles could be detected by their lensing properties which were characterized on the basis of some approximative assumptions (cf. [82]). However, such approximative assumptions are actually not necessary. The metric (115) has the nice property that the geodesics can be written explicitly in terms of elementary functions. This allows to write down explicit expressions for image positions and observables such as angular diameter distances, luminosity distances, image distortion, etc. (see [271]). Note that because of the deficit angle the metric (115) is not asymptotically flat in the usual sense. (It is “quasiasymptotically flat” in the sense of [243].) For this reason, the “almost exact lens map” of Virbhadra and Ellis [337] (see Section 4.3), is not applicable to this case, at least not without modification.
The metric (115) is closely related to the metric of a static string (see metric (133) with a = 0). Restricting metric (115) to the hyperplane ϑ = π/2 and restricting metric (133) with a = 0 to the hyperplane z = constant gives the same (2 + 1)dimensional metric. Thus, studying light rays in the equatorial plane of a BarriolaVilenkin monopole is the same as studying light rays in a plane perpendicular to a static string. Hence, the multiple imaging properties of a BarriolaVilenkin monopole can be deduced from the detailed discussion of the string example in Section 5.10. In particular Figures 24 and 25 show the light cone of a nontransparent and of a transparent monopole if we interpret the missing spatial dimension as circular rather than linear. This makes an important difference. While in the string case the cone of Figures 24 has a 2dimensional set of transverse selfintersection points, the corresponding cone for the monopole has a 1dimensional radial caustic. The difference is difficult to visualize in spacetime pictures; it is therefore recommendable to use a purely spatial visualization in terms of instantaneous wave fronts (intersections of the light cone with hypersurfaces t = constant) (compare Figures 18 and 19 with Figures 2 and 28).
5.6 JanisNewmanWinicour spacetime
The JanisNewmanWinicour metric [168] can be brought into the form [336]
where m and γ are constants. It is the most general spherically symmetric static and asymptotically flat solution of Einstein’s field equation coupled to a massless scalar field. For γ = 1 it reduces to the Schwarzschild solution; in this case the scalar field vanishes. For m > 0 and γ ≠ 1, there is a naked curvature singularity at r = 2m/γ. Lensing in this spacetime was studied in [339, 338]. The main motivation was to find out whether the lensing characteristics of such a naked singularity can be distinguished from lensing by a Schwarschild black hole. The result is that the qualitative features of lensing remain similar to the Schwarzschild case as long as 1/2 < γ < 1. However, if γ drops below 1/2, they become quite different. The reason is easily understood if we write Equation (116) in the form (69). The metric coefficient
has a minimum between the singularity and infinity as long as \({1 \over 2} < \gamma < 1\) (see Figure 20). This minimum indicates an unstable light sphere (recall Equation (71)), as in the Schwarzschild case at r = 3m. All qualitative features of lensing carry over from the Schwarzschild case, i.e., Figures 15, 16, and 17 remain qualitatively unchanged. Clearly, the precise shape of the graph of Φ in Figure 15 changes if γ is changed. The question of how the logarithmic asymptote (“strongfield limit”) depends on γ is dicussed in [37]. If γ drops below 1/2, R(r) has no longer an extremum, i.e., there is no light sphere. This implies that only finitely many images are possible [153]. In [338] naked singularities of spherically symmetric spacetimes are called weakly naked if they are surrounded by a light sphere (cf. [59]). In a nutshell, weakly naked singularities show the same qualitative lensing features as black holes. A generalization of this result to spacetimes without spherical symmetry has not been worked out.
5.7 Boson and fermion stars
Spherically symmetric static solutions of Einstein’s field equation coupled to a scalar field may be interpreted as (uncharged, nonrotating) boson stars if they are free of singularities. Because of the latter condition, the WymanNewmanJanis metric (see Section 5.6) does not describe a boson star. The theoretical concept of boson stars goes back to [179, 291]. The analogous idea of a fermion star, with the scalar field replaced by a spin 1/2 (neutrino) field, is even older [216]. Until today there is no observational evidence for the existence of either a boson or a fermion star. However, they are considered, e.g., as hypothetical candidates for supermassive objects at the center of galaxies (see [301, 324] for the boson and [335, 325] for the fermion case). For the supermassive object at the center of our own galaxy, evidence points towards a black hole, but the possibility that it is a boson or fermion star cannot be completely excluded so far.
Exact solutions that describe boson or fermion stars have been found only numerically (in 3 + 1 dimensions). For this reason there is no boson star model for which the lightlike geodesics could be studied analytically. Numerical studies of lensing have been carried out by Dabrowski and Schunck [70] for a transparent spherically symmetric static maximal boson star, and by Bilić, Nikolić, and Viollier [30] for a transparent spherically symmetric static maximal fermion star. For the case of a fermionfermion star (two components) see [171]. In all three articles the authors use the “almost exact lens map” of Virbhadra and Ellis (see Section 4.3) which is valid for observer and light source in the asymptotic region and almost aligned. Dąbrowski and Schunck [70] also discuss how the alignment assumption can be dropped. The lensing features found in [70] for the boson star and in [30] for the fermion star have several similarities. In both cases, there is a tangential caustic and a radial caustic (recall Figure 8 for terminology). A (point) source on the tangential caustic (i.e., on the axis of symmetry through the observer) is seen as a (1dimenional) Einstein ring plus a (point) image in the center. If the (point) source is moved away from the axis the Einstein ring breaks into two (point) images, so there are three images altogether. Two of them merge and vanish if the radial caustic is crossed. So the qualitative lensing features are quite different from a Schwarzschild black hole with (theoretically) infinitely many images (see Section 5.1). The essential difference is that in the case of a boson or fermion star there are no circular lightlike geodesics towards which light rays could asymptotically spiral.
5.8 Kerr spacetime
Next to the Schwarzschild spacetime, the Kerr spacetime is the physically most relevant example of a spacetime in which lensing can be studied explicitly in terms of the lightlike geodesics. The Kerr metric is given in BoyerLindquist coordinates (r, ϑ, ϕ, t) as
where ϱ and Δ are defined by
and m and a are two real constants. We assume 0 < a < m, with the Schwarzschild case a = 0 and the extreme Kerr case a = m as limits. Then the Kerr metric describes a rotating uncharged black hole of mass m and specific angular momentum a. (The case a > m, which describes a naked singularity, will be briefly considered at the end of this section.) The domain of outer communication is the region between the (outer) horizon at
and r = ∞. It is joined to the region r < r_{+} in such a way that pastoriented ingoing lightlike geodesics cannot cross the horizon. Thus, for lensing by a Kerr black hole only the domain of outer communication is of interest unless one wants to study the case of an observer who has fallen into the black hole.
5.8.1 Historical notes
The Kerr metric was found by Kerr [181]. The coordinate representation (118) is due to Boyer and Lindquist [36]. The literature on lightlike (and timelike) geodesics of the Kerr metric is abundant (for an overview of the pre1979 literature, see Sharp [306]). Detailed accounts on Kerr geodesics can be found in the books by Chandrasekhar [54] and O’Neill [248].
5.8.2 Fermat geometry
The Killing vector field ∂_{t} is not timelike on that part of the domain of outer communication where ϱ(r, ϑ)^{2} ≤ 2mr. This region is known as the ergosphere. Thus, the general results of Section 4.2 on conformally stationary spacetimes apply only to the region outside the ergosphere. On this region, the Kerr metric is of the form (61), with redshift potential
Fermat metric
and Fermat oneform _{2}
(Equation (122) corrects a misprint in [265], Equation (66), where a square is missing.) With the Fermat geometry at hand, the opticalmechanical analogy (Fermat’s principle versus Maupertuis’ principle) allows to write the equation for light rays in the form of Newtonian mechanics (cf. [7]). Certain embedding diagrams for the Fermat geometry (optical reference geometry) of the equatorial plane have been constructed [313, 160]. However, they are less instructive than in the static case (recall Figure 11) because they do not represent the light rays as geodesics of a Riemannian manifold.
5.8.3 First integrals for lightlike geodesics
Carter [53] discovered that the geodesic equation in the Kerr metric admits another independent constant of motion K, in addition to the constants of motion L and E associated with the Killing vector fields ∂_{ϕ} and ∂_{t}. This reduces the lightlike geodesic equation to the following firstorder form:
Here an overdot denotes differentiation with respect to an affine parameter s. This set of equations allows writing the lightlike geodesics in terms of elliptic integrals [16]. Clearly, \({\dot \vartheta}\) and ṙ may change sign along a ray; thus, the integration of Equation (126) and Equation (127) must be done piecewise. The determination of the turning points where \({\dot \vartheta}\) and ṙ change sign is crucial for numerical evaluation of these integrals and, thus, for ray tracing in the Kerr spacetime (see, e.g., [330, 285, 109]). With the help of Equations (126, 127) one easily verifies the following important fact. Through each point of the region
there is spherical light ray, i.e., a light ray along which r is constant (see Figure 21). These spherical light rays are unstable with respect to radial perturbations. For the spherical light ray at radius r_{p} the constants of motion E, L, and K satisfy
The region \({\mathcal K}\) is the Kerr analogue of the “light sphere” r = 3m in the Schwarzschild spacetime.
5.8.4 Light cone
With the help of Equations (124, 125, 126, 127), the past light cone of any observation event p_{O} can be explicitly written in terms of elliptic integrals. In this representation the light rays are labeled by the constants of motion L/E and K/E^{2}. In accordance with the general idea of observational coordinates (4), it is desirable to relabel them by spherical coordinates (Ψ, Θ) on the observer’s celestial sphere. This requires choosing an orthonormal tetrad (e_{0}, e_{1}, e_{2}, e_{3}) at p_{O}. It is convenient to choose e_{1} ∝ ∂_{ϑ}, e_{2} ∝ ∂_{ϕ}, e_{3} ∝ ∂_{r} and, thus, e_{0} perpendicular to the hypersurface t = constant (“zeroangularmomentum observer”). For an observation event in the equatorial plane, ϑ_{O} = π/2, at radius r_{O}, one finds
As in the Schwarzschild case, some light rays from p_{O} go out to infinity and some go to the horizon. In the Schwarzschild case, the borderline between the two classes corresponds to light rays that asymptotically approach the light sphere at r = 3m. In the Kerr case, it corresponds to light rays that asymptotically approach a spherical light ray in the region \({\mathcal K}\) of Figure 21. The constants of motion for such light rays are given by Equation (129, 130), with r_{p} varying between its extremal values \(r_ + ^{{\rm{ph}}}\) and \(r_  ^{{\rm{ph}}}\) (see again Figure 21). Thereupon, Equation (131) and Equation (132) determine the celestial coordinates Ψ and Θ of those light rays that approach a spherical light ray in \({\mathcal K}\). The resulting curve on the observer’s celestial sphere gives the apparent shape of the Kerr black hole (see Figure 22). For an observation event on the axis of rotation, sin ϑ_{O} = 0, the Kerr light cone is rotationally symmetric. The caustic consists of infinitely many spacelike curves, as in the Schwarzschild case. A light source passing through such a caustic point is seen as an Einstein ring. For observation events not on the axis, there is no rotational symmetry and the caustic structure is quite different from the Schwarzschild case. This is somewhat disguised if one restricts to light rays in the equatorial plane ϑ = π/2 (which is possible, of course, only if the observation event is in the equatorial plane). Then the resulting 2dimensional light cone looks indeed qualitatively similar to the Schwarzschild cone of Figure 12 (cf. [147]), where intersections of the light cone with hypersurfaces t = constant are depicted. However, in the Kerr case the transverse selfintersection of this 2dimensional light cone does not occur on an axis of symmetry. Therefore, the caustic of the full (3dimensional) light cone is more involved than in the Schwarzschild case. It turns out to be not a spatially straight line, as in the Schwarzschild case, but rather a tube, with astroid crosssection, that winds a certain number of times around the black hole; for a → m it approaches the horizon in an infinite spiral motion. The caustic of the Kerr light cone with vertex in the equatorial plane was numerically calculated and depicted, for a = m, by Rauch and Blandford [285]. From the study of light cones one may switch to the more general study of wave fronts. (For the definition of wave fronts see Section 2.2.) Pretorius and Israel [282] determined all axisymmetric wave fronts in the Kerr geometry. In this class, they investigated in particular those members that are asymptotic to Minkowski light cones at infinity (“quasispherical light cones”) and they found, rather surprisingly, that they are free of caustics.
5.8.5 Lensing by a Kerr black hole
For an observation event p_{O} and a light source with worldline γ_{S}, both in the domain of outer communication of a Kerr black hole, several qualitative features of lensing are unchanged in comparison to the Schwarzschild case. If γ_{S} is pastinextendible, bounded away from the horizon and from (past lightlike) infinity, and does not meet the caustic of the past light cone of p_{O}, the observer sees an infinite sequence of images; for this sequence, the travel time (e.g., in terms of the time coordinate t) goes to infinity. These statements can be proven with the help of Morse theory (see Section 3.3). On the observer’s sky the sequence of images approaches the apparent boundary of the black hole which is shown in Figure 22. This follows from the fact that

the infinite sequence of images must have an accumulation point on the observer’s sky, by compactness, and

the lightlike geodesic with this initial direction cannot go to infinity or to the horizon, by assumption on γ_{S}.
If γ_{S} meets the caustic of p_{O}’s past light cone, the image is not an Einstein ring, unless p_{O} is on the axis of rotation. It has only an “infinitesimal” angular extension on the observer’s sky. As always when a point source meets the caustic, the rayoptical calculation gives an infinitely bright image. Numerical studies show that in the Kerr spacetime, where the caustic is a tube with astroid crosssection, the image is very bright whenever the light source is inside the tube [285]. In principle, with the lightlike geodesics given in terms of elliptic integrals, image positions on the observer’s sky can be calculated explicitly. This has been worked out for several special wordlines γ_{S}. The case that γ_{S} is a circular timelike geodesic in the equatorial plane of the extreme Kerr metric, a = m, was treated by Cunningham and Bardeen [67, 17]. This example is of relevance in view of accretion disks. Viergutz [330] developed a formalism for the case that γ_{S} has constant r and ϑ coordinates, i.e., for a light source that stays on a ring around the axis. One aim of this approach, which could easily be rewritten in terms of the exact lens map (recall Section 2.1), was to provide a basis for numerical studies. The case that γ_{S} is an integral curve of ∂_{t} and that γ_{S} and p_{O} are at large radii is treated by Bozza [38] under the additional assumption that source and observer are close to the equatorial plane and by Vazquez and Esteban [329] without this restriction. All these articles also calculate the brightness of images. This requires determining the crosssection of infinitesimally thin bundles with vertex, e.g., in terms of the shape parameters D_{+} and D_{−} (recall Figure 3). For a bundle around an arbitrary light ray in the Kerr metric, all relevant equations were worked out analytically by Pineault and Roeder [276]. However, the equations are much more involved than for the Schwarzschild case and will not be given here. Lensing by a Kerr black hole has been visualized (i) by showing the apparent distortion of a background pattern [277, 307] and (ii) by showing the visual appearence of an accretion disk [277, 281, 307]. The main difference, in comparison to the Schwarzschild case, is in the loss of the leftright symmetry. In view of observations, Kerr black holes are considered as candidates for active galactic nuclei (AGN) since many years. In particular, the Xray variability of AGN is interpreted as coming from a “hot spot” in an accretion disk that circles around a Kerr black hole. Starting with the pioneering work in [67, 17], many articles have been written on calculating the light curves and the spectrum of such “hot spots”, as seen by a distant observer (see, e.g., [75, 12, 175, 169, 109]). The spectrum can be calculated in terms of a transfer function that was tabulated, for some values of a, in [65] (cf. [330, 331]). A Kerr black hole is also considered as the most likely candidate for the supermassive object at the center of our own galaxy. (For background material see [107].) In this case, the predicted angular diameter of the black hole on our sky, in the sense of Figure 22, is about 30 microarcseconds; this is not too far from the reach of current VLBI technologies [108]. Also, the fact that the radio emission from our galactic center is linearly polarized gives a good motivation for calculating polarimetric images as produced by a Kerr black hole The calculation is based on the geometricoptics approximation according to which the polarization vector is parallel along the light ray. In the Kerr spacetime, this paralleltransport law can be explicitly written with the help of constants of motion [61, 276, 317] (cf. [54], p. 358). As to the large number of numerical codes that have been written for calculating imaging properties of a Kerr black hole the reader may consult [176, 330, 285, 109].
5.8.5.1 Notes on Kerr naked singularities and on the KerrNewman spacetime
The Kerr metric with a > m describes a naked singularity and is considered as unphysical by most authors. Its lightlike geodesics have been studied in [47, 49] (cf. [54], p. 375). The KerrNewman spacetime (charged Kerr spacetime) is usually thought to be of little astrophysical relevance because the net charge of celestial bodies is small. For the lightlike geodesics in this spacetime the reader may consult [48, 50].
5.9 Rotating disk of dust
The stationary axisymmetric spacetime around a rigidly rotating disk of dust was first studied in terms of a numerical solution to Einstein’s field equation by Bardeen and Wagoner [18, 19]. The exact solution was found much later by Neugebauer and Meinel [236]. It is discussed, e.g., in [235]. The metric cannot be written in terms of elementary functions because it involves the solution to an ultraelliptic integral equation. It depends on a parameter μ which varies between zero and μ_{c} = 4.62966 …. For small μ one gets the Newtonian approximation, for μ → μ_{c} the extreme Kerr metric (a = m) is approached. The lightlike geodesics in this spacetime have been studied numerically and the appearance of the disk to a distant observer has been visualized [345]. It would be desirable to support these numerical results with exact statements. From the known properties of the metric, only a few qualitative lensing features of the disk can be deduced. As Minkowski spacetime is approached for μ → 0, the spacetime must be asymptotically simple and empty as long as μ is sufficiently small. (This is true, of course, only if the disk is treated as transparent.) The general results of Section 3.4 imply that in this case the gravitational field of the disk produces finitely many images of each light source, and that the number of images is odd, provided that the worldline of the light source is pastinextendible and does not go out to past lightlike infinity. For larger values of μ, this is no longer true. For μ > 0.5 there are two counterrotating circular lightlike geodesics in the equatorial plane, a stable one at a radius \({{\tilde \rho}_1}\) inside the disk and an unstable one at a radius \({{\tilde \rho}_2}\) outside the disk. (This follows from [10] where it is shown that for μ > 0.5 timelike counterrotating circular geodesics do not exist in a radius interval [\({{\tilde \rho}_1},{{\tilde \rho}_2}\)]. The boundary values of this interval give the radii of lightlike circular geodesics.) The existence of circular light rays has the consequence that the number of images must be infinite; this is obviously true if light source and observer are exactly on the spatial track of such a circular light ray and, by continuity, also in a neighborhood. For a better understanding of lensing by the disk of dust it is desirable to investigate, for each value of μ and each event p_{O}: Which pastoriented lightlike geodesics that issue from p_{O} go out to infinity and which are trapped? Also, it is desirable to study the light cones and their caustics.
5.10 Straight spinning string
Cosmic strings (and other topological defects) are expected to exist in the universe, resulting from a phase transition in the early universe (see, e.g., [334] for a detailed account). So far, there is no direct observational evidence for the existence of strings. In principle, they could be detected by their lensing effect (see [295] for observations of a recent candidate and [164] for a discussion of the general perspective). Basic lensing features for various string configurations are briefly summarized in [9]. Here we consider the simple case of a straight string that is isolated from all other masses. This is one of the most attractive examples for investigating lensing from the spacetime perspective without approximations. In particular, studying the light cones in this metric is an instructive exercise. The geodesic equation is completely integrable, and the geodesics can even be written explicitly in terms of elementary functions.
We consider the spacetime metric
with constants a and k > 0. As usual, the azimuthal coordinate ϕ is defined modulo 2π. For a = 0 and k = 1, metric (133) is the Minkowski metric in cylindrical coordinates. For any other values of a and k, the metric is still (locally) flat but not globally isometric to Minkowski spacetime; there is a singularity along the zaxis. For a = 0 and 0 < k < 1, the plane t = constant, z = constant has the geometry of a cone with a deficit angle
(see Figure 23); for k > 1 there is a surplus angle. Note that restricting the metric (133) with a = 0 to the hyperplane z = constant gives the same result as restricting the metric (115) of the BarriolaVilenkin monopole to the hyperplane ϑ = π/2.
The metric (133) describes the spacetime around a straight spinning string. The constant k is related to the string’s massperlength μ, in Planck units, via
whereas the constant a is a measure for the string’s spin. Equation (135) shows that we have to restrict to the deficitangle case k < 1 to have μ nonnegative. One may treat the string as a line singularity, i.e., consider the metric (133) for all ρ > 0. (This “wire approximation”, where the energymomentum tensor of the string is concentrated on a 2dimensional submanifold, is mathematically delicate; see [135].) For a string of finite radius ρ_{*} one has to match the metric (133) at ρ = ρ_{*} to an interior solution, thereby getting a metric that is regular on all of ℝ^{4}.
5.10.1 Historical notes
With a = 0, the metric (133) and its geodesics were first studied by Marder [213, 214]. He also discussed the matching to an interior solution, without, however, associating it with strings (which were no issue at that time). The same metric was investigated by Sokolov and Starobinsky [308] as an example for a conic singularity. Later Vilenkin [332, 333] showed that within the linearized Einstein theory the metric (133) with a = 0 describes the spacetime outside a straight nonspinning string. Hiscock [159], Gott [144], and Linet [207] realized that the same is true in the full (nonlinear) Einstein theory. Basic features of lensing by a nonspinning string were found by Vilenkin [333] and Gott [144]. The matching to an interior solution for a spinning string, a ≠ 0, was worked out by Jensen and Soleng [170]. Already earlier, the restriction of the metric (133) with a ≠ 0 to the hyperplane z = 0 was studied as the spacetime of a spinning particle in 2 + 1 dimensions by Deser, Jackiw, and’ t Hooft [77]. The geodesics in this (2+ 1)dimensional metric were first investigated by Clément [60] (cf. Krori, Goswami, and Das [192] for the (3 + 1)dimensional case). For geodesics in string metrics one may also consult Galtsov and Masar [131]. The metric (133) can be generalized to the case of several parallel strings (see Letelier [205] for the nonspinning case, and Krori, Goswami, and Das [192] for the spinning case). Clarke, Ellis and Vickers [58] found obstructions against embedding a string model close to metric (133) into an almostRobertsonWalker spacetime. This is a caveat, indicating that the lensing properties of “real” cosmic strings might be significantly different from the lensing properties of the metric (133).
5.10.2 Redshift and Fermat geometry
The string metric (133) is stationary, so the results of Section 4.2 apply. Comparison of metric (133) with metric (61) shows that the redshift potential vanishes, f = 0. Hence, observers on tlines see each other without redshift. The Fermat metric \({\hat g}\) and Fermat oneform \({\hat \phi}\) read
As the Fermat oneform is closed, \(d\hat \phi = 0\), the spatial paths of light rays are the geodesics of the Fermat metric \({\hat g}\) (cf. Equation (64)), i.e., they are not affected by the spin of the string. \({\hat \phi}\) can be transformed to zero by changing from t to the new time function t − aϕ. Then the influence of the string’s spin on the travel time (62) vanishes as well. However, the new time function is not globally wellbehaved (if a ≠ 0), because ϕ is either discontinuous or multivalued on any region that contains a full circle around the zaxis. As a consequence, \({\hat \phi}\) can be transformed to zero on every region that does not contain a full circle around the zaxis, but not globally. This may be viewed as a gravitational analogue of the AharonovBohm effect (cf. [309]). The Fermat metric (136) describes the product of a cone with the zline. Its geodesics (spatial paths of light rays) are straight lines if we cut the cone open and flatten it out into a plane (see Figure 23). The metric of a cone is (locally) flat but not (globally) Euclidean. This gives rise to another analogue of the AharonovBohm effect, to be distinguished from the one mentioned above, which was discussed, e.g., in [115, 29, 155].
5.10.3 Light cone
For the metric (133), the lightlike geodesics can be explicitly written in terms of elementary functions. One just has to apply the coordinate transformation (t, ϕ) ↦ (t − aϕ, kϕ) to the lightlike geodesics in Minkowski spacetime. As indicated above, the new coordinates are not globally wellbehaved on the entire spacetime. However, they can be chosen as continuous and singlevalued functions of the affine parameter s along all lightlike geodesics through some chosen event, with ϕ taking values in ℝ. In this way we get the following representation of the lightlike geodesics that issue from the observation event (ρ = ρ_{0}, ϕ = 0, z = 0, t = 0) into the past:
The affine parameter s coincides with \({\hat g}\)arclength ℓ, and (Ψ, Θ) parametrize the observer’s celestial sphere,
Equations (138, 139, 140, 141) give us the light cone parametrized by (s, Θ, Ψ)The same equations determine the intersection of the light cone with any timelike hypersurface (source surface) and thereby the exact lens map in the sense of Frittelli and Newman [123] (recall Section 2.1). For k = 0.8 and a = 0, the light cone is depicted in Figure 24; intersections of the light cone with hypersurfaces t = constant (“instantaneous wave fronts”) are shown in Figure 27. In both pictures we consider a nontransparent string of finite radius ρ_{*}, i.e., the light rays terminate if they meet the boundary of the string. Figures 25 and 28 show how the light cone is modified if the string is transparent. This requires matching the metric (133) to an interior solution which is everywhere regular and letting light rays pass through the interior. For the nontransparent string, the light cone cannot form a caustic, because the metric is flat. For the transparent string, light rays that pass through the interior of the string do form a caustic. The special form of the interior metric is not relevant. The caustic has the same features for all interior metrics that monotonously interpolate between a regular axis and the boundary of the string. Also, there is no qualitative change of the light cone for a spinning string as long as the spin a is small. Large values of a, however, change the picture drastically. For \({a^2} > {k^2}\rho _*^2\), where ρ_{*} is the radius of the string, the ϕlines become timelike on a neighborhood of the string. As the ϕlines are closed, this indicates causality violation. In this causalityviolating region the hypersurfaces t = constant are not everywhere spacelike and, in particular, not transverse to all lightlike geodesics. Thus, our notion of instantaneous wave fronts becomes pathological.
5.10.4 Lensing by a nontransparent string
With the lightlike geodesics known in terms of elementary functions, positions and properties of images can be explicitly determined without approximation. We place the observation event at ρ = ρ_{0}, ϕ = 0, z = 0, t = 0, and we consider a light source whose worldline is a tline at ρ = ρ_{S}, ϕ = ϕ_{S}, z = z_{S} with 0 ≤ ϕ_{S} ≤ π. From Equations (138, 139, 140) we find that the images are in onetoone correspondence with integers n such that
They can be numbered by the winding number n in the order n = 0, −1, 1, −2, 2, … The total number of images depends on k. Let N_{1}(k) be the largest integer and N_{2}(k) be the smallest integer such that N_{1}(k) ≤ 1/k < N_{2}(k). Of the two integers N_{1}(k) and N_{2}(k), denote the odd one by N_{odd}(k) and the even one by N_{even}(k). Then we find from Equation (143)
Thus, the number of images is even in a wedgeshaped region behind the string and odd everywhere else. If the light source approaches the boundary between the two regions, one image vanishes behind the string (see Figure 23 for the case 1 < 1/k < 2). (If the nontransparent string has finite thickness, there is also a region with no image at all, in the “shadow” of the string.) The coordinates (Ψ_{n}, Θ_{n}) on the observer’s sky of an image with winding number n and the affine parameter s_{n} at which the light source is met can be determined from Equations (138, 139, 140). We just have to insert \(\rho (s) = {\rho _{\rm{S}}},\varphi (s) = {\varphi _{\rm{S}}} + 2n\pi, z(s) = {z_{\rm{S}}}\) and to solve for tanΨ = tanΨ_{n} tanΘ = tanΘ_{n} s = s_{n}:
The travel time follows from Equation (141):
It is the only relevant quantity that depends on the string’s spin a. With the observer on a tline, the affine parameter s coincides with the area distance, D_{area}(s) = s, because in the (locally) flat string spacetime the focusing equation (44) reduces to D_{area} = 0. For observer and light source on tlines, the redshift vanishes, so s also coincides with the luminosity distance, D_{lum}(s) = s, owing to the general law (48). Hence, Equation (148) gives us the brightness of images (see Section 2.6 for the relevant formulas). The string metric produces no image distortion because the curvature tensor (and thus, the Weyl tensor) vanishes (recall Section 2.5). Realistic string models yield a mass density μ that is smaller than 10^{−4}. So, by Equation (135), only the case N_{odd}(k) = 1 and N_{even}(k) = 2 is thought to be of astrophysical relevance. In that case we have a singleimaging region, 0 ≤ ϕ_{S} < 2π − π/k, and a doubleimaging region, 2π − π/k < ϕ_{S} ≤ π (see Figure 23). The occurrence of doubleimaging and of single imaging can also be read from Figure 24. In the doubleimaging region we have a (“primary”) image with n = 0 and a (“secondary”) image with n = −1. From Equations (147, 148) we read that the two images have different latitudes and different brightnesses. However, for k close to 1 the difference is small. If we express k by Equation (134) and linearize Equations (146, 147, 148, 149) with respect to the deficit angle (134), we find
Hence, in this approximation the two images have the same Θ−coordinate; their angular distance Δ on the sky is given by Vilenkin’s [333] formula
and is thus independent of ϕ_{S}; they have equal brightness and their time delay is given by the string’s spin a via Equation (154). All these results apply to the case that the worldlines of the observer and of the light source are tlines. Otherwise redshift factors must be added.
5.10.5 Lensing by a transparent string
In comparison to a nontransparent string, a transparent string produces additional images. These additional images correspond to light rays that pass through the string. We consider the case a = 0 and 1 < 1/k < 2, which is illustrated by Figures 24 and 25. The general features do not depend on the form of the interior metric, as long as it monotonously interpolates between a regular axis and the boundary of the string. In the nontransparent case, there is a singleimaging region and a doubleimaging region. In the transparent case, the doubleimaging region becomes a tripleimaging region. The additional image corresponds to a light ray that passes through the interior of the string and then smoothly slips over one of the cusp ridges. The point where this light ray meets the worldline of the light source is on the sheet of the light cone between the two cusp ridges in Figure 25, i.e., on the sheet that does not exist in the nontransparent case of Figure 24. From the picture it is obvious that the additional image shows the light source at a younger age than the other two images (so it is a “tertiary image”). A light source whose worldline meets the caustic of the observer’s past light cone is on the borderline between singleimaging and tripleimaging. In this case the tertiary image coincides with the secondary image and it is particularly bright (even infinitely bright according to the rayoptical treatment; recall Section 2.6). Under a small perturbation of the worldline the bright image either splits into two or vanishes, so one is left either with three images or with one image.
5.11 Plane gravitational waves
A plane gravitational wave is a spacetime with metric
where f(u)^{2} + g(u)^{2} is not identically zero. For any choice of f(u) and g(u), the metric (156) has vanishing Ricci tensor, i.e., Einstein’s vacuum field equation is satisfied. The lightlike vector field ∂_{v} is covariantly constant. Nonflat spacetimes with a covariantly constant lightlike vector field are called planefronted waves with parallel rays or ppwaves for short. They made their first appearance in a purely mathematical study by Brinkmann [43].
In spite of their high idealization, plane gravitational waves are interesting mathematical models for studying the lensing effect of gravitational waves. In particular, the focusing effect of plane gravitational waves on light rays can be studied quite explicitly, without any weakfield or smallangle approximations. This focusing effect is reflected by an interesting light cone structure.
The basic features with relevance to lensing can be summarized in the following way. If the profile functions f and g are differentiable, and the coordinates (x, y, u, v) range over ℝ^{4}, the spacetime with the metric (156) is geodesically complete [92]. With the exception of the integral curves of ∂_{v}, all inextendible lightlike geodesics contain a pair of conjugate points. Let q be the first conjugate point along a pastoriented lightlike geodesic from an observation event p_{O}. Then the first caustic of the past light cone of p_{O} is a parabola through q. (It depends on the profile functions f and g whether or not there are more caustics, i.e., second, third, etc. conjugate points.) This parabola is completely contained in a hyperplane u = constant. All light rays through p_{O}, with the exception of the integral curve of ∂_{v}, pass through this parabola. In other words, the entire sky of p_{O}, with the exception of one point, is focused into a curve (see Figure 29). This astigmatic focusing effect of plane gravitational waves was discovered by Penrose [259] who worked out the details for “sufficiently weak sandwich waves”. (The name “sandwich wave” refers to the case that f(u) and g(u) are different from zero only in a finite interval u_{1} < u < u_{2}.) Full proofs of the above statements, for arbitrary profile functions f and g, were given by Ehrlich and Emch [94, 95] (cf. [25], Chapter 13). The latter authors also demonstrate that plane gravitational wave spacetimes are causally continuous but not causally simple. This strengthens Penrose’s observation [259] that they are not globally hyperbolic. (For the hierarchy of causality notions see [25].) The generators of the light cone leave the boundary of the chronological past I^{−}(p_{O}) when they reach the caustic. Thus, the abovementioned parabola is also the cut locus of the past light cone. By the general results of Section 2.8, the occurrence of a cut locus implies that there is multiple imaging in the planewave spacetime. The number of images depends on the profile functions. We may choose the profile functions such that there is no second caustic. (The “sufficiently weak sandwich waves” considered by Penrose [259] are of this kind.) Then Figure 29 demonstrates that an appropriately placed worldline (close to the caustic) intersects the past light cone exactly twice, so there is doubleimaging. Thus, the plane waves demonstrate that the number of images need not be odd, even in the case of a geodesically complete spacetime with trivial topology.
The geodesic and causal structure of plane gravitational waves and, more generally, of ppwaves is also studied in [163, 51].
One often considers profile functions f and g with Diracdeltalike singularities (“impulsive gravitational waves”). Then a mathematically rigorous treatment of the geodesic equation, and of the geodesic deviation equation, is delicate because it involves operations on distributions which are not obviously welldefined. For a detailed mathematical study of this situation see [310, 193].
Garfinkle [132] discovered an interesting example for a ppwave which is singular on a 2dimensional worldsheet. This exact solution of Einstein’s vacuum field equation can be interpreted as a wave that travels along a cosmic string. Lensing in this spacetime was numerically discussed by Vollick and Unruh [340].
The vast majority of work on lensing by gravitational waves is done in the weakfield approximation. For the exact treatment and in the weakfield approximation one may use Kovner’s version of Fermat’s principle (see Section 2.9), which has the advantage that it allows for timedependent situations. Applications of this principle to gravitational waves have been worked out in the original article by Kovner [187] and by Faraoni [110, 111].
References
Abramowicz, M.A., “Centrifugal force: a few surprises”, Mon. Not. R. Astron. Soc., 245, 733–746, (1990). 4.3
Abramowicz, M.A., “Relativity of inwards and outwards: an example”, Mon. Not. R. Astron. Soc., 256, 710–718, (1992). 5.1
Abramowicz, M.A., Bengtsson, I., Karas, V., and Rosquist, K., “Poincaré ball embeddings of the optical geometry”, Class. Quantum Grav., 19, 3963–3976, (2002). For a related online version see: M.A. Abramowicz, et al., (June, 2002), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/grqc/0206027. 5.1, 5.3
Abramowicz, M.A., Carter, B., and Lasota, J.P., “Optical reference geometry for stationary and static dynamics”, Gen. Relativ. Gravit., 20, 1172–1183, (1988). 4.2, 5.1, 11
Abramowicz, M.A., and Lasota, J.P., “A note on a paradoxical property of the Schwarzschild solution”, Acta Phys. Pol., B5, 327–329, (1974). 5.1
Abramowicz, M.A., and Prasanna, A.R., “Centrifugal force reversal near a Schwarzschild black hole”, Mon. Not. R. Astron. Soc., 245, 720–728, (1990). 5.1
Alsing, P.M., “The opticalmechanical analogy for stationary metrics in general relativity”, Am. J. Phys., 66, 779–790, (1998). 4.2, 5.8
Ames, W.L., and Thorne, K.S., “The optical appearance of a star that is collapsing through its gravitational radius”, Astrophys. J., 151, 659–670, (1968). 5.1
Anderson, M.R., “Gravitational lensing by curved cosmic strings”, in Kochanek, C.S., and Hewitt, J.N., eds., Astrophysical Applications of Gravitational Lensing: Proceedings of the 173rd Symposium of the International Astronomical Union, held in Melbourne, Australia, 9–14 July 1995, volume 173 of IAU Symposia, 377–378, (Kluwer, Dordrecht, Netherlands, 1996). 5.10
Ansorg, M., “Timelike geodesic motions within the general relativistic gravitational field of the rigidly rotating disk of dust”, J. Math. Phys., 39, 5984–6000, (1998). 5.9
Arnold, V.I., GuseinZade, S.M., and Varchenko, A.N., Singularities of Differentiable Maps. Vol. 1: The Classification of Critical Points, Caustics and Wave Fronts, volume 82 of Monographs in Mathematics, (Birkhauser, Boston, U.S.A., 1985). 2.2, 2, 3.2
Asaoka, I., “Xray spectra at infinity from a relativistic accretion disk around a Kerr black hole”, Publ. Astron. Soc. Japan, 41, 763–778, (1989). 5.8
Atkinson, R.d’E., “On light tracks near a very massive star”, Astron. J., 70, 517–523, (1965). 4.3, 4.3, 5.1, 5.1, 5.1
Bao, G., Hadrava, P., and Ostgaard, E., “Emissionline profiles from a relativistic accretion disk and the role of its multiple images”, Astrophys. J., 435, 55–65, (1994). 5.1, 2
Bao, G., Hadrava, P., and Ostgaard, E., “Multiple images and light curves of an emitting source on a relativistic eccentric orbit around a black hole”, Astrophys. J., 425, 63–71, (1994). 5.1, 2
Bardeen, J.M., “Timelike and null geodesics in the Kerr metric”, in DeWitt, C., and DeWitt, B.S., eds., Black Holes. Les Astres Occlus. École d’été de Physique Théorique, Les Houches 1972, 215–239, (Gordon and Breach, New York, U.S.A., 1973). 5.8, 21, 22
Bardeen, J.M., and Cunningham, C.T., “The optical appearance of a star orbiting an extreme Kerr black hole”, Astrophys. J., 183, 237–264, (1973). 5.8
Bardeen, J.M., and Wagoner, R.V., “Uniformly rotating disks in general relativity”, Astrophys. J. Lett., 158, L65–L69, (1969). 5.9
Bardeen, J.M., and Wagoner, R.V., “Relativistic disks. I. Uniform rotation”, Astrophys. J., 167, 359–423, (1971). 5.9
Barraco, D., Kozameh, C.N., Newman, E.T., and Tod, P., “Geodesic Deviation and Minikowski Space”, Gen. Relativ. Gravit., 22, 1009–1019, (1990). 2.3
Barriola, M., and Vilenkin, A., “Gravitational field of a global monopole”, Phys. Rev. Lett., 63, 341–343, (1989). 5.5, 5.5
Bartelmann, M., and Schneider, P., “Weak Gravitational Lensing”, Phys. Rep., 340, 291–472, (2001). For a related online version see: M. Bartelmann, et al., (December, 1999), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/astroph/9912508. 2.5
Bazanski, S.L., “Some properties of light propagation in relativity”, in Rembieliński, J., ed., Particles, Fields, and Gravitation. Proceedings of a conference held in Lodz, Poland, 15–19 April 1998, volume 453 of AIP Conference Proceedings, 421–430, (American Institute of Physics, Woodbury, U.S.A., 1998). 2.4
Bazanski, S.L., and Jaranowski, P., “Geodesic deviation in the Schwarzschild spacetime”, J. Math. Phys., 30, 1794–1803, (1989). 5.1
Beem, J., Ehrlich, P., and Easley, K., Global Lorentzian Geometry, volume 202 of Monographs and Textbooks in Pure and Applied Mathematics, (Dekker, New York, U.S.A., 1996), 2nd edition. 2.7, 2.7, 3, 3.1, 5.11
Bernal, A.N., and Sánchez, M., “Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes”, (January, 2004), [Online Los Alamos Archive Preprint]: cited on 30 May 2004, http://arXiv.org/abs/grqc/0401112. 3
Bernal, A.N., and Snanchez, M., “On smooth Cauchy hypersurfaces and Geroch’s splitting theorem”, Commun. Math. Phys., 243, 461–470, (2003). For a related online version see: A.N. Bernal, et al., (June, 2003), [Online Los Alamos Archive Preprint]: cited on 30 May 2004, http://arXiv.org/abs/grqc/0306108. 3
Berry, M.V., and Upstill, C., “Catastrophe optics: Morphologies of caustics and their diffraction patterns”, volume 18 of Progress in Optics, 257–346, (NorthHolland, Amsterdam, Netherlands, 1980). 2.2
Bezerra, V.B., “Gravitational analogue of the AharonovBohm effect in four and three dimensions”, Phys. Rev. D, 35, 2031–2033, (1987). 5.10
Bilić, N., Nikolić, H., and Viollier, R.D., “Fermion stars as gravitational lenses”, Astrophys. J., 537, 909–915, (2000). For a related online version see: N. Bilić, et al., (December, 1999), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/astroph/9912381. 5.7
Birch, P., “Is the universe rotating?”, Nature, 298, 451–454, (1982). 2.5
Blake, C., and Wall, J., “A velocity dipole in the distribution of radio galaxies”, Nature, 416, 150–152, (2002). For a related online version see: C. Blake, et al., “Detection of the velocity dipole in the radio galaxies of the NRAO VLA Sky Survey”, (March, 2002), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/astroph/0203385. 2.1
Blandford, R., and Narayan, R., “Fermat’s principle, caustics, and the classification of gravitational lens images”, Astrophys. J., 310, 568–582, (1986). 2.2
Blandford, R.D., “The future of gravitational optics”, Publ. Astron. Soc. Pac., 113, 1309–1311, (2001). For a related online version see: R.D. Blandford, (October, 2001), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/astroph/0110392. 2.2
Born, M., and Wolf, E., Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light, (Cambridge University Press, Cambridge, U.K., 2002). 2.9
Boyer, R.H., and Lindquist, R.W., “Maximal analytic extension of the Kerr metric”, J. Math. Phys., 8, 265–281, (1967). 5.8
Bozza, V., “Gravitational lensing in the strong field limit”, Phys. Rev. D, 66, 103001, (2002). For a related online version see: V. Bozza, (August, 2002), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/grqc/0208075. 4.3, 4.3, 15, 5.3, 5.6
Bozza, V., “Quasiequatorial gravitational lensing by spinning black holes in the strong field limit”, Phys. Rev. D, 67, 103006, (2003). For a related online version see: V. Bozza, (October, 2002), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/grqc/0210109. 4.3, 5.8
Bozza, V., Capozziello, S., Iovane, G., and Scarpetta, G., “Strong field limit of black hole gravitational lensing”, Gen. Relativ. Gravit., 33, 1535–1548, (2001). For a related online version see: V. Bozza, et al., (February, 2001), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/grqc/0102068. 4.3, 15, 5.3
Bozza, V., and Mancini, L., “Time delay in black hole gravitational lensing as a distance estimator”, Gen. Relativ. Gravit., 36, 435–450, (2004). For a related online version see: V. Bozza, et al., (May, 2003), [Online Los Alamos Archive Preprint]: cited on 30 May 2004, http://arXiv.org/abs/grqc/0305007. 16, 5.3
Brill, D., “A simple derivation of the general redshift formula”, in Farnsworth, D., Fink, J., Porter, J., and Thompson, A., eds., Methods of local and global differential geometry in general relativity: Proceedings of the Regional Conference on Relativity held at the University of Pittsburgh, Pittsburgh, Pennsylvania, July 13–17, 1970, volume 14 of Lecture Notes in Physics, 45–47, (Springer, Berlin, Germany; New York, U.S.A., 1972). 2.4
Brill, D., “Observational contacts of general relativity”, in Israel, W., ed., Relativity, Astrophysics, and Cosmology: Proceedings of the Summer School held 14–26 August 1972 at the Banff Centre, Banff, Alberta, volume 38 of Astrophysics and space science library, 127–152, (Reidel, Dordrecht, Netherlands; Boston, U.S.A., 1973). 4.2
Brinkmann, H.W., “Einstein spaces which are mapped conformally on each other”, Math. Ann., 94, 119–145, (1925). 5.11
Bromley, B.C., Melia, F., and Liu, S., “Polarimetric Imaging of the Massive Black Hole at the Galactic Center”, Astrophys. J. Lett., 555, L83–L86, (2001). For a related online version see: B.C. Bromley, et al., (June, 2001), [Online Los Alamos Archive Preprint]: cited on 30 October 2003, http://arXiv.org/abs/astroph/0106180. 5.8
Bruckman, W., and Esteban, E.P., “An alternative calculation of light bending and time delay by a gravitational field”, Am. J. Phys., 61, 750–754, (1993). 5.1
Budic, R., and Sachs, R.K., “Scalar time functions: differentiability”, in Cahen, M., and Flato, M., eds., Differential Geometry and Relativity: A volume in honour of André Lichnerowicz on his 60th birthday, 215–224, (Reidel, Dordrecht, Netherlands; Boston, U.S.A., 1976). 2.7
Calvani, M., and de Felice, F., “Vortical null orbits, repulsive barriers, energy confinement in Kerr metric”, Gen. Relativ. Gravit., 9, 889–902, (1978). 5.8.0.2
Calvani, M., de Felice, F., and Nobili, L., “Photon trajectories in the KerrNewman metric”, J. Phys. A, 13, 3213–3219, (1980). 5.8.0.2
Calvani, M., Nobili, L., and de Felice, F., “Are naked singularities really visible?”, Lett. Nuovo Cimento, 23, 539–542, (1978). 5.8.0.2
Calvani, M., and Turolla, R., “Complete description of photon traj