1 The major steps

The dynamics of standard cosmology has two parts: firstly, the dynamics of the background spatially homogeneous and isotropic Friedmann–Lemaître–Robertson–Walker (‘FLRW’) model; and secondly, the dynamics of inhomogeneous perturbations about that model that lead to structure formation in the expanding universe. Understanding each has taken place in two phases: before inflation theory, and after inflation theory. The 1946 Lifshitz paper reprinted here “On the gravitational stability of the expanding universe” [1, 2] was the pioneer paper investigating the development of generic linearised inhomogeneities, and provided the basis for all further developments in this regard.

Cosmological models must of course be related to astronomical observations. One can observationally directly test the geometry of the background model, and can observe the structures that form in this model. However additionally, observational studies of large scale structures on the one hand, and of Cosmic Microwave Background Radiation (‘CMB’) anisotropies on the other, provide strong observational constraints on the background model [3, 4].

Table 1 Major steps in cosmological theory: background dynamics and observations; perturbed model dynamics and observations

The major innovative steps before inflation theory was developed are indicated in Table 1, with a key paper listed in each case.

  • Einstein in 1917 [8] led the way on the application of general relativity theory to cosmology, but his model was static. Friedmann in 1922 [5] and 1924 [9] developed the first dynamical cosmological models with \(k=+1\) and \(k=-1\) respectively, and so opened the way to studying dynamic universe models. Lemaître independently in 1927 [10] developed the cosmology of an expanding universe, with a idealised smoothed model related to observations of galactic redshifts and the physical history of the universe. Robertson’s masterly paper in 1933 [11] summarised the dynamics of these models; they included matter and radiation, as discussed by Tolman [12] and Landau and Lifshitz in 1941 [13] (both referred to by Lifshitz).

  • The relation of these models to the theory of observations was initiated by Lemaître, but he dealt with the theory of redshift only. Following earlier work by Slipher, Hubble in 1929 [14] observationally established the redshift-magnitude relation, but did not relate it to theory. Tolman and Whittaker related the theory to luminosity distances (see [11], p. 69). The theory of observations was developed by Heckmann, McVittie, Etherington, McCrea, Mattig, Robertson, Hoyle, and others, and culminated in Sandage’s great paper in 1961 [6] on direct observational tests to determine the parameters of the background FLRW geometry by predicting magnitudes, number counts, and redshifts of standard sources in these models as a function of those parameters.

  • The general theory of perturbations in the expanding universe was initiated by Lifshitz’ 1946 paper [1, 2], which is the topic of this note. It underlay all further studies of growth of inhomogeneities in cosmology, both Newtonian and relativistic, because its main results were then derived also in Newtonian form [15, 16], which had not been achieved before.

  • The relation to CMB anisotropy observations, the lynchpin of today’s observational cosmology [4], was pioneered by Sachs and Wolfe in 1967 [7] soon after the discovery of the CMB in 1965, directly building on the work of Lifshitz. This was then developed by many others, e.g. [17, 18].

Each step was a major step forward. The development of inflationary dynamics followed as another major step. After initial work by Gliner in 1965 [19], Englert-Brout-Gunzig in 1978 [20], and Starobinsky in 1979 [21], the theory of the background inflationary model was developed: the old inflation of Guth in 1981 [22] followed by the new inflationary models of Linde [23] and Albrecht and Steinhardt [24] in 1982, see e.g. Kolb and Turner ([25, pp. 261–320]). Then came the theory of quantum perturbations in an inflationary universe initiated by Mukhanov and Chibisov [26] and Starobinsky [27] in 1982, and developed by the participants at the Cambridge 1982 Nuffield workshop [28, 29], all building on the paper by Lifshitz.

The many observational papers on inflationary universe models that followed then built on the papers by Lifshitz and by Sachs and Wolfe, but extended to use a kinetic theory approach, and to consider effects of gravitational radiation. This is now the standard model of cosmology [30,31,32,33,34,35].

2 Early attempts

It was believed at the time the Lifshitz paper was written that inhomogeneities in cosmology led to structure growth. There were two approaches used to examine this: Newtonian theory calculations, and general relativistic models based on exact spherically symmetric models.

Jeans in 1902 [36] and 1928 [37] [cited by Lifshitz] developed the Newtonian theory of gravitational instability in a non-expanding medium with pressure. He defined the Jeans’ mass \(M_J\) given by

$$\begin{aligned} M_J = \left( \frac{\pi }{6}\right) \frac{c_s^3}{G^{3/2} \rho ^{1/2}} \end{aligned}$$
(1)

where \(c_s\) is the speed of sound, \(\rho \) the density, and G the Newtonian gravitational constant. He showed that aggregations of greater mass would collapse to give an exponential growth of the accreting mass because gravitational attraction would win, while aggregations of lesser mass would oscillate rather than grow because restoring pressure would win. On this basis he proposed that galaxies (‘nebulae’) would arise from gravitational instability in a uniform gas. However the calculation is not consistent, as commented by Lifshitz, because all regions of greater mass in an infinite static medium will also be collapsing, so the growth rate of the accreting mass relative to the density of the collapsing background is slower than predicted by Jeans’ analysis.

Non-linear spherically symmetric exact inhomogeneous solutions of general relativity for a dust cloud were investigated by Lemaître in 1933 [38], Tolman in 1934 [12, 39], and Sen in 1934 [40]. It is a rather sophisticated exact solution of the Einstein equations, and it was a remarkable feat to derive something like this at that time and engage in analysing it from the point of view of observational cosmology, as Lemaître, Tolman and Sen did.

Robertson in 1933 commented on work done on condensations by Eddington, McCrea, McVittie and Lemaître up to that time ([11, pp. 81–82]). Lemaître developed implications for structure formation in 1934 [41], stating

For the perturbed motion, i.e., for a distribution of mass and initial velocities somewhat different from the idealized model, the motion at some places may be of a completely different type from the motion of the idealized model. The relation between the energy-constant h and the mass m may be such that the motion is of the collapsing type: the expansion velocity vanishes when the gravitation is not yet completely balanced by the cosmical repulsion and the expansion is followed by a contraction. The result of the perturbations is that, after the time \(t_I\), the system includes collapsing regions, distributed in the generally expanding space. That means that we obtain collapsing regions flying away one from another with velocities roughly proportional to the distance.

Indeed these authors proved that expanding homogeneous universes are unstable to condensations and rarefactions.

Gamow and Teller in 1939 [42] [cited by Lifshitz] use the Jeans criterion \(M > M_J\), expressed in terms of random velocities, in the context of the expanding universe. Using the Hubble law where \(v = H R\), they state

In an expanding space the formation of gravitational condensations can take place only when the average density is above a certain critical density \(\rho _0\)

$$\begin{aligned} \rho \ge 3 H^2/8 \pi G = \rho _0 \end{aligned}$$
(2)

They conclude

Under present conditions the formation of condensations in the universe on a great scale is impossible whatever the masses or velocities of particles may be.

That is, in effect they conclude that a bottom up scenario must apply to galaxy formation. They continue to discuss whether use of the Friedmann equation for the expanding universe implies one needs an open or closed universe in order that stars can form. Using a lookback-time argument, they conclude that either open or closed models are compatible with star formation.

Gamow in 1948 [43] stated

The epoch when the radiation density fell below the density of matter has an important cosmogonical significance since it is only at that time that the Jeans principle of ‘gravitational instability’ could begin to work.

This reflects the point that at that time the issue of matter domination vs radiation domination had not yet been separated from the issue of tight-coupling vs free streaming. Lemaître in 1958 [44] summarised the problem, stating

The problem which cosmology has to face is how gas would finally arise from the primeval radiation and then organise itself into nebulae and secondly to understand what would arise from the part of the primeval radiation which would have escaped condensation into gases.

But the paper was before the CMB was discovered and in effect he regarded cosmic rays as the relic radiation from the big bang.

3 The Lifshitz paper

The Lifshitz paper [1, 2] is based on a general relativity analysis of an inhomogeneous spacetime that can be regarded as linearised round a FLRW homogeneous and isotropic model.

Linearisation of the General Relativity equations was of course not new. It had occurred in three contexts.

  • The Newtonian limit of General Relativity was obtained by Einstein in 1916 by linearising the equations of motion and the field equations around flat spacetime ([45, §21]; see [46, pp. 435–447]).

  • The prediction of gravitational waves was derived by Einstein in 1916 by approximate integration of the gravitational field equations ([47, 48], see [46, pp. 442 and 451–457]).

  • Particle motion in the linearised theory was studied by Einstein, Infeld, and Hoffman in 1938 [49]; this approximation is used for studying planetary orbits in the solar system, see ([46, pp. 1091–1095]).

Lifshitz’ 1946 paper [1, 2] was the first general relativity study of general perturbations of a FLRW cosmology. The paper was extended in the second part of a paper by Lifshitz and Khalatnikov in 1963 [50], with corrections of a couple of errors in the original paper.

The paper starts with comments that previous work (referring to Jeans [37] and to Gamow and Teller [42]) is based in Newtonian theory, and general relativity might give different results. Also, strictly speaking, the Newtonian derivation is inconsistent, because one has to discard infinite forces, which is not a consistent mathematical operation.

This introductory section also gives the main conclusion:

In the expanding universe of the general relativity theory, the perturbations of most types decrease with time, thus showing no tendency to spontaneous increase. There also exist such perturbations which increase with time, but so slowly that they cannot produce large concentrations. Thus we can apparently conclude that gravitational instability is not the source of condensation of matter into separate nebulae.

The rest of the paper sets out the calculations leading to this conclusion.

3.1 Background model

Section 1, based on Tolman [12] and Landau and Lifshitz [13], summarizes the geometry and dynamics of FLRW universe models. Both “open” (negatively curved space sections: \(k=-1\)) and “closed” (positively curved space sections: \(k=+1\)) models are considered.

Both matter and radiation are considered. Introducing the conformal time \(\eta \) which simplifies the equations, the dynamic equations are derived, and solutions given for the cases of pressure free matter [(1,11) for \(k=+1\) and (1,15) and \(k=-1\)] and pure radiation [(1,12) for \(k=+1\) and (1,16) for \(k=-1\)].Footnote 1

3.2 Linearised equations

Section 2 gives the perturbed metric:

$$\begin{aligned} g_{ik} \rightarrow g_{ik} + h_{ik} \end{aligned}$$
(3)

and derives the varied Christoffel symbols (2,3) and curvature tensor components (2,4). The matter tensor is similarly perturbed in (2,9).

Using a synchronous gauge choice (2,10):

$$\begin{aligned} h_{0\alpha } = 0, \,\,h_{00}=0, \end{aligned}$$
(4)

(see Sect. 5.1.1 below for a discussion of this choice), the perturbed field equations (2,11) are derived. Note that these coordinates are not comoving with the matter (\(\delta u_\alpha \ne 0\)). Combining this with the matter perturbations gives the final form of the field equations (2,15), (2,16). The fractional change in the density is given by (2,17) and the fractional change in the velocity by (2,18).

The coordinates are not uniquely determined by the synchronous gauge choice (4), hence some solutions to these equations are gauge modes (they can be eliminated by coordinate transformations). Equation (2,19) gives a general expression for such gauge modes.

3.3 Spherical harmonics

The key idea now is to Fourier analyze the perturbations into scalar, vector, and tensor parts of comoving wave number n. The relevant functions are defined in Sect. 3, starting from 4-dimensional spherical harmonics.

Because FLRW models are imbeddable in a pseudo-Euclidean 5-dimensional space (see Robertson’s discussion in [11, pp. 86–87]), one can find 4-dimensional spherical harmonics from the polynomial (3,1). Scalar 3-dimensional harmonics Q with wave number n obeying (3,4) follow from this. Vectorial harmonics \(S_\alpha \) satisfying (3,6) and tensorial ones \(G_{\alpha \beta }\) satisfying (3,9) can also be defined. Hence

  • one can define isotropic tensor functions \(Q^\alpha _\beta \) and trace-free tensor functions \(P^\alpha _\beta \) from the scalars Q, see (3,10);

  • one can define vector perturbations \(P_\alpha \) from the scalars Q, see (3,11);

  • one can define tensor perturbations \(S_\alpha ^\beta \) from the vectors \(S_\alpha \), see (3,12), but cannot define scalars from this vector;

  • one cannot define scalar or vector functions from the tensor functions \(G^\alpha _\beta \).

Using these functions, one can separate out linear perturbations into scalar, vector, and tensor parts, and do a spherical harmonic analysis in terms of comoving wavelength for arbitrary inhomogeneities. The number n determines the spatial periodicity of the relevant function. The physical wavelength is a / n, so large n corresponds to small scales.

3.4 Perturbations of the density of the matter

Section 4 deals with scalar perturbations of open FLRW models (\(k=-1\)), accompanied by condensations or rarefactions of matter. The perturbation has the form

$$\begin{aligned} h^\beta _\alpha = \lambda (\eta )P^\beta _\alpha + \mu (\eta )Q^\beta _\alpha , \,\, h = \mu Q. \end{aligned}$$
(5)

where the wave number n is implied. The perturbed field equations give two coupled second order growth equations (4,2) for \(\lambda \) and \(\mu \). The resulting density and velocity perturbations are given by (4,3) and (4,4).

Two integrals are defined (4,5) that correspond to gauge modes. The order of the perturbation equations can be reduced by defining new variables \(\xi \), \(\zeta \) (4,6) to give two coupled first order perturbation equations (4,7) for \(\xi \) and \(\zeta \).

3.4.1 Radiation dominated era

At early times (\(\eta \ll 1\)), radiation dominates the universe, so the equation of state can be taken to be

$$\begin{aligned} p = \rho /3 \end{aligned}$$
(6)

and the equations for \(\xi \) and \(\zeta \) become (4,8). Two different cases occur:

Large scale perturbations: When n is small enough that \(n \eta \ll 1\), \(\delta \rho /\rho \) has a power law growth given by (4,9):

$$\begin{aligned} \delta \rho /\rho = \frac{n^2+4}{9}\left( C_1 \eta + C_2 \eta ^2\right) Q\,\,\,(n \eta \ll 1). \end{aligned}$$
(7)

While this increases with time, \(C_1\ll \eta _0\) and these perturbations remain small.

Small scale perturbations: When n is large enough that \(n \eta \gg 1\), so we have oscillations, \(\delta \rho /\rho \) is given by (4,10):

$$\begin{aligned} \delta \rho /\rho = C \left( \frac{i\,n}{\sqrt{3}}\right) \exp \left( \frac{i \,n \eta }{\sqrt{3}}\right) \,\,\,\left( \frac{1}{n}\ll \mu \ll 1\right) \end{aligned}$$
(8)

(C complex). These are sound waves propagating with speed

$$\begin{aligned} u = \sqrt{\frac{dp}{d(\rho /c^2)}} = \frac{2}{\sqrt{3}}. \end{aligned}$$
(9)

The amplitude of the variations remains constant.

3.4.2 Matter

At late stages of its expansion, the energy-momentum tensor is matter dominated, so we can take the equation of state to be

$$\begin{aligned} p = 0 \end{aligned}$$
(10)

and the perturbation equations can be integrated to give the complex expression (4,11) for \(\delta \rho /\rho \), which includes curvature effects at late times. Before such effects are appreciable (\(\eta \ll 1)\), there is a decaying mode (4,12) and a growing mode. Lifshitz separates the latter into two cases: small n gives the curvature dominated form (4,13):

$$\begin{aligned} \delta \rho /\rho = \frac{C_1}{60}(n^2+4)\eta ^2 Q,\,\,\,(n\eta \ll 1), \end{aligned}$$
(11)

which does not become large, and large n gives the matter dominated form (4,14):

$$\begin{aligned} \delta \rho /\rho = \frac{C_1}{60}n^2\eta ^2 Q,\,\,\,(1/ n \ll \pi \ll 1). \end{aligned}$$
(12)

These perturbations increase proportionally to the radius and can become large, but if originating in statistical fluctuations are insignificant relative to those required to produce nebulae or even stars.

For late stages of the expansion (\(\eta \gg 1\)) curvature dominates, and one gets (4,15) for \(\delta \rho /\rho \), tending to a constant or decreasing.

3.4.3 Matter and radiation

When matter and radiation must both be taken into account, one can introduce the speed of sound \(u = \sqrt{dp/d\rho } \ll 1\). The pressure terms are only significant if \(un\eta \gg 1\). The solution for \(\delta \rho /\rho \) is then (4,16):

$$\begin{aligned} \delta \rho /\rho = \frac{C n^2}{3\sqrt{u}\eta }\exp \left( i\,n\int u d\eta \right) \end{aligned}$$
(13)

giving “sound waves” propagating at speed u. The condition for such “geometrical acoustics” is \(u n \eta \gg 1\). These perturbations also cannot become large.

These perturbations were assumed to be adiabatic. Lifshitz remarks that if this is not the case, one must take into account in addition entropy changes and thermal conduction. The gravitational field equations will contain additional terms due to the variation of entropy.

3.5 Rotational perturbations

Section 5 deals with vector perturbations, which can generate vorticity. The perturbation has the form

$$\begin{aligned} h^\beta _\alpha = \sigma (\eta )S^\beta _\alpha \end{aligned}$$
(14)

where again the comoving wave number n is suppressed. The perturbed field equations give the growth equation (5,2) for \(\sigma \), with solutions (5,3):

$$\begin{aligned} \sigma =const.\int \frac{d\eta }{a^2}. \end{aligned}$$
(15)

with perturbation of velocity (5,4). For the radiation dominated case (6), the solution is (5,5):

$$\begin{aligned} \sigma = - \frac{C}{\eta }, \quad \, a \delta u^\alpha = \frac{C}{8}\, . \end{aligned}$$
(16)

For the matter dominated case (10), the solution is (5,6), which when curvature is negligible gives

$$\begin{aligned} \sigma = -\frac{8C}{\eta ^3}. \end{aligned}$$
(17)

In all cases the perturbation decreases with time.

3.6 Gravitational waves

Section 6 deals with tensor perturbations, which generate gravitational waves. The perturbation has the form

$$\begin{aligned} h^\beta _\alpha = \nu (\eta )G^\beta _\alpha . \end{aligned}$$
(18)

The matter is unperturbed: \(\delta u_\alpha = 0, \,\, \delta \rho = 0\) and the growth equation for \(\nu \) is (6,2). For the case of radiation (6), the solution is (6,3):

$$\begin{aligned} \nu = \frac{1}{\sinh (\eta )}\left( C_1 \sin n\eta + C_2 \cos n\eta \right) \!. \end{aligned}$$
(19)

For the case of pressure-free matter (10) the solution is (6,4). For small \(\eta \) and not large values of n this gives (6,5):

$$\begin{aligned} \nu = C_1 + \frac{C_2}{\eta ^2}, \,\,\, \left( \eta \ll \frac{1}{n}\right) \end{aligned}$$
(20)

while for small \(\eta \) and very large n one gets (6,6):

$$\begin{aligned} \nu = \frac{4n}{\eta ^2}\left( C_1 \cos n\eta - C_2 \sin n\eta \right) ,\,\left( \frac{1}{n} \ll \eta \ll 1\right) \end{aligned}$$
(21)

The periodic factor in (19), (21) shows these are gravitational waves propagating at the speed of light, with wave vector \(k = n/a\) and phase \(\int c k dt = n \eta \)) and amplitude decreasing as 1 / a.

3.7 Extension

Lifshitz and Khalatnikov in 1963 [50] reproduced and extended the results of the Lifshitz (1946) paper. The abstract of this later paper includes

The second part of the paper contains an investigation of the gravitational stability of the isotropic model. There are grounds to believe that this model gives an adequate description of the present-day state of the universe considered on a large scale. The behaviour in time of various kinds of small perturbations to the isotropic model is studied. It is shown that perturbations which do not disturb the uniformity of the distribution of matter are either damped with time or remain constant. Perturbations which involve changes in the density of matter behave differently in expanding and contracting universes. In an expanding universe the changes in the density of matter grow slowly with time for long wavelength perturbations and decrease with time for short wavelength perturbations. The contracting universe, however, is essentially unstable against such perturbations.

The new part is the statement that contracting universes are unstable to scalar perturbations.

4 The formation of structure: pre inflation

Lifshitz’ paper was ignored for a while by many standard texts on cosmology, such as McVittie [51], Bondi [52], Misner, Thorne and Wheeler [46], and Sciama [53], though McVittie’s section 9.8 in [51] does consider non-uniform models using Newtonian like coordinates for scalar perturbations, representing N condensations immersed in a distribution of perfect fluid (equation (9.814)). But McVittie misses the harmonic decomposition and makes no reference to Lifshitz.

The Lifshitz paper was however noticed by some of those interested in structure formation.

4.1 The Newtonian version

Bonnor in 1957 [15] extended Lifshitz’ work by studying the Newtonian theory of cosmological perturbations in an expanding universe, and derived a Jeans’ length criterion for oscillations in these models. Growth will occur for wavelengths such that

$$\begin{aligned} \lambda ^2 > \frac{\pi }{G\rho _0}\frac{dp}{d\rho }; \end{aligned}$$
(22)

(his equation (3.21)); for perturbations larger than this wave length in an expanding model, the density perturbations have to solve his equation (5.6), giving a power law growth. This confirms that there is no hope of accounting for the formation of nebulae from statistical fluctuations ([15, pp. 112–113]).

Peebles in his 1971 book Physical cosmology [16] derives a modified Newtonian limit from the weak field limit of general relativity ([16, p. 214]): his equations (35), (36) imply

$$\begin{aligned} \nabla ^2(\phi ) = 4 \pi G (\rho + 3 p) \end{aligned}$$
(23)

so giving the general relativistic active gravitational mass \(\rho + 3p\) [54, 55] rather than the Newtonian version \(\rho \). From the Newtonian matter conservation and momentum equations he derives ([16, pp. 215–217]) the Lifshitz growth rate for pressure free matter, which as Bonnor [52] showed gives the same power law result (his (45)) as Lifshitz when curvature can be ignored:

$$\begin{aligned} \delta = A(r) t^{2/3} + B(r) t^{-1} \end{aligned}$$
(24)

so galaxies cannot grow from gravitational instability because this grows too slowly.

Adding in pressure (46), the Jeans length ([16, pp. 217–219])Footnote 2

$$\begin{aligned} \lambda _J = \left( \frac{\pi k T}{G \rho m_p}\right) ^{1/2} = \,c_s \left( \frac{\pi }{G \rho }\right) ^ {1/2} \end{aligned}$$
(25)

separates long wavelengths that behave as (24), because gravitational attraction dominates, from short wavelengths that oscillate like an acoustic wave, because pressure prevents collapse. This is essentially what was proven by Lifshitz (see Sect. 3.4.1 above), although he did not define a Jeans’ length for his relativistic models.

4.2 Jeans’-length studies and exact spherical models

Many papers then developed structure formation studies largely based on the changing Jeans’ length as the key parameter, because of the realisation of the importance of the history of the thermal background radiation.

It had been known since the 1930s that the early universe might involve radiation; this was implied by Lemaître’s idea of the ‘primeval atom’, and inter alia, Tolman’s book [12] discussed matter, radiation, and entropy relations in the early universe. Lifshitz’ paper [1, 2] explicitly incorporated the idea of an early radiation dominated era, an intermediate matter dominated era, and a late curvature dominated era. What was new after the 1965 CMB discovery was the idea that matter would be ionised at early times, and so there would be tight coupling of the hydrogen-helium plasma with photons until recombination took place on a last scattering surface (LSS) at a redshift of about 1100 [56]. During this tight coupling time, photon oscillations would necessarily be accompanied by baryon oscillations. After decoupling, the resulting blackbody radiation would be freely propagating [16, 53, 57].

Many papers then proceeded on the basis one did not need a full relativistic analysis such as given by Lifshitz; the variation of the Jeans’ length with time would govern structure formation, which could be investigated using Newtonian theory and specifically the results (25), (24).

An example is Doroshkevich, Zeldovich, and Novikov’s 1967 paper [58]. A nice summary of these arguments is given in a popular book by Silk [59, pp. 172–182].

One could also use general relativistic models based on the spherical shell (Lemaître-Tolman) pressure free models [38, 39] where each shell evolves independently, as in Silk and Wilson [60]. Bonnor [61] studied the formation of the “nebulae” using an exact relativistic model, consisting of two FLRW regions with different densities, with an LT region interpolating between them. The conclusion was the same: statistical fluctuations in density are too small to produce the “nebulae”.

Rees in 1971 [62] gave an excellent summary of this phase of understanding, mainly based on (i) the spherical shell model for \(p=0\) ([62, pp. 316–318]); (ii) the way Jeans length varies with time together with the Lifshitz growth law (24), for \(p \ne 0\) ([62, pp. 316–318 and 322–326]). The outcome remained the same as found by Lifshitz:

The original studies were motivated by the hope that galaxies and clusters might have condensed from random \(\sqrt{N}/N\) fluctuations which would naturally be expected in a universe composed of discrete atoms. For a galactic mass of \(\simeq 10^{11}M_\circ \) however, “statistical” fluctuations are only \(\simeq 10^{-14}\), and these would not have condensed out by the present epoch unless one assumes that the growth was initiated at a stage when the particle horizon encompassed only a few atoms... the problem is that the overall expansion transforms the growth rate of the instability from an exponential to a (much slower) power law, which means one must either suppose that the galaxies come into being at an exceedingly early epoch, or else assume larger initial amplitudes. ([62, pp. 319–320])

A more optimistic conclusion is given in Peebles’ 1981 book [63, p. 22].

4.3 General relativistic studies

However, many writings considered the full relativistic perturbation theory, as initiated by Lifshitz, e.g. Harrison [64, 66], Field and Shepley [65], Zel’dovich [67], Weinberg [57] and Peebles [63].

Harrison [64] derived the equations for density perturbations in a longitudinal gauge, found to be free from gauge modes, and later [66] considered the relation of Jeans’ length to scale, deriving the scale free spectrum also found by Zel’dovich [67].

Weinberg [57, pp. 561–588] includes sections on formation of galaxies as affected by the changing Jeans Length, the Newtonian theory of small fluctuations, and the GR theory of small fluctuations, and refers to Lifshitz as showing that disturbances at wave numbers below \(k_J\) grow like powers of t rather than exponentially, as in Jeans’ static case.

Peebles [63, pp. 306–362], after discussing the Newtonian theory, gives a comprehensive survey of GR theory perturbation theory, using the synchronous gauge, considers acoustic waves including identification of the baryon acoustic oscillation scale, obtains the transfer function and predicts the effect of acoustic waves on it. He also considered effects on CMB anisotropies (see below).

4.4 Effect on radiation

One can observationally test cosmological perturbation theory firstly by observations of the power spectrum and angular n-point correlation functions of matter, as pioneered by Peebles [63], and secondly, by observing their effect on the CMB anisotropy power spectrum.

The latter approach was pioneered by Sachs and Wolfe in their 1967 paper “Perturbations of a Cosmological Model and Angular Variations of the Microwave Background” [7]. Based on Lifshitz’ 1946 work, they consider perturbed field equations for a \(k=0\) FLRW model, and characterise the solution for \(p=0\) and for \(p=\rho /3\) and its gauge freedom in their equations (22), (23) (which were not given by Lifshitz [1, 2]). The Sachs–Wolfe derivation of their form of the solution is rather enigmatic; Ehlers gives a full derivation in an introductory note to the reprinted Golden Oldie version of the paper [7]. Sachs and Wolfe comment on the physical meaning of the tensor, vector, and scalar perturbations (their §II.d), characterising the same oscillating and growing density modes as identified by Lifshitz. They then calculated photon orbits and redshifts in the perturbed model (their equations (37) and (39)) and hence the effect on the CMB anisotropies (their equation (43)). The effect usually called the Sachs–Wolfe effect is given by their equation (45). They also give the angular autocorrelation function (their equation (54)).

Sunyaev and Zeldovich in 1970 [17] developed the theory in the context of the thermal history of the universe, stating

A distinct periodic dependence of the spectral density of the perturbations on wavelength is peculiar to adiabatic perturbations.

This approach was developed further in Doroshkevich, Zel’dovich, and Sunyaev in 1978 [68]. These models are based on redshift effects for single photons. However a fuller analysis requires considering a distribution of photons.

4.5 General relativistic kinetic theory

Kinetic theory in FLRW models was presented by Robertson in 1936 [69], referring back to pioneering work by Lemaître in 1930 [70], Heckmann in 1931 [71], and Milne. He gives the solution of the Liouville equation in a FLRW universe. However to deal with structure formation and associated observations, one needs the Liouville and Boltzmann equations in generic spacetimes.

Following earlier work inter alia by Synge in 1934 [72], Walker in 1936 [73]Footnote 3, Taub, Synge, Sasaki, Ehlers [54], and others, Sachs and Ehlers in their 1968 Brandeis lectures [74] developed kinetic theory with an aim to application in cosmology. Kinetic theory in a general spacetime for both matter and massless particles is presented in detail by Ehlers in his 1969 Varenna lectures [75] and by Stewart [76]. Because this is a general theory applicable to any spacetime whatever, it applies in particular to perturbed FLRW models.

The specific application to such models was given by Peebles and Yu in 1970 [18], where the CMB fluctuation spectrum in a baryon-radiation universe is calculated using time orthogonal coordinates. They gave the distribution function relaxation collision equation, the zero pressure solution (their (49)), and derived the CMB fluctuation peak (Fig 8) and matter peaks (Fig 5). This is the start of the present day understanding of the matter and radiation power spectra.

Much work followed these pioneering papers. The theory was expanded in detail in Peebles in his 1980 book [63] using the Liouville equation for collisionless particles (pp. 345–352) and obtaining the transfer function (pp. 358–363) and predicting acoustic waves (p. 362) and associated CMB anisotropies (p. 363).

The Boltzmann equation approach has been developed by many others since then, e.g. Ma and Bertschinger in 1995 [77], and is presented in depth in recent texts such as Dodelson [31] and Durrer [33].

5 Gauge problem: errors and misconceptions

The papers that used GR methods, following Lifshitz’ paper, came up against a key problem: the gauge issue, namely, that there is a gauge freedom in fitting coordinates to a perturbed FLRW model. This was already known to Milne in 1935, see [78, §§3 and 81]. One can choose a constant time surface \(\{t = const\}\) so as to make a density inhomogeneity

$$\begin{aligned} \delta \rho (t,x^i):= \rho (t,x^i+\delta x^i) - \rho (t,x^i) \end{aligned}$$
(26)

vanish: simply choose \(\{t =const\}\) surfaces to be the same as the \(\{\rho =const\}\) surfaces, then \( \rho (t,x^i+\delta x^i) = \rho (t,x^i)\, \Rightarrow \, \delta \rho = 0\). Indeed the value determined for \(\delta \rho (t,x^i)\) is arbitrary because of the coordinate freedom \(t \rightarrow t'= t'(t,x^i)\) allowed by General Relativity. This allows apparent density variations that are in fact gauge modes.

The general coordinate freedom allowed by general relativity was what had already plagued studies of gravitational radiation. In the case of cosmological perturbations, Lifshitz was aware of it and handled it carefully in his 1946 paper [1, 2] , as did Sachs and Wolfe in their 1967 paper [7]; but many papers did not, and many errors resulted. As stated by Kodama and Sasaki [79], referring to the Lifshitz and Khakatnikov paper [50]:

Although their analysis was entirely correct, their results were often misinterpreted and misused by a number of authors who subsequently considered the generation and growth of cosmological density perturbations on super-horizon scales. This unfortunate situation arose because too much attention was paid to the growth rate of the density perturbation without realizing that it essentially depends on the choice of coordinates. In addition the fact that the equations for density perturbations in the synchronous gauge, which was used in the analysis by Lifshitz and Khalatnikov, are too complicated to allow the elimination of unphysical gauge modes in general gave rise to a number of incorrect conclusions in the literature.

This was highlighted in a 1980 paper by Press and Vishniac [80] demonstrating a number of erroneous results in the literature because the gauge issue was not handled correctly. There are a number of ways to handle the issue: essentially, fix the gauge as far as possible and then track the remaining gauge freedom very carefully (Sect. 5.1 below), with various choices as to the coordinates chosen, or use gauge invariant quantities (Sects. 5.2 and 5.3).

5.1 Gauge fixing

The first method used was to track and use up gauge freedom as far as possible, as in Lifshitz [1, 2] and Sachs and Wolfe [7].

A key example is the Press and Vishniac paper [80], where they worked in the synchronous gauge but carefully eliminated two unphysical gauge modes associated with this gauge. This enabled them to reveal the source of some erroneous ideas about density perturbations on super-horizon scales explicitly.

While there is a whole variety of gauges that can be used (see Bardeen [81]), there are two commonly used gauges in cosmological perturbation theory: the Synchronous gauge (Sect. 5.1.1) and the Conformal Newtonian gauge (Sect. 5.1.2). They are compared in detail by Ma and Bertschinger [77].

5.1.1 Synchronous gauge

In general relativity, a synchronous reference system is a coordinate system in which the metric is fitted to a set of imagined geodesically and irrotationally moving observers by using comoving proper time for these observers:

$$\begin{aligned} ds^2=-dt^2+h_{ab}dx^adx^b,\,\,\,u^a = \delta ^a_0. \end{aligned}$$
(27)

(\(a, b =1-3\)) where \(h_{ab}\) is a positive definite spatial metric. This is the same as the ADM formalism with shift function set to unity: \(N=1\), and the lapse vector set to zero: \(N^a=0\). Any metric can locally be put into this form by a coordinate transformation by choosing a family of irrotational geodesic observers and an initial surface \(t=0\), and constructing the geodesics orthogonal to this surface. The coordinates are “synchronous” for these observers. However, if there are pressure gradients or matter has vorticity, these are fictitious observers (they do not correspond to the motion of any matter) and the coordinates are not uniquely defined because the initial spacelike hypersurface \(\{t=0\}\) can be chosen arbitrarily. It is used in numerical studies because of its relation to ADM.

Its merit is that it gives the simplest form of the field equations because the orthogonal world lines \(x^a(t)\) with tangent vector \(u^a = dx^a/dt = \delta ^a_0\) are geodesic. However any matter or radiation moving non-geodesically or with vorticity cannot be comoving: its 4-velocity \(u^a = \delta ^a_0\) will have non-zero spatial components.Footnote 4 It therefore does not relate well to the Newtonian limit. Furthermore in general the family of fictitious observers will develop singularities where the orthogonal curves intersect. The gauge is not completely fixed: there is a residual growing mode that remains, see [82] for a discussion.

This is the metric form used by Lifshitz [1, 2], Sachs and Wolfe [7], and Peebles and Yu [18].

5.1.2 Conformal Newtonian gauge

The conformal Newtonian gauge is discussed by Bardeen [81] and Mukhanov, Feldman and Brandenberger [83]. For scalar perturbations it has the form

$$\begin{aligned} ds^2 =a^2(\tau )\left[ -(1+2\varPsi )d\tau ^2+(1-2\varPhi )\delta _{ab}dx^adx^b\right] \end{aligned}$$
(28)

where the conformal time coordinate \(\tau \) is related to the proper time t by the transformation \(dt=a(t)d\tau \); when anisotropic stresses vanish, \(\varPhi = \varPsi \). It is most commonly used for structure formation studies in the linear regime. It still has residual coordinate freedom. This form was already given by McVittie in his 1932 paper “Condensations in an expanding universe” [84].

Its merit is that it relates well to the Newtonian limit because the orthogonal world lines \(x^a(t)\) with tangent vector \(U^a = dx^a/dt = \delta ^a_0\) are non-geodesic when \(\varPsi \) varies spatially, and \(\varPsi \) then corresponds to the Newtonian potential. However any matter or radiation moving geodesically or with vorticity will not be comoving: its 4-velocity \(u^a = \delta ^a_0\) will have non-zero spatial components. In those cases the world lines correspond therefore to a family of fictitious Newtonian-like observers relative to whom matter moves.

5.2 Gauge invariant formalism

By keeping track of the way gauge transformations affect geometrical and physical variables, one can form combinations that are gauge invariant. This approach, developed by Bardeen, is presented in his very influential 1980 paper “Gauge-invariant cosmological perturbations” [81].

For scalar perturbations, on defining vector and tensor harmonics from scalar harmonics as Lifshitz did (Sect. 3.3 above), the metric tensor is written in the form (2.14),Footnote 5 which is essentially a harmonically analysed form of the conformal Newtonian gauge.

The most general possible gauge transformation associated with a scalar perturbation is the result of the coordinate transformation

$$\begin{aligned} {\tilde{\tau }}= & {} \tau + T(\tau )\, Q^{(0)}(x^\mu ),\end{aligned}$$
(29)
$$\begin{aligned} {\tilde{x}}^\alpha= & {} x^\alpha + L^{(0)}(\tau ) \,Q^{(o)\alpha }(x^\mu ) \end{aligned}$$
(30)

with \(T(\tau )\) and \(L^{(0)}(\tau )\) arbitrary functions of \(\tau \) and \(Q^{(o)}\), \(Q^{(o)\alpha }\) scalar and vector harmonics. One can find variables invariant under these transformations, so defining gauge invariant potentials \(\varPhi _A\) (3.9) and \(\varPhi _H\) (3.10), a gauge invariant velocity \(v_S^{(0)}\) (3.11), and two gauge invariant density perturbation variables: \(\epsilon _m\) (3.13) and \(\epsilon _g\) (3.14). The corresponding harmonically decomposed field equations are (4.3) and (4.4), which have a very simple form, the momentum equation is (4.5), and the energy equation is (4.8). Bardeen gives solutions for these gauge invariant variables in specific models for the evolution of the universe. The approach is usually implemented using a harmonic decomposition of the Conformal Newtonian gauge.

The method is very effective and widely used, but the variables do not have a straightforward geometrical meaning: they have to be interpreted in various coordinate systems. Two major papers developing perturbation theory taking this all into account are those of Kodama and Sasaki [79] and of Mukhanov, Feldman and Brandenberger [83]. They both include the perturbed form of the Boltzmann equation.

5.3 The 3 + 1 covariant and gauge invariant approach

An alternative approach is to consider the gauge problem as being the way one fits a FLRW model to a lumpy universe [85]; the gauge freedom is really the freedom of the map from the background spacetime into the lumpy more realistic mode. To develop this one uses a 1+3 splitting relative to a preferred family of observers with 4-velocity \(u^a\) developed by the Hamburg group (Heckmann, Schücking, Ehlers, Kundt, Sachs, and Trümper), summarised by Hawking in 1966 [86], Kristian and Sachs in 1966 [87], and Ellis in 1971 [55]. From this one can define gauge-invariant and covariant geometric and physical variables for cosmology.

Hawking first used this method in a very nice paper in 1966 [86], but his density perturbation variable was not gauge invariant. The method was developed further by Olson in 1976 [88], but was still ambiguous [81]. Ellis et al. in 1989 [85, 90] and 1992 [20] completed the Hawking approach by using gauge invariant variables for the density perturbation.

This 3+1 gauge invariant and covariant formalism centres on the comoving fractional spatial density gradient, defined as

$$\begin{aligned} \mathcal{D}_a:= h_a^b \rho _{,b}/\rho \end{aligned}$$
(31)

for an observer with 4-velocity \(u^a\) (\(u_au^a = -1\)), where \(h_{ab}:=g_{ab}+u_au_b\) projects orthogonal to \(u^a\) [54, 55, 87]. The 3+1 choice of \(u^a\) is similar to gauge freedom, in the sense that any \(u^a\) can be chosen (as long as it reduces to the background Ricci eigenvector when perturbations are switched off) and this choice changes the perturbative variables like \(D_a\). However it can always be chosen to be tangent to a physically well defined set of world lines, for example the timelike eigenvector of the Ricci tensor; then the associated quantities such as the expansion, shear, acceleration, and vorticity of these world lines [54, 55, 87] will be physically meaningful, as will the matter variables such as the energy density and pressure associated with this reference frame, and the electric and magnetic parts of the Weyl tensor. Its field equations derive from the Ricci identities for \(u^a\), and can be written out in detail in relation to any specific coordinate system.

The approach was developed by Ellis and Bruni for pressure-free matter [85], and extended to general fluids by Ellis, Hwang, Bruni and Dunsby [20, 90] (as summarised in [34]). It has been applied in exact parallel in the Newtonian case [89], giving a much more transparent derivation of the Newtonian results obtained by Bonnor [15].

When \(w = p/\rho = const\), \(\varLambda = 0\), and spatial curvature \(k=0\), the linearised growth equation for modes of wave number n obtained this way is

$$\begin{aligned} \ddot{\mathcal{D}}_a + \left( \frac{2}{3}-w\right) \theta \,\dot{\mathcal{D}}_a -\left( \frac{(1-w)(1+3w)}{2} \kappa \rho \right) \mathcal{D}_a - w \frac{n^2}{a^2}\mathcal{D}_a=0 \end{aligned}$$
(32)

which directly gives the general relativistic version of the Jeans’ length [90] and the density perturbation growth laws. When \(w=0\) this reduces to

$$\begin{aligned} \ddot{\mathcal{D}}_a + \frac{2}{3}\theta \,\dot{\mathcal{D}}_a -\frac{1}{2} \kappa \rho \mathcal{D}_a =0 \end{aligned}$$
(33)

which directly gives Lifshitz’ results [1, 2] for pressure free matter [85].

The method has been applied to inflationary models by Challinor and Lasenby [91] and to CMB anisotropies from scalar perturbations of a CDM model [92]. The kinetic theory version has been developed by Lewis, Challinor and Lasenby [93] building on earlier work by Ellis, Treciokas, Matravers, Maartens, and Gebbie (see e.g. [94]). Its extension to the effect of gravitational waves on the CMB is given by Challinor [95] and to polarisation modes also by Challinor [96].

6 The formation of structure: post inflation

Through all of these studies, there was still no plausible mechanism for the origin of the perturbations that grew into galaxies and clusters of galaxies. The discovery of inflation transformed the situation by providing a specific mechanism for generating scale-free perturbations that could be large enough.

6.1 Inflation

The idea of an inflationary universe was in essence floated by Gliner in 1966 [19], Brout, Englert and Gunzig in 1978 [97], and Starobinsky in 1979 [21]. However it only gained traction when ‘old inflation’ was proposed by Guth in 1981 [22]. It was based on a scalar field that gave an early exponential expansion for a great many e-folds, producing a de Sitter expansion epoch cooling the universe down and making it very flat. This expansion was ended by a phase transition, assumed to be strongly first order, converting the inflaton field into radiation and so starting the hot big bang epoch of the universe. This solved various problems [22] but did not work (see Guth [29]), as random bubbles of the new phase nucleating and colliding produced an inhomogeneous froth which did not have the right properties.

The situation was rescued by the new inflation models of Linde [23] and Albrecht and Steinhardt [24], leading to a standard inflationary picture as described by Kolb and Turner [25], but with numerous variations: at present there are about 195 inflationary models [3].

The basic point of inflationary dynamics [22, 25] is that at early times slow roll of a scalar field \(\phi \) with potential \(V(\phi )\) would dominate the expansion (curvature, matter, and radiation are assumed to be subdominant then). The Friedmann equation for a FLRW model then becomes ([31, p. 153]):

$$\begin{aligned} H_{inf}\simeq \sqrt{\frac{8\pi G V(\phi )}{3}} \end{aligned}$$
(34)

where \(H_{inf}\) is the Hubble parameter \(H = {\dot{a}}/a\) at that time, and \(\phi \) rolls slowly along V, leading to an almost exponential expansion driven by a slowly varying scalar potential \(V(\phi )\).

6.2 Inflation perturbations

The pioneering paper to demonstrate that particles may be created or annihilated in an expanding universe was by Schrödinger in 1939 [98], clearly identifying the mixing of positive and negative frequencies that is now taken to identify quantum particle creation in a curved spacetime (see [99]). He called this an “alarming phenomenon”, but did not take it further.

Before the idea of inflation, the possibility of quantum fluctuations as the source of galaxies was explored by Sakharov in 1965 [100] and Harrison in 1970 [66], but remained highly speculative.

Inflation completely changed this. A quantum origin of fluctuations in an inflationary universe was explored first by Mukhanov and Chibisov in 1982 [26] and Starobinsky also in 1982 [27]. It was separately developed in depth at a Nuffield workshop in Cambridge in 1982 [28], as described interestingly by Guth [29]. This led to a series of further papers, including Hawking’s 1982 paper [101] and Bardeen, Steinhardt and Turner’s 1983 paper [102]. This laid the foundation for the current standard model: quantum fluctuations in the inflationary era are the seed of fluctuations on the Last Scattering Surface that then, because of the presence of cold dark matter, led to the origin of structure in a top-down way [25, 31, 32, 79, 83].

6.3 The amplitude of the perturbations

Evaluation of the power spectrum of the inflationary perturbations gives ([31, p. 168])

$$\begin{aligned} P_{\varPhi }(k)= \frac{128\pi ^2G^2}{9k^3} \left( \frac{H_{inf}^2V^2}{V'^2}\right) _{aH=k} \end{aligned}$$
(35)

where the right-hand side is evaluated at first Hubble crossing, the instant during the inflationary era when the physical wavelength of the mode under consideration is equal to \(H^{-1}\). This depends on the potential \(V(\phi )\) at the time of inflation directly and via (34). This means that the perturbations at the Last Scattering Surface, mediated by the transfer function and growth function ([31, p. 183]), can be much larger than the statistical fluctuations envisaged by Lifshitz, Rees and others as quoted above, which were not large enough to produce the observed astronomical structures.

Key point: We observe fluctuations on the LSS at \(\delta \rho /\rho \simeq 10^{-5}\), much larger than the random fluctuations arising by statistics alone. Hence they can lead to the observed large scale structure. How did they get to be that large? They are determined by the value of the potential \(V(\phi )\) during the time of inflation. As we have no fundamental theory of what the inflaton is, we can run the theory backwards to determine the required value of \(V(\phi )\) then and assume the effective theory does indeed have that value.

That is, we do not have a proper link to fundamental physics that will uniquely set \(\delta \rho /\rho _{LSS} \simeq 10^{-5}\) as observed (much larger than expected from statistical fluctuations); but we can make a phenomenological theory that works by adjusting the properties of \(V(\phi )\) suitably. If we adopted a specific theory that fixes \( V(\phi )\), such as minimal SU(5) grand unified theory with a Coleman-Weinberg potential as initially assumed [29], this arbitrariness would not be there; but that theory has fallen away. Thus as described by Guth [29],

The theoretical curve shown in Figure 4 has an amplitude which is normalized to the data; in practice inflationary models do not make any prediction for the overall amplitude of the fluctuations, though we could in principle make a prediction if we really knew the potential energy function of the inflaton field, the scalar field that drives inflation.

A Higgs inflaton with the right coupling can give the needed link to testable particle physics [101, 103] and can give the observed amplitude of the inhomogeneities as well as their power spectra [3]. In any other case inflationary theory does not explain why the Lifshitz problem (Sects. 3 and 4.2 above) is resolved; rather a free parameter in the theory is adjusted to make it work.

The result that \(\delta \rho \) scales as the scale factor \(a(\tau \)) for a \(k=0\) universe [7] is also an argument for the existence of dark matter, as follows ([35, §5.3.3.2]): if \(\delta \rho /\rho = 10^{-5}\) at LSS at a redshift of \(z=10^3\), this means that it cannot be larger than \(\delta \rho /\rho =10^{-2}\) today. This means that one should have actually formed potential wells of a larger amplitude at last scattering without affecting the CMB, therefore through a component that does not couple to radiation. The point is that because photons and baryons are coupled the perturbations cannot grow before decoupling (they oscillate), while the DM starts to grow.

6.4 Physical effects and observational outcomes

After this a great many papers developed the theory of relativistic structure formation in detail, involving a complex set of interactions between radiation and various matter components. These were supplemented by N-body simulations in the non-linear regime. They included:

Fig. 1
figure 1

Cosmological observations. Observations determine the present day values of dark energy density \(\varOmega _\varLambda \) and matter density \(\varOmega _m\). The supernova data (blue ellipses) is based on observing the background model geometry directly. The BAO data (green almost vertical straight lines ) reflects how the background model has affected structure formation. The WMAP data (orange triangle sloping diagonally upwards) represents how this structure affects the observed CMB temperature power spectrum. It is the latter two - based in top-down effects from the cosmological background to smaller scales - that together give us the best estimates of the cosmological parameters \(\varOmega _\varLambda \), \(\varOmega _m\) [119]. From The Supernova Cosmology Project [http://supernova.lbl.gov/Union/]

  • Bardeen, Steinhardt and Turner in 1983 [102] developed inflationary perturbation theory and its effects on CMB anisotropy, and showed the simplest model of “new inflation” based on an SU(5) GUT with Coleman-Weinberg potential is in obvious conflict with the large-scale isotropy of the microwave background.

  • Bond and Efstathiou in 1984 [104] gave a modern unified treatment of the CMB fluctuations on all angular scales in CDM models, including polarisation effects, using the synchronous gauge and Boltzmann equations. They considered correlation function peaks (their eqn(2b)), normalized by the CfA redshift survey. Their paper did not refer to inflation.

  • Kodama and Sasaki in 1984 [79] presented perturbation theory in the light of inflation, relating gauge invariant formalism to gauge dependent methods and using the Boltzmann equation for the matter-radiation interaction.

  • Davis, Efstathiou, Frenk, and White in 1985 [105] used N-body Newtonian simulations and proposed biassed galaxy formation in a universe with bottom up structure formation due to presence of cold dark matter (“CDM”).

  • Kaiser in 1987 [106] examined the effects on LSS measurements that arise from observing on the past lightcone because of redshift space distortions.

  • Efstathiou, Sutherland, and Maddox in 1990 [107] used Newtonian N-body simulations to show that a cosmological constant together with CDM, thus “\(\varLambda \)CDM”, is needed to account simultaneously for the CMB spectrum and large scale structure. This was well in advance of the confirmation of acceleration due to dark energy by supernova observations in 1995.

  • Mukhanov, Feldman and Brandenberger in 1992 [83] studied general perturbations in an inflationary model, considering hydrodynamical and scalar field sources, and discussing gauge free variables and the synchronous and longitudinal gauges. They show how inflation leads to adiabatic perturbations and the Harrison-Zel’dovich [66, 67] scale-free spectrum, and consider CMB anisotropies and gravitational waves.

  • Ma and Bertschinger in 1995 [77] considered interacting cold dark matter and baryons (fluids), plus photons, massless neutrinos, and massive neutrinos, using a detailed phase space description. They give the full details of the cosmic microwave background anisotropy, and present accurate calculations of the angular power spectra in the two CDM+HDM models including photon polarization, higher neutrino multipole moments, and helium recombination.

  • Seljak and Zaldarriaga in 1996 [108] present a method for calculating linear CMB anisotropy spectra based on integration of the Boltzmann equation over sources along the photon past light cone, with baryons and CDM represented as fluids. The temperature anisotropy is written as a time integral over the product of a geometrical term and a source term, allowing very efficient computations and showing the CMB power spectrum peaks (their Figure 1).

  • Hu and White in 1996 [109] studied the acoustic pattern of CMB peak locations and relative heights predicted by the standard inflationary cold dark matter model and showed it is essentially unique.

  • Kamionkowski, Kosowsky and Stebbins [110] and Zaldariaga and Seljak [111, 112] in 1997 showed how E and B polarisation modes in the CMB would be induced by the gravitational radiation predicted by inflation.

  • Lewis, Challinor, and Lasenby in 2000 [93] used the 3+1 covariant and gauge invariant method to give the first calculations in perturbed \(k=+1\) models.

  • Bacon, Refregier, and Ellis in 2000 [113] examined weak lensing as a probe of large scale structure.

  • Eisenstein et al. in 2005 [114] showed how acoustic oscillations imprinted into the late-time correlations of galaxies by baryonic physics at the epoch of recombination can be used as a cosmological standard ruler, allowing computing the angular diameter distance to and the Hubble parameter at the redshifts of the survey. These BAO oscillations are related to the CMB power spectrum peaks.

  • Pitrou and Uzan in 2007 showed how to quantize perturbations during inflation using the 1+3 covariant formalism [115].

  • Yoo in 2010 [116], Bonvin and Durrer in 2011 [117], and Challinor and Lewis in 2011 [118] investigated the effects of weak lensing on clustering observations (via magnification bias) and of other general relativistic effects that significantly alter the power spectrum on Hubble scales.

6.5 The integrated whole

The theory of cosmological perturbations is now a complex whole, incorporating all the elements above: baryons, photons, neutrinos, plus dark matter and a cosmological constant (“\(\varLambda \)CDM”).

Earlier texts presenting much of this theory include Kolb and Turner [25] and Liddle and Lyth [30]. More recent integrative texts covering both the full perturbative theory and its relation to CMB anisotropies, polarisation, and gravitational radiation include Dodelson [31], Mukhanov [32], Durrer [33], and Peter and Uzan [35].

7 Conclusion

All this is based on and develops from Lifshitz’ path breaking paper [1, 2], but extended in important ways. The paper pioneered a key aspect of cosmology by developing cosmological perturbations and introducing firstly, Fourier analysis into comoving wave numbers, and second the splitting into scalar, vector, and tensor modes used in all subsequent structure formation analyses.

There are two interesting questions we can ask about it. First, what was missed by this paper that could have been there? This is considered in Sect. 7.1. Second, what has resulted from it? - what has been achieved in later studies developing from it? Three major consequences have followed from this line of research:

  • A theory of structure formation in the expanding universe, completing what Jeans hoped for and Lifshitz took further (Sect. 7.2);

  • An important contribution to cosmography: determining the geometry of the universe, completing what Sandage [6] hoped for (Sect. 7.3), but in quite a new way;

  • Because of the inflationary origin of perturbations, one can derive limits on some aspects of particle physics from CMB observations - an unexpected bonus (Sect. 7.4).

7.1 What was missed

What aspects did this innovative paper not include, that could have been there? In the clear light of hindsight, what was missed was the following:

  • It is noteworthy that it presents perturbation theory for \(k=+1\) and \(k=-1\) models but not for \(k=0\) models, which are the focus of a great deal of present day theory. The lacuna was filled by Sachs and Wolfe [7], who integrated the perturbation equations for the \(k=0\) case.

  • The paper did not include a cosmological constant, which nowadays is seen as playing a significant role in structure formation (cf. [107]).

  • It omitted the idea of Baryon Acoustic Oscillations, because (like all the other papers until Gamow 1948 [43]) it misses the idea of tight coupling of matter and radiation followed by decoupling ([31, pp. 70–73]).

  • Consequent on this, it did not investigate the effect of the perturbations on the CMB anisotropy. This was first considered by Sachs and Wolfe (1967) [7], and then many others (see [33] and references therein).

It was only later, following on the Sachs and Wolfe paper, that (i) a kinetic theory approach to the radiation was developed ([74, 75] and references therein) and (ii) CMB polarisation, and in particular the E-polarisation and B-polarisation CMB modes generated by gravitational waves ([33, pp. 176–209]), was considered.

7.2 Structure formation

The Lifshitz paper was the pioneering paper in generic relativistic structure formation in the expanding universe (previous relativistic papers had looked at spherical models only). Its methods have been used in all relativistic structure formation studies since. In the context of a realistic thermal history of the universe, it implies growth of structure due to gravitational instability, leading to the observed power spectrum of matter and the BAO peak.

What happened then to Lifshitz’ pessimistic conclusion (Sect. 3), confirmed by later analyses based on the Jeans length (see Sect. 4.2)?

Basically the freedom introduced by inflation allowed the initial fluctuations to be much larger than he had envisaged (cf. the discussion in Sect. 6.3), hence the slow growth rate that occurs in an expanding universe after decoupling is adequate to create the structures we observe.

7.3 Cosmography: cosmological parameters

An unexpected outcome is that perturbation studies have also provided the mainstay of the present day relation between theory and observational tests of cosmological models, e.g. through the Planck CMB anisotropy data [4] and observations of baryon acoustic oscillations [114]. This is illustrated in Figure  1.

The key point is that galaxy cluster and WMAP limits are derived by studying the effect of the background model on structure formation in the expanding universe [31,32,33, 35]. The resulting relations between cosmological parameters and CMB anisotropies are clarified for example in [35, pp. 362–367]. These limits are much tighter than those derived from the direct measurement of the background geometry by using the Supernova data alone. The possibility of determining these tight limits arises precisely because global cosmological variables have significant effects on the local physical processes of structure formation. Just like nucleosynthesis, this is a top-down effect from global variables to local physical effects [120], because global parameters (density \(\rho \), expansion rate \(\theta \)) enter as parameters in the differential equations (32), (33) above for the growth of perturbations. That is why the growth is power law [1, 2] and not exponential [37], as pointed out by Lifshitz [1, 2].

Cosmological tests: The most sensitive limits on inflationary cosmological models come not from direct measurement of spacetime curvature, for example by supernova observations, but from observations of structures that have formed and their effects on CMB anisotropies [4]: that is, the topdown effect of global cosmological parameters (which we want to measure) on structure formation.

The limits on inflationary models through these observations are studied exhaustively in the Encyclopaedia Inflationaris [3].

7.4 Cosmological bounds on particle physics

Most remarkably, in addition the CMB data gives interesting limits on some aspects of particle physics. These include,

  • Limits on spatial variation of the fine-structure parameter [121],

  • Limits on neutrino interactions [122, 123].

  • Limits on thermal relic axions and axion-like particles [124,125,126,127].

  • In particular, Tereno et al in 2009 [128] obtained the first evidence for a non-vanishing neutrino mass from cosmological data alone (CMB+lensing) by giving both an upper and a lower bound. They state,

    We obtain a 95% confidence level upper limit of 0.54 eV for the sum of the neutrino masses, and a lower limit of 0.03 eV. The preference is for massive neutrinos.

These kinds of limits can be obtained because particle interactions affect structure formation in inflationary universes, and hence affect CMB anisotropies through the general relativity gravitational instability effects studied by Lifshitz in his 1946 paper, extended to the context of an inflationary universe.

8 Evgenii Mikhailovich Lifshitz: a brief biography

By Andrzej Krasiński, abstracted from Refs. [129], [130] and [131] below.Footnote 6

E. M. Lifshitz was born on 21 February 1915 in Kharkov, Ukraine. He finished his high school education in 1929 (at the age of 14!). He actually went to school to take only the last two classes of a 7-year course; before that he was educated at home. Beginning in 1929, he studied for two years at a chemical college, and then at the Physics and Mechanics Faculty of the Kharkov Mechanics and Machine Building Institute. He graduated from it in 1933 and began graduate study at the Ukrainian Physico-Technical Institute (UPTI), under Lev Landau. (Lifshitz was one of the first students, friends, and colleagues of Landau.) He took the PhD examination in 1934.

As seen from these dates, Lifshitz embarked on his scientific career at a very early age. He published his first paper with Landau in 1934 [132], while only 19 years old. Its subject was the formation of electron-positron pairs in heavy-particle collisions. This was a very hot topic at that time—only two years after the discovery of the positron.

He worked at UPTI until 1938 as a senior research scientist. In 1939 he obtained the DSc degree at the Leningrad State University. From 1939 on he worked at the Institute of Physical Problems of the Academy of Sciences of the USSR in Moscow.

Here is a short listing of Lifshitz’s most important scientific achievements, adapted from Ref. [130] (see there for more detailed descriptions):

\(\bullet \) The Landau–Lifshitz paper “Toward a Theory of the Dispersion of the Magnetic Permeability of Ferromagnetic Bodies” (1935) [133]. It contains a complete theory of domains in ferromagnets, an equation of motion of magnetic moments that takes account of external fields and spin-orbital interactions (now called the Landau–Lifshitz equation), and a theory of ferromagnetic resonance.

\(\bullet \) The derivation of an expression for the Coulomb collision integral of a plasma in a strong magnetic field (1937) [134]. It contains the first use of the well-known drift approximation to simplify the kinetic equation.

\(\bullet \) The calculation of deuteron dissociation on collision with a charged particle (1938) [135]. In virtue of its use of a highly general quasi-classical calculation method, this paper has not lost its significance even today.

\(\bullet \) The solution of the problem of which phase transitions can be brought about as second-order transitions (1941) [136]. Lifshitz established the properties of the possible transitions - the “Lifshitz criterion” - and listed all possible crystallographic changes that could accompany the transitions.

\(\bullet \) The paper on the second sound in superfluid helium (1944) [137], whose existence had been predicted by Landau in 1941. Attempts to excite this sound by conventional acoustic methods had failed. Lifshitz showed that it should be excited by a heater with oscillating temperature. It was in this way that the second sound was detected in V. P. Peshkov’s famous experiments in 1946.

\(\bullet \) The paper on the stability of Friedman’s solutions of Einstein’s equations (1946) [138] that was the starting point of Lifshitz’s activity in the field of relativistic cosmology. More results in this area were published later, jointly with I. M. Khalatnikov and V. A. Belinskii [139, 140]. The later papers contain a perturbative discussion of the geometry of cosmological solutions of Einstein’s equations in the vicinity of the initial singularity. They indicated that the behaviour of a general model close to the singularity may be approximated by an infinite sequence of Kasner-like evolutions with the exponents alternating between different components of the metric. Work on various aspects of cosmological singularities was the main occupation of Lifshitz from then on till the end of his life.

\(\bullet \) A theory of molecular forces operating between condensed bodies (1955) [141], considered to be Lifshitz’s most elegant paper. Before, these forces had been evaluated in a rough approximation, by adding the forces of interaction between individual atoms. Lifshitz’s suggestion was that these forces are a manifestation of the pressure of a fluctuating electromagnetic field between the bodies; his formulas express these forces solely in terms of the dielectric constants of the bodies.

In addition to working as a scientist, he taught at Kharkov University, the Kharkov Mechanics and Machine Building Institute, the Kharkov Chemical Technology Institute, Moscow University and the Pedagogical Institute.

Ref. [131] rather cryptically mentions that 1937–1938 was a difficult period for Lifshitz. This was one of the periods of Stalin’s purges, during which people were murdered at random, on the basis of “court verdicts” stating that the accused committed assorted crimes against the Soviet state. (Needless to say the charges were made up, and so was the evidence, the absurdity of the crime itself notwithstanding, and the authorities made no effort to cover up the fraudulent character of the whole “investigation”.) No-one, however famous and outstanding, could be safe; even Lev Landau was arrested for a year but released thanks to a courageous intervention of Pyotr Kapitza.Footnote 7

Ref. [131] equally cryptically mentions Lifshitz’s work for the army during World War II, for which he received the Order of the Red Star in 1945. In 1954 he received another civil distinction: the Order of the Red Banner of Labour.

For more than twenty years he was the deputy editor of the Zhurnal Eksperimentalnoi i Teoreticheskoi Fiziki.

In 1966 he was elected a corresponding member of the USSR Academy of Sciences.

Among the scientific honours he received were the Lomonosov Prize of the Academy of Sciences in 1958, the Lenin Prize in 1962 – jointly with Lev Landau for their Course of Theoretical Physics, the Lev Landau Prize in 1974 and the title of Foreign Member of the Royal Society in the UK in 1982.

Lifshitz is best known for the 10-volume Course of Theoretical Physics written jointly with Lev Landau. They began to work on this in the 1930s, and Lifshitz continued to work on the books after Lev Landau’s death. The course was completed in 1979. The work includes results of many papers in which Lifshitz was an author or co-author, the latter written in large part with Lev Landau. The books had a difficult-to-count number of editions in various countries and languages.

Evgenii Lifshitz had a younger brother, Ilya Mikhailovich (1917 – 1982), who was an equally outstanding physicist. He was a member of the USSR Academy of Sciences, a Lenin Prize awardee, a member of the National Academy of Sciences of the USA and a frequent consultant for the Course of Theoretical Physics, credited on the pages of the course where appropriate.

E. M. Lifshitz died on 29 Oct 1985 in Moscow, Russia, after a heart operation.

More extended accounts of Lifshitz’s biography can be found in Refs. [130] and [131] below.Footnote 8 In particular, Ref. [131] contains a bibliography of Lifshitz’s papers, with pointers to the sections of the Landau – Lifshitz “Course of theoretical physics”, where a given paper was used. Also, Ref. [131] gives many details of Lifshitz’s personal life and inside information on some of his scientific achievements. However, the bibliography in Ref. [131] seems to be incomplete – it does not contain some of the papers cited in Ref. [130], and does not contain even the paper reprinted here.