Skip to main content

Relativistic fluid dynamics: physics for many different scales

The Original Version of this article was published on 30 January 2007


The relativistic fluid is a highly successful model used to describe the dynamics of many-particle systems moving at high velocities and/or in strong gravity. It takes as input physics from microscopic scales and yields as output predictions of bulk, macroscopic motion. By inverting the process—e.g., drawing on astrophysical observations—an understanding of relativistic features can lead to insight into physics on the microscopic scale. Relativistic fluids have been used to model systems as “small” as colliding heavy ions in laboratory experiments, and as large as the Universe itself, with “intermediate” sized objects like neutron stars being considered along the way. The purpose of this review is to discuss the mathematical and theoretical physics underpinnings of the relativistic (multi-) fluid model. We focus on the variational principle approach championed by Brandon Carter and collaborators, in which a crucial element is to distinguish the momenta that are conjugate to the particle number density currents. This approach differs from the “standard” text-book derivation of the equations of motion from the divergence of the stress-energy tensor in that one explicitly obtains the relativistic Euler equation as an “integrability” condition on the relativistic vorticity. We discuss the conservation laws and the equations of motion in detail, and provide a number of (in our opinion) interesting and relevant applications of the general theory. The formalism provides a foundation for complex models, e.g., including electromagnetism, superfluidity and elasticity—all of which are relevant for state of the art neutron-star modelling.

Setting the stage

If one performs a search on the topic of relativistic fluids on any of the major physics article databases one is overwhelmed by the number of “hits”. This reflects the importance that the fluid model has long had for physics and engineering. For relativistic physics, in particular, the fluid model is essential. After all, many-particle astrophysical and cosmological systems are the best sources of detectable effects associated with General Relativity. Two obvious examples, the expansion of the Universe and oscillations (or, indeed, mergers) of neutron stars, indicate the vast range of scales on which relativistic fluids are relevant. A particularly topical context for general relativistic fluids is their use in the modelling of gravitational-wave sources. This includes the compact binary inspiral problem, either involving two neutron stars or a neutron star and a black hole, the collapse of stellar cores during supernovae, or various neutron star instabilities. One should also not forget the use of (special) relativistic fluids in modelling collisions of heavy nuclei, astrophysical jets, and gamma-ray burst emission.

This review provides an introduction to the modeling of fluids in General Relativity. As the (main) target audience is graduate students with a need for an understanding of relativistic fluid dynamics we have made an effort to keep the presentation pedagogical, carefully introducing the central concepts. The discussion will (hopefully) also be useful to researchers who work in areas outside of General Relativity and gravitation per se (e.g., a nuclear physicist who develops neutron star equations of state), but who require a working knowledge of relativistic fluid dynamics.

Throughout (most of) the discussion we will assume that General Relativity is the proper description of gravity. From a conservative point of view, this restriction is not too severe. Einstein’s theory is extremely well tested and it is natural to focus our attention on it. At the same time, it is important to realize that the problem of fluids in other theories of gravity has interesting aspects. And perhaps more importantly, we know that General Relativity cannot be the ultimate theory of gravity—it absolutely breaks on the quantum scale and may also have trouble on the large scales of cosmology (taking the presence of the mysterious dark energy as evidence that something is missing in our understanding). As we hope that the review will be used by students and researchers who are not necessarily experts in General Relativity and the techniques of differential geometry, we have included an introduction to the mathematical tools required to build relativistic models. Our summary is not a proper introduction to General Relativity, but we have made an effort to define all the tools we need for the discussion that follows. Hopefully, our description is sufficiently self-contained to provide a less experienced reader with a working understanding of (at least some of) the mathematics involved. In particular, the reader will find an extended discussion of the covariant and Lie derivatives. This is natural since many important properties of fluids, both relativistic and non-relativistic, can be established and understood by the use of parallel transport and Lie-dragging, and it is vital to appreciate the distinction between the two. As we do not want to make the initial learning curve too steep, we have tried to avoid the language of differential geometry. This makes the discussion less “elegant” in places, but we feel that this is a price worth paying if the aim is to make the material more generally accessible.

Ideally, the reader should have some familiarity with standard fluid dynamics, e.g., at the level of the discussion in Landau and Lifshitz (1959), basic thermodynamics (Reichl 1984), and the mathematics of action principles and how they are used to generate equations of motion (Lanczos 1949). Having stated this, it is clear that we are facing a challenge. We are trying to introduce a topic on which numerous books have been written (e.g., Tolman 1987; Landau and Lifshitz 1959; Lichnerowicz 1967; Anile 1989; Wilson and Mathews 2003; Rezzolla and Zanotti 2013), and which requires an understanding of a significant fraction of modern theoretical physics. This does not, however, mean that there is no place for this kind of survey. We continue to see exciting developments for multi-constituent systems, such as superfluid/superconducting neutron star cores.Footnote 1 Much of the recent theory work has been guided by the geometric approach to fluid dynamics championed by Carter (1983, 1989a, 1992), which provides a powerful framework that makes extensions to multi-fluid situations intuitive. A typical example of a phenomenon that arises naturally is the so-called entrainment effect, which plays a crucial role in a superfluid neutron star core. Given the flexible nature of the formalism, its natural connection with General Relativity and the potential for future applications, we have opted to base much of our description on the work of Carter and colleagues.

It is important to appreciate that, even though the subject of relativistic fluids is far from new, issues still remain to be resolved. The most obvious shortcoming of the available theory concerns dissipative effects. As we will see, different dissipation channels are (at least in principle) easy to incorporate in Newtonian theory but the extension to General Relativity remains “problematic”. This is an issue—with a number of notable recent efforts—of key importance for future gravitational-wave source modelling (e.g., in numerical relativity) as well as the description of laboratory systems (like heavy-ion collisions). In order to develop the required framework, we need to make progress on both the underpinning theory and implementations (e.g., computationally “affordable” simulations)—a real, but at the same time inspiring, challenge.

A brief history of fluids

The two fluids air and water are essential to human survival. This obvious fact implies a basic need to divine their innermost secrets. Homo Sapiens have always needed to anticipate air and water behaviour under a myriad of circumstances, such as those that concern water supply, weather, and travel. The essential importance of fluids for survival—and how they can be exploited to enhance survival—implies that the study of fluids likely reaches as far back into antiquity as the human race itself. Unfortunately, our historical records of this ever-ongoing study are not so great that we can reach very far accurately.

A wonderful account (now in affordable Dover print) is “A History and Philosophy of Fluid Mechanics” by Tokaty (1994). He points out that while early cultures may not have had universities, government sponsored laboratories, or privately funded centers pursuing fluids research (nor a Living Reviews journal on which to communicate results!), there was certainly some collective understanding. After all, there is a clear connection between the viability of early civilizations and their access to water. For example, we have the societies associated with the Yellow and Yangtze rivers in China, the Ganges in India, the Volga in Russia, the Thames in England, and the Seine in France, to name just a few. We must also not forget the Babylonians and their amazing technological (irrigation) achievements in the land between the Tigris and Euphrates, and the Egyptians, whose intimacy with the flooding of the Nile is well documented. In North America, we have the so-called Mississippians, who left behind their mound-building accomplishments. For example, the Cahokians (in Collinsville, Illinois) constructed Monk’s Mound,Footnote 2 the largest pre-Columbian earthen structure in existence that is “...over 100 feet tall, 1000 feet long, and 800 feet wide (larger at its base than the Great Pyramid of Giza)”.

In terms of ocean and sea travel, we know that the maritime ability of the Mediterranean people was the key to ensuring cultural and economic growth and societal stability. The finely-tuned skills of the Polynesians in the South Pacific allowed them to travel great distances, perhaps reaching as far as South America, and certainly making it to the “most remote spot on the Earth”, Easter Island. Apparently, they were adept at reading the smallest of signs—water colour, views of weather on the horizon, subtleties of wind patterns, floating objects, birds, etc.—as indications of nearby land masses. Finally, the harsh climate of the North Atlantic was overcome by the highly accomplished Nordic sailors, whose skills allowed them to reach North America. Perhaps it would be appropriate to think of these early explorers as adept geophysical fluid dynamicists/oceanographers?

Many great scientists are associated with the study of fluids. Lost are the names of the individuals who, almost 400,000 years ago, carved “aerodynamically correct” (Gad-el Hak 1998) wooden spears. Also lost are those who developed boomerangs and fin-stabilized arrows. Among those not lost is Archimedes, the Greek mathematician (287–212 BC), who provided a mathematical expression for the buoyant force on bodies. Earlier, Thales of Miletus (624–546 BC) asked the simple question: What is air and water? His question is profound as it represents a departure from the main, myth-based modes of inquiry at that time. Tokaty ranks Hero of Alexandria as one of the great, early contributors. Hero (c. 10–70) was a Greek scientist and engineer, who left behind writings and drawings that, from today’s perspective, indicate a good grasp of basic fluid mechanics. To make a complete account of individual contributions to our present understanding of fluid dynamics is, of course, impossible. Yet, it is useful to list some of the contributors to the field. We provide a highly subjective “timeline” in Fig. 1. The list is to a large extent focussed on the topics covered in this review, and includes chemists, engineers, mathematicians, philosophers, and physicists. It recognizes those that have contributed to the development of non-relativistic fluids, their relativistic counterparts, multi-fluid versions of both, and exotic phenomena like superfluidity. The list provides context—both historical and scientific—and also serves as an informal table of contents for this survey.

Fig. 1

A “timeline” focussed on the topics covered in this review, including chemists, engineers, mathematicians, philosophers, and physicists who have contributed to the development of non-relativistic fluids, their relativistic counterparts, multi-fluid versions of both, and exotic phenomena like superfluidity

Tokaty (1994) discusses the human propensity for destruction when it comes to water resources. Depletion and pollution are the main offenders. He refers to a “Battle of the Fluids” as a struggle between their destruction and protection. His context for this discussion was the Cold War. He rightly points out the failure to protect our water and air resources by the two dominant powers—the USA and USSR. In an ironic twist, modern study of the relativistic properties of fluids has its own “Battle of the Fluids”. A self-gravitating mass can become absolutely unstable and collapse to a black hole, the ultimate destruction of any form of matter.

Why are fluid models useful?

The Merriam-Webster online dictionaryFootnote 3 defines a fluid as “...a substance (as a liquid or gas) tending to flow or conform to the outline of its container” when taken as a noun and “...having particles that easily move and change their relative position without a separation of the mass and that easily yield to pressure: capable of flowing” when taken as an adjective. The best model of physics is the Standard Model which is ultimately the description of the “substance” that makes up our fluids. The substance of the Standard Model consists of a remarkably small set of elementary particles: leptons, quarks, and the so-called “force” carriers (gauge-vector bosons). Each elementary particle is quantum mechanical, but the Einstein equations require explicit trajectories. Effectively, there is a disconnect between the quantum scale and our classical description of gravity. Moreover, cosmology and neutron stars are (essentially) many particle systems and—even forgetting about quantum mechanics—it is not possible to track each and every “particle” that makes them up, regardless of whether these are elementary (leptons, quarks, etc.) or collections of elementary particles (e.g., individual stars in galaxies and the galaxies themselves in cosmology). The fluid model is such that the inherent quantum mechanical behaviour, and the existence of many particles are averaged over in such a way that it can be implemented consistently in the Einstein equations.

Fig. 2

An object with a characteristic size D is modeled as a fluid that contains M fluid elements. From inside the object we magnify a generic fluid element of characteristic size L. In order for the fluid model to work we require \(M \gg N \gg 1\) and \(D \gg L\)

Central to the model is the notion of a “fluid element”, also known as a “fluid particle” or “material particle” (Lautrup 2005). This is an imagined, local “box” that is infinitesimal with respect to the system en masse and yet large enough to contain a large number of particles, N  (e.g., an Avogadro’s number of particles). The idea is illustrated in Fig. 2. We consider an object with characteristic size D that is modeled as a fluid that contains M fluid elements. From inside the object we magnify a generic fluid element of characteristic size L. In order for the fluid model to work we require \(M \gg N \gg 1\) and \(D \gg L\). Strictly speaking, the model has L infinitesimal, \(M \rightarrow \infty \), but with the total number of particles remaining finite. An operational point of view is that discussed by Lautrup in his fine text “Physics of Continuous Matter” (2005). He rightly points out the implicit connection to the intended precision. At some level, any real system will be discrete and no longer represented by a continuum. As long as the scale where the discreteness of matter and fluctuations are important is much smaller than the desired precision, the continuum approximation is valid. The key point is that the fluid model allows us to consider complex dynamical phenomena in terms of a (relatively) small number of variables. We do not have to keep track of individual particles. The connection between the different scales (macroscopic and microscopic) plays a role, but many of the tricky issues are assumed to be “known” (read: encoded in the matter equation of state, the determination of which may be someone else’s “problem”).

The aim of this review is to describe how the fluid model can be used (and understood) in the context of Einstein’s curved spacetime theory for gravity. As will become clear, this necessarily involves attention to detail. For example, we need to consider how the coordinate invariance of General Relativity (with no preferred observers) impacts on (by necessity) observer-dependent notions from thermodynamics and the underlying microphysics. We also need to explore to what extent the dynamics of spacetime enters the problem. This is particularly relevant in the context of numerical simulations of energetic gravitational-wave sources (like merging neutron stars or massive stars collapsing under their own weight). The first step we have to take is natural—we need to consider how a given fluid element moves through spacetime and how this fluid motion enters the Einstein field equations. To some extent, this is a text-book problem with a well-known solution (= the perfect fluid model). However, as we will learn along the way, more realistic matter descriptions (including for example superfluidity, as expected in the core of a mature neutron star, or the elasticity of the star’s crust) require a more sophisticated approach. Nevertheless, the first step we have to take is natural.

The explicit trajectories that enter the Einstein equations are those of the fluid elements, not the much smaller (generally fundamental) particles that are “confined” (on average) to the elements. Hence, when we talk about the fluid velocity, we mean the velocity of fluid elements. In this sense, the use of the phrase “fluid particle” is very apt. For instance, each fluid element traces out a timelike trajectory in spacetime \(x^a(\tau )\), such that the unit tangent vector

$$\begin{aligned} u^a = {dx^a \over d\tau } , \quad \text{ with } \quad u_a u^a = -1 \end{aligned}$$

where \(\tau \) is time measured on a co-moving clock (proper time), provides the four velocity of the particle. The idea is illustrated in Fig. 3.

Fig. 3

An illustration of the fibration of spacetime associated with a set of fluid “observers”, each with their own four velocity \(u^a\) and notion of time (the proper time measured on a co-moving clock). In the fluid model, individual worldlines are assigned to specific fluid elements (which involve averages over the large number of constituent particles)

The fundamental variable that enters the fluid equations is the particle flux density, in the following given by \(n^a = n u^a\), where \(n \approx N/L^3\) is the particle number density of the fluid element whose worldline is given by \(u^a\). An object like a neutron star is then modelled as a collection of particle flux density worldlines that continuously fill a portion of spacetime. In fact, we will see later that the relativistic Euler equation is little more than an “integrability” condition that guarantees that this filling (or fibration) of spacetime can be performed.

Equivalently, we may consider the family of three-dimensional hypersurfaces that are pierced by the worldlines at given instants of time, as illustrated later in Fig. 10. The integrability condition in this case guarantees that the family of hypersurfaces continuously fill a portion of spacetime. In this view, a fluid is a so-called three-brane (see Carter 1992 for a general discussion of branes). In fact, the strategy adopted in Sect. 6 to derive the relativistic fluid equations is based on thinking of a fluid as living in a three-dimensional “matter” space (i.e., the left-hand-side of Fig. 10). At first sight, this approach may seem confusing. However, as we will demonstrate, it allows us to develop a versatile framework for complicated systems which (in turn) enables progress on a number of relevant problems in astrophysics and cosmology.

Once we understand how to build a fluid model using the matter space, it is straight-forward to extend the technique to single fluids with several constituents, as in Sect. 8.1, and multiple fluid systems, as in Sect. 9. An example of the former would be a fluid with one species of particles at a non-zero temperature, i.e., non-zero entropy, that does not allow for heat conduction relative to the particles. (Of course, entropy still flows through spacetime.) The latter example can be obtained by relaxing the constraint of no heat conduction. In this case the particles and the entropy are both considered to be fluidsFootnote 4 that are dynamically independent, meaning that the entropy will have a four-velocity that is generally different from that of the particles. There is thus an associated collection of fluid elements for the particles and another for the entropy. At each point of spacetime that the system occupies there will be two fluid elements, in other words, there are two matter spaces (cf. Sect. 9). Perhaps the most important consequence of this is that there can be a relative flow of the entropy with respect to the particles. In general, relative flows lead to the so-called entrainment effect, i.e., the momentum of one fluid in a multiple fluid system is in principle a linear combination of all the fluid velocities (Andersson and Comer 2006). The canonical examples of two fluid models with entrainment are superfluid \(\mathrm {He}^4\) (Putterman 1974) at non-zero temperature and a mixture of superfluid \(\mathrm {He}^4\) and \(\mathrm {He}^3\) (Andreev and Bashkin 1975). We will develop a detailed understanding of all these concepts in due course, but as it is important to proceed with care we will first focus on the physics that provide input for the fluid model.

Notation and conventions

Throughout the article we assume the “MTW” (Misner et al. 1973) conventions. We also generally assume geometrized units \(c=G=1\), unless specifically noted otherwise, and set the Boltzmann constant \(k_B = 1\). A coordinate basis will always be used, with spacetime indices denoted by lowercase Latin letters \(\{a,b,\ldots \}\) etc. that range over \(\{0,1,2,3\}\) (time being the zeroth coordinate), and purely spatial indices denoted by lowercase Latin letters \(\{i,j,\ldots \}\) etc. that range over \(\{1,2,3\}\). Unless otherwise noted, we assume that the Einstein summation convention applies. Finally, we adopt the convention that \(u^{\mathrm {x}}_a=g_{ab} u^b_\mathrm {x}\) where \({\mathrm {x}}\) is a fluid constituent label. These are never summed over when repeated. Also note that, while it is possible to build a chemically covariant formalism (with the \({\mathrm {x}}\) treated on a par with spacetime indices) we will not do so here. Our approach has the “advantage” that the constituent labels can be placed up or down, without this having any particular meaning, which helps keep many of the expressions tidy. We will also regularly have to deal with expressions where more than two of these labels are repeated and this complicates a fully covariant approach.

Thermodynamics and equations of state

As fluids consists of many fluid elements—and each fluid element consists of many particles—the state of matter in a given fluid element is (inevitably) determined thermodynamically (Reichl 1984). This means that only a few parameters are tracked as the fluid element evolves. In a typical situation, not all the thermodynamic variables are independent—they are connected through the so-called equation of state. Moreover, the number of independent variables may be reduced if the system has an overall additivity property. As this is a very instructive example, we will illustrate this point in detail.

Fundamental, or Euler, relation

Consider the standard form of the combined First and Second LawsFootnote 5 for a simple, single-species system:

$$\begin{aligned} d E = T \, d S - p \, d V + \mu \, d N. \end{aligned}$$

This follows because there is an equation of state, meaning that \(E = E(S,V,N)\) where

$$\begin{aligned} T = \left. \frac{\partial E}{\partial S} \right| _{V,N}, \quad \quad p = - \left. \frac{\partial E}{\partial V} \right| _{S,N}, \quad \quad \mu = \left. \frac{\partial E}{\partial N} \right| _{S,V} . \end{aligned}$$

The total energy E, entropy S, volume V, and particle number N are said to be extensive if when S, V, and N are doubled, say, then E will also double. Conversely, the temperature T, pressure p, and chemical potential \(\mu \) are called intensive if they do not change their values when V, N, and S are doubled. This is the additivity property and we will now show why it implies an Euler relation (also known as the “fundamental relation”; Reichl 1984) among the thermodynamic variables. This relation is essential for any effort to connect the microphysics and thermodynamics to the fluid dynamics.

Let a bar represent the change in thermodynamic variables when S, V, and N are all increased by the same amount \(\lambda \), i.e.,

$$\begin{aligned} \overline{S} = \lambda S, \qquad \overline{V} = \lambda V, \qquad \overline{N} = \lambda N. \end{aligned}$$

Taking E to be extensiveFootnote 6 then means

$$\begin{aligned} \overline{E}(\overline{S},\overline{V},\overline{N}) = \lambda E(S,V,N). \end{aligned}$$

Of course, we have for the intensive variables

$$\begin{aligned} \overline{T} = T, \qquad \overline{p} = p, \qquad \overline{\mu } = \mu . \end{aligned}$$


$$\begin{aligned} d \overline{E}= & {} \lambda \, d E + E \, d \lambda = \overline{T} \, d \overline{S} - \overline{p} \, d \overline{V} + \overline{\mu } \, d \overline{N} \nonumber \\= & {} \lambda \left( T d S - p d V + \mu d N \right) + \left( T S - p V + \mu N \right) d \lambda , \end{aligned}$$

and (since the change in the energy should be proportional to \(\lambda \)) we find the Euler relation

$$\begin{aligned} E = T S - p V + \mu N. \end{aligned}$$

If we let \(\varepsilon = E / V\) denote the total energy density, \(s = S / V\) the total entropy density, and \(n = N / V\) the total particle number density, then

$$\begin{aligned} p + \varepsilon = T s + \mu n. \end{aligned}$$

The nicest feature of an extensive system is that the number of parameters required for a complete specification of the thermodynamic state can be reduced by one, in such a way that only intensive variables remain. To see this, let \(\lambda = 1/V\), in which case

$$\begin{aligned} \overline{S} = s, \qquad \overline{V} = 1, \qquad \overline{N} = n. \end{aligned}$$

The re-scaled energy becomes just the total energy density, i.e., \(\overline{E} = E / V = \varepsilon \), and moreover \(\varepsilon = \varepsilon (s,n)\) since

$$\begin{aligned} \varepsilon = \overline{E}(\overline{S},\overline{V},\overline{N}) = \overline{E}(S/V,1,N/V) = \overline{E}(s,n). \end{aligned}$$

The first law thus becomes

$$\begin{aligned} d \overline{E} = \overline{T} \, d \overline{S} - \overline{p} \, d \overline{V} + \overline{\mu } \, d \overline{N} = T \, d s + \mu \, d n, \end{aligned}$$


$$\begin{aligned} d \varepsilon = T \, d s + \mu \, d n. \end{aligned}$$

This implies

$$\begin{aligned} T = \left. \frac{\partial \varepsilon }{\partial s} \right| _n, \qquad \mu = \left. \frac{\partial \varepsilon }{\partial n} \right| _s. \end{aligned}$$

That is, \(\mu \) and T are the chemical potentialsFootnote 7 associated with the particles and entropy, respectively. The Euler relation (2.8) then yields the pressure as

$$\begin{aligned} p = - \varepsilon + s \left. \frac{\partial \varepsilon }{\partial s} \right| _n + n \left. \frac{\partial \varepsilon }{\partial n} \right| _s. \end{aligned}$$

In essence, we can think of a given relation \(\varepsilon (s,n)\) as the equation of state, to be determined in the flat, tangent space at each point of spacetime, or, physically, small enough patches across which the changes in the gravitational field are negligible, but also large enough to contain a large number of particles. For example, for a neutron star, Glendenning (1997) argues that the relative change in the metric over the size of a nucleon with respect to the change over the entire star is about \(10^{- 19}\), and thus one must consider many inter-nucleon spacings before a substantial change in the metric occurs. In other words, it is sufficient to determine the properties of matter in special relativity, neglecting effects due to the spacetime curvature.Footnote 8 The equation of state is the key link between the microphysics that governs the local fluid behaviour and global quantities (such as the mass and radius of a star).

In what follows we will use a thermodynamic formulation that satisfies the fundamental scaling relation, meaning that the local thermodynamic state (modulo entrainment, see later) is a function of the variables N/V, S/V, and so on. This is in contrast to the discussion in, for example, “MTW” (Misner et al. 1973). In their approach one fixes from the outset the total number of particles N, meaning that one simply sets \(d N = 0\) in the first law of thermodynamics. Thus, without imposing any scaling relation, one can write

$$\begin{aligned} d \varepsilon = d \left( E/V \right) = T \, d s + \frac{1}{n} \left( p + \varepsilon - T s \right) d n. \end{aligned}$$

This is consistent with our starting point, because we assume that the extensive variables associated with a fluid element do not change as the fluid element moves through spacetime. However, we feel that the scaling is necessary in that the fully conservative (read: non-dissipative) fluid formalism presented below can be adapted to non-conservative, or dissipative, situations where \(d N = 0\) cannot be imposed.

Case study: neutron stars

With a mass of more than that of the Sun squeezed inside a radius of about 10 km, a neutron star represents many extremes of physics. The relevant matter description involves issues that cannot be explored in terrestrial laboratories, yet relies on aspects similar to those probed by high-energy colliders. However, while the LHC at CERN and RHIC at Brookhaven (among others) probe low density matter at high temperatures, neutron stars are cold (on the nuclear physics temperature scale) and reach significantly higher densities. In effect, the problems are complementary, see Fig. 4 for a schematic illustration. Moreover, atrophysical modelling of neutron star dynamics (e.g., the global oscillations of the star) typically involves large enough scales that a fluid description is an absolute necessity. Yet, such models must build on appropriate microphysics input (encoded in the equation of state). This is problematic because first principle calculations of the interactions for many-body QCD systems are not yet within reach (due to the fermion sign problem). In essence, we do not know the composition of matter. There may be a large population of hyperons present at densities relevant for neutron star cores. Perhaps the quarks are deconfined to form a quark-gluon plasma? Our models needs to be flexible enough to account for different possibilities, and the problem is further complicated by the state of matter. At the relevant temperatures, many of the particle constituents (neutrons, protons, hyperons, etc.) are expected to exhibit Cooper pairing to form superfluid/superconducting condensates. This brings in aspects from low-temperature physics and a realistic neutron-star model must recognize this. In short, the problem is overwhelming and one would typically (at some point) have to resort to phenomenology, using experiments and observations to test predictions as new models become available (Watts et al. 2016).

Fig. 4

A broad-brush illustration of the phase space for dense matter physics, represented by the baryon chemical potential (\(\mu _{\mathrm b}\)) (horizontal axis) and the temperature (vertical axis). Experiments carried out using high-energy colliders, like the LHC and RHIC, aim to explore the nature of the quark-gluon plasma and the conditions of the early Universe—hot matter at relatively low densities. In contrast, an understanding of relativistic stars depends on the dense-low temperature regime, which unlikely to be within reach of laboratory efforts. First principles calculation in the \(\mu _{\mathrm b}\rightarrow \infty \) limit of QCD suggests that the core of a mature neutron star may contain a colour superconductor, but the exact nature of the quark pairing at the relevant densitites is not (particularly) well understood (Alford et al. 2008)

The details may be blurry but (at least) the rules that guide the exercise are fairly clear. We need to build models that allow for a complex matter composition and account for different states of matter (from solids to superfluids). This involves going beyond the single-fluid setting and considering systems with distinct components exhibiting relative flows. In short, we need to model multi-constituent multi-fluid systems. As both concepts will be central to the discussion, let us introduce the main ideas already at this point.

It is natural to start by considering the matter in the outer core of a neutron star, dominated by neutrons with a small fraction of protons and electrons. Assuming that the different constituents flow together (we will relax this assumption later), we have the thermodynamic relation (assuming matter at zero temperature, for simplicity)

$$\begin{aligned} p+\varepsilon = \sum _{{\mathrm {x}}} n_{\mathrm {x}}\mu _{\mathrm {x}}, \quad \text{ with } \quad {\mathrm {x}}= \mathrm {n},\mathrm {p},\mathrm{e}, \end{aligned}$$

where \(n_{\mathrm {x}}\) are the respective number densities and \(\mu _{\mathrm {x}}\) the corresponding chemical potentials. This is a straightforward extension of (2.14). At the microscopic scale (e.g., the level of the equation of state), it is usually assumed that the matter is charge neutral. The number of electrons must balance the number of protons. We have \(n_\mathrm {p}=n_\mathrm{e}\) and it follows that

$$\begin{aligned} p+\varepsilon = n_\mathrm {n}\mu _\mathrm {n}+ n_\mathrm {p}(\mu _\mathrm {p}+\mu _\mathrm{e}) \end{aligned}$$

Next, we need to consider the issue of chemical equilibrium. For the case under consideration this would involve the system being such that the Urca reactions are in balance. In essence, this means that we have

$$\begin{aligned} \beta \equiv \mu _\mathrm {n}- (\mu _\mathrm {p}+ \mu _\mathrm{e}) = 0 . \end{aligned}$$

This condition determines how many neutrons we need per proton, which means that the composition is specified. In general, we can rewrite the thermodynamical relation asFootnote 9

$$\begin{aligned} p+\varepsilon = n \mu _\mathrm {n}- n_\mathrm {p}\beta , \end{aligned}$$

where we have introduced the baryon number density \(n= n_\mathrm {n}+n_\mathrm {p}\). Assuming equilibrium, this leads to

$$\begin{aligned} p = n \mu _\mathrm {n}(n) - \varepsilon (n) ; \end{aligned}$$

that is, we have a one-parameter equation of state. It is common to think of the equation of state in this way—the pressure is provided as a function of the (baryon number) density.

Many formulations for numerical simulations take this “barotropic” model as the starting point. The usual logic works (in some sense) “backwards” by focussing on the mass density and separating out the mass density contribution to the chemical potential by introducing \(\rho = mn\) where m is the baryon mass. That is, we use

$$\begin{aligned} \mu _\mathrm {n}= m+ \overline{\mu }. \end{aligned}$$

This expression reflects that simple fact that the (rest) mass of a particle in isolation should be \(mc^2\), leaving the (to some extent) unknown aspects of the many-body interactions to be encoded in \(\overline{\mu }\). This allows us to write

$$\begin{aligned} p = \rho + (n\overline{\mu }- \varepsilon ) = \rho (1- \epsilon ), \end{aligned}$$

where \(\epsilon \) represents the (specific) internal energy. Numerical efforts often focus on \(\epsilon \). The reason for this will become shortly. First, it is easy to see that we also have

$$\begin{aligned} \varepsilon = \rho (1+\epsilon ) , \end{aligned}$$


$$\begin{aligned} \overline{\mu }= {d (\rho \epsilon ) \over dn}. \end{aligned}$$

It is also useful to note that

$$\begin{aligned} d\epsilon = {p\over \rho ^2} d\rho . \end{aligned}$$

Let us now see what happens when we try to account for additional aspects, like the effects due to a finite temperature. Assuming that we are comfortable working with the chemical potential (as we will do throughout much of this review) the natural starting point would be (2.12). However, it could be that we would prefer to extend the discussion using the internal energy. In that case, we first of all need to convince ourselves that (2.22) and (2.23) remain valid when \(\varepsilon = \varepsilon (n,s)\). We then have \(\epsilon =\epsilon (\rho ,s)\), which leads to

$$\begin{aligned} {\partial \epsilon \over \partial s} = {T \over \rho } \end{aligned}$$

and we find that

$$\begin{aligned} d \epsilon = {p\over \rho ^2} d\rho + {T\over \rho } ds - {sT \over \rho ^2} d \rho = {p\over \rho ^2} d\rho + T d \hat{s} \end{aligned}$$

where we have introduced the specific entropy

$$\begin{aligned} \hat{s} = {s\over \rho } . \end{aligned}$$

If we want to progress beyond this point, we need to provide the form for the internal energy. This requires a finite temperature treatment on the microphysics level, as discussed by (for example) Constantinou et al. (2015) and Lattimer and Prakash (2016).

Before we move on, it is useful to note that many numerical simulations have been based on implementing a pragmatic result drawn from the ideal gas law

$$\begin{aligned} p = n k_B T , \end{aligned}$$

where \(k_B\) is Boltzmann’s constant. Noting that this model leads to \(\epsilon = C_v T\), with \(C_v\) the specific heat capacity (at fixed volume), Mayer’s relation

$$\begin{aligned} {k_B\over C_v} = m(\varGamma -1), \end{aligned}$$

where \(\varGamma \) is the adiabatic index, leads to

$$\begin{aligned} p = \rho \epsilon (\varGamma -1). \end{aligned}$$

For obvious reasons this is commonly referred to as the Gamma-law equation of state. It may not be particularly realistic—at least not for neutron stars—but it is simple (and relatively easy to implement). It also provides a straightforward measure of the temperature. Combining (2.29) and (2.31) we arrive at

$$\begin{aligned} T = {m\epsilon \over k_B} (\varGamma -1) = {m\over k_B} {p\over \rho }. \end{aligned}$$

This is useful, but we need to be careful with this result. In a more general setting—like a multi-constituent system for which the ideal gas law argument is dubious—we are not quantifying the actual temperature. This would require use of the relevant physics from the beginning of the argument rather than at the end. However, sometimes you have to accept a bit of pragmatism as the price of progress.

Up to this point, we have separated the microphysics (determining the equation of state) from the hydrodynamics (governing stellar oscillations and the like). Let us now consider the scale associated with fluid dynamics. For ordinary matter, the relevant scale is set by interparticle collisions. Collisions tend to dissipate relative motion, leading to the system reaching (local dynamical and thermodynamical) equilibrium. Since we want to associate a single “velocity” with each fluid element, the particles must be able to equilibrate in a meaningful sense (e.g., have a velocity distribution with a well defined peak, allowing us to average over the system). The relevant length-scale is the mean-free path. This concept is closely related to the shear viscosity of matter (which arises due to particle scattering). In the case of neutrons (which dominate the outer core of a typical neutron star) we would have

$$\begin{aligned} \lambda \approx { \eta \over \rho v_F} \approx 10^{-4} \left( {\rho \over 10^{14}\ \text{ g/cm}^3} \right) ^{11/12} \left( {10^8 \ \text{ K } \over T}\right) ^2 \ \text{ cm } , \end{aligned}$$

where \(v_F\) is the relevant Fermi velocity and we have used the estimate for the neutron-neutron scattering shear viscosity \(\eta \) from Andersson et al. (2005). This estimate gives us an idea of the smallest scale on which it makes sense to consider the system as a fluid. Notably, the mean-free path is many orders of magnitude larger than the interparticle separation (typically, the Fermi scale). The actual scale assumed in a fluid model typically depends on the problem one wants to study and tends to be limited by computational resources. For example, in current state of the art simulations of neutron star mergers, the computational fluid elements tend to be of order a few tens to perhaps a hundred meters across. They are in no sense microscopic entities. It is important to appreciate that these models involve a significant amount of “extrapolation”.

Assuming that the averaging procedure makes sense (we will have more to say about this later), the equations of hydrodynamics can be obtained from a set of (more or less) phenomenological balance laws representing the conservation (or not...) of the key quantities. The possibility that different fluid components may be able to flow (or perhaps rather “drift”) relative to one another, leads to a multi-fluid system. In order to model such systems we assume that the system contains a number of distinguishable components, the dynamics of which are coupled. The formalism that we will develop draws on experience from chemistry, where one regularly has to consider the mechanics of mixtures, but is adapted to the kind of systems that are relevant for General Relativity. The archetypal such system is (again) represented by the neutron star core, where we expect different components (neutrons, protons, hyperons) to be in a superfluid state. However, the formalism is general enough that it can be applied in a variety of contexts, including (as we shall see later) the problem of heat conduction and the charged flows relevant for electromagnetism.

As the concept may not be familiar, it is worth considering the notion of a multi-fluid system in a bit more detail before we move on. In principle, it is easy to see how such a system may arise. Recall the discussion of the mean-free path, but consider a system with two distinct particle species. Suppose that the mean-free path associated with scattering of particles of the same kind is (for some reason) significantly shorter than the scale for inter-species collisions. Then we have two clearly defined “fluids”. In fact, any system where it is meaningful to consider one component drifting (on average) relative to another one can be considered from this point of view (a liquid with gas bubbles would be an obvious example).

Another relevant context involves systems that exhibit superfluidity. At the most basic level, superfluidity implies that no friction impedes the flow. Technically, the previous argument leading to a scale for averaging does not work anymore. However, a superfluid system has a different scale associated with it; the so-called coherence length. The coherence length arises from the fact that a superfluid is a “macroscopic” quantum state, the flow of which depends on the gradient of the phase of the wave-function (the so-called order parameter, see Sect. 13.1). On some small scale, the superfluidity breaks down due to quantum fluctations. This defines the coherence length. It can be taken as the typical “size” of a Cooper pair in a fermionic system. On any larger scale the system exhibits collective (fluid) behaviour.

For neutron-star superfluids, the coherence length is of the order of tens of Fermi; evidently, much smaller than the mean-free path in the normal fluid case. This means that superfluids can exhibit extremely small scale dynamics. Since a superfluid is inviscid, superfluid neutrons and superconducting protons (say) do not scatter (at least not at as long as thermal excitations can be ignored) and hence the outer core of a neutron star demands a multi-fluid treatment (Glampedakis et al. 2011). One can meaningfully take the fluid elements to have a size of the order of the coherence length, i.e. they are tiny. However, in reality the problem is more complicated, as yet another length-scale needs to be considered. First of all, on scales larger than the Debye screening length, the electrons will be electromagnetically locked to the protons, forming a charge-neutral conglomerate that does exhibit friction (due to electron-electron scattering). This brings us back to the mean-free path argument. At finite temperatures we also need to consider thermal excitations for both neutrons and protons (which may scatter and dissipate), making the problem rather complex. Finally, ideal superfluids are irrotational and neutron stars are not. In order to mimic bulk rotation the neutron superfluid must form a dense array of vortices (locally breaking the superfluidity). This brings yet another length scale into the picture. In order to develop a useful fluid model, we need to average over the vortices, as well. This makes the effective fluid elements much larger. The typical vortex spacing in a neutron star is of the order;

$$\begin{aligned} d_\mathrm {n}\approx 4\times 10^{-4} \left( {P \over 1\ \text{ ms }} \right) ^{1/2} \ \text{ cm } , \end{aligned}$$

where P is the star’s spin period. In other words, the fluid elements we consider may (at the end of the day) be quite large also in a superfluid system.

Physics in a curved spacetime

There is an extensive literature on Special and General Relativity and the spacetime-based viewFootnote 10 of the laws of physics, providing historical context, technical insight and topical updates. For a student at any level interested in developing a working understanding we recommend Taylor and Wheeler (1992) for an introduction, followed by Hartle’s excellent text (2003) designed for students at the undergraduate level. The recent contribution from Poisson and Will (2014) provides a detailed discussion of the link between Newtonian gravity and Einstein’s four dimensional picture. For more advanced students, we suggest two of the classics, “MTW” (Misner et al. 1973) and Weinberg (1972), or the more contemporary book by Wald (1984). Finally, let us not forget the Living Reviews journal as a premier online source of up-to-date information!

In terms of the experimental and/or observational support for Special and General Relativity, we recommend two articles by Will that were written for the 2005 World Year of Physics celebration (2005, 2006). They summarize a variety of tests that have been designed to expose breakdowns in both theories. (We also recommend Will’s popular book Was Einstein Right? (1986) and his technical exposition Theory and Experiment in Gravitational Physics (1993).) Updates including the breakthrough observations of gravitational waves can be found in recent monographs (Maggiore 2018; Andersson 2019) . There have been significant recent developments, but... to date, Einstein’s theoretical edifice is still standing!

For Special Relativity, this is not surprising, given its long list of successes: explanation of the Michelson–Morley result, the prediction and subsequent discovery of anti-matter, and the standard model of particle physics, to name a few. Will (2006) offers the observation that genetic mutations via cosmic rays require Special Relativity, since otherwise muons would decay before making it to the surface of the Earth. On a more somber note, we may consider the Trinity site in New Mexico, and the tragedies of Hiroshima and Nagasaki, as reminders of \(E = m c^2\).

In support of General Relativity, there are Eötvös-type experiments testing the equivalence of inertial and gravitational mass, detection of gravitational red-shifts of photons, the passing of the solar system tests, confirmation of energy loss via gravitational radiation in the Hulse–Taylor binary pulsar—and eventually the first direct detection of these faint whispers from the Universe in 2015—and the expansion of the Universe. Incredibly, General Relativity even finds a practical application in the GPS system. In fact, we need both of Einstein’s theories. The speed of the moving clock leads to it slowing down by 7 micro-seconds every day, while the fact that a clock in a gravitational field runs slow, leads to the orbiting clock appearing to speed up by 45 micro-seconds each day. All in all, if we ignore relativity position errors accumulate at a rate of about 10 km every day (Will 2006). This would make reliable navigation impossible.

The evidence is overwhelming that General Relativity, or at least some closely related theory that passes the entire collection of tests, is the proper description of gravity. Given this, we assume the Einstein Equivalence Principle, i.e., that (Will 2006, 2005, 1993)

  • test bodies fall with the same acceleration independently of their internal structure or composition;

  • the outcome of any local non-gravitational experiment is independent of the velocity of the freely-falling reference frame in which it is performed;

  • the outcome of any local non-gravitational experiment is independent of where and when in the Universe it is performed.

If the Equivalence Principle holds, then gravitation must be described by a metric-based theory (Will 2006). This means that

  1. 1.

    spacetime is endowed with a symmetric metric,

  2. 2.

    the trajectories of freely falling bodies are geodesics of that metric, and

  3. 3.

    in local freely falling reference frames, the non-gravitational laws of physics are those of Special Relativity.

For our present purposes this is very good news. The availability of a metricFootnote 11 means that we can develop the theory without requiring much of the differential geometry edifice that would be needed in a more general case. We will develop the description of relativistic fluids with this in mind. Readers that find our approach too “pedestrian” may want to consult the article by Gourgoulhon (2006), which serves as a useful complement to our description.

The metric and spacetime curvature

Our strategy is to provide a “working understanding” of the mathematical objects that enter the Einstein equations of General Relativity. We assume that the metric is the fundamental “field” of gravity. For a four-dimensional spacetime the metric determines the distance between two spacetime points along a given curve, which can generally be written as a one parameter function with, say, components \(x^a(\tau )\). For a material body, it is natural to take the parameter to be proper time, but we may opt to make a different choice. As we will see, once a notion of parallel transport is established, the metric also encodes information about the curvature of spacetime, which is taken to be pseudo-Riemannian, meaning that the signatureFootnote 12 of the metric is \(-+++\) (cf. Eq. (3.2) below).

In a coordinate basis, which we will assume throughout this review, the metric is denoted by \(g_{a b} = g_{b a}\). The symmetry implies that there are in general ten independent components (modulo the freedom to set arbitrarily four components that is inherited from coordinate transformations; cf. Eqs. (3.8) and (3.9) below). The spacetime version of the Pythagorean theorem takes the form

$$\begin{aligned} d s^2 = g_{a b} \, d x^a \, d x^b , \end{aligned}$$

and in a local set of Minkowski coordinates \(\{t,x,y,z\}\) (i.e., in a local inertial frame, or small patch of the manifold) it looks like

$$\begin{aligned} d s^2 = - \left( d t \right) ^2 + \left( d x \right) ^2 + \left( dy \right) ^2 + \left( dz \right) ^2. \end{aligned}$$

This illustrates the \(-+++\) signature. The inverse metric \(g^{a b}\) is such that

$$\begin{aligned} g^{a c} g_{c b} = \delta ^a{}_b, \end{aligned}$$

where \(\delta ^a{}_b\) is the unit tensor. The metric is also used to raise and lower spacetime indices, i.e., if we let \(V^a\) denote a contravariant vector, then its associated covariant vector (also known as a covector or one-form) \(V_a\) is obtained as

$$\begin{aligned} V_a = g_{a b} V^b \qquad \Leftrightarrow \qquad V^a = g^{a b} V_b . \end{aligned}$$

We can now consider three different classes of curves: timelike, null, and spacelike. A vector is said to be timelike if \(g_{a b} V^a V^b < 0\), null if \(g_{a b} V^a V^b = 0\), and spacelike if \(g_{a b} V^a V^b > 0\). We can naturally define timelike, null, and spacelike curves in terms of the congruence of tangent vectors that they generate. A particularly useful timelike curve for fluids is one that is parameterized by the so-called proper time, i.e., \(x^a(\tau )\) where

$$\begin{aligned} d \tau ^2 = - d s^2. \end{aligned}$$

The tangent \(u^a\) to such a curve has unit magnitude; specifically,

$$\begin{aligned} u^a \equiv \frac{dx^a}{d\tau }, \end{aligned}$$

and thus

$$\begin{aligned} g_{a b} u^a u^b = g_{a b} \frac{d x^a}{d \tau } \frac{d x^b}{d\tau } = \frac{d s^2}{d \tau ^2} = - 1. \end{aligned}$$

Under a coordinate transformation \(x^a \rightarrow \overline{x}^a\), contravariant vectors transform as

$$\begin{aligned} \overline{V}^a = \frac{\partial \overline{x}^a}{\partial x^b} V^b \end{aligned}$$

and covariant vectors as

$$\begin{aligned} \overline{V}_a = \frac{\partial x^b}{\partial \overline{x}^a} V_b . \end{aligned}$$

Tensors with a greater rank (i.e., a greater number of indices), transform similarly by acting linearly on each index using the above two rules.

When integrating, as we have to when we discuss conservation laws for fluids, we must make use of an appropriate measure that ensures the coordinate invariance of the integration. In the context of three-dimensional Euclidean space this measure is referred to as the Jacobian. For spacetime, we use the so-called volume form \(\epsilon _{abcd}\). It is completely antisymmetric, and for four-dimensional spacetime, it has only one independent component, which is

$$\begin{aligned} \epsilon _{0 1 2 3} = \sqrt{- g} \qquad \text{ and } \qquad \epsilon ^{0 1 2 3} = \frac{1}{\sqrt{- g}}, \end{aligned}$$

where g is the determinant of the metric (cf. Appendix 1 for details). The minus sign is required under the square root because of the metric signature. By contrast, for three-dimensional Euclidean space (i.e., when considering the fluid equations in the Newtonian limit) we have

$$\begin{aligned} \epsilon _{1 2 3} = \sqrt{g} \qquad \text{ and } \qquad \epsilon ^{1 2 3} = \frac{1}{\sqrt{g}}, \end{aligned}$$

but now g is the determinant of the three-dimensional space metric. A general identity that is extremely useful for writing the fluid vorticity in three-dimensional, Euclidean space—using lower-case Latin indices and setting \(s = 0\), \(n = 3\) and \(j = 1\) in Eq. (A.2) of Appendix 1—is

$$\begin{aligned} \epsilon ^{m i j} \epsilon _{m k l} = \delta ^i{}_k \delta ^j{}_l - \delta ^j{}_k \delta ^i{}_l. \end{aligned}$$

The general identities in Eqs. (A.1A.3) of Appendix 1 will be frequently used in the following.

Parallel transport and the covariant derivative

In order to have a generally covariant prescription for fluids—in terms of spacetime tensors—we must have a notion of derivative \(\nabla _a\) that is itself covariant. For example, when \(\nabla _a\) acts on a vector \(V^a\) a rank-two tensor of mixed indices must result:

$$\begin{aligned} \overline{\nabla }_b \overline{V}^a = \frac{\partial x^c}{\partial \overline{x}^b} \frac{\partial \overline{x}^a}{\partial x^d} \nabla _c V^d . \end{aligned}$$

The ordinary partial derivative does not work because under a general coordinate transformation

$$\begin{aligned} \frac{\partial \overline{V}^a}{\partial \overline{x}^b} = \frac{\partial x^c}{\partial \overline{x}^b} \frac{\partial \overline{x}^a}{\partial x^d} \frac{\partial V^d}{\partial x^c} + \frac{\partial x^c}{\partial \overline{x}^b} \frac{\partial ^2 \overline{x}^a}{\partial x^c \partial x^d} V^d . \end{aligned}$$

The second term spoils the general covariance, since it vanishes only for the restricted set of rectilinear transformations

$$\begin{aligned} \overline{x}^a = a^a{}_b x^b + b^a , \end{aligned}$$

where \(a^a{}_b\) and \(b^a\) are constants. Note that this includes the Lorentz transformation of Special Relativity.

For both physical and mathematical reasons, one expects a covariant derivative to be defined in terms of a limit. This is, however, a bit problematic. In three-dimensional Euclidean space limits can be defined uniquely as vectors can be moved around without their length and direction changing, for instance, via the use of Cartesian coordinates (the \(\{{\varvec{i}},{\varvec{j}},{\varvec{k}}\}\) set of basis vectors) and the usual dot product. Given these limits, those corresponding to more general curvilinear coordinates can be established. The same is not true for curved spaces and/or spacetimes because they do not have an a priori notion of parallel transport.

Consider the classic example of a vector on the surface of a sphere (illustrated in Fig. 5). Take this vector and move it along some great circle from the equator to the North pole in such a way as to always keep the vector pointing along the circle. Pick a different great circle, and without allowing the vector to rotate, by forcing it to maintain the same angle with the locally straight portion of the great circle that it happens to be on, move it back to the equator. Finally, move the vector in a similar way along the equator until it gets back to its starting point. The vector’s spatial orientation will be different from its original direction, and the difference is directly related to the particular path that the vector followed.

On the other hand, we could consider the sphere to be embedded in a three-dimensional Euclidean space, and let the two-dimensional vector on the sphere result from projection of a three-dimensional vector. Then we move the projection so that its higher-dimensional counterpart always maintains the same orientation with respect to its original direction in the embedding space. When the projection returns to its starting place it will have exactly the same orientation as it started out with (see Fig. 5). It is now clear that a derivative operation that depends on comparing a vector at one point to that of a nearby point is not unique, because it depends on the choice of parallel transport.

Pauli (1981) notes that Levi-Civita (1917) is the first to have formulated the concept of parallel “displacement”, with Weyl (1952) generalizing it to manifolds that do not have a metric. The point of view expounded in the books of Weyl and Pauli is that parallel transport is best defined as a mapping of the “totality of all vectors” that “originate” at one point of a manifold with the totality at another point. (In modern texts, this discussion tends to be based on fiber bundles.) Pauli points out that we cannot simply require equality of vector components as the mapping.

Let us examine the parallel transport of the force-free, point particle velocity in Euclidean three-dimensional space as a means for motivating the form of the mapping. As the velocity is constant, we know that the curve traced out by the particle will be a straight line. In fact, we can turn this around and say that the velocity parallel transports itself because the path traced out is a geodesic (i.e., the straightest possible curve allowed by Euclidean space). In our analysis we will borrow liberally from the excellent discussion of Lovelock and Rund (1989). Their text is comprehensive yet readable for anyone not well-versed with differential geometry. Finally, we note that this analysis will be relevant later when we consider the Newtonian limit of the relativistic equations, in an arbitrary coordinate basis.

Fig. 5

A schematic illustration of two possible versions of parallel transport. In the first case (a) a vector is transported along great circles on the sphere locally maintaining the same angle with the path. If the contour is closed, the final orientation of the vector will differ from the original one. In case (b) the sphere is considered to be embedded in a three-dimensional Euclidean space, and the vector on the sphere results from projection. In this case, the vector returns to the original orientation for a closed contour

We are all well aware that the points on the curve traced out by the particle can be described, in Cartesian coordinates, by three functions \(x^i(t)\) where t is the universal Newtonian time. Likewise, we know that the tangent vector at each point of the curve is given by the velocity components \(v^i(t) = d x^i/d t\), and that the force-free condition is equivalent to

$$\begin{aligned} a^i(t) = \frac{dv^i}{d t} = 0 \qquad \Rightarrow \qquad v^i(t) = \mathrm {const}. \end{aligned}$$

Hence, the velocity components \(v^i(0)\) at the point \(x^i(0)\) are equal to those at any other point along the curve, say \(v^i(T)\) at \(x^i(T)\), and so we could simply take \(v^i(0) = v^i(T)\) as the mapping. But as Pauli warns, we only need to reconsider this example using spherical coordinates to see that the velocity components \(\{\dot{r},\dot{\theta },\dot{\phi }\}\) must change as they undergo parallel transport along a straight-line path (assuming the particle does not pass through the origin). The question is what should be used in place of component equality? The answer follows once we find a curvilinear coordinate version of \(dv^i/dt = 0\).

What we need is a new “time” derivative \(\overline{D}/d t\), that yields a generally covariant statement

$$\begin{aligned} \frac{\overline{D} \overline{v}^i}{d t} = 0, \end{aligned}$$

where the \(\overline{v}^i(t) = d \overline{x}^i/d t\) are the velocity components in a curvilinear system of coordinates. Consider now a coordinate transformation to the new coordinate system \(\overline{x}^i\), the inverse being \(x^i = x^i(\overline{x}^j)\). Given that

$$\begin{aligned} v^i = \frac{\partial x^i}{\partial \overline{x}^j} \overline{v}^j \end{aligned}$$

we can write

$$\begin{aligned} \frac{d v^i}{d t} = \left( \frac{\partial x^i}{\partial \overline{x}^j} \frac{\partial \overline{v}^j}{\partial \overline{x}^k} + \frac{\partial ^2 x^i}{\partial \overline{x}^k \partial \overline{x}^j} \overline{v}^j \right) \overline{v}^k, \end{aligned}$$


$$\begin{aligned} \frac{d \overline{v}^i}{d t} = \frac{\partial \overline{v}^i}{\partial \overline{x}^j} \overline{v}^j. \end{aligned}$$

Again, we have an “offending” term that vanishes only for rectilinear coordinate transformations. However, we are now in a position to show the importance of this term to the definition of the covariant derivative.

First note that the metric \(\overline{g}_{i j}\) for our curvilinear coordinate system is obtained from

$$\begin{aligned} \overline{g}_{i j} = \frac{\partial x^k}{\partial \overline{x}^i} \frac{\partial x^l}{\partial \overline{x}^j} \delta _{k l}, \end{aligned}$$


$$\begin{aligned} \delta _{i j} = \left\{ \begin{array}{ll} 1 &{} \qquad \mathrm {for }\, i = j, \\ 0 &{} \qquad \mathrm {for }\, i \ne j. \end{array} \right. \end{aligned}$$

Differentiating Eq. (3.21) with respect to \(\overline{x}\), and permutating indices, we can show that

$$\begin{aligned} \frac{\partial ^2 x^h}{\partial \overline{x}^i \partial \overline{x}^j} \frac{\partial x^l}{\partial \overline{x}^k} \delta _{h l} = \frac{1}{2} \left( \overline{g}_{i k,j} + \overline{g}_{j k,i} - \overline{g}_{i j,k} \right) \equiv \overline{g}_{i l} \overline{ \left\{ \scriptstyle {\begin{array}{c} l \\ j~k \end{array}} \right\} }, \end{aligned}$$

where we use commas to indicate partial derivatives:

$$\begin{aligned} \overline{g}_{i j, k} \equiv \frac{\partial \overline{g}_{i j}}{\partial \overline{x}^k}. \end{aligned}$$

Using the inverse transformation of \(\overline{g}_{i j}\) to \(\delta _{i j}\) implied by Eq. (3.21), and the fact that

$$\begin{aligned} \delta ^i{}_j = \frac{\partial \overline{x}^k}{\partial x^j} \frac{\partial x^i}{\partial \overline{x}^k}, \end{aligned}$$

we get

$$\begin{aligned} \frac{\partial ^2 x^i}{\partial \overline{x}^j \partial \overline{x}^k} = \overline{\left\{ \scriptstyle {\begin{array}{c} l \\ j~k \end{array}}\right\} } \frac{\partial x^i}{\partial \overline{x}^l}. \end{aligned}$$

Now we substitute Eq. (3.26) into Eq. (3.19) and find

$$\begin{aligned} \frac{d v^i}{d t} = \frac{\partial x^i}{\partial \overline{x}^j} \frac{\overline{{D}} \overline{v}^j}{d t}, \end{aligned}$$


$$\begin{aligned} \frac{\overline{{D}} \overline{v}^i}{{d} t} = \overline{v}^j \left( \frac{\partial \overline{v}^i}{\partial \overline{x}^j} + \overline{\left\{ \scriptstyle {\begin{array}{c} i \\ k~j \end{array}} \right\} } \overline{v}^k \right) . \end{aligned}$$

The operator \(\overline{{D}}/{d} t\) is easily seen to be covariant with respect to general transformations of curvilinear coordinates.

We now identify the generally covariant derivative (dropping the overline) as

$$\begin{aligned} \nabla _j v^i = \frac{\partial v^i}{\partial x^j} + \left\{ \scriptstyle {\begin{array}{c} i \\ k~j \end{array}}\right\} v^k \equiv v^i{}_{; j}. \end{aligned}$$

Similarly, the covariant derivative of a covector is

$$\begin{aligned} \nabla _j v_i = \frac{\partial v_i}{\partial x^j} - \left\{ \scriptstyle {\begin{array}{c} k \\ i~j \end{array}}\right\} v_k \equiv v_{i ; j}. \end{aligned}$$

One extends the covariant derivative to higher rank tensors by adding to the partial derivative each term that results by acting linearly on each index with \(\left\{ \scriptstyle {\begin{array}{c} i \\ j~k \end{array}}\right\} \) using the two rules given above.

Relying on our understanding of the force-free point particle, we have built a notion of parallel transport that is consistent with our intuition based on equality of components in Cartesian coordinates. We can now expand this intuition to see how the vector components in a curvilinear coordinate system must change under an infinitesimal, parallel displacement from \(x^i(t)\) to \(x^i(t + \delta t)\). Setting Eq. (3.28) to zero, and noting that \(v^i \delta t = \delta x^i\), implies

$$\begin{aligned} \delta v^i \equiv \frac{\partial v^i}{\partial x^j} \delta x^j = - \left\{ \scriptstyle {\begin{array}{c} i \\ k~j \end{array}} \right\} v^k \delta x^j. \end{aligned}$$

In General Relativity we assume that under an infinitesimal parallel transport from a spacetime point \(x^a(\tau )\) on a given curve to a nearby point \(x^a(\tau + \delta \tau )\) on the same curve, the components of a vector \(V^a\) will change in an analogous way, namely

$$\begin{aligned} \delta V^a_\parallel \equiv \frac{\partial V^a}{\partial x^b} \delta x^b = - \varGamma ^a_{c b} V^c \delta x^b , \end{aligned}$$


$$\begin{aligned} \delta x^a \equiv \frac{d x^a}{{d} \tau } \delta \tau . \end{aligned}$$

Weyl (1952) refers to the symbol \(\varGamma ^a_{b c}\) as the “components of the affine relationship”, but we will use the modern terminology and call it the connection. In the language of Weyl and Pauli, this is the mapping that we were looking for.

For Euclidean space, we can verify that the metric satisfies

$$\begin{aligned} \nabla _i g_{j k} = 0 \end{aligned}$$

for a general, curvilinear coordinate system. The metric is thus said to be “compatible” with the covariant derivative. Metric compatibility is imposed as an assumption in General Relativity. This results in the so-called Christoffel symbol for the connection, defined as

$$\begin{aligned} \varGamma ^a_{b c} = \frac{1}{2} g^{a d} \left( g_{b d, c} + g_{c d, b} - g_{b c, d}\right) . \end{aligned}$$

The rules for the covariant derivative of a contravariant vector and a covector are the same as in Eqs. (3.29) and (3.30), except that all indices are spacetime ones.


The Lie derivative and spacetime symmetries

From the above discussion it should be evident that there are other ways to take derivatives in a curved spacetime. A particularly important tool for measuring changes in tensors from point to point in spacetime is the Lie derivative. It requires a vector field, but no connection, and is a more natural definition in the sense that it does not even require a metric. The Lie derivative yields a tensor of the same type and rank as the tensor on which the derivative operated (unlike the covariant derivative, which increases the rank by one). It is as important for Newtonian, non-relativistic fluids as for relativistic ones (a fact which needs to be continually emphasized as it has not yet permeated the fluid literature for chemists, engineers, and physicists). For instance, the classic papers on the gravitational-wave driven Chandrasekhar–Friedman–Schutz instability (Friedman and Schutz 1978a, b) in rotating stars are great illustrations of the use of the Lie derivative in Newtonian physics. We recommend the book by Schutz (1980) for a complete discussion and derivation of the Lie derivative and its role in Newtonian fluid dynamics (see also the series of papers by Carter and Chamel 2004, 2005a, b). Here, we will adapt the coordinate-based discussion of Schouten (1989), as it may be more readily understood by readers not well-versed in differential geometry.

In a first course on classical mechanics, when students encounter rotations, they are introduced to the idea of active and passive transformations. An active transformation would be to fix the origin and axis-orientations of a given coordinate system with respect to some external observer, and then move an object from one point to another point of the same coordinate system. A passive transformation would be to place an object so that it remains fixed with respect to some external observer, and then induce a rotation of the object with respect to a given coordinate system, rotating the coordinate system itself with respect to the external observer. We will derive the Lie derivative of a vector by first performing an active transformation and then following it with a passive transformation to determine how the final vector differs from its original form. In the language of differential geometry, we will first “push-forward” the vector, and then subject it to a “pull-back”.


In the active (push-forward) sense we imagine that there are two spacetime points connected by a smooth curve \(x^a(\lambda )\). Let the first point be at \(\lambda = 0\), and the second, nearby point at \(\lambda = \epsilon \), i.e., \(x^a(\epsilon )\); that is,

$$\begin{aligned} x^a_\epsilon \equiv x^a(\epsilon ) \approx x^a_0 + \epsilon \, \xi ^a , \end{aligned}$$

where \(x^a_0 \equiv x^a(0)\) and

$$\begin{aligned} \xi ^a = \left. \frac{dx^a}{{d} \lambda } \right| _{\lambda = 0} \end{aligned}$$

is the tangent to the curve at \(\lambda = 0\). In the passive (pull-back) sense we imagine that the coordinate system itself is changed to \(\overline{x}{}^a =\overline{x}{}^a(x^b)\), but in the very special form

$$\begin{aligned} \overline{x}{}^a = x^a - \epsilon \, \xi ^a . \end{aligned}$$

In this second step the Lie derivative differs from the covariant derivative. If we insert Eq. (3.36) into Eq. (3.38) we find the result \(\overline{x}{}^a_\epsilon = x^a_0\). This is called “Lie-dragging” of the coordinate frame, meaning that the coordinates at \(\lambda = 0\) are carried along so that at \(\lambda = \epsilon \) (and in the new coordinate system) the coordinate labels take the same numerical values.

Fig. 6

A schematic illustration of the Lie derivative. The coordinate system is dragged along with the flow, and one can imagine an observer “taking derivatives” as he/she moves with the flow (see the discussion in the text)

As an interesting aside it is worth noting that Arnold (1989)—only a little whimsically—refers to this construction as the “fisherman’s derivative”. He imagines a fisherman sitting in a boat on a river, “taking derivatives” as the boat moves along with the current. Let us now see how Lie-dragging reels in vectors.

For some given vector field that takes values \(V^a(\lambda )\), say, along the curve, we write

$$\begin{aligned} V^a_0 = V^a(0) \end{aligned}$$

for the value of \(V^a\) at \(\lambda = 0\) and

$$\begin{aligned} V^a_\epsilon = V^a(\epsilon ) \end{aligned}$$

for the value at \(\lambda = \epsilon \). Because the two points \(x^a_0\) and \(x^a_\epsilon \) are infinitesimally close (\(\epsilon \ll 1\)) we have

$$\begin{aligned} V^a_\epsilon \approx V^a_0 + \epsilon \, \xi ^b \left. \frac{\partial V^a}{\partial x^b} \right| _{\lambda = 0} \end{aligned}$$

for the value of \(V^a\) at the nearby point and in the same coordinate system. However, in the new coordinate system (at the nearby point) we find

$$\begin{aligned} \overline{V}{}^a_\epsilon = \left. \left( \frac{\partial \overline{x}{}^a}{\partial x^b} V^b\right) \right| _{\lambda = \epsilon } \approx V^a_\epsilon - \epsilon \, V^b_0 \left. \frac{\partial \xi ^a}{\partial x^b} \right| _{\lambda = 0}. \end{aligned}$$

The Lie derivative now is defined to be

$$\begin{aligned} \mathcal{L}_\xi V^a= & {} \lim _{\epsilon \rightarrow 0} \frac{\overline{V}{}^a_\epsilon - V^a}{\epsilon } \nonumber \\= & {} \xi ^b \frac{\partial V^a}{\partial x^b} - V^b \frac{\partial \xi ^a}{\partial x^b} \nonumber \\= & {} \xi ^b \nabla _b V^a - V^b \nabla _b \xi ^a , \end{aligned}$$

where we have dropped the “0” subscript and the last equality follows easily by noting \(\varGamma ^c_{a b} = \varGamma ^c_{b a}\).

The Lie derivative of a covector \(A_a\) is easily obtained by acting on the scalar \(A_a V^a\) for an arbitrary vector \(V^a\):

$$\begin{aligned} \mathcal{L}_\xi (A_a V^a)= & {} V^a \mathcal{L}_\xi A_a + A_a \mathcal{L}_\xi V^a \nonumber \\= & {} V^a \mathcal{L}_\xi A_a + A_a \left( \xi ^b \nabla _b V^a - V^b \nabla _b \xi ^a \right) . \end{aligned}$$

But, because \(A_a V^a\) is a scalar,

$$\begin{aligned} \mathcal{L}_\xi (A_a V^a)= & {} \xi ^b \nabla _b (A_a V^a) \nonumber \\= & {} \xi ^b \left( V^a \nabla _b A_a + A_a \nabla _b V^a \right) , \end{aligned}$$

and thus

$$\begin{aligned} V^a \left( \mathcal{L}_\xi A_a - \xi ^b \nabla _b A_a - A_b \nabla _a \xi ^b \right) = 0. \end{aligned}$$

Since \(V^a\) is arbitrary we have

$$\begin{aligned} \mathcal{L}_\xi A_a = \xi ^b \nabla _b A_a + A_b \nabla _a \xi ^b . \end{aligned}$$

Equation (3.32) introduced the effect of parallel transport on vector components. By contrast, the Lie-dragging of a vector causes its components to change as

$$\begin{aligned} \delta V^a_\mathcal{L} = \mathcal{L}_\xi V^a \, \epsilon . \end{aligned}$$

We see that if \(\mathcal{L}_\xi V^a = 0\), then the components of the vector do not change as the vector is Lie-dragged. Suppose now that \(V^a\) represents a vector field and that there exists a corresponding congruence of curves with tangent given by \(\xi ^a\). If the components of the vector field do not change under Lie-dragging, we can show that this implies a symmetry, meaning that a coordinate system can be found such that the vector components do not depend on one of the coordinates. This is a potentially very powerful statement.

Let \(\xi ^a\) represent the tangent to the curves drawn out by, say, the \(a = \phi \) coordinate. Then we can write \(x^a(\lambda ) = \lambda \) which means

$$\begin{aligned} \xi ^a = \delta ^a{}_\phi . \end{aligned}$$

If the Lie derivative of \(V^a\) with respect to \(\xi ^b\) vanishes we find

$$\begin{aligned} \xi ^b \frac{\partial V^a}{\partial x^b} = V^b \frac{\partial \xi ^a}{\partial x^b} = 0 . \end{aligned}$$

Using this in Eq. (3.41) implies \(V^a_\epsilon = V^a_0\), that is to say, the vector field \(V^a(x^b)\) does not depend on the \(x^a\) coordinate. Generally speaking, every \(\xi ^a\) that exists that causes the Lie derivative of a vector (or higher rank tensors) to vanish represents a symmetry.

Let us take the spacetime metric \(g_{a b}\) as an example. A spacetime symmetry can be represented by a generating vector field \(\xi ^a\) such that

$$\begin{aligned} \mathcal{L}_\xi g_{a b} = \nabla _a \xi _b + \nabla _b \xi _a = 0 . \end{aligned}$$

This is known as Killing’s equation, and solutions to this equation are naturally referred to as Killing vectors. It is now fairly easy to demonstrate the claim that the existence of a Killing vector relates to an underlying symmetry of the spacetime metric. First we expand (3.51) to get

$$\begin{aligned} g_{bc} \partial _a \xi ^c + g_{ac} \partial _b \xi ^c + \xi ^d \partial _d g_{ab} = 0 . \end{aligned}$$

Then we assume that the Killing vector is associated with one of the coordinates, e.g., by letting \(\xi ^a = \delta _0^a\). The first two terms in (3.52) then vanish by definition, and we are left with

$$\begin{aligned} \xi ^d \partial _d g_{ab} = \partial _0 g_{ab} = 0 , \end{aligned}$$

demonstrating that the metric does not depend on the \(x^0\) coordinate.

An important application of this idea is provided by stationary, axisymmetric, and asymptotically flat spacetimes—highly relevant in the present context as they capture the physics of rotating, equilibrium configurations. The associated geometries are fundamental for the relativistic astrophysics of spinning black holes and neutron stars. Stationary, axisymmetric, and asymptotically flat spacetimes are such that (Bonazzola et al. 1993)

  1. 1.

    there exists a Killing vector \(t^a\) that is timelike at spatial infinity, and the independence of the metric on the associated time coordinate leads to the solution being stationary;

  2. 2.

    there exists a Killing vector \(\phi ^a\) that vanishes on a timelike 2-surface—the axis of symmetry—is spacelike everywhere else, and whose orbits are closed curves; and

  3. 3.

    asymptotic flatness means the scalar products \(t_a t^a\), \(\phi _a \phi ^a\), and \(t_a \phi ^a\) tend to, respectively, \(- 1\), \(+ \infty \), and 0 at spatial infinity.

Spacetime curvature

The main message of the previous two Sects. 3.2 and 3.3 is that one must have an a priori idea of how vectors and higher rank tensors are moved from point to point in spacetime. An immediate manifestation of the complexity associated with carrying tensors about in spacetime is that the covariant derivative does not commute. For a vector we find

$$\begin{aligned} \nabla _b \nabla _c V^a - \nabla _c \nabla _b V^a = R^a{}_{d b c} V^d , \end{aligned}$$

where \(R^a{}_{d b c}\) is the Riemann tensor. It is obtained from

$$\begin{aligned} R^a{}_{d b c} = \varGamma ^a_{d c, b} - \varGamma ^a_{d b, c} + \varGamma ^a_{e b} \varGamma ^e_{d c} - \varGamma ^a_{e c} \varGamma ^e_{d b} . \end{aligned}$$

Closely associated are the Ricci tensor \(R_{ab} = R_{ba}\) and scalar R that are defined by the contractions

$$\begin{aligned} R_{a b} = R^c{}_{a c b} , \qquad R = g^{a b} R_{a b} . \end{aligned}$$

We will also need the Einstein tensor, which is given by

$$\begin{aligned} G_{a b} = R_{a b} - \frac{1}{2} R g_{a b} . \end{aligned}$$

It is such that \(\nabla _b G^b{}_a\) vanishes identically. This is known as the Bianchi identity.

A more intuitive understanding of the Riemann tensor is obtained by seeing how its presence leads to a path-dependence in the changes that a vector experiences as it moves from point to point in spacetime. Such a situation is known as a “non-integrability” condition, because the result depends on the whole path and not just the initial and final points. That is, it is not like a total derivative which can be integrated and depends on only the limits of integration. Geometrically we say that the spacetime is curved, which is why the Riemann tensor is also known as the curvature tensor.

To illustrate the meaning of the curvature tensor, let us suppose that we are given a surface that is parameterized by the two parameters \(\lambda \) and \(\eta \). Points that live on this surface will have coordinate labels \(x^a(\lambda ,\eta )\). We want to consider an infinitesimally small “parallelogram” whose four corners (moving counterclockwise with the first corner at the lower left) are given by \(x^a(\lambda ,\eta )\), \(x^a(\lambda ,\eta + \delta \eta )\), \(x^a(\lambda + \delta \lambda ,\eta + \delta \eta )\), and \(x^a(\lambda + \delta \lambda ,\eta )\). Generally speaking, any “movement” towards the right of the parallelogram is effected by varying \(\eta \), and ones towards the top results by varying \(\lambda \). The plan is to take a vector \(V^a(\lambda ,\eta )\) at the lower-left corner \(x^a(\lambda ,\eta )\), parallel transport it along a \(\lambda = \mathrm {const}\) curve to the lower-right corner at \(x^a(\lambda ,\eta + \delta \eta )\) where it will have the components \(V^a(\lambda ,\eta + \delta \eta )\), and end up by parallel transporting \(V^a\) at \(x^a(\lambda ,\eta + \delta \eta )\) along an \(\eta = \mathrm {const}\) curve to the upper-right corner at \(x^a(\lambda + \delta \lambda ,\eta + \delta \eta )\). We will call this path I and denote the final component values of the vector as \(V^a_\mathrm {I}\). We then repeat the process except that the path will go from the lower-left to the upper-left and then on to the upper-right corner. We will call this path II and denote the final component values as \(V^a_\mathrm {II}\).

Recalling Eq. (3.32) as the definition of parallel transport, we first of all have

$$\begin{aligned} V^a(\lambda ,\eta + \delta \eta ) \approx V^a(\lambda ,\eta ) + \delta _\eta V^a_\parallel (\lambda ,\eta ) = V^a(\lambda ,\eta ) - \varGamma ^a_{b c} V^b \delta _\eta x^c \end{aligned}$$


$$\begin{aligned} V^a(\lambda + \delta \lambda ,\eta ) \approx V^a(\lambda ,\eta ) + \delta _\lambda V^a_\parallel (\lambda ,\eta ) = V^a(\lambda ,\eta ) - \varGamma ^a_{b c} V^b \delta _\lambda x^c , \end{aligned}$$


$$\begin{aligned} \delta _\eta x^a \approx x^a(\lambda ,\eta + \delta \eta ) - x^a(\lambda ,\eta ) , \qquad \delta _\lambda x^a \approx x^a(\lambda + \delta \lambda ,\eta ) - x^a(\lambda ,\eta ) . \end{aligned}$$

Next, we need

$$\begin{aligned} V^a_\mathrm {I}\approx & {} V^a(\lambda ,\eta + \delta \eta ) + \delta _\lambda V^a_\parallel (\lambda ,\eta + \delta \eta ), \end{aligned}$$
$$\begin{aligned} V^a_\mathrm {II}\approx & {} V^a(\lambda + \delta \lambda ,\eta ) + \delta _\eta V^a_\parallel (\lambda + \delta \lambda ,\eta ). \end{aligned}$$

Working things out, we find that the difference between the two paths is

$$\begin{aligned} \varDelta V^a \equiv V^a_\mathrm {I} - V^a_\mathrm {II} = R^a{}_{d b c} V^d \delta _\lambda x^c \delta _\eta x^b , \end{aligned}$$

which follows because \(\delta _\lambda \delta _\eta x^a = \delta _\eta \delta _\lambda x^a\), i.e., we have closed the parallelogram.

The Einstein field equations

We now have the tools we need to outline the argument that leads to the field equations of General Relativity. This sketch will be complemented by a variational derivation in Sect. 4.4.

Consider two freely falling particles moving along neighbouring geodesics with a vector \(\xi ^a\) measuring the separation. Assuming that this vector is purely spatial according to the trajectory of one of the bodies, who we also assign to measure time (such that the corresponding four-velocity only has a time-component), we have

$$\begin{aligned} u^a \xi _a = 0 . \end{aligned}$$

The second derivative of the separation vector will be affected by the spacetime curvature. With this set-up it follows that

$$\begin{aligned} u^a \nabla _a \xi ^b - \xi ^a \nabla _a u^b = 0 \end{aligned}$$

and we find that

$$\begin{aligned} u^c \nabla _c (u^b\nabla _b \xi ^a) = u^c \xi ^b ( \nabla _c \nabla _b - \nabla _b \nabla _c) u^a = - R^a_{\ d b c} u^d \xi ^b u^c , \end{aligned}$$

where we have used the fact that the Riemann tensor encoded the failure of second covariant derivatives to commute. This is the equation of geodesic deviation.

At this point it is useful to introduce a total time derivative, such that

$$\begin{aligned} {D \over D\tau } = u^a \nabla _a \end{aligned}$$

which means that (3.66) becomes

$$\begin{aligned} {D^2 \xi ^a \over D\tau ^2} = - R^a_{\ d b c} u^d \xi ^b u^c . \end{aligned}$$

This provides us with an expression for the relative acceleration caused by the spacetime curvature. As gravity is a tidal interaction, we can meaningfully compare our relation to the corresponding relation in Newtonian gravity. This leads to the identification

$$\begin{aligned} R^j_{\ 0k0} = {\mathcal {E}}^j_{\ k} = \delta ^{jl} \left( { \partial ^2 \varPhi \over \partial x^l \partial x^k} \right) , \end{aligned}$$

where \(\mathcal {E}_j^{\ k}\) is the tidal tensor and \(\varPhi \) is the gravitational potential. This provides a constraint that the curved spacetime theory must satisfy (in the limit of weak gravity and low velocities).

After some deliberation, including a careful counting of the dynamical degrees of freedom (noting the freedom to introduce coordinates), one arrives at the field equations for General Relativity:

$$\begin{aligned} G_{a b} = {8 \pi G \over c^4} T_{a b} , \end{aligned}$$

where G is Newton’s constant and c is the speed of light.

At this point it is evident that any discussion of relativistic physics (involving matter) must include the energy-momentum-stress tensor,Footnote 13\(T_{a b}\). This is where the messy physics of reality enter the problem. Misner et al. (1973) refer to \(T_{a b}\) as “...a machine that contains a knowledge of the energy density, momentum density, and stress as measured by any and all observers at that event.” Encoding this is a severe challenge. However, we need to understand how this works—both phenomenologically (allowing us to move swiftly to the challenge of solving the equations) and from a detailed microphysics point of view (as required in order for our models to be realistic). We will develop this understanding step by step, starting with the simple perfect fluid model and proceeding towards more complex settings including distinct components exhibiting relative flows and dissipation. However, before we take the next step in this direction we need to introduce the main technical machinery that forms the basis for much of the discussion.

Variational analysis

The key geometric difference between generally covariant Newtonian fluids and their general relativistic counterparts is that the former have an a priori notion of time (Carter and Chamel 2004, 2005a, b). Newtonian fluids also have an a priori notion of space (cf. the discussion in Carter and Chamel 2004). Such a structure has clear advantages for evolution problems, where one needs to be unambiguous about the rate-of-change of a given system. However, once a problem requires, say, electromagnetism, then the a priori Newtonian time is at odds with the spacetime covariance of the electromagnetic fields (as the Lorentz invariance of Maxwell’s equations dictates that the problem is considered in—at least—Special Relativity). Fortunately, for spacetime covariant theories there is the so-called “3 + 1” formalism (see, for instance, Smarr and York 1978 and the discussion in Sect. 11) which allows one to define “rates-of-change” in an unambiguous manner, by introducing a family of spacelike hypersurfaces (the “3”) given as the level surfaces of a spacetime scalar (the “1”) associated with a timelike progression.

Something that Newtonian and relativistic fluids have in common is that there are preferred frames for measuring changes—those that are attached to the fluid elements. In the parlance of hydrodynamics, one refers to Lagrangian and Eulerian frames, or observers. In Newtonian theory, an Eulerian observer is one who sits at a fixed point in space, and watches fluid elements pass by, all the while taking measurements of their densities, velocities, etc. at the given location. In contrast, a Lagrangian observer rides along with a particular fluid element and records changes of that element as it moves through space and time. A relativistic Lagrangian observer is the same, but the relativistic Eulerian observer is more complicated to define (as we have to explain what we mean by a ”fixed point” in space). One way to do this, see Smarr and York (1978), is to define such an observer as one who moves along a worldline that remains everywhere orthogonal to the family of spacelike hypersurfaces.

The existence of a preferred frame for a fluid system can be a great advantage. In Sect. 5.2 we will use an “off-the-shelf” approach that exploits a preferred frame to derive the standard perfect fluid equations. Later, we will use Eulerian and Lagrangian variations to build an action principle for both single and multiple fluid systems. In this problem the Lagrangian displacements play a central role, as they allow us to introduce the constraints that are required in order to arrive at the desired results. Moreover, these types of variations turn out to be useful for many applications, e.g., they can be used as the foundation for a linearized perturbation analysis of neutron stars (Kokkotas and Schmidt 1999). As we will see, the use of Lagrangian variations is essential for establishing instabilities in rotating fluids (Friedman and Schutz 1978a, b). However, it is worth noting already at this relatively early stage that systems with several distinct flows are more complex as they can have as many notions of Lagrangian observers as there are fluids in the system (Fig. 6).

A simple starting point: the point particle

The simplest physics problem, i.e. the motion of a point particle, serves as a guide to deep principles used in much harder problems. We have used it already to motivate parallel transport as the foundation for the covariant derivative. Let us call upon the point particle again to set the context for the action-based derivation of the fluid equations. We will simplify the discussion by considering only motion in one dimension—assuring the reader that we have good reasons for this, and asking for patience while we remind him/her of what may be very basic facts.

Early on in life (relatively!) we learn that an action appropriate for the point particle is

$$\begin{aligned} I = \int ^{t_f}_{t_i} T dt = \int ^{t_f}_{t_i} \left( \frac{1}{2} m \dot{x}^2\right) dt, \end{aligned}$$

where m is the mass and T the kinetic energy. A first-order variation of the action with respect to x(t) yields

$$\begin{aligned} \delta I = - \int ^{t_f}_{t_i} \left( m \ddot{x}\right) \delta x dt+ \left. \left( m \dot{x} \delta x\right) \right| ^{t_f}_{t_i} , \end{aligned}$$

see Fig. 7. If this is all the physics to be incorporated, i.e. if there are no forces acting on the particle, then we impose d’Alembert’s principle of least action, which states that the trajectories x(t) that make the action stationary, i.e. \(\delta I = 0\), yield the true motion. We then see that functions x(t) that satisfy the boundary conditions

$$\begin{aligned} \delta x(t_i) = 0 = \delta x(t_f) , \end{aligned}$$

and the equation of motion

$$\begin{aligned} m \ddot{x} = 0 , \end{aligned}$$

will indeed make \(\delta I = 0\). The same logic applies in the substantially more difficult variational problems that will be considered later.

Fig. 7

A simple illustration of the variation that leads to the point particle equations of motion. The solid line in this parameter space represents a curve which is understood to be a solution to the equations of motion, while the dashed line is some arbitrarily specified curve. At a given value of time, the variation \(\delta x\) represents the vertical displacement between the curves; obviously, at the endpoints \(t = t_1\) and \(t = t_2\), the two curves meet and the displacement vanishes. Keeping the endpoints fixed, the equations of motion are obtained from the extrema of the action, as demonstrated in the main text. The same idea applies in the more complicated cases of field theories that we consider later; the fields have actions, and the field equations of motion are obtained by locating the extrema. The field values at the extrema are often referred to as being “on shell’ (or “on the mass shell”) for reasons we do not really have to elaborate on here


In general we need to account for forces acting on the particle. First on the list are the so-called conservative forces, describable by a potential V(x), which are placed into the action according to:

$$\begin{aligned} I = \int ^{t_f}_{t_i} L(x,\dot{x}) dt = \int ^{t_f}_{t_i} \left[ \frac{1}{2} m \dot{x}^2 - V(x)\right] dt , \end{aligned}$$

where \(L = T - V\) is known as the Lagrangian. The variation now leads to

$$\begin{aligned} \delta I = - \int ^{t_f}_{t_i} \left( m \ddot{x} + \frac{\partial V}{\partial x}\right) \delta x dt + \left. \left( m \dot{x} \delta x\right) \right| ^{t_f}_{t_i} . \end{aligned}$$

Assuming no externally applied forces, d’Alembert’s principle yields the equation of motion

$$\begin{aligned} m \ddot{x} + \frac{\partial V}{\partial x} = 0 . \end{aligned}$$

An alternative way to write this is to introduce the momentum p (not to be confused with the fluid pressure introduced earlier) defined as

$$\begin{aligned} p = \frac{\partial L}{\partial \dot{x}} = m \dot{x} , \end{aligned}$$

in which case

$$\begin{aligned} \dot{p} + \frac{\partial V}{\partial x} = 0 . \end{aligned}$$

In the most honest applications, one has the obligation to incorporate dissipative, i.e., non-conservative, forces. Unfortunately, dissipative forces \(F_d\) cannot be put into action principles (at least not directly, see the discussion in Sect. 16 where we discuss recent progress towards dissipative variational models). Fortunately, Newton’s second law is great guidance, since it states

$$\begin{aligned} m \ddot{x} + \frac{\partial V}{\partial x} = F_d , \end{aligned}$$

when both conservative and dissipative forces act. A crucial observation of Eq. (4.10) is that the “kinetic” (\(m \ddot{x} = \dot{p}\)) and conservative (\(\partial V/\partial x\)) forces, which enter the left-hand side, still follow from the action, i.e.,

$$\begin{aligned} \frac{\delta I}{\delta x} = - \left( m \ddot{x} + \frac{\partial V}{\partial x}\right) , \end{aligned}$$

where we have introduced the “variational derivative” \({\delta I}/{\delta x}\). When there are no dissipative forces acting, the action principle gives us the appropriate equation of motion. When there are dissipative forces, the action defines the kinetic and conservative force terms that are to be balanced by the dissipative contribution. It also defines the momentum. These are the key lessons from this toy-problem.

We should emphasize that this way of using the action to define the kinetic and conservative pieces of the equation of motion, as well as the momentum, can also be used in situations when a system experiences an externally applied force \(F_\mathrm {ext}\). The force can be conservative or dissipative (see, e.g., Galley 2013), and will enter the equation of motion in the same way as \(F_d\) did above. That is

$$\begin{aligned} - \frac{\delta I}{\delta x} = F_d + F_\mathrm {ext} . \end{aligned}$$

Like a dissipative force, the main effect of the external force can be to siphon kinetic energy from the system. Of course, whether a force is considered to be external or not depends on the a priori definition of the system.

More general Lagrangians

Returning to the discussion of the variational approach for obtaining the dynamical equations that govern a given system, let us consider a generalized version of the problem. Basically, we want to extend the idea to the case of a field theory in spacetime. To do this, we assume that the system is described by a set of fields \(\varPhi ^A\) defined on spacetime, i.e., depending on the coordinates \(x^a\). At this level, we can keep the discussion abstract and consider any number of fields, labelled by A. This set can (in principle) contain any number of scalar, vector or tensor fields. If we are interested in models containing vector fields, then the label A runs over all four components of each of the relevant fields. In that situation, the label A essentially becomes a spacetime index, like a. Tensor fields are treated in a similar way. As an example, discussed in more detail later, consider electromagnetism, for which the set of fields would be the vector potential \(A^a\) and the spacetime metric \(g_{a b}\), so that we have \(\varPhi ^A = \{A^a, g_{a b}\}\).

The action for the system should now take the form of an integral of a Lagrangian (density) \(\mathcal {L}\), which depends on the fields \(\varPhi ^A\) and their various derivatives (as “appropriate”). Integrating over a spacetime region R we would have

$$\begin{aligned} I= \int _R \mathcal {L} \left( \varPhi ^A, \partial _a \varPhi ^A, \partial _a \partial _b \varPhi ^A, \ldots \right) d^4 x \end{aligned}$$

Since we expect the theory to be covariant, we need the action to transform as a scalar under a general coordinate transformation. To ensure this, we need to involve the invariant volume element \(\sqrt{-g}d^4x\), where g is the determinant of the metric, as before. Defining the scalar Lagrangian L we then have

$$\begin{aligned} I= \int _R L \sqrt{-g}\ d^4 x \end{aligned}$$

(which is a scalar by construction).


As in the case of a point particle, we can derive the field equations by demanding that the action is stationary under variations in the fields. Letting

$$\begin{aligned} \varPhi ^A \rightarrow \varPhi ^A + \delta \varPhi ^A \end{aligned}$$

and assuming, for simplicity, that the theory is “local” (meaning that only first derivatives of the fields appear in the action) we need also

$$\begin{aligned} \partial _a \varPhi ^A \rightarrow \partial _a \varPhi ^A + \partial _a \left( \delta \varPhi ^A\right) = \partial _a \varPhi ^A + \delta \left( \partial _a \varPhi ^A\right) \end{aligned}$$

Given these relation, the variation in the action is \(I+\delta I\), where

$$\begin{aligned} \delta I = \int _R \delta \mathcal {L} d^4 x = \int _R \left[ {\partial \mathcal {L} \over \partial \varPhi ^A} \delta \varPhi ^A + {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \delta \left( \partial _a \varPhi ^A\right) \right] d^4x \end{aligned}$$

To make progress we need to factor out \(\delta \varPhi ^A\) from the second term in the integrand. This is achieved by integrating by parts;

$$\begin{aligned}&\int _R {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \delta \left( \partial _a \varPhi ^A\right) \ d^4x \nonumber \\&\quad = \int _R \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \delta \varPhi ^A \right] d^4x - \int _R \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \right] \delta \varPhi ^A \ d^4x \end{aligned}$$

At this point we make use of the fact that the first term is a total derivative, which can be turned into a integral over the bounding surface (in the usual way). Inspired by the boundary conditions imposed on the variations in the point-particle case, we then restrict ourselves to variations \(\delta \varPhi ^A\) that vanish on the boundary. Thus, we can neglect the first integral (later referred to as the “surface terms”), ending up with

$$\begin{aligned} \delta I = \int _R \left\{ {\partial \mathcal {L} \over \partial \varPhi ^A}- \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \right] \delta \varPhi ^A \right\} \delta \varPhi ^Ad^4x \end{aligned}$$

Demanding that \(\delta I=0\) we see that the variational derivative satisfies

$$\begin{aligned} {\delta \mathcal {L} \over \delta \varPhi ^A } = {\partial \mathcal {L} \over \partial \varPhi ^A}- \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \right] = 0 . \end{aligned}$$

These are the Euler-Lagrange equations that govern the evolution of the fields \(\varPhi ^A\).

So far, we have developed the theory for the Lagrangian density \(\mathcal {L}\), rather than the Lagrangian L itself. This is not a problem, we can simply consider the components of the metric as belonging to the set of fields that we vary. However, the added complication (due to the presence of \(\sqrt{-g}\) and the derivatives that need to be evaluated) may be unnecessary in many cases. In such situations one can often express the Lagrangian in terms of the covariant derivative \(\nabla _a\) instead of the partial \(\partial _a\). Essentially, this involves reworking the algebra taking as starting point an action of form

$$\begin{aligned} I = \int _R L \left( \varPhi ^A, \nabla _a \varPhi ^A, \ldots , g_{a b}, \partial _c g_{a b}, \ldots \right) \sqrt{-g}\ d^4x \end{aligned}$$

where the fields \(\varPhi ^A\) are now independent of the metric, although the Lagrangian may still contain \(g_{a b}\) in contractions of spacetime indices to construct the required scalar. After some algebra, we find that

$$\begin{aligned} {\delta L \over \delta \varPhi ^A } = {\partial L \over \partial \varPhi ^A} - \nabla _a \left[ {\partial L \over \partial \left( \nabla _a \varPhi ^A\right) } \right] = 0 \end{aligned}$$

This is the form of the Euler-Lagrange equations that we will be using in the following.


As a first “explicit” example of the variational approach, let us derive the field equations for electromagnetism (Hobson et al. 2006). In this case, the starting point is the electromagnetic vector potential \(A^a\), which (in turn) leads to the Faraday tensor

$$\begin{aligned} F_{ab}=\nabla _a A_b - \nabla _b A_a \end{aligned}$$

Because of the anti-symmetry, this object has 6 components which can (as we will see later) be associated with the electric and magnetic fields, leading to a (presumably) more familiar picture. However, these fields are manifestly observer dependent (a moving charge leads to a magnetic field etc.) so, from a formal point of view, it is better to develop the theory in terms of \(F_{ab}\). Making contact with the previous discussion and the variational approach, the fields \(\varPhi ^A\) to be varied will be the four components of \(A^a\). The first step of the derivation is to construct a suitable scalar Lagrangian from \(A^a\) and its first derivatives. However, already at this point do we run into “trouble”. We know that the theory is gauge-invariant, since we can add \(\nabla _a \psi = \partial _a \psi \) (where \(\psi \) is an arbitrary scalar) to the vector potential without altering the physics (read: \(F_{ab}\)). The upshot of this is that we need to ensure that the electromagnetic action is invariant under the transformation

$$\begin{aligned} A_a \rightarrow A_a + \nabla _a \psi \end{aligned}$$

This constrains the permissible Lagrangians. For example, we cannot use the contraction \(A^a A_a=g_{ab}A^a A^b\) since this combination is not gauge invariant. However, it is easy to see that \(F_{ab}\) exhibits the required invariance, so we can use it as our main building block. The obvious thing to do would be to try to use the scalar \(F_{a b} F^{a b}\) to build the Lagrangian. However, this would not account for the fact that the charge current \(j^a\) acts as source of the electromagnetic field. To reflect this, we add an “interaction term” \(-j^a A_a\) to the Lagrangian (leaving the details of this for later). At the end of the day, the Lagrangian takes the form

$$\begin{aligned} L= - {1 \over 4 \mu _0} F_{a b} F^{a b} + j^a A_a \end{aligned}$$

where \(\mu _0\) is a constant (describing the strength of the coupling).

At this point, we realize that the current term is not gauge-invariant. It would transform as

$$\begin{aligned} j^a A_a \rightarrow j^a A_a + j^a \nabla _a \psi = j^a A_a + \nabla _a \left( \psi j^a\right) - \psi \left( \nabla _a j^a\right) \end{aligned}$$

We already know that the second term contributes a surface term to the action integral, and hence can be “ignored”. The third term is different. In order to ensure that the action is gauge-invariant, we must demand that the current is conserved, i.e.

$$\begin{aligned} \nabla _a j^a = 0 . \end{aligned}$$

The field equations that we derive require this constraint to be satisfied. Later, when we consider the fluid problem, we will see that the conservation of the matter flux plays a similar role.

Having established an invariant scalar Lagrangian, we determine the Euler-Lagrange equations by varying the fields \(A_a\) (keeping the source \(j^a\) fixed). From (4.22) we then have

$$\begin{aligned} {\partial L \over \partial A_a} - \nabla _b \left[ {\partial L \over \partial \left( \partial _b A_a\right) } \right] = 0 . \end{aligned}$$

From the stated form of the action (and recalling the discussion of the point particle) we see that

$$\begin{aligned} {\partial L \over \partial A_a} = j^a \end{aligned}$$

The second term is messier, but after a bit of work we arrive at;

$$\begin{aligned} {\partial L \over \partial \left( \partial _b A_a\right) } = - {1 \over \mu _0} F^{a b} \end{aligned}$$

which leads to the final field equation

$$\begin{aligned} \nabla _b F^{a b} = \mu _0 j^a . \end{aligned}$$

The relativistic Maxwell equations are completed by

$$\begin{aligned} \nabla _{[c} F_{a b]} = 0 \quad \Longrightarrow \quad \nabla _c F_{a b} + \nabla _b F_{c a} + \nabla _a F_{b c} = 0 \end{aligned}$$

which is automatically satisfied for our definition of \(F^{a b}\), as it is anti-symmetric.


The Einstein field equations

Having discussed the underlying principles and considered the explicit example of electromagnetism, we have reached the level of confidence required to derive the field equations of General Relativity. We know that the metric \(g_{a b}\) is the central object of the theory (essentially, because we are looking for a theory where the geometry plays a key role). To build the Lagrangian we therefore want to construct a simple (for elegance) scalar from the metric and its derivatives. The simplest object we can think of is the Ricci scalar, R. This is, in fact, the only scalar that contains only the metric and its first two derivatives. Moreover, it is natural that the Lagrangian involves a quantity which is directly linked to the spacetime curvature, and the Ricci scalar fits this bill, as well.

This argument leads to the celebrated Einstein–Hilbert action

$$\begin{aligned} I_\mathrm {EH} = \int _R R \sqrt{-g}\ d^4x . \end{aligned}$$

In this case, where the Lagrangian depends on the metric, it is natural to work directly with the density \(\mathcal {L} = R \sqrt{-g}\). From (4.20) we then see that

$$\begin{aligned} {\partial \mathcal {L}\over \partial g_{a b} } - \partial _c \left[ { \partial \mathcal {L} \over \partial \left( \partial _c g_{a b}\right) }\right] +\partial _d \partial _c \left[ { \partial \mathcal {L} \over \partial \left( \partial _d \partial _c g_{a b}\right) }\right] = 0 , \end{aligned}$$

where we have allowed for the fact that the Lagrangian also depends on the second derivatives of the metric (the extension of the analysis to allow for this is straightforward). Having a go at evaluating the required derivatives, we soon appreciate that this task is formidable. Luckily, there is an easier way to arrive at the answer.

Let us consider the variation in the action that results from a metric variation \(g_{a b} \rightarrow g_{a b} + \delta g_{a b}\). Carrying out this analysis we need the variation of the covariant metric, which follows readily:

$$\begin{aligned} g^{a b} g_{b c} = \delta ^a_c \qquad \Longrightarrow \qquad \delta g^{a b} = - g^{a c} g^{b d} \delta g_{c d} . \end{aligned}$$

Making use of the fact that \(R = g^{a b} R_{a b}\), we then have

$$\begin{aligned} \delta I_\mathrm {EH} = \int _R \left[ \delta g^{a b} R_{a b} + g^{a b} \delta R_{a b} \right] \sqrt{-g}\ d^4x + \int _R g^{a b} R_{a b} \delta \sqrt{-g}\ d^4x . \end{aligned}$$

Since the metric is the fundamental variable, we need to factor out \(\delta g^{ab}\) (somehow). The terms in the second integral are easiest to deal with. Given that g is the determinant of the metric, the expression we need follows from (A.11). That is, we have

$$\begin{aligned} \delta \sqrt{-g} = -{1 \over 2} \sqrt{-g}\ g_{a b} \delta g^{a b} . \end{aligned}$$

Turning to the second term in the first bracket of (4.36), the easiest way to progress is to consider the variation of the Riemann tensor and then constructing the expression for the Ricci tensor by contraction. Moreover, noting that the Riemann tensor variation is expressed in terms of variations of the connection, \(\delta \varGamma ^c_{\ a b}\), which is a tensor, we can simplify the analysis by working in a local inertial frame (where \(\varGamma ^c_{\ a b}=0\)). Thus, we have

$$\begin{aligned} \delta R^d_{\ a b c} = \nabla _b \left( \delta \varGamma ^d_{\ a c}\right) - \nabla _c \left( \delta \varGamma ^d_{\ a b}\right) . \end{aligned}$$

As this is also a tensor expression it is valid in any coordinate system. Carrying out the required contraction, we find that

$$\begin{aligned} \delta R_{a b} = \nabla _b \left( \delta \varGamma ^c_{\ a c}\right) - \nabla _c \left( \delta \varGamma ^c_{\ a b}\right) . \end{aligned}$$

Using this expression we see that

$$\begin{aligned} g^{a b} \delta R_{a b} = \nabla _b \left( g^{a b} \delta \varGamma ^{c}_{\ a c} - g^{a c} \delta \varGamma ^b_{\ a c} \right) . \end{aligned}$$

In other words, the term that we need in (4.36) can be written as a total derivative. Given that this leads to a surface term, we duly neglect it and arrive at the final result:

$$\begin{aligned} \delta I_\mathrm {EH} = \int _R \left( R_{a b} -{1 \over 2} g_{a b} R \right) \delta g^{a b} \sqrt{-g} \ d^4x . \end{aligned}$$

The vanishing of the variation leads to the vacuum Einstein equations

$$\begin{aligned} G_{a b} = R_{a b} - {1 \over 2} g_{a b} R = 0 . \end{aligned}$$

The derivation highlights the fact that Einstein’s theory is one of the most elegant constructions of modern physics.

The stress-energy tensor as obtained from the action principle

However aesthetically pleasing the theory may be, our main interest here is not in the vacuum dynamics of Einstein’s theory. Rather, we want to explore the matter sector. In Einstein’s Universe, matter plays a dual role—it (actively) provides the origin of the spacetime curvature and the gravitational field and (perhaps not quite passively) adjusts its motion according to this curvature.

In particular, we want to explore systems of astrophysical relevance for which general relativistic aspects are crucial. Inevitably, this involves some rather complex physics. However, the coupling to the spacetime curvature remains relatively straightforward as it is encoded in a single object; the stress-energy tensor \(T_{a b}\). This object is as important for General Relativity as the Einstein tensor \(G_{a b}\) in that it enters the Einstein equations in as direct a way as possible, i.e. (in geometric units)

$$\begin{aligned} G_{a b} = 8 \pi T_{a b} . \end{aligned}$$

From a conceptual point-of-view it is relatively easy to incorporate matter in the variational derivation from the previous section. Essentially, we add a matter component such that (cf. the argument for electromagnetism)

$$\begin{aligned} I = I_\mathrm {EH} + I_\mathrm {M} = \int _R \left( {1\over 2\kappa } R + L \right) \sqrt{-g}\ d^4x \end{aligned}$$

where \(\kappa = 8\pi G/c^4\) is a coupling constant fixed by Newtonian correspondence in the weak-field limit. Given the results for the vacuum gravity problem, it is easy to see that the matter contribution to the field equations follow from the variation of the matter action with respect to the metric. This insight will be very important later. In essence, the Einstein equations take the form

$$\begin{aligned} G_{a b} = \kappa T_{a b} \end{aligned}$$

provided that

$$\begin{aligned} T_{a b} = - \frac{2}{\sqrt{- g}} { \delta \mathcal {L}_\mathrm {M} \over \delta g^{a b}}= - \frac{2}{\sqrt{- g}} { \delta \left( \sqrt{- g} L \right) \over \delta g^{a b}} , \end{aligned}$$

or, equivalently,

$$\begin{aligned} T^{a b}= \frac{2}{\sqrt{- g}} {\delta \left( \sqrt{- g} L \right) \over \delta g_{a b}} . \end{aligned}$$

Applying this result to the case of electromagnetism and (4.25), we see that the relevant stress-energy tensor takes the form

$$\begin{aligned} T_{a b}^\mathrm {EM} = - {1\over \mu _0} \left[ g^{c d}F_{a c}F_{b d}-{1\over 4}g_{a b} \left( F_{c d}F^{c d}\right) \right] . \end{aligned}$$

Case study: single fluids

Without an a priori, physics-based specification for \(T_{a b}\), solutions to the Einstein equations are void of physical content, a point which has been emphasized, for instance, by Geroch and Horowitz (in Hawking and Israel 1979). Unfortunately, the following algorithm for producing “solutions” has been much abused: (i) specify the form of the metric, typically by imposing some type of symmetry (or symmetries), (ii) work out the components of \(G_{a b}\) based on this metric, (iii) define the energy density to be \(G_{0 0}\) and the pressure to be \(G_{1 1}\), say, and thereby “solve” those two equations, and (iv) based on the “solutions” for the energy density and pressure solve the remaining Einstein equations. The problem is that this algorithm is little more than a mathematical parlour game. It is only by sheer luck that it will generate a physically relevant solution for a non-vacuum spacetime. As such, the strategy is antithetical to the raison d’être of, say, gravitational-wave astrophysics, which is to use observed data as a probe of the microphysics, say, in the cores of neutron stars. Much effort is currently going into taking given microphysics and combining it with the Einstein equations to model gravitational-wave emission from astrophysical scenarios, like binary neutron star mergers (Baiotti and Rezzolla 2017). To achieve this aim, we need an appreciation of the stress-energy tensor and how it is encodes the physics.

General stress decomposition

Readers familiar with Newtonian fluids will be aware of the roles that the internal energy (recall the discussion in Sect. 2), the particle flux, and the stress tensor play in the fluid equations. In special relativity we learn that, in order to have spacetime covariant theories (e.g., well-behaved with respect to the Lorentz transformation) energy and momentum must be combined into a spacetime vector, whose zeroth component is the energy while the spatial components give the momentum (as measured by a given observer). The fluid stress must also be incorporated into a spacetime object, hence the necessity for \(T_{a b}\). Because the Einstein tensor’s covariant divergence vanishes identically, we must have

$$\begin{aligned} \nabla _b T^b{}_a = 0 . \end{aligned}$$

This provides us with four equations, often interpreted as the equations for relativistic fluid dynamics. As we will soon see, this interpretation makes “sense” (as the equations we arrive at reduce to the familiar Newtonian ones in the appropriate limit). However, from a formal point of view the argument is somewhat misleading. It leaves us with the impression that the job is done, but this is not (quite) the case. Sure, we are able to speedily write down the equations for a perfect fluid. But, we still have work to do if we want to consider more complex settings (e.g., including relative flows). This requires additional assumptions or a different approach altogether. One of the main aims with this review is to develop such an alternative and explore the results in a variety of settings. Having done this, we will see that (5.1) follows automatically once the “fluid equations” are satisfied. This may seem like splitting hairs at the moment, but the point we are trying to make should become clear as we progress.

The fact that we advocate a different strategy does not mean that the importance of the stress-energy tensor is (somehow) reduced. Not at all. We still need \(T_{ab}\) to provide the matter input for the Einstein equations and we may opt to use (5.1) to get (some of) the dynamical equations we need. Given this, it is important to understand the physical meaning of the components of \(T_{a b}\). In order to do this, we need to introduce a suitable observer (someone has to measure energy etc. for us). This then allows us to express the tensor components in terms of projections into the timelike and spacelike directions associated with this observer, in essence providing a fibration of spacetime as illustrated in Fig. 3.

In order to project a tensor along an observer’s timelike direction we contract that index with the observer’s four-velocity, \(U^a\). The required projection of a tensor into spacelike directions perpendicular to the timelike direction defined by \(U^a\) is effected via the operator \(\perp ^a_b\), defined as

$$\begin{aligned} \perp ^a_b = \delta ^a{}_b + U^a U_b , \qquad U^a U_a = - 1 \quad \Longrightarrow \quad \perp ^a_b U^b = 0 \end{aligned}$$

Any tensor index that has been “hit” with the projection operator will be perpendicular to the timelike direction defined (locally) by \(U^a\). It is then easy to see that any vector can be expressed in terms of its component along a given \(U^a\) and components orthogonal (in the spacetime sense) to it. That is, we have

$$\begin{aligned} V^a = \delta ^a_b V^b + \underbrace{(U^a U_b V^b - U^a U_b V^b)}_{=0} = -(U_bV^b) U^a + \perp ^a_b V^b \end{aligned}$$

The two projections (of a vector \(V^a\) for an observer with unit four-velocity \(U^a\)) are illustrated in Fig. 8. More general tensors are projected by acting with \(U^a\) or \(\perp ^a_b\) on each index separately (i.e., multi-linearly).

Fig. 8

The projections of a vector \(V^a\) onto the worldline defined by \(U^a\) (providing a fibration of spacetime) into the perpendicular hypersurface (obtained from a projection with \(\perp ^a_b\))

Let us now see how we can use the projection to give physical “meaning” to the components of the stress-energy tensor. The energy density \(\varepsilon \) as perceived by the observer is (see Eckart 1940 for one of the earliest discussions)

$$\begin{aligned} \varepsilon = U^a U^b T_{a b} , \end{aligned}$$


$$\begin{aligned} \mathcal{P}_a = - \perp ^b_a U^c T_{b c} \end{aligned}$$

is the spatial momentum density (as it does not have a contribution along \(U^a\) it is a three vector), and the spatial stresses are encoded in

$$\begin{aligned} \mathcal{S}_{a b} = \perp ^c_a \perp ^d_b T_{c d} . \end{aligned}$$

As usual, the manifestly spatial component \(\mathcal {S}_{i j}\) is understood to be the ith-component of the force across a unit area perpendicular to the jth-direction. With respect to the observer, the stress-energy tensor can now be written (in complete generality) as

$$\begin{aligned} T_{a b} = \varepsilon \, U_a U_b + 2 U_{(a} \mathcal{P}_{b)} + \mathcal{S}_{a b}, \end{aligned}$$

where \(2 U_{(a} \mathcal{P}_{b)} \equiv U_a \mathcal{P}_b + U_b \mathcal{P}_a\). Because \(U^a \mathcal{P}_a = 0\), we see that the trace \(T = T^a{}_a\) is

$$\begin{aligned} T = \mathcal{S} - \varepsilon , \end{aligned}$$

where \(\mathcal{S} = \mathcal{S}^a{}_a\).

It is important at this stage to appreciate that we are discussing a mathematical construction. We need to take further steps to connect the phenomenology to the underlying physics.

“Off-the-shelf” analysis

As we have already suggested, there are different ways of deriving the general relativistic fluid equations. Our purpose here is not to review all possible approaches, but rather to focus on a couple: (i) an “off-the-shelf” consistency analysis for the simplest fluid a la Eckart (1940), to establish some of the key ideas, and then (ii) a more powerful method based on an action principle that varies fluid element world lines. We now consider the first of these. The second avenue will be explored in Sect. 6.

We have seen how the components of a general stress-energy tensor can be projected onto a coordinate system carried by an observer moving with four-velocity \(U^a\). Let us now connect this with the motion of a fluid. The simplest fluid is one for which there is only one four-velocity \(u^a\). As both four velocities are normalized (to unity) we must have

$$\begin{aligned} u^a = \gamma (U^a + v^a) , \quad \text{ with } \quad U_a v^a = 0 \quad \text{ and } \quad \gamma = (1-v^2)^{-1/2} \end{aligned}$$

the familiar redshift factor from special relativity. Clearly, the problem simplifies if we assume that the observer rides along with the fluid. That is, we introduce a preferred frame defined by \(u^a\), and then simply take \(U^a = u^a\). With respect to the fluid there will then (by definition) be no momentum flux, i.e., \(\mathcal{P}_a = 0\). Moreover, since we use a fully spacetime covariant formulation, i.e., there are only spacetime indices, the resulting stress-energy tensor will transform properly under general coordinate transformations, and hence can be used for any observer.

In general, the spatial stresses are given by a two-index, symmetric tensor, and the only objects that can be used to carry the indices (in the simple model we are considering at this point) are the four-velocity \(u^a\) and the metric \(g_{a b}\). Furthermore, because the spatial stress must also be symmetric, the only possibility is a linear combination of \(g_{a b}\) and \(u^a u^b\). Given that \(u^b \mathcal{S}_{b a} = 0\), we must have

$$\begin{aligned} \mathcal{S}_{a b} = \frac{1}{3} \mathcal{S} (g_{a b} + u_a u_b). \end{aligned}$$

As the system is assumed to be locally isotropic, it is possible to diagonalize the spatial stress tensor. This also implies that its three independent diagonal elements should actually be equal to the same quantity, which turns out to be the local pressure. Hence we have \(p = \mathcal{S}/3\) and

$$\begin{aligned} T_{a b} = \left( \varepsilon + p\right) u_a u_b + p g_{a b} = \varepsilon u_a u_b + p \perp _{ab} . \end{aligned}$$

This is the well-established result for a perfect fluid.

Given a relation \(p = p(\varepsilon )\) (an equation of state), there are four independent fluid variables. Because of this the equations of motion are often understood to be given by (5.1). Let us proceed along these lines, but first simplify matters by assuming that the equation of state is given by a relation of the form \(\varepsilon = \varepsilon (n)\) where n is the particle number density. As discussed in Sect. 2, the chemical potential \(\mu \) is then given by

$$\begin{aligned} {d} \varepsilon = \frac{d \varepsilon }{d n} {d} n \equiv \mu \, {d} n , \end{aligned}$$

and we know from the Euler relation (2.8) that

$$\begin{aligned} \mu n = p + \varepsilon . \end{aligned}$$

In essence, we have connected the model to the thermodynamics. This is an important step.

Let us now get rid of the free index of \(\nabla _b T^b{}_a = 0\) in two ways: first, by contracting with \(u^a\) and second, by projecting with \(\perp ^a_b\) (recalling that \(U^a = u^a\)). Given that \(u^a u_a = - 1\) we have the identity

$$\begin{aligned} \nabla _a \left( u^b u_b\right) = 0 \qquad \Longrightarrow \qquad u_b \nabla _a u^b = 0. \end{aligned}$$

Contracting (5.1) with \(u^a\) and using this identity gives

$$\begin{aligned} u^a \nabla _a \varepsilon + (\varepsilon + p) \nabla _a u^a = 0 . \end{aligned}$$

The definition of the chemical potential \(\mu \) and the Euler relation allow us to rewrite this as

$$\begin{aligned} \mu u^a \nabla _a n + \mu n \nabla _a u^a = 0 \qquad \Longrightarrow \qquad \nabla _a n^a = 0 , \end{aligned}$$

where we have introduced the particle flux, \(n^a \equiv n u^a\). This result simply represents the fact that the particles are conserved.

Meanwhile, projection of the free index in (5.1) using \(\perp ^b_a\) leads to

$$\begin{aligned} (\varepsilon + p) a_a = - \perp ^b_a \nabla _b p , \end{aligned}$$

where \(a_a \equiv u^b \nabla _b u_a\) is the fluid (four) acceleration. This is reminiscent of the Euler equation for Newtonian fluids. In fact, we demonstrate in Sect. 7.1 that the non-relativistic limit of (5.17) leads to the Newtonian result.

However, we should not be too quick to think that this is the only way to understand (5.1)! There is an alternative form that makes the perfect fluid have more in common with vacuum electromagnetism. If we define

$$\begin{aligned} \mu _a = \mu u_a , \end{aligned}$$

then the stress-energy tensor can be written in the form

$$\begin{aligned} T^a{}_b = p \delta ^a{}_b + n^a \mu _b . \end{aligned}$$

We have here our first encounter with the fluid element momentum \(\mu _a\) that is conjugate to the particle flux, the number density current \(n^a\). Its importance will become clearer as this story develops, particularly when we discuss the multi-fluid problem. For now, we simply note that \(u_a {d} u^a = 0\) implies that we will have

$$\begin{aligned} {d} \varepsilon = - \mu _a \, {d} n^a . \end{aligned}$$

This relation will serve as the starting point for the fluid action principle in Sect. 6, where \(- \varepsilon \) will be taken to be the fluid Lagrangian.

If we project onto the free index of (5.1) using \(\perp ^b_a\), as before, we arrive at

$$\begin{aligned} f_a + \left( \nabla _b n^b \right) \mu _a = 0 , \end{aligned}$$

where the force density \(f_a\) is

$$\begin{aligned} f_a = n^b \omega _{b a} , \end{aligned}$$

and the vorticity \(\omega _{a b}\) is defined as

$$\begin{aligned} \omega _{a b} \equiv 2 \nabla _{[ a} \mu _{b ]} = \nabla _a \mu _b - \nabla _b \mu _a . \end{aligned}$$

Contracting Eq. (5.21) with \(n^a\) we see (since \(\omega _{a b} = - \omega _{b a}\)) that

$$\begin{aligned} \nabla _a n^a = 0 \end{aligned}$$

and, as a consequence, the equations of motion take the form

$$\begin{aligned} f_a = n^b \omega _{b a} = 0 . \end{aligned}$$

The vorticity two-form \(\omega _{a b}\) has emerged quite naturally as an essential ingredient of the fluid dynamics (Lichnerowicz 1967; Carter 1989a; Bekenstein 1987; Katz 1984). This is a key result. Readers familiar with Newtonian fluids should be inspired by this, as the vorticity is used to establish theorems on fluid behaviour (for instance the Kelvin–Helmholtz theorem; Landau and Lifshitz 1959) and is at the heart of turbulence modelling (Pullin and Saffman 1998).


To demonstrate the role of \(\omega _{a b}\) as the vorticity, consider a small region of the fluid where the time direction \(t^a\), in local Minkowski coordinates, is adjusted to be the same as that of the fluid four-velocity so that \(u^a = t^a = (1,0,0,0)\). Eq. (5.25) and the antisymmetry then imply that \(\omega _{a b}\) can only have purely spatial components. Because the rank of \(\omega _{a b}\) is two, there are two “nulling” vectors, meaning their contraction with either index of \(\omega _{a b}\) yields zero (a condition which is true also for vacuum electromagnetism). We have arranged already that \(t^a\) be one such vector. By a suitable rotation of the coordinate system the other one can be taken to be \(z^a = (0,0,0,1)\), implying that the only non-zero component of \(\omega _{a b}\) is \(\omega _{x y}\).

Geometrically, this kind of two-form can be pictured as a collection of oriented worldtubes, whose walls lie in the \(x = \mathrm {const}\) and \(y = \mathrm {const}\) planes (Misner et al. 1973). Any contraction of a vector with a two-form that does not yield zero implies that the vector pierces the walls of the worldtubes. But when the contraction is zero, as in Eq. (5.25), the vector does not pierce the walls. This is illustrated in Fig. 9, where the red circles indicate the orientation of each world-tube. The individual fluid element four-velocities lie in the centers of the world-tubes. Finally, consider the closed contour in Fig. 9. If that contour is attached to fluid-element worldlines, then the number of worldtubes contained within the contour will not change because the worldlines cannot pierce the walls of the worldtubes. This is essentially the Kelvin–Helmholtz theorem on the conservation of vorticity. From this we learn that the Euler equation is (in fact) an integrability condition which ensures that the vorticity two-surfaces mesh together to fill spacetime.

Fig. 9

A local, geometrical view of the Euler equation as an integrability condition of the vorticity for a single-constituent perfect fluid


Conservation laws

The variational model we will develop contains the same information as the standard approach (a point that is emphasized by the Newtonian limit in Sect. 7.1)—as it must if we want it to be useful—but it is more directly linked to the conservation of vorticity. In fact, the definition of the vorticity implies that its exterior derivative vanishes. This means that

$$\begin{aligned} \nabla _{[a} \omega _{bc]} = 0. \end{aligned}$$

Whenever the Euler equation (5.25) holds, this leads to the vorticity being conserved along the flow. That is, we have

$$\begin{aligned} \mathcal {L}_u \omega _{ab} = 0. \end{aligned}$$

The upshot of this is that, Eq. (5.25) can be used to discuss the conservation of vorticity in an elegant way. It can also be used as the basis for a derivation of other theorems in fluid mechanics.

As is well-known, constants of motion are often associated with symmetries of the problem under consideration. In General Relativity, spacetime symmetries can be expressed in terms of Killing vectors, \(\hat{\xi }^a\) (the hat is used to make a distinction from the Lagrangian displacement later). As an example, let us assume that the spacetime does not depend on the coordinate \(a = X\). The corresponding Killing vector would be

$$\begin{aligned} \hat{\xi }^a = \delta ^a_X {\partial \over \partial X}, \end{aligned}$$

and the symmetry leads to Killing’s equation

$$\begin{aligned} \mathcal {L}_{\hat{\xi }} g_{a b} = 0 \qquad \Longrightarrow \qquad \nabla _a \hat{\xi }_b + \nabla _b \hat{\xi }_a = 0 . \end{aligned}$$

Associated with each such Killing vector will be a conserved quantity. In the vacuum case, it is easy to combine the geodesic equation

$$\begin{aligned} u^b \nabla _b u_a = 0 , \end{aligned}$$

with Killing’s equation to show that

$$\begin{aligned} u^b \nabla _b \left( \hat{\xi }^a u_a\right) = {d \over d\tau } \left( \hat{\xi }^a u_a\right) = 0 . \end{aligned}$$

In other words, the combination \(\hat{\xi }^a u_a\) remains constant along each geodesic.

Let us now consider how this argument extends to the fluid case. Assuming that the flow is invariant with respect to transport by the vector field \(\hat{\xi }^a\), we have

$$\begin{aligned} \mathcal {L}_{\hat{\xi }} \mu _a = 0 , \qquad \Longrightarrow \qquad \hat{\xi }^b \nabla _b \mu _a + \mu _b \nabla _a\hat{\xi }^b = 0 . \end{aligned}$$

Now combine this with the equation of motion in the form (5.25) to find

$$\begin{aligned} \hat{\xi }^a n^b \left( \nabla _b \mu _a - \nabla _a \mu _b\right) = n^b \nabla _b \left( \hat{\xi }^a \mu _a\right) = 0 . \end{aligned}$$

Since \(n^a = n u^a\) we see that the quantity \(\hat{\xi }^a \mu _a\) is conserved along the fluid world lines, reminding us of the vacuum result. The difference is due to the fact that pressure gradients in the fluid leads to the flow no longer being along geodesics. One may consider two specific situations. If \(\hat{\xi }^a\) is taken to be the four-velocity, then the scalar \(\hat{\xi }^a \mu _a\) represents the “energy per particle”. If instead \(\hat{\xi }^a\) represents an axial generator of rotation, then the scalar will correspond to an angular momentum. For the purposes of the present discussion we can leave \(\hat{\xi }^a\) unspecified, but it is still useful to keep these possibilities in mind.

Given that the flux is conserved, i.e. (5.16) holds, we can take one further step to show that we have

$$\begin{aligned} n^a \nabla _a \left( \mu _b \hat{\xi }^b\right) = \nabla _a \left( n^a \mu _b\hat{\xi }^b\right) = 0 , \end{aligned}$$

and we have shown that \(n^a \mu _b \hat{\xi }^b\) is a conserved quantity.

In many cases one can also obtain integrals of the motion, analogous to the Bernoulli equation for stationary rotating Newtonian fluids. Quite generally, the derivation proceeds as follows. Assume that \(\hat{\xi }^a\) is such that

$$\begin{aligned} \hat{\xi }^b \omega _{b a} = 0 . \end{aligned}$$

This condition can be written

$$\begin{aligned} \mathcal {L}_{\hat{\xi }} \mu _a - \nabla _a \left( \hat{\xi }^b \mu _b \right) = 0 \end{aligned}$$

where the first term vanishes as long as (5.35) holds. Hence, we arrive at the first integral

$$\begin{aligned} \nabla _a \left( \hat{\xi }^b \mu _b \right) = 0 \qquad \Longrightarrow \qquad \hat{\xi }^b \mu _b = \mathrm {constant} . \end{aligned}$$

An obvious version of this analysis is an irrotational flow, when \(\omega _{a b} = 0\). Another situation of direct astrophysical interest is “rigid” flow—when \(\hat{\xi }^a = \lambda u^a\) for some scalar field \(\lambda \). Rotating compact stars, in equilibrium, belong to this category. In that case, one would have \(\hat{\xi }^a =t^a + \varOmega \phi ^a\), where \(\varOmega \) is the rotation frequency and \(t^a\) and \(\phi ^a\) represent the timelike Killing vector and the spatial Killing vector associated with axisymmetry, respectively (the system permits a helical Killing vector).

A couple of steps towards relative flows

With the comments at the close of the previous section, we have reached the end of the road as far as the “off-the-shelf” strategy is concerned. We will now move towards an action-based derivation of the fluid equations of motion. As a first step, let us look ahead to see what is coming and why we need to go in this direction.

Return to the perfect fluid stress-energy tensor but now let us not associate the observer with the fluid flow. The thermodynamical relations still hold in the co-moving (fluid) frame associated with \(u^a\), but the observer sees the fluid flow by with the relative velocity \(v^a\) from (5.9). In essence, we then have

$$\begin{aligned} T_{ab} \!= & {} \! (p\!+\!\varepsilon )\gamma ^2 (U_a\!+\!v_a)(U_b +v_b)+pg_{ab} \nonumber \\ \!= & {} \! \varepsilon \gamma ^2 U_a U_b \!+\!p(U_aU_b\!+\!g_{ab})\!+\!2 (p\!+\!\varepsilon )\gamma ^2 U_{(a}v_{b)} \!+\! (p\!+\!\varepsilon )\gamma ^2 v_a v_b \end{aligned}$$

We learn several important lessons from this. The perfect fluid does not seem quite so simple in the frame of a general observer. First of all, the different thermodynamical quantities will be redshifted (as expected from Special Relativity) so we need to keep track of the \(\gamma \) factors. Secondly, we now appear to have both a momentum flux and anisotropic spatial stresses. In order to arrive at the main point we want to make, let us assume that the relative velocity is small enough that we can linearize the problem. As we will see later, this should be an adequate assumption in many situations of interest. Leaving out terms quadratic in \(v^a\) we lose the spatial stresses and \(\gamma \rightarrow 1\) (which is convenient as the thermodynamics then remains as before). We are left with

$$\begin{aligned} T_{ab} \approx \varepsilon U_a U_b +p(U_aU_b+g_{ab})+2 (p+\varepsilon ) U_{(a}v_{b)} . \end{aligned}$$

At this point, we can make use of the freedom to choose the observer. We may return to the case where the observer rides along with the fluid by setting (\(v^a=0\)). This choice is commonly called the Eckart frame, as it was first introduced in the discussion of relativistic heat flow (see Sect. 15). This is the obvious choice for a single fluid problem, but when we are dealing with multiple flows there are alternatives.

As an illustration, in the case of a problem with both matter and heat flowing, we have to replace the stress energy tensor by (don’t worry, we will derive this later)

$$\begin{aligned} T_{ab} \approx p g_{ab} + n\mu u_a u_b + sT u^\mathrm {s}_a u^\mathrm {s}_b , \end{aligned}$$

where s and T are the entropy (density) and temperature, respectively, and \(u_\mathrm {s}^a\) accounts for the heat flux. We have assumed that both flows may be linearized relative to the observer so

$$\begin{aligned} u_\mathrm {s}^a \approx U^a + q^a , \quad \text{ with } \quad U^aq_a = 0 , \end{aligned}$$

where \(q^a\) is the heat flux. This means that we have

$$\begin{aligned} T_{ab}\approx & {} p g_{ab} + (n\mu +sT) U_a U_b + 2 n\mu U_{(a}v_{b)} + 2 sT U_{(a}q_{b)} \nonumber \\= & {} \varepsilon U_a U_b +p(U_aU_b+g_{ab}) + 2 n\mu U_{(a}v_{b)} + 2 sT U_{(a}q_{b)} . \end{aligned}$$

In this case, the momentum flux relative to the observer will be

$$\begin{aligned} \mathcal {P}_a = - \perp ^b_a U^c T_{b c} = n\mu v_a + sT q_a . \end{aligned}$$

Basically, an observer riding along with the matter will experience heat flowing. We may, however, work with a different observer according to whom no energy flows. It is easy to see that this involves setting

$$\begin{aligned} v_a = -{sT \over n\mu } q_a = -{sT \over p+\varepsilon -sT} q_a . \end{aligned}$$

With this choice we are left with

$$\begin{aligned} T_{ab} \approx \varepsilon U_a U_b + p ( g_{ab} + U_a U_b ) , \end{aligned}$$

reminding us of the perfect fluid situation, even though we are considering a more complicated problem. It follows that

$$\begin{aligned} U^a T^b_{\ a} = - \varepsilon U^b . \end{aligned}$$

Formally, the energy density \(\varepsilon \) is an eigenvalue of the stress-energy tensor (with the observer four velocity \(U^a\) the corresponding eigenvector). This choice of observer is usually referred to as the Landau–Lifshitz frame (Landau and Lifshitz 1959).

We are free to work with whatever observer we like—different options have different advantages—but there is no free lunch. For example, with the Landau–Lifshitz choice the fluid equations simplify, but the particle conservation law becomes more involved. We now have

$$\begin{aligned} \nabla _a n^a \approx \nabla _a ( nU^a + nv^a) = \nabla _a \left( nU^a - {n sT \over p+\varepsilon -sT} q^a \right) = 0 . \end{aligned}$$

The contribution from the heat flux is not particularly intuitive.

The main lesson we learn from this exercise is that any situation with relative flows involves making choices, and we have to keep careful track of how these choices impact on the connection with the underlying physics. This motivates the formal development of the variational approach for general relativistic multifluid systems, to be described in Sect. 9.

From microscopic models to the equation of state

We have discussed how the equations for relativistic fluid dynamics relate to a given stress-energy tensor, involving a set of suitably averaged variables (energy, pressure, four-velocity etc.). We have also seen how one can obtain the equations of motion from

$$\begin{aligned} \nabla _a T^{ab} = 0 , \end{aligned}$$

as required by the Einstein field equations (by virtue of the Bianchi identities). Moreover, in Sect. 4 we showed how the stress-energy tensor can be obtained via a variation of the Lagrangian with respect to the spacetime metric. This description is neatly self-consistent—and we will make frequent use of it later—but it is helpful to pause and consider the logic. In principle, the relation (5.51) follows from the fact that the Einstein tensor \(G_{ab}\) is divergence free, which in turn represents the fact that the problem involves four “unphysical” degrees of freedom, usually taken to mean that we have the freedom to choose the four spacetime coordinates. However, by turning (5.51) into the equations for fluid dynamics we are changing the perspective. The four degrees of freedom now represent the conservation of energy and momentum. Why are we allowed to do this? Is it simply a fluke that the four degrees of freedom involved can be suitably interpreted in a manner that fits out purpose? One can argue that this is, indeed, the case and we will discuss this later.

For the moment, we want to consider a different aspect of the problem. If it is the case that (5.51) encodes the fluid equations of motion, then there ought to be a way to derive the stress-energy tensor from some underlying microscopical theory (presumably involving quantum physics). This issue turns out to be somewhat involved. As a starting point, suppose we focus on a one-parameter system, with the parameter being the particle number density. The equation of state will then be of the form \(\varepsilon = \varepsilon (n)\), representing the energy per particle. In many-body physics (as studied in condensed matter, nuclear, and particle physics) one can then in principle construct the quantum mechanical particle number density \(n_{\mathrm {QM}}\), stress-energy tensor \(T^{\mathrm {QM}}_{a b}\), and associated conserved particle number density current \(n^a_{\mathrm {QM}}\) (starting from some fundamental Lagrangian, say; cf. Walecka 1995; Glendenning 1997; Weber 1999). But unlike in quantum field theory in a curved spacetime (Birrell and Davies 1982), one typically assumes that the matter exists in an infinite Minkowski spacetime.

Once \(T^{\mathrm {QM}}_{a b}\) is obtained, and after (quantum mechanical and statistical) expectation values with respect to the system’s (quantum and statistical) states are taken, one defines the energy density as

$$\begin{aligned} \varepsilon = u^a u^b \langle T^{\mathrm {QM}}_{a b} \rangle , \end{aligned}$$


$$\begin{aligned} u^a \equiv \frac{1}{n} \langle n^a_{\mathrm {QM}} \rangle , \qquad n = \langle n_{\mathrm {QM}} \rangle . \end{aligned}$$

Similarly, the pressure is obtained as

$$\begin{aligned} p = \frac{1}{3} \left( \langle T^{\mathrm {QM} a}{}_a \rangle + \varepsilon \right) \end{aligned}$$

and it will also be a function of n.

One must be very careful to distinguish \(T^{\mathrm {QM}}_{a b}\) from \(T_{a b}\). The former describes the states of elementary particles with respect to a fluid element, whereas the latter describes the states of fluid elements with respect to the system. Comer and Joynt (2003) have shown how this line of reasoning applies to the two-fluid case.

This outline description stays close to the fluid picture, but it does not shed much light on the origin of \(T^{\mathrm {QM}}_{a b}\). This is where we run into “trouble”. A typical field theory description would take a given symmetry of the system as its starting point, and then obtain equations of motion for conserved quantities associated with this symmetry. Let us consider this problem in flat space and use a scalar field with Lagrangian \(L=L(\phi , \partial _a \phi )\) as our example. Assuming that the system is symmetric under spacetime translations, we have four conserved (Noether) currents given by

$$\begin{aligned} \tau ^a_{\ b} = {\partial L \over \partial ( \partial _a \phi )} \partial _b \phi - \delta ^a_b L . \end{aligned}$$

That is, we have

$$\begin{aligned} \partial _a \tau ^a_{\ b} = 0 , \end{aligned}$$

which follows by virtue of the Euler-Lagrange equations:

$$\begin{aligned} \partial _a \left( {\partial L \over \partial ( \partial _a \phi )} \right) - {\partial L \over \partial \phi } = 0 , \end{aligned}$$

and the fact that we are working in flat space (so partial derivatives commute). It may seem tempting to take \(\tau ^a_{\ b}\) to be the stress-energy tensor—intuitively, we can change partial derivatives to covariant ones, introduce the spacetime metric (instead of \(\eta ^{ab}\), as appropriate), to arrive at an expression similar to (5.51). However, the Devil is in the detail. The flat-space field equations represent a true conservation law (with four conserved currents, one for each value of b in (5.56)), which is what we expect, but \(\tau ^a_{\ b}\) is (in general) not symmetric. Since symmetry is required for the gravitational stress-energy tensor \(T^{ab}\) (as long as we do not deviate from Einstein’s theory) we have a problem. The issue is resolved by invoking the Belinfante-Robinson “correction” to \(\tau ^a_{\ b}\) (see for example Ilin and Paston 2018 for a recent discussion). This is a uniquely defined object which effects the change from a flat to a curved spacetime. While we will not need to understand the details of this procedure to make progress, it is important to be aware of it.

Variational approach for a single-fluid system

Let us now consider the single-fluid problem from a different perspective and derive the equations of motion and the stress-energy tensor from an action principle. The ideas behind this variational approach can be traced back to Taub (1954) (see also Schutz 1970). Our approach relies heavily on the work of Brandon Carter, his students, and collaborators (Carter 1989a; Comer and Langlois 1993, 1994; Carter and Langlois 1995b, 1998; Langlois et al. 1998; Prix 2000, 2004). This strategy is attractive as it makes maximal use of the tools of the trade of relativistic fields, i.e., no special tricks or devices will be required (unlike even the case of the “off-the-shelf” approach). Our footing is made sure by well-grounded, action-based arguments. As Carter has made clear: When there are multiple fluids, of both the charged and uncharged variety, it is essential to distinguish the fluid momenta from the velocities, in order to make the geometrical and physical content of the equations transparent. A well-posed action is, of course, perfect for systematically constructing the momenta.

Specifically, we will make use of a “pull-back” approach (see, e.g., Comer and Langlois 1993, 1994; Comer 2002) to construct a Lagrangian displacement of the particle number density flux \(n^a\), whose magnitude n is the particle number density. This will form the basis for the variations of the fundamental fluid variables in the action principle.

The action principle

It is useful to begin by explaining why we need to develop a constrained action principle. The argument is quite simple. Consider a single matter component, represented by a flux \(n^a\). For an isotropic system the matter Lagrangian, which we will call \(\varLambda \) (taking over the role of L from Sect. 4), should be a relativistic invariant and hence depend only on \(n^2 = -g_{ab} n^a n^b\). In effect, this means that it depends on both the flux and the spacetime metric. This is, of course, important as the dependence on the metric leads to the stress-energy tensor (again, as in Sect. 4). An arbitrary variation of \(\varLambda =\varLambda (n^2)=\varLambda (n^a,g_{ab})\) now leads to (ignoring terms that can be written as total derivatives representing “surface terms”, as in the point-particle discussion)

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left[ \ \mu _a \delta n^a + \frac{1}{2} \left( \varLambda g^{a b} + n^a \mu ^b\right) \delta g_{a b}\right] , \end{aligned}$$

where \(\mu _a\) is the canonical momentum, which is given by

$$\begin{aligned} \mu _{a} = {\partial \varLambda \over \partial n^a} = -2 {\partial \varLambda \over \partial n^2} g_{ab} n^b . \end{aligned}$$

We have also used (see Sect. 4.4)

$$\begin{aligned} \delta \sqrt{-g} = {1\over 2} g^{ab} \delta g_{ab} . \end{aligned}$$

Here is the problem: As it stands, Eq. (6.1) suggests that the equations of motion would simply be \(\mu _a=0\), which means that the fluid carries neither energy nor momentum. This is obviously not what we are looking for,

In order to make progress, we impose the constraint that the flux is conserved.Footnote 14 That is, we insist that

$$\begin{aligned} \nabla _ a n^a = 0 . \end{aligned}$$

From a strict field theory point of view, it makes sense to introduce this constraint. The conservation of the particle flux (the number density current) should not be a part of the equations of motion, but rather should be automatically satisfied when evaluated on a solution of the “true” equations.

For reasons that will become clear shortly, it is useful to rewrite the conservation law in terms of the dual three-formFootnote 15

$$\begin{aligned} n_{abc} = \epsilon _{dabc} n^d, \end{aligned}$$

such that

$$\begin{aligned} n^a = {1\over 3!} \epsilon ^{bcda} n_{bcd} . \end{aligned}$$

It also follows that

$$\begin{aligned} n^2 = - g_{ab} n^a n^b = {1\over 3!} n_{abc}n^{abc} , \end{aligned}$$

which shows that \(n_{abc}\) acts as a volume measure which allows us to “count” the number of fluid elements. In Fig. 9 we have seen that a two-form is associated with worldtubes. A three-form is the next higher-ranked object and it can be thought of, in an analogous way, as leading to boxes (Misner et al. 1973). This is quite intuitive, and we will comment on it again later.


With this set-up, the conservation of the matter flux is ensured provided that the three-form \(n_{abc}\) is closed. It is easy to see that

$$\begin{aligned} \partial _{[a} n_{bcd]}=\nabla _{[a} n_{bcd]} = 0\quad \Longrightarrow \quad \nabla _{a} n^{a} = 0 . \end{aligned}$$
Fig. 10

The pull-back from “fluid-particle” points in the three-dimensional matter space, labelled by the coordinates \(\{X^1,X^2,X^3\}\), to fluid-element worldlines in spacetime. Here, the pull-back of the “\(I^{ th }\)” (\(I = 1,2,\dots ,n\)) fluid-particle to, say, an initial point on a worldline in spacetime can be taken as \(X^A_I = X^A(0,x^i_I)\) where \(x^i_I\) is the spatial position of the intersection of the worldline with the \(t = 0\) time slice

The main reason for introducing the dual is that it is straightforward to construct a particle number density three-form that is automatically closed. We achieve this by introducing a three-dimensional “matter” space—the left-hand part of Fig. 10—which is labelled by coordinates \(X^A\), where \(A,B,C, \ldots = 1,2,3\). For each time slice in spacetime, we have the same configuration in the matter space. That is, as time moves forward, the fluid particle positions in the matter space remain fixed—even through the worldlines weave through spacetime. In this sense we are “pulling back” from the matter space to spacetime (cf. the discussion of the Lie derivative). The \(n_{abc}\) three-form can then be “pushed forward” to the three-dimensional matter space by using the map associated with the coordinates \(X^A\) (which represent scalar fields on spacetime):

$$\begin{aligned} \psi ^A_a = \partial _a X^A . \end{aligned}$$

This construction leads to a matter-space three form \(N_{ABC}\),

$$\begin{aligned} n_{a b c} = \psi _a^A \psi _b^B \psi ^C_c N_{A B C} , \end{aligned}$$

which is completely anti-symmetric in its indices. The final step involves noting that

$$\begin{aligned} \partial _{[a}n_{bcd]} = \psi ^A_a \psi ^B_b \psi ^C_c\psi ^D_d \partial _{[A} n_{BCD]} = 0 , \end{aligned}$$

is automatically satisfied if

$$\begin{aligned} \partial _{[A} n_{BCD]} = 0 , \end{aligned}$$

which, in turn, follows if \(n_{ABC}\) is taken to be a function only of the \(X^A\) coordinates. This completes the argument.

Now we need to connect this idea to the variational principle. The key step involves introducing the Lagrangian displacement \(\xi ^a\), tracking the motion of a given fluid element. From the standard definition of Lagrangian variations, we have

$$\begin{aligned} \varDelta X^A = \delta X^A + \mathcal {L}_{\xi } X^A = 0 , \end{aligned}$$

where \(\delta X^A\) is the Eulerian variation and \(\mathcal {L}_{\xi }\) is the Lie derivative along \(\xi ^a\). This means that we have

$$\begin{aligned} \delta X^A = - \mathcal {L}_{\xi } X^A = - \xi ^a {\partial X^A \over \partial x^a} = -\xi ^a \psi ^A_a. \end{aligned}$$

It also follows that

$$\begin{aligned} \varDelta \psi ^A_a= & {} \delta \psi ^A_a + \xi ^b \partial _b \psi ^A_a + \psi ^A_b \partial _a \xi ^b = \partial _a \delta X^A + \xi ^b \partial _b \psi ^A_a + \psi ^A_b \partial _a \xi ^b \nonumber \\= & {} \partial _a \left( \varDelta X^a - \xi ^b \partial _b X^a \right) + \xi ^b \partial _b \psi ^A_a + \psi ^A_b \partial _a \xi ^b= 0 , \end{aligned}$$

since partial derivatives commute. Given these results, it is easy to show that

$$\begin{aligned} \varDelta n_{abc} = \psi ^A_a \psi ^B_b\psi ^C_c \partial _D N_{ABC} \varDelta X^D = 0 . \end{aligned}$$

This implies that

$$\begin{aligned} \delta n_{abc} = - \mathcal {L}_\xi n_{abc} , \end{aligned}$$

and hence

$$\begin{aligned} \delta n^a = {1\over 3!} \delta \left( \epsilon ^{bcda} n_{bcd} \right) = {1\over 3!} \left( \delta \epsilon ^{bcda} n_{bcd} - \epsilon ^{bcda} \mathcal {L}_\xi n_{bcd} \right) . \end{aligned}$$

Making use of a little bit of elbow grease and the standard relations

$$\begin{aligned} \delta g_{db} = - g_{da} g_{bc} \delta g^{ac} , \end{aligned}$$


$$\begin{aligned} \delta \epsilon ^{abcd} = {1\over 2} \epsilon ^{abcd} g_{ef} \delta g^{ef} , \end{aligned}$$

we arrive at

$$\begin{aligned} \delta n^a= & {} {1\over 3!} \delta ( \epsilon ^{ b c d a} n_{b c d} ) = n^b \nabla _b \xi ^a - \xi ^b \nabla _b n^a - n^a \left( \nabla _b \xi ^b - \frac{1}{2} g_{b c} \delta g^{b c}\right) \nonumber \\= & {} - \mathcal {L}_\xi n^a - n^a \left( \nabla _b \xi ^b - \frac{1}{2} g_{b c} \delta g^{b c}\right) , \end{aligned}$$


$$\begin{aligned} \varDelta n^a = - n^a \left( \nabla _b \xi ^b + { 1 \over 2} g^{bd} \delta g_{bd} \right) = - {1 \over 2} n^a \left( g^{bd} \varDelta g_{bd}\right) , \end{aligned}$$


$$\begin{aligned} \varDelta g_{ab} = \delta g_{a b} + 2 \nabla _{(a} \xi _{b)} , \end{aligned}$$

(the parentheses indicate symmetrization, as usual). Equation (6.22) has a natural interpretation: The variation of a fluid worldline with respect to its own Lagrangian displacement has to be along the worldline and can only measure the changes of the volume of its own fluid element. This is one of the advantages of the Lagrangian variation approach.


Expressing the variations of the matter Lagrangian in terms of the displacement \(\xi ^a\), rather than the perturbed flux, we ensure that the flux conservation is accounted for in the equations of motion. The variation of \(\varLambda \) now leads to

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} \sqrt{- g} \left\{ f_a \xi ^a - \frac{1}{2}\left[ \left( \varLambda - n^c\mu _c\right) g_{a b} + n_a \mu _b \right] \delta g^{a b} \right\} \nonumber \\&+ \nabla _a \left( \frac{1}{2} \sqrt{- g} \mu ^{abc} n_{bcd} \xi ^d\right) , \end{aligned}$$

and the fluid equations of motion are given by

$$\begin{aligned} f_b \equiv 2 n^a \nabla _{[a} \mu _{b]} = 0 , \end{aligned}$$

(where the square brackets indicate anti-symmetrization, as usual). Finally, introducing the vorticity two-form

$$\begin{aligned} \omega _{ab} = 2\nabla _{[a} \mu _{b]} , \end{aligned}$$

we have the simple relation

$$\begin{aligned} n^a \omega _{ab} = 0 , \end{aligned}$$

which should be familiar (see Sect. 5.2).

We can also read off the stress-energy tensor from (6.27). We need (see Sect. 4)

$$\begin{aligned} T_{ab} = - {2 \over \sqrt{-g}} {\delta \left( \sqrt{-g}\varLambda \right) \over \delta g^{ab}} = \varLambda g_{ab} - 2 {\delta \varLambda \over \delta g^{ab}} . \end{aligned}$$

Finally, introducing the matter four-velocity, such that \(n^a=nu^a\) and \(\mu _a = \mu u_a\), where \(\mu \) is the chemical potential (as before), we see that the energy is

$$\begin{aligned} \varepsilon = u_a u_b T^{ab} = - \varLambda . \end{aligned}$$

Moreover, we identify the pressure from the thermodynamical relation:

$$\begin{aligned} p = -\varepsilon + n\mu = \varLambda - n^c\mu _c . \end{aligned}$$

This means that we have

$$\begin{aligned} T^{ab} = pg^{ab} + n^a \mu ^b = \varepsilon u^a u^b + p \perp ^{ab} , \end{aligned}$$

and it is straightforward to confirm that

$$\begin{aligned} \nabla _a T^{ab} = - f^b + \nabla ^b \varLambda - \mu ^b \nabla _a n^a = - f^b = 0 , \end{aligned}$$

given that (i) \(\varLambda \) is a function only of \(n^a\) and \(g_{ab}\), and (ii) the definition of the momentum \(\mu _a\).


Lagrangian perturbations

Later, we will consider linear dynamics of different systems—both at the local level and for macroscopic bodies like rotating stars. This inevitably draws on an understanding of perturbation theory, which (in turn) makes contact with the variational argument we have just completed. Given this, it is worth making a few additional remarks before we move on.

First of all, an unconstrained variation of \(\varLambda (n^2)\) is with respect to \(n^a\) and the metric \(g_{a b}\), and allows the four components of \(n^a\) to be varied independently. It takes the form

$$\begin{aligned} \delta \varLambda = \mu _a \delta n^a + \frac{1}{2} n^a \mu ^b \delta g_{a b} , \end{aligned}$$


$$\begin{aligned} \mu _a = \mathcal {B}n_a , \qquad \mathcal {B}\equiv - 2 \frac{\partial \varLambda }{\partial n^2}. \end{aligned}$$

The use of the letter \(\mathcal {B}\) is to remind us that this is a bulk fluid effect, which is present regardless of the number of fluids and constituents. The momentum covector \(\mu _a\) is (as we have seen) dynamically, and thermodynamically, conjugate to \(n^a\), and its magnitude is the chemical potential of the particles (recalling that \(\varLambda = - \varepsilon \)).

Next, by introducing the displacement \(\xi ^a\), effectively tracking the fluid elements, we have prepared the ground for a study of general Lagrangian perturbations (for example, those used in relativistic studies of neutron-star instabilities (Friedman 1978), see Sect. 7.4). In fact, given the results from the variational derivation it is straightforward to write down the perturbed fluid equations.

By introducing the decomposition \(n^a = n u^a\) we can show that the argument that led to (6.22) also providesFootnote 16

$$\begin{aligned} \delta n = - \nabla _a \left( n \xi ^a \right) - n \left( u_a u^b \nabla _b \xi ^a + \frac{1}{2} \perp ^{ab} \delta g_{a b}\right) , \end{aligned}$$


$$\begin{aligned} \delta u^a = \left( \delta ^a{}_b + u^a u_b \right) \left( u^c \nabla _c \xi ^b - \xi ^c \nabla _c u^b \right) + \frac{1}{2} u^a u^b u^c \delta g_{b c}. \end{aligned}$$

Similar arguments lead to

$$\begin{aligned} \varDelta u^a= & {} \frac{1}{2} u^a u^b u^c \varDelta g_{b c}, \end{aligned}$$
$$\begin{aligned} \varDelta \epsilon _{a b c d}= & {} \frac{1}{2} \epsilon _{a b c d} g^{e f} \varDelta g_{e f}, \end{aligned}$$
$$\begin{aligned} \varDelta n= & {} - \frac{n}{2} \perp ^{ab} \varDelta g_{a b}. \end{aligned}$$

These results and their Newtonian analogues were used by Friedman and Schutz in establishing the so-called Chandrasekhar–Friedman–Schutz (CFS) instability (Chandrasekhar 1970; Friedman and Schutz 1978a, b) (see Sect. 7.4).

Working with the matter space

The derivation of the Euler equations (6.28) made “implicit” use of the matter space as a device to ensure the conservation of the particle flux. In many ways it makes sense to introduce the argument this way, but—as we will see when we consider elasticity—it can be useful to work more explicitly with the matter space quantities.

Let us first note that, as implied by Fig. 10, the \(X^A\) coordinates are comoving with their respective worldlines, meaning that they are independent of the proper time \(\tau \), say, that parameterizes each curve. This is easy to demonstrate. Introducing the four velocity associated with the world line through \(n^a = n u^a\), we have

$$\begin{aligned} n{dX^A \over d\tau }= & {} n {dx^a \over d\tau } \partial _a X^A = n^a \partial _a X^A = \mathcal{L}_n X^A \nonumber \\= & {} - \frac{1}{3!} \epsilon ^{b c d a} \psi ^A_a \psi ^B_b \psi ^C_c \psi ^D_d N_{B C D} = 0. \end{aligned}$$

We see that the time part of the spacetime dependence of the \(X^A\) is somewhat ad hoc. If we take the flow of time \(t^a\) to provide the proper time of the worldlines (\(t^a\) is parallel to \(n^a\) and hence \(u^a\)), the \(X^A\) do not change. An apparent time dependence in spacetime means that \(t^a\) is such as to cut across fluid worldlines (\(t^a\) is not parallel to \(n^a\)), which of course have different values for the \(X^A\).

It is also worth noting the (closely related) fact that \(n_{abc}\) is a “fixed” tensor, in the sense that

$$\begin{aligned} u^a n_{abc} = n u^a u^d \epsilon _{dabc} = 0 , \end{aligned}$$

(i.e. the three-form is spatial) and

$$\begin{aligned} \mathcal {L}_u n_{abc} = 0 , \end{aligned}$$

(it does not change along the flow). The latter is equivalent to requiring that the three-form \(n_{abc}\) be closed; i.e.,

$$\begin{aligned} \nabla _{[a}n_{bcd]} = \partial _{[a}n_{bcd]} = 0, \end{aligned}$$

which, of course, holds by construction.

From a formal point of view, we have changed perspective by taking the (scalar fields) \(X^A\) to be the fundamental variables. The construction also provides matter space with a geometric structure. As a first example of this note that, if integrated over a volume in matter space, \(n_{ABC}\) provides a measure of the number of particles in that volume. To see this, simply introduce a matter space three form \(\epsilon _{ABC}\) such that

$$\begin{aligned} n_{ABC} = n\epsilon _{ABC} , \end{aligned}$$

and recall that such an object represents a volume. Since n is the number density, it follows immediately that \(n_{ABC}\) represents the number of particles in the volume. This object is directly linked to the spacetime version;

$$\begin{aligned} n_{abc} = n u^d \epsilon _{dabc} \equiv n \epsilon _{abc} \end{aligned}$$

where \(\epsilon _{abc}\) is associated with a right-handed tetrad moving along \(u^a\). It then follows immediately that

$$\begin{aligned} \epsilon _{abc} = \psi ^A_a \psi ^B_b \psi ^C_c \epsilon _{ABC} . \end{aligned}$$

Inspired by this, we may also introduce

$$\begin{aligned} g^{A B} = \psi ^A_a \psi ^B_b g^{a b} = \psi ^A_a \psi ^B_b \perp ^{a b} , \end{aligned}$$

representing the induced metric on matter space.

Equipped with these matter space quantities, it is fairly natural to ask; is it possible to express the Lagrangian \(\varLambda (n^2)\) in terms of matter space quantities? The answer will soon be relevant, so let us consider it now. It is straightforward to show that we may consider \(\varLambda \) to be a function of \(g^{A B}\) and \(n_{ABC}\):

$$\begin{aligned} n^2= & {} - g_{ab} n^a n^b = {1\over 3!} n_{abc} n^{abc} \nonumber \\= & {} {1\over 3!} \left( \psi ^A_a g^{ad} \psi ^D_d\right) \left( \psi ^B_b g^{be} \psi ^E_e\right) \left( \psi ^C_c g^{cf} \psi ^F_f\right) n_{ABC} n_{DEF} \nonumber \\= & {} {1\over 3!} g^{AD} g^{BE} g^{CF} n_{ABC} n_{DEF} . \end{aligned}$$

It follows that, if we introduce

$$\begin{aligned} \gamma _{AB} = \left( \sqrt{\det \left( g_{GH}\right) } n\right) ^{2/3} g_{AB} , \end{aligned}$$

then (using Eq. (B.8) from Appendix 2)

$$\begin{aligned} n^2 = \frac{1}{3!} \gamma ^{AD} \gamma ^{BE} \gamma ^{CF} [ABC] [DEF] = \det \left( \gamma ^{AB}\right) . \end{aligned}$$


$$\begin{aligned} \varLambda (n^2)\quad \Leftrightarrow \quad \varLambda \left( \mathrm {det} \left( \gamma ^{AB}\right) \right) . \end{aligned}$$

Finally, it is worth noting that, alongside the number three-form we may introduce the analogous object for the momentum:

$$\begin{aligned} \mu ^{abc} = \epsilon ^{dabc} \mu _d , \quad \mu _a = {1\over 3!} \epsilon _{bcda} \mu ^{bcd} . \end{aligned}$$

This then leads to

$$\begin{aligned} n\mu = -n^a \mu _a = n_{abc}\mu ^{abc} = n_{ABC} \mu ^{ABC} , \end{aligned}$$


$$\begin{aligned} \mu ^{ABC} = \psi ^A_a\psi ^B_b\psi ^C_c \mu ^{abc} . \end{aligned}$$

A step towards field theory

The quantities we introduced in the previous section may seem somewhat abstract at this point, but their meaning will (hopefully) become clearer later. As a first exercise in working with them, let us ask what happens if we consider the matter space “fields” as the fundamental variables of the theory.

In general, we might take the Lagrangian to be \(\varLambda = \varLambda (X^A, \psi ^A_a, g^{ab})\) (as in, for example, Jezierski and Kijowski 2011). This leads to

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left\{ {\partial \varLambda \over \partial X^A} \delta X^A+ {\partial \varLambda \over \partial \psi ^A_a} \delta \psi ^A_a + \left[ {\partial \varLambda \over \partial g^{ab} }- {\varLambda \over 2} g_{ab}\right] \delta g^{ab} \right\} . \end{aligned}$$

If we introduce the Lagrangian displacement, as before, we already know that

$$\begin{aligned} \varDelta X^A = 0 , \end{aligned}$$


$$\begin{aligned} \varDelta \psi ^A_a = 0 \quad \Longrightarrow \quad \delta \psi ^A_a = - \xi ^c \nabla _c \psi ^A_a - \psi ^A_c \nabla _a \xi ^c = - \nabla _a \left( \xi ^c \psi ^A_c\right) , \end{aligned}$$

where we have used the fact that partial derivatives commute. It then follows that

$$\begin{aligned} {\partial \varLambda \over \partial X^A} \delta X^A+ {\partial \varLambda \over \partial \psi ^A_a} \delta \psi ^A_a = -\xi ^c \psi ^A_c \left[ {\partial \varLambda \over \partial X^A} - \nabla _a \left( {\partial \varLambda \over \partial \psi ^A_a} \right) \right] , \end{aligned}$$

and we see that the Euler-Lagrange equations are

$$\begin{aligned} \psi ^A_c \left[ {\partial \varLambda \over \partial X^A} - \nabla _a \left( {\partial \varLambda \over \partial \psi ^A_a} \right) \right] =0 . \end{aligned}$$

We also see that the stress-energy tensor is

$$\begin{aligned} T_{ab} = - {2 \over \sqrt{-g}} {\delta \left( \sqrt{-g} \varLambda \right) \over \delta g^{ab} } = \varLambda g_{ab} - 2{\partial \varLambda \over \partial g^{ab} } . \end{aligned}$$

It is easy to see that these results lead us back to (4.46).

In order to compare the Euler–Lagrange equations for the fields to the Euler equations (5.25), we need two intermediate results. First of all,

$$\begin{aligned} {\partial \varLambda \over \partial \psi ^A_a} = \mu _b {\partial n^b \over \partial \psi ^A_a}= & {} {1\over 3!} \mu _b \epsilon ^{cdeb} n_{CDE} {\partial \over \partial \psi ^A_a} \left( \psi ^C_c \psi ^D_d \psi ^E_e \right) \nonumber \\= & {} {1\over 2} \mu _b \epsilon ^{adeb} \psi ^D_d \psi ^E_e n_{ADE} = - {1\over 2} \mu ^{ade} \psi ^D_d \psi ^E_e n_{ADE} \nonumber \\= & {} - {1\over 2} \mu ^{ade} \delta _A^B \psi ^D_d \psi ^E_e n_{BDE} = - {1\over 2} \mu ^{ade} \left( \psi _A^b \psi ^B_b \right) \psi ^D_d \psi ^E_e n_{BDE} \nonumber \\= & {} - {1\over 2} \psi _A^b\mu ^{ade} n_{bde} = \psi ^b_A\left[ \delta ^a_b \left( \mu _c n^c\right) - \mu _b n^a\right] . \end{aligned}$$

This is true because (i) the metric is held fixed in the partial derivative, and (ii) \(n_{ABC}\) depends only on the matter space coordinates \(X^A\). We then see that

$$\begin{aligned} \psi ^A_b {\partial \varLambda \over \partial \psi ^A_a} = \perp _b^c \left[ \delta ^a_c \left( \mu _d n^d\right) - \mu _c n^a\right] = - n\mu \perp _b^a , \end{aligned}$$

since \(n^a=nu^a\), \(\mu _a= \mu u_a\) and \(\perp ^c_b u_c=0\). Secondly, we need

$$\begin{aligned} \psi ^A_c {\partial \varLambda \over \partial X^A} = \nabla _c \varLambda - {\partial \varLambda \over \partial \psi ^A_b} \nabla _c \psi ^A_b , \end{aligned}$$

Making use of these results, we get

$$\begin{aligned} \psi ^A_b \left[ {\partial \varLambda \over \partial X^A} - \nabla _a \left( {\partial \varLambda \over \partial \psi ^A_a} \right) \right]= & {} \nabla _a \left[ \delta ^a_b \varLambda - \psi ^A_b {\partial \varLambda \over \partial \psi ^A_a} \right] = \nabla _a \left[ \delta ^a_b \varLambda + n\mu \perp ^a_b\right] \nonumber \\= & {} \nabla _a \left[ \delta ^a_b (\varLambda -n^c \mu _c) + n^a \mu _b\right] = \nabla _aT^a_{\ b} = 0 . \end{aligned}$$

In essence, the two descriptions are consistent—as they had to be.

What we have outlined is a field-theory approach to the problem, based on the idea that the matter space variables can be viewed as fields in spacetime (Endlich et al. 2011). It is, of course, not a truly independent variational approach, and (as we have seen) the equations of motion one obtains need to be massaged into a more intuitive form. However, this does not mean that the argument is without merit. Looking at a problem from different perspectives tends to help understanding. In this particular instance, we may explore the connection between the symmetries of the problem and the matter space variables. By changing the focus from the familiar macroscopic fluid degrees of freedom to three scalar functions \(X^A\) it is easy to keep track of the expected Poincaré invariance. First of all, if we expect the system to be homogeneous and isotropic we have to require the fields to be invariant under internal translations and rotations. This means that

$$\begin{aligned} X^A \rightarrow X^A + a^A , \end{aligned}$$

for constant \(a^A\), and

$$\begin{aligned} X^A \rightarrow O^A_{\ B} X^B , \end{aligned}$$

where \(O^A_{\ B}\) is an SO(3) matrix (associated with rotation). These conditions do not restrict us to fluids, however, as they will also hold for isotropic solids. The final condition we need relates to invariance under volume-preserving diffeomorphisms, leading to

$$\begin{aligned} X^A \rightarrow \xi ^A(X^B) , \ \text{ with } \ \mathrm {det} {\partial \xi ^A \over \partial X^B} = 1 . \end{aligned}$$

In practice, this corresponds to the dynamics being invariant as the fluid elements move around without expansion or contraction.

What are the implications of these conditions? First of all, we need each of the \(X^A\) fields to be acted on by at least one derivative (although see Andersson et al. 2017a for a discussion on how this assumption can be relaxed for dissipative systems). This means that the Lagrangian cannot depend on \(X^A\) directly (as we assumed). Moreover, taking a field-theory view of the problem (see the discussion of the fluid-gravity correspondence in Sect. 16.4) we may focus on low momenta/low frequencies, for which the most relevant terms are those with the fewest derivatives. In effect, the lowest order Lagrangian will involve exactly one derivative acting on each \(X^A\). The focus then shifts to the map, \(\psi ^A_a\). As we expect to work with Lorentz scalars, it would be natural to assume that the Lagrangian must involve the contraction

$$\begin{aligned} g^{AB}= g^{ab} \psi ^A_a \psi ^B_b , \end{aligned}$$

from before (i.e., the induced metric on the matter space). Moreover, we have already seen that the symmetries require us to work with invariant functions of \(g^{AB}\) and the volume preserving argument picks out the determinant as the key combination.

The connection with quantum field theory is explored by Endlich et al. (2011), with particularly interesting developments relating to symmetry breaking and the emergence of superfluidity (Dubovsky et al. 2006, 2012) and extensions to incorporate quantum anomaliesFootnote 17 in the field theory (Dubovsky et al. 2014). And example of the latter is the Wess–Zumino anomaly, which leads to terms that remain only after integration by parts. In effect, the action is invariant, but the Lagrangian is not. Somewhat simplistically, one may associate such terms with the surface terms we neglected in the variational argument. There has also been some effort to extend the approach to dissipative systems (Endlich et al. 2013).

Newtonian limit and Lagrangian perturbations

The Newtonian limit

Having written down the equations that govern a single (barotropic) relativistic fluid, it is natural to consider the connection between the final expressions and standard Newtonian fluid dynamics. In order to make this connection, we need to establish how one arrives at the Newtonian limit of the relativistic equations. It is useful to work this out because—even though the framework we are developing is intended to describe relativistic systems—modelling often draws on intuition gained from good old Newtonian physics. This is especially the case when one considers “new” applications. Useful qualitative understanding can often be obtained from a Newtonian analysis, but we need relativistic models for precision and in order to explore unique aspects, like rotational frame-dragging and gravitational radiation.

There has been much progress on the analysis of Newtonian multifluid systems. Prix (2004) has developed an action-based formalism, analogous to the model we consider here (based on the notion of time-shifts, closely related to the Lagrangian variations in spacetime). Carter and Chamel (2004, 2005a, 2005b) have done the same, except that they use a fully spacetime covariant formalism (taking the work of Milne and Cartan as starting points), taking full account of the fact that the Newtonian limit is singular. Our aim here is less ambitious. We simply want to demonstrate how the Newtonian fluid equations can be extracted as the non-relativistic limit of the relativistic model.

We take as the starting point the leading order line element in the weak-field limit;

$$\begin{aligned} {d} s^2 = - c^2 d\tau ^2 = - c^2 \left( 1 + \frac{2 \varPhi }{c^2}\right) {d} t^2 + \eta _{i j} { d} x^i { d} x^j , \end{aligned}$$

where \(x^i\ (i=1-3)\) are Cartesian coordinates, \(\eta _{i j}\) is the flat three-dimensional metric and \(\varPhi \) is the gravitational potential. The Newtonian limit then follows by writing the equations to leading order in an expansion in powers of the speed of light c. Formally, the Newtonian results are obtained in the limit where \(c \rightarrow \infty \).

Let us apply this strategy to the equations of fluid dynamics. With \(\tau \) the proper time measured along a fluid element’s worldline, the curve it traces out can be written

$$\begin{aligned} x^{a}(\tau ) = \{c t(\tau ),x^i(\tau )\} . \end{aligned}$$

In order to work out the four-velocity,

$$\begin{aligned} u^{a} = \frac{{ d} x^a}{{ d} \tau } , \end{aligned}$$

we note that (7.1) leads to

$$\begin{aligned} { d} \tau ^2 = \left( 1 + \frac{2 \varPhi }{c^2} - \frac{\eta _{ij} v^i v^j}{c^2}\right) {d} t^2 , \end{aligned}$$

with \(v^i = { d}x^i/{d}t\) the Newtonian three-velocity of the fluid. Since the velocity is assumed to be small, in the sense that

$$\begin{aligned} {\left| v^i\right| \over c} \ll 1 , \end{aligned}$$

this leads to

$$\begin{aligned} {dt\over d\tau } \approx 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} , \end{aligned}$$

where \(v^2 = \eta _{i j} v^i v^j\), and

$$\begin{aligned} u^0 = {dx^0 \over d\tau } = c {dt \over d\tau } \approx c\left( 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) . \end{aligned}$$

It is also easy to see that

$$\begin{aligned} u^i = {d x^i \over d\tau } = v^i {dt \over d\tau } \approx v^i . \end{aligned}$$

In order to obtain the covariant components, we use the metric (which is manifestly diagonal). Thus, we find that

$$\begin{aligned} u_0 = g_{00}u^0 = - c \left( 1 + \frac{2 \varPhi }{c^2}\right) \left( 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) \approx -c \left( 1+ {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) , \end{aligned}$$


$$\begin{aligned} u_i = v_i . \end{aligned}$$

Note that these relations lead to

$$\begin{aligned} u^a u_a = -c^2 \left( 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) \left( 1+ {\varPhi \over c^2} + {v^2 \over 2 c^2}\right) + v^2 \approx - c^2 , \end{aligned}$$

as expected.

We can now work out the Newtonian limit for the conserved particle flux

$$\begin{aligned} \nabla _a ( n u^a)= & {} 0 \quad \Longrightarrow \quad {1\over c} \partial _t \left( n u^0 \right) + \nabla _i \left( nv^i \right) = 0 \nonumber \\&\Longrightarrow&\quad \partial _t n + \nabla _i \left( nv^i \right) = \mathcal {O}\left( c^{-1}\right) \end{aligned}$$

To leading order we retain the expected result

$$\begin{aligned} \partial _t n + \nabla _i \left( nv^i \right) = 0 , \end{aligned}$$

recovering the usual continuity equation by introducing the mass density \(\rho = mn\), with m the mass per particle.

In order to work out the corresponding limit of the Euler equations, we need the curvature contributions to the covariant derivative. However, from the definition (3.35) and the weak-field metric, we see that only \(g_{00}\) gives a non-vanishing contribution. Moreover, it is clear that

$$\begin{aligned} \varGamma ^a_{b c} = \mathcal {O}(1/c^2) , \end{aligned}$$

which is why we did not need to worry about this in the case of the flux conservation. The curvature contributes at higher orders.

Explicitly, we have

$$\begin{aligned} u^a \nabla _a u^b = u^a \partial _a u^b + \varGamma ^b_{ca} u^a u^c = {1\over c} u^0 \partial _t u^b + u^i \partial _i u^b + \varGamma ^b_{ca} u^a u^c . \end{aligned}$$

We only need the spatial components, so we set \(b=j\) to get

$$\begin{aligned} u^a \nabla _a u^j= & {} {1\over c} u^0 \partial _t u^j + u^i \partial _i u^j + \varGamma ^j_{ca} u^a u^c \nonumber \\= & {} \partial _t v^j + v^i \partial _i v^j + c^2 \varGamma ^j_{00} + \text{ higher } \text{ order } \text{ terms } \nonumber \\= & {} \partial _t v^j + v^i \partial _i v^j + {1\over 2} \eta ^{jk} \partial _k \left( {2\varPhi \over c^2} \right) \nonumber \\= & {} \partial _t v^j + v^i \partial _i v^j + \eta ^{jk} \partial _k \varPhi . \end{aligned}$$

Finally, we need the pressure contribution. For this we note that the projection becomes

$$\begin{aligned} \perp ^{ab} = g^{ab} + {1\over c^2} u^a u^b , \end{aligned}$$

in order to be dimensionally consistent. We also need \( \varepsilon \gg p \). This means that we have

$$\begin{aligned} \perp ^{ba} \nabla _a p \quad \Longrightarrow \quad \eta ^{jk} \partial _k p , \end{aligned}$$

and we (finally) arrive at the Euler equations

$$\begin{aligned} \partial _t v^j + v^i \partial _i v^j = - \eta ^{jk} \left( { 1\over \rho } \partial _k p + \partial _k \varPhi \right) , \end{aligned}$$

which represent momentum conservation.

Local dynamics

In principle, the fluid equations (from Sect. 5.2 or above) completely specify the problem for a single-component barotropic flow (once an equation of state has been provided, of course). In general, the problem is nonlinear and difficult to solve analytically. Once we couple the fluid motion to the dynamic spacetime of the Einstein equations, it becomes exceedingly so. However, if we want to understand the behaviour of a given system we can make progress using linearized theory. This approach would be suitable whenever the dynamics only deviates slightly from a known background/equilibrium state. The deviations should be small enough that we can neglect nonlinearities. This is a very common strategy, for example, to study the oscillations of neutron stars. Moreover, it is a good strategy if we want to explore the local dynamics of a given system.

Consider the case where the length and time scales of the deviations are such that the spacetime curvature can be ignored; then, we can work in the local inertial frame associated with the flow—i.e. use Minkowski coordinates \(x^a = [t,x^i]\) and assume that the spacetime curvature is flat. Letting \(\tau \) be the proper time associated with a given fluid worldline, we see from Eqs. (7.2) and (7.3) and the normalization of the four-velocity \(u^a\) (i.e. \(u^a u_a = -1\)) that—in the local inertial frame—the particle flux density takes the form

$$\begin{aligned} n^a = n u^a = n \left( 1 - v^2\right) ^{- 1/2} [1,v^i] , \end{aligned}$$

where \(v^i = dx^i/dt\) is the local three-velocity and \(v^2 = \eta _{i j} v^i v^j\). In the linearized case, the three-velocity \(v^i\) is small and therefore a deviation. The background four-velocity is thus uniform, taking the form \(u^a = [1,0,0,0]\), and it is obviously the case that \(\nabla _b u^a = 0\). As long as the associated scales of the deviations are sufficiently small, we should be able to take the background particle number density n to be uniform both temporally and spatially so that \(\nabla _a n = 0\). Therefore, it is easy to see that the background/equilibrium state trivially satisfies the dynamical equations.

Now consider (Eulerian) variations, such that \(n \rightarrow n + \delta n\) and \(v^i \rightarrow \delta v^i\) and let the deviations be expressed as plane waves (making use of a Fourier decomposition). The normalization of the four-velocity \(u^a\) demands that the perturbed velocity is spatial (\(u^a \delta u_a = 0\)), which is consistent with the linearization of Eq. (7.20):

$$\begin{aligned} \delta n^a = [\delta n ,n \delta v^i] . \end{aligned}$$

A standard sound speed derivation, however, takes the point of view that the energy density and four-velocity are the fundamental variables. For now, we adopt this approach in order to make contact with the well-known results.

From Eq. (5.12), we see a perturbation in n leads to a perturbation in \(\rho \) (recall \(\varepsilon \approx \rho = mn\) in the weak-field limit); namely,

$$\begin{aligned} \delta \rho = \mu \delta n . \end{aligned}$$

Likewise, Eq. (5.13) shows that there are corresponding perturbations in the pressure and chemical potential. With that in mind, we linearize Eqs. (5.15) and (5.17), and find that the perturbation problem becomes

$$\begin{aligned} \partial _t \delta \rho + \left( p + \rho \right) \nabla _i \delta v^i = 0 , \end{aligned}$$


$$\begin{aligned} \left( p + \rho \right) \partial _t \delta v_i + \nabla _i \delta p = 0 . \end{aligned}$$

To close the system, we introduce a barotropic equation of state:

$$\begin{aligned} p = p(\rho ) \quad \longrightarrow \quad \delta p = \left( {dp \over d\rho } \right) \delta \rho \equiv C_s^2 \delta \rho . \end{aligned}$$

The plane-wave Ansatz means that we have

$$\begin{aligned} \delta p= & {} A_{p} e^{i k (- \sigma t + \hat{k}_j x^j)} \end{aligned}$$
$$\begin{aligned} \delta \rho= & {} A_{\rho } e^{i k (- \sigma t + \hat{k}_j x^j)} \end{aligned}$$


$$\begin{aligned} \delta v^i = A^i_v e^{i k (- \sigma t + \hat{k}_j x^j)} . \end{aligned}$$

In these expressions, the constant \(\sigma \) is the wave-speed, the constant \(k_i\) is the (spatial) wave-vector, such that \(k^2 = k_i k^i\) (\(k^i = g^{i j} k_j\)) and \(\hat{k}_i = k_i/k\). We see from Eq. (7.25) that the pressure amplitude \(A_p\) must satisfy (assuming that the perturbations are described by the same equation of state as the background)

$$\begin{aligned} A_p = C_s^2 A_{\rho } . \end{aligned}$$

Inserting the plane-wave decompositions for \(\delta \rho \) and \(\delta v^i\) into (7.23) and (7.24) we find

$$\begin{aligned} \sigma A_{\rho } + (p + \rho ) \hat{k}_i A^i_v = 0 \end{aligned}$$


$$\begin{aligned} (p + \rho ) \sigma A^i_v + C^2_s A _{\rho } \hat{k}^i = 0 . \end{aligned}$$

It is easy to see that we cannot have non-trivial transverse waves; i.e., if \(\hat{k} _iA^i_v = 0\) then we must have \(A_{\rho } = 0\) as well. Focussing on the longitudinal case, we can contract the second equation with \(\hat{k}_i\) to obtain a scalar equation. Making use of this equation, we obtain the dispersion relation

$$\begin{aligned} \sigma ^2 - C_s^2 = 0 \quad \Longrightarrow \quad \sigma = \pm C_s . \end{aligned}$$

In this simple situation it is obvious that we should identify \(C_s\) as the speed of sound.

It is worth noting that we can go back to the case where the particle flux \(n^a\) is taken to be fundamental and the equation of state has the form \(\rho = \rho (n)\). If we do that, then we have

$$\begin{aligned} d\rho = \mu dn \qquad \text{ and } \qquad dp = n d\mu \end{aligned}$$

and it follows that the speed of sound is given by

$$\begin{aligned} C_s^2 = {dp \over d \rho } = {n \over \mu } {d \mu \over dn} . \end{aligned}$$

Newtonian fluid perturbations

Studies of the stability properties of rotating self-gravitating bodies are of obvious relevance to astrophysics. By improving our understanding of the relevant issues we can hope to shed light on the nature of the various dynamical and secular instabilities that may govern the spin-evolution of rotating stars. The relevance of such knowledge for neutron star astrophysics may be highly significant, especially since instabilities may lead to detectable gravitational-wave signals. In this section we will outline the Lagrangian perturbation framework developed by Friedman and Schutz (1978a, 1978b) for rotating non-relativistic stars, leading to criteria that can be used to decide when the oscillations of a rotating neutron star are unstable. We also provide an explicit example proving the instability of the so-called r-modes at all rotation rates in a perfect fluid star.

Following Friedman and Schutz (1978a, 1978b), we work with Lagrangian variations. We have already seen that the Lagrangian perturbation \(\varDelta Q\) of a quantity Q is related to the Eulerian variation \(\delta Q\) by

$$\begin{aligned} \varDelta Q = \delta Q + \mathcal {L}_\xi Q, \end{aligned}$$

where (as before) \(\mathcal {L}_\xi \) is the Lie derivative (introduced in Sect. 3). The Lagrangian change in the fluid velocity now follows from the Newtonian limit of Eq. (6.39):

$$\begin{aligned} \varDelta v^i = \partial _t \xi ^i, \end{aligned}$$

where \(\xi ^i\) is the Lagrangian displacement. Given this, and

$$\begin{aligned} \varDelta g_{ij} = \nabla _i \xi _j + \nabla _j \xi _i, \end{aligned}$$

where \(g_{ij}\) is the flat three-dimensional metric, we have

$$\begin{aligned} \varDelta v_i = \partial _t \xi _i + v^j\nabla _i \xi _j + v^j \nabla _j \xi _i. \end{aligned}$$

Let us consider the simplest case, namely a barotropic ordinary fluid for which \(\varepsilon =\varepsilon (n)\). Then we want to perturb the continuity and Euler equations. The conservation of mass for the perturbations follows immediately from the Newtonian limits of Eqs. (6.38) and (6.40) (which as we recall automatically satisfy the continuity equation):

$$\begin{aligned} \varDelta n = - n \nabla _i \xi ^i, \qquad \delta n = - \nabla _i (n \xi ^i). \end{aligned}$$

Consequently, the perturbed gravitational potential follows from

$$\begin{aligned} \nabla ^2 \delta \varPhi = 4\pi G \delta \rho = 4 \pi G m \, \delta n = - 4\pi G m \nabla _i(n \xi ^i). \end{aligned}$$

In order to perturb the Euler equations we first rewrite Eq. (7.19) as

$$\begin{aligned} (\partial _t +\mathcal {L}_v) v_i + \nabla _i \left( \tilde{\mu } + \varPhi - \frac{1}{2} v^2 \right) = 0, \end{aligned}$$

where \(\tilde{\mu }= \mu /m\). This form is particularly useful since the Lagrangian variation commutes with the operator \(\partial _t + \mathcal {L}_v\). Perturbing Eq. (7.41) we thus have

$$\begin{aligned} (\partial _t +\mathcal {L}_v) \varDelta v_i + \nabla _i \left( \varDelta \tilde{\mu } + \varDelta \varPhi - \frac{1}{2} \varDelta ( v^2) \right) = 0. \end{aligned}$$

We want to rewrite this equation in terms of the displacement vector \(\xi \). After some algebra we arrive at

$$\begin{aligned}&\partial _t^2 \xi _i + 2 v^j \nabla _j \partial _t \xi _i + (v^j \nabla _j)^2 \xi _i + \nabla _i \delta \varPhi + \xi ^j \nabla _i \nabla _j \varPhi \nonumber \\&- (\nabla _i \xi ^j) \nabla _j \tilde{\mu } + \nabla _i \varDelta \tilde{\mu } = 0. \end{aligned}$$

Finally, we need

$$\begin{aligned} \varDelta \tilde{\mu } = \delta \tilde{\mu } + \xi ^i\nabla _i \tilde{\mu } = \left( \frac{\partial \tilde{\mu }}{\partial n} \right) \delta n + \xi ^i\nabla _i \tilde{\mu } = - \left( \frac{\partial \tilde{\mu }}{\partial n} \right) \nabla _i (n \xi ^i) + \xi ^i\nabla _i \tilde{\mu }. \end{aligned}$$

Given this, we have arrived at the following form for the perturbed Euler equation:

$$\begin{aligned}&\partial _t^2 \xi _i + 2 v^j \nabla _j \partial _t \xi _i + (v^j \nabla _j)^2 \xi _i + \nabla _i \delta \varPhi + \xi ^j \nabla _i \nabla _j \left( \varPhi + \tilde{\mu } \right) \nonumber \\&- \nabla _i \left[ \left( \frac{\partial \tilde{\mu }}{\partial n} \right) \nabla _j (n \xi ^j) \right] = 0. \end{aligned}$$

This equation should be compared to Eq. (15) of Friedman and Schutz (1978a).

The CFS instability

Having derived the perturbed Euler equations, we are interested in constructing conserved quantities that can be used to assess the stability of the system. To do this, we first multiply Eq. (7.45) by the number density n, and then write the result (schematically) as

$$\begin{aligned} A \partial _t^2 \xi + B \partial _t \xi + C \xi = 0, \end{aligned}$$

omitting the indices since there is little risk of confusion. Defining the inner product

$$\begin{aligned} \left\langle \eta ^i,\xi _i \right\rangle = \int \eta ^{i*} \xi _i \, \mathrm {d} V, \end{aligned}$$

where \(\eta \) and \(\xi \) both solve the perturbed Euler equation, and the asterisk denotes complex conjugation (and we integrate over the volume of the body, V), one can now show that

$$\begin{aligned} \left\langle \eta , A\xi \right\rangle = \left\langle \xi ,A\eta \right\rangle ^* \qquad \mathrm {and} \qquad \left\langle \eta ,B\xi \right\rangle = - \left\langle \xi ,B\eta \right\rangle ^*. \end{aligned}$$

The latter requires the background relation \(\nabla _i (n v^i) = 0\), and holds as long as \(n \rightarrow 0\) at the surface of the star. A slightly more involved calculation leads to

$$\begin{aligned} \left\langle \eta , C\xi \right\rangle = \left\langle \xi , C\eta \right\rangle ^*. \end{aligned}$$

Inspired by the fact that the momentum conjugate to \(\xi ^i\) is \(\rho (\partial _t + v^j \nabla _j)\xi _i\), we now consider the symplectic structure

$$\begin{aligned} W(\eta ,\xi ) = \left\langle \eta , A\partial _t \xi + \frac{1}{2} B \xi \right\rangle - \left\langle A\partial _t \eta + \frac{1}{2} B \eta , \xi \right\rangle . \end{aligned}$$

It is straightforward to show that \(W(\eta ,\xi )\) is conserved, i.e., \(\partial _t W = 0\). This leads us to define the canonical energy of the system as (with m the baryon mass, not to be confused with the angular multipole m later)

$$\begin{aligned} E_\mathrm {c} = \frac{m}{2} W (\partial _t \xi ,\xi ) = \frac{m}{2} \left\{ \left\langle \partial _t \xi , A \partial _t \xi \right\rangle + \left\langle \xi , C \xi \right\rangle \right\} . \end{aligned}$$

After some manipulations, we arrive at the explicit expression:

$$\begin{aligned} E_\mathrm {c}= & {} \frac{1}{2} \int \left\{ \rho |\partial _t \xi |^2 - \rho | v^j \nabla _j \xi _i|^2 + \rho \xi ^i \xi ^{j*}\nabla _i \nabla _j (\tilde{\mu } + \varPhi ) \right. \nonumber \\&\left. + \left( \frac{\partial \mu }{\partial n} \right) |\delta n|^2 - \frac{1}{4 \pi G} |\nabla _i \delta \varPhi |^2 \right\} \mathrm {d} V , \end{aligned}$$

which can be compared to Eq. (45) of Friedman and Schutz (1978a). In the case of an axisymmetric system, e.g., a rotating star, we can also define a canonical angular momentum as

$$\begin{aligned} J_\mathrm {c} = - \frac{m}{2} W (\partial _\varphi \xi , \xi ) = - \mathrm {Re} \left\langle \partial _\varphi \xi , A\partial _t \xi + \frac{1}{2} B\xi \right\rangle . \end{aligned}$$

The proof that this quantity is conserved relies on the fact that (i) \(W(\eta , \xi )\) is conserved for any two solutions to the perturbed Euler equations, and (ii) \(\partial _\varphi \) commutes with \(\rho v^j \nabla _j\) in axisymmetry, which means that if \(\xi \) solves the Euler equations then so does \(\partial _\varphi \xi \).

As discussed in Friedman and Schutz (1978a, 1978b), the stability analysis is complicated by the presence of so-called “trivial” displacements. These trivials can be thought of as representing a relabeling of the physical fluid elements. A trivial displacement \(\zeta ^i\) leaves the physical quantities unchanged, i.e., is such that \(\delta n = \delta v^i = 0\). This means that we must have

$$\begin{aligned} \nabla _i (\rho \zeta ^i)= & {} 0, \end{aligned}$$
$$\begin{aligned} \left( \partial _t + \mathcal {L}_v \right) \zeta ^i= & {} 0. \end{aligned}$$

The solution to the first of these equations can be written

$$\begin{aligned} \rho \zeta ^i = \epsilon ^{ijk} \nabla _j \chi _k , \end{aligned}$$

where, in order to satisfy the second equations, the vector \(\chi _k\) must have time-dependence such that

$$\begin{aligned} ( \partial _t + \mathcal {L}_v) \chi _k = 0. \end{aligned}$$

This means that the trivial displacement will remain constant along the background fluid trajectories. Or, as Friedman and Schutz (1978a) put it, the “initial relabeling is carried along with the unperturbed motion”.

The trivials cause trouble because they affect the canonical energy. Before one can use the canonical energy to assess the stability of a rotating configuration one must deal with this “gauge problem”. To do this one should ensure that the displacement vector \(\xi ^i\) is orthogonal to all trivials. A prescription for this is provided by Friedman and Schutz (1978a). In particular, they show that the required canonical perturbations preserve the vorticity of the individual fluid elements. Most importantly, one can also prove that a normal mode solution is orthogonal to the trivials. Thus, mode solutions can serve as canonical initial data, and be used to assess stability.

The importance of the canonical energy stems from the fact that it can be used to test the stability of the system. In particular:

  • Dynamical instabilities are only possible for motions such that \(E_\mathrm {c}=0\). This makes intuitive sense since the amplitude of a mode for which \(E_\mathrm {c}\) vanishes can grow without bound and still obey the conservation laws.

  • If the system is coupled to radiation (e.g., gravitational waves) which carries positive energy away from the system (which should be taken to mean that \(\partial _t E_\mathrm {c} < 0\)) then any initial data for which \(E_\mathrm {c}<0\) will lead to an unstable evolution.

Consider a real frequency normal-mode solution to the perturbation equations, a solution of form \(\xi = \hat{\xi } e^{i(\omega t+m\varphi )}\). One can readily show that the associated canonical energy becomes

$$\begin{aligned} E_\mathrm {c} = \omega \left[ \omega \left\langle {\xi }, A {\xi }\right\rangle - \frac{i}{2} \left\langle {\xi }, B{\xi }\right\rangle \right] , \end{aligned}$$

where the expression in the bracket is real. Similarly, for the canonical angular momentum, we get

$$\begin{aligned} J_\mathrm {c} = -m \left[ \omega \left\langle {\xi }, A {\xi } \right\rangle - \frac{i}{2} \left\langle {\xi }, B{\xi } \right\rangle \right] . \end{aligned}$$

Combining Eqs. (7.58) and (7.59) we see that, for real frequency modes, we have

$$\begin{aligned} E_\mathrm {c} = - \frac{\omega }{m} J_\mathrm {c} = \sigma _\mathrm {p} J_\mathrm {c}, \end{aligned}$$

where \(\sigma _\mathrm {p}\) is the pattern speed of the mode.

Now note that Eq. (7.59) can be rewritten as

$$\begin{aligned} \frac{J_\mathrm {c}}{\left\langle \hat{\xi }, \rho \hat{\xi } \right\rangle } = - m\omega + m \frac{\left\langle {\xi }, i \rho v^j \nabla _j {\xi } \right\rangle }{\left\langle \hat{\xi }, \rho \hat{\xi } \right\rangle }. \end{aligned}$$

Using cylindrical coordinates, and \(v^j = \varOmega \varphi ^j \), one can show that

$$\begin{aligned} - i \rho {{\xi }}_i^* v^j \nabla _j {\xi }^i = \rho \varOmega \left[ m \left| \hat{\xi } \right| ^2 + i ({\hat{\xi }}^* \times \hat{\xi })_z \right] . \end{aligned}$$


$$\begin{aligned} \left| ({\hat{\xi }}^* \times \hat{\xi })_z \right| \le \left| \hat{\xi } \right| ^2 \end{aligned}$$

and hence we must have (for uniform rotation)

$$\begin{aligned} \sigma _\mathrm {p} - \varOmega \left( 1 + \frac{1}{m} \right) \le \frac{J_\mathrm {c}/m^2}{\left\langle \hat{\xi }, \rho \hat{\xi } \right\rangle } \le \sigma _\mathrm {p} - \varOmega \left( 1 - \frac{1}{m} \right) . \end{aligned}$$

Equation (7.64) forms a key part of the proof that rotating perfect fluid stars are generically unstable in the presence of radiation (Friedman and Schutz 1978b). The argument goes as follows: Consider modes with finite frequency in the \(\varOmega \rightarrow 0\) limit. Then Eq. (7.64) implies that co-rotating modes (with \(\sigma _\mathrm {p}>0\)) must have \(J_\mathrm {c}>0\), while counter-rotating modes (for which \(\sigma _\mathrm {p} < 0\)) will have \(J_\mathrm {c}<0\). In both cases \(E_\mathrm {c}>0\), which means that both classes of modes are stable. Now consider a small region near a point where \(\sigma _\mathrm {p}=0\) (at a finite rotation rate). Typically, this corresponds to a point where the initially counter-rotating mode becomes co-rotating. In this region \(J_\mathrm {c}<0\). However, \(E_\mathrm {c}\) will change sign at the point where \(\sigma _\mathrm {p}\) (or, equivalently, the frequency \(\omega \)) vanishes. Since the mode was stable in the non-rotating limit this change of sign indicates the onset of instability at a critical rate of rotation. The situation for the fundamental f-mode of a rotating star is illustrated in Fig. 11.

Fig. 11

An illustration of the instabilities affecting the fundamental f-mode of a rotating neutron star. The horizontal axis represents the rotation, expressed in terms of the ratio between the kinetic energy and the gravitational potential energy (\(\beta = T/|W|\)). The angular velocity is not a (particularly) useful parameter as values beyond (something like) \(\beta \approx 0.11\) requires some degree of differential rotation. That is, rigidly rotating bodies never reach the dynamically unstable regime (at least not in Newtonian gravity). The vertical axis gives the pattern speed of the mode, with waves that appear to move forwards (according to a distant observer) having positive values, while backwards moving modes lead to negative values. The originally backwards moving f-mode becomes secularly unstable at \(\beta \approx 0.14\), at the point where the mode first appears to move forwards (because of the rotation of star). The mode becomes dynamically unstable (this is the so-called bar-mode instability) when the two modes merge at \(\beta \approx 0.24\) (adapted from Andersson 2003)

In order to further demonstrate the usefulness of the canonical energy, let us prove the instability of the inertial r-modes (these are oscillation modes that owe their existence to the rotation of the star, and which are predominantly associated with the Coriolis force). For a general inertial mode we have (cf. Lockitch and Friedman 1999 for a discussion of the single fluid problem using notation which closely resembles the one we adopt here)

$$\begin{aligned} v^i \sim \delta v^i \sim \dot{\xi }^i \sim \varOmega \qquad \mathrm {and} \qquad \delta \varPhi \sim \delta n \sim \varOmega ^2. \end{aligned}$$

In particular, modes like the r-modes are dominated by convective currents, so we have \(\delta v_r \sim \varOmega ^2\) and the continuity equation leads to

$$\begin{aligned} \nabla _i \delta v^i \sim \varOmega ^3 \qquad \Longrightarrow \qquad \nabla _i \xi ^i \sim \varOmega ^2. \end{aligned}$$

Under these assumptions we find that \(E_\mathrm {c}\) becomes (to order \(\varOmega ^2\))

$$\begin{aligned} E_\mathrm {c} \approx \frac{1}{2} \int \rho \left[ \left| \partial _t {\xi } \right| ^2 - \left| v^i \nabla _i{\xi } \right| ^2 + \xi ^{i*} \xi ^{j} \nabla _i \nabla _j \left( \varPhi + \tilde{\mu } \right) \right] \mathrm {d} V. \end{aligned}$$

We can rewrite the last term using the equation governing the axisymmetric equilibrium. Keeping only terms of order \(\varOmega ^2\) we have

$$\begin{aligned} \xi ^{i*} \xi ^{j} \nabla _i\nabla _j \left( \varPhi + \tilde{\mu } \right) \approx \frac{1}{2} \varOmega ^2 \xi ^{i*} \xi ^{j} \nabla _i \nabla _j (r^2 \sin ^2 \theta ). \end{aligned}$$

A bit more work then leads to

$$\begin{aligned} \frac{1}{2} \varOmega ^2 \xi ^{i*} \xi ^{j} \nabla _i \nabla _j (r^2 \sin ^2 \theta ) = \varOmega ^2 r^2 \left[ \cos ^2 \theta \left| \xi ^\theta \right| ^2 + \sin ^2\theta \left| \xi ^\varphi \right| ^2 \right] , \end{aligned}$$


$$\begin{aligned} \left| v^i \nabla _i \xi _j \right| ^2= & {} \varOmega ^2 \left\{ m^2 \left| \xi \right| ^2 - 2imr^2 \sin \theta \cos \theta \left[ \xi ^\theta \xi ^{\varphi *} - \xi ^\varphi \xi ^{\theta *} \right] \right. \nonumber \\&+ \left. r^2 \left[ \cos ^2 \theta \left| \xi ^\theta \right| ^2 + \sin ^2\theta \left| \xi ^\varphi \right| ^2 \right] \right\} , \end{aligned}$$

which means that the canonical energy can be written in the form

$$\begin{aligned}&E_\mathrm {c} \approx - \frac{1}{2} \int \rho \left\{ (m \varOmega - \omega )(m \varOmega + \omega ) |\xi |^2 \right. \nonumber \\&\quad \left. - 2 i m \varOmega ^2 r^2 \sin \theta \cos \theta \left[ \xi ^\theta \xi ^{\varphi *} - \xi ^\varphi \xi ^{\theta *} \right] \right\} \mathrm {d} V. \end{aligned}$$

Introducing the axial stream function U we have

$$\begin{aligned} \xi ^\theta= & {} - \frac{iU}{r^2 \sin \theta } \partial _\varphi Y_l^m e^{i \omega t}, \end{aligned}$$
$$\begin{aligned} \xi ^\varphi= & {} \frac{iU}{r^2 \sin \theta } \partial _\theta Y_l^m e^{i\omega t}, \end{aligned}$$

where \(Y_l^m=Y_l^m(\theta ,\varphi )\) are the spherical harmonics. This leads to

$$\begin{aligned} |\xi |^2 = \frac{|U|^2}{r^2} \left[ \frac{1}{\sin ^2 \theta } |\partial _\varphi Y_l^m|^2 + |\partial _\theta Y_l^m|^2 \right] , \end{aligned}$$


$$\begin{aligned}&ir^2 \sin \theta \cos \theta \left[ \xi ^\theta \xi ^{\varphi *} - \xi ^\varphi \xi ^{\theta *} \right] \nonumber \\&\quad = \frac{1}{r^2} \frac{ \cos \theta }{\sin \theta } m |U|^2 \left[ Y_l^m \partial _\theta Y_l^{m*} + Y_l^{m *} \partial _\theta Y_l^{m}\right] . \end{aligned}$$

After performing the angular integrals, we find that

$$\begin{aligned} E_\mathrm {c} = - \frac{ l(l+1) }{2} \left\{ (m \varOmega - \omega )(m \varOmega + \omega ) - \frac{2 m^2 \varOmega ^2}{l(l+1)} \right\} \int \rho |U|^2 \, \mathrm {d} r. \end{aligned}$$

Combining this with the r-mode frequency (Lockitch and Friedman 1999)

$$\begin{aligned} \omega = m \varOmega \left[ 1 - \frac{2}{l(l+1)} \right] , \end{aligned}$$

we see that \(E_\mathrm {c} < 0\) for all \(l>1\) r-modes, i.e., they are all unstable. The \(l=m=1\) r-mode is a special case, as it leads to \(E_\mathrm {c}=0\).

The relativistic problem

The theoretical framework for studying stellar stability in General Relativity was mainly developed during the 1970s, with key contributions from Chandrasekhar and Friedman (1972a, 1972b) and Schutz (1972a, 1972b). Their work extends the Newtonian analysis discussed above. There are basically two reasons why a relativistic analysis is more complicated than the Newtonian one. First of all, the problem is algebraically more complex because one must solve the Einstein field equations in addition to the fluid equations of motion. This is apparent from the perturbation relations we have written down already. For any given equation of state—represented by \(\varLambda (n)\)—we can express the perturbed equations of motion in terms of the displacement vector \(\xi ^a\) and the Eulerian variation of the metric, \(\delta g_{ab}\). In doing this it is worth noting that the usual approach to relativistic stellar perturbations is to work with this combination of variables (see, e.g., Kojima 1992). Essentially, we need the Eulerian perturbation of the Einstein field equations and the Lagrangian variation of the momentum equation (6.28). The description of the perturbed Einstein equations is standard (see, e.g., Andersson 2019), so we focus on the fluid aspects here.

The perturbations of (5.25) are easy to work out once we note that the Lagrangian variation commutes with the exterior derivative. We immediately get

$$\begin{aligned} (\varDelta n^a) \nabla _{[a}\mu _{b]} + n^a \nabla _{[a}\varDelta \mu _{b]} = 0 . \end{aligned}$$

This simplifies further if we use (6.22) and assume that the background is such that (5.25) is satisfied. The first term then vanishes, and we are left with

$$\begin{aligned} n^a \nabla _{[a}\varDelta \mu _{b]} = 0 . \end{aligned}$$

To complete this expression, we need to work out \(\varDelta \mu _a\). This is a straightforward task given the above results, and we find

$$\begin{aligned} \varDelta \mu _a = \left( \mathcal {B} + n {d \mathcal {B} \over dn} \right) g_{ab} \varDelta n^b + \left( \mu ^b \delta _a^d - {d \mathcal {B} \over d n^2} n_a n^b n^d \right) \varDelta g_{b d} . \end{aligned}$$

An additional complication is associated with the fact that one must account for gravitational waves, leading to the system being dissipative. The work culminated in a series of papers (Friedman and Schutz 1975, 1978a, b; Friedman 1978) in which the role that gravitational radiation plays in these problems was explained, and a foundation for subsequent research in this area was established. The main result was that gravitational radiation acts in the same way in the full theory as in a post-Newtonian analysis of the problem. If we consider a sequence of equilibrium models, a mode becomes secularly unstable at the point where its frequency vanishes (in the inertial frame). Most importantly, the proof does not require the completeness of the modes of the system.

A step towards multi-fluids

Returning to the relativistic setting, let us consider what happens if one tries to extend the off-the-shelf analysis from Sect. 5.2 to the case of two components. Take, for example, the case of a single particle species at finite temperature; a case where we have to account for the presence of entropy. In general, one would have to allow for the heat (i.e. entropy) to flow relative to the matter (see Sect. 15), but we will assume that this is not the case here. If the entropy is carried along with the matter flow, we are dealing with a single-fluid problem and we should be able to make progress with the tools we have at hand. The equation of state is, however, no longer barotropic since we have \(\varepsilon =\varepsilon (n,s)\), with n the matter number density and s the entropy density (as before). Nevertheless, the stress-energy tensor can still be expressed in terms of the pressure p and the energy density \(\varepsilon \), as in Sect. 5.2. The fluid equations obtained from its divergence will take the same form as in the barotropic case. The difference becomes apparent only when we try to close the system of equations. Now the energy variation takes the form

$$\begin{aligned} d\varepsilon = \mu dn + T ds , \end{aligned}$$

where the temperature is identified as the chemical potential of the entropy:

$$\begin{aligned} T = \left( {\partial \varepsilon \over \partial s} \right) _n . \end{aligned}$$

This means that we have

$$\begin{aligned} T^{ab} = (n\mu + sT) u^a u^b + p g^{ab} \end{aligned}$$

and, if we note that

$$\begin{aligned} dp = n d\mu + s dT \quad \Longrightarrow \quad \nabla _a p = n \nabla _a \mu + s \nabla _a T , \end{aligned}$$

it follows that energy conservation leads to

$$\begin{aligned} \mu \nabla _a n^a + T \nabla _a s^a = 0 , \end{aligned}$$


$$\begin{aligned}&\mu \left( \dot{n} + n \nabla _a u^a \right) + T \left( \dot{s} + s \nabla _a u^a \right) = 0 , \end{aligned}$$
$$\begin{aligned}&\dot{n} = {dn \over d\tau } = u^a \nabla _a n, \end{aligned}$$

and similar for \( \dot{s}\). At this point we need to make additional assumptions. If, for example, the motion is adiabatic then the entropy is conserved and the second term on the left-hand side of (8.6) vanishes. It then follows that the first bracket must vanish as well, so the matter flux is also conserved. If the flow is not adiabatic, the situation is different. Suppose there are no sources or sinks for the matter. Then the matter flux should still be conserved, but now the entropy is not. So the first term in (8.68.7) still vanishes, but the second can not. We obviously have a problem, unless we relax the assumption that the entropy flows with the matter. Introducing a heat flux relative to the matter, we avoid the issue. However, by doing so, we introduce extra degrees of freedom that need to be accounted for and understood. We will consider this problem in detail once we have extended the variational formalism to deal with additional flows. We could also consider the implication the other way; in order for a single particle flow to be adiabatic, the entropy must be carried along with the matter.

Moving on to the momentum equations arising from \(\nabla _a T^{a b}=0\), replicating the analysis from Sect. 5.2, recalling the definition \(\mu _a = \mu u_a\) and introducing the analogous quantity \(\theta _a = T u_a\), we can write (5.25) as

$$\begin{aligned} 2 n^a \nabla _{[a}\mu _{b]}+ 2 s^a \nabla _{[a}\theta _{b]}=0 \end{aligned}$$

That is, we arrive at a “force balance” equation with two vorticity terms instead of the single one we had before. The implication is that, even in the absence of external agents we have to consider possible interactions between the two components. By extending the variational approach we gain insight that helps address this issue (also in more complicated situations).

It is also worth highlighting that, by using notation that highlights the entropy component we have made the problem look less “symmetric” than it really is. In many situations it is practical to introduce constituent indices (labels telling us which component the quantity belongs to), e.g., use \(n_\mathrm {n}^a\) and \(n_\mathrm {s}^a\) instead of \(n^a\) and \(s^a\). Noting also that the temperature is the chemical potential associated with the entropy, i.e. \(\theta _a = \mu ^\mathrm {s}_a\), we can write the above result as

$$\begin{aligned} \sum _{{\mathrm {x}}=\mathrm {n},\mathrm {s}} f_a^{\mathrm {x}}= \sum _{{\mathrm {x}}=\mathrm {n},\mathrm {s}} 2 n_{\mathrm {x}}^b \nabla _{[b}\mu ^{\mathrm {x}}_{a]} = \sum _{{\mathrm {x}}=\mathrm {n},\mathrm {s}} 2 n_{\mathrm {x}}^b \omega ^{\mathrm {x}}_{b a} = 0 . \end{aligned}$$

The generalisation of this result to situations where additional components are carried along by the same four velocity is now obvious. The problem with distinct four velocities, which we turn to in Sect. 9, requires additional thinking.

The two-constituent, single fluid

Before we move on to the general problem, let us consider how the problem discussed in the previous Sect. 7.2 would be described in the variational approach. Generally speaking, the total energy density \(\varepsilon \) can be a function of independent parameters other than the particle number density \(n_\mathrm {n}\), like the entropy density \(s=n_\mathrm {s}\) in the case we just considered, assuming that the system scales in the manner discussed in Sect. 2 so that only densities need enter the equation of state.


As we have already suggested, if there is no heat flow (say) then this is a single fluid problem, meaning that there is still just one flow velocity \(u^a\). This is what we mean by a two-constituent, single fluid. We assume that the particle number and entropy are both conserved along the flow. Associated which each parameter there is then a conserved current flux, i.e. \(n^a_\mathrm {n}= n_\mathrm {n}u^a\) for the particles and \(n_\mathrm {s}^a = n_\mathrm {s}u^a\) for the entropy. Note that the ratio \(x_\mathrm {s}= n_\mathrm {s}/n_\mathrm {n}\) (the specific entropy) is co-moving in the sense that

$$\begin{aligned} u^a \nabla _a x_\mathrm {s}= \dot{x}_\mathrm {s}= 0 . \end{aligned}$$

This is, of course, the relation (8.68.7) from before.

Making use of the constituent indices, the associated first law can be written in the form

$$\begin{aligned} { d} \varepsilon = \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}{d} n_{\mathrm {x}}= - \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}_a {d} n^a_{\mathrm {x}}, \end{aligned}$$

since \(\varepsilon = \varepsilon (n_{\mathrm {n}},n_{\mathrm {s}})\), where

$$\begin{aligned} n^a_{\mathrm {x}}= n_{\mathrm {x}}u^a , \quad n^2_{{\mathrm {x}}} = - g_{a b} n^a_{\mathrm {x}}n^b_{\mathrm {x}}, \end{aligned}$$


$$\begin{aligned} \mu ^{\mathrm {x}}_a = g_{a b} \mathcal {B}^{{\mathrm {x}}} n^b_{\mathrm {x}}, \quad \mathcal {B}^{{\mathrm {x}}} \equiv 2 \frac{\partial \varepsilon }{\partial n^2_{{\mathrm {x}}}} . \end{aligned}$$

Given that we only have one four-velocity, the system will still just have one fluid element per spacetime point. But unlike before, there is an additional conserved number, \(N_\mathrm {s}\), that can be attached to each worldline, like the particle number \(N_\mathrm {n}\) of Fig. 10. In order to describe the worldlines we can use the same three scalars \(X^A(x^a)\) as before. But how do we get a construction that allows for the additional conserved number? Recall that the intersections of the worldlines with some hypersurface, say \(t = 0\), is uniquely specified by the three \(X^A(0,x^i)\) scalars. Each worldline will also have the conserved numbers \(N_\mathrm {n}\) and \(N_\mathrm {s}\) assigned to them. Thus, the values of these numbers can be expressed as functions of the \(X^A(0,x^i)\). But most importantly, the fact that each \(N_{\mathrm {x}}\) is conserved, means that this specification must hold for all of spacetime, so that the ratio \(x_\mathrm {s}\) is of the form \(x_\mathrm {s}(x^a) = x_\mathrm {s}(X^A(x^a))\). Consequently, we now have a construction where this ratio identically satisfies Eq. (8.10), and the action principle remains a variational problem in terms of the three \(X^A\) scalars.

The variation of the action follows just like before, except now a constituent index \({\mathrm {x}}\) must be attached to the particle number density current and three-form:

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \epsilon _{dabc} n^d_{\mathrm {x}}. \end{aligned}$$

Once again it is convenient to introduce the momentum form, now defined as

$$\begin{aligned} \mu ^{a b c}_{\mathrm {x}}= \epsilon ^{ d a b c } \mu ^{\mathrm {x}}_d . \end{aligned}$$

Since the \(X^A\) are the same for each \(n^{\mathrm {x}}_{a b c}\), the above discussion indicates that the pull-back construction is now to be based on

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \psi ^A_a \psi ^B_b \psi ^C_c N^{\mathrm {x}}_{A B C} , \end{aligned}$$

where \(N^{\mathrm {x}}_{A B C}\) is completely antisymmetric and a function only of the \(X^A\). After a little thought, it should be obvious that the only thing required here (in addition to the single-component arguments) is to attach an \({\mathrm {x}}\) index to \(n^a\) and n in Eqs. (6.21) and (6.38), respectively.

If we now define the Lagrangian to be

$$\begin{aligned} \varLambda = - \varepsilon \end{aligned}$$

and the generalized pressure \(\varPsi \) as

$$\begin{aligned} \varPsi = \varLambda - \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}_a n^a_{\mathrm {x}}= \varLambda + \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}n_{\mathrm {x}}, \end{aligned}$$

then the first-order variation of \(\varLambda \) is (ignoring a surface term, as usual)

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} \frac{1}{2} \sqrt{- g} \left[ \varPsi g^{a b} + \left( \varPsi - \varLambda \right) u^a u^b \right] \delta g_{a b} \nonumber \\&- \sqrt{- g} \left( \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} f^\mathrm {x}_a\right) \xi ^a + \nabla _a \left( \frac{1}{2} \sqrt{-g} \sum _{\mathrm {x}= \mathrm {n},\mathrm {s}} \mu ^{a b c}_{\mathrm {x}}n^{\mathrm {x}}_{b c d} \xi ^d\right) ,\qquad \qquad \end{aligned}$$


$$\begin{aligned} f^{\mathrm {x}}_a = 2 n^b_{\mathrm {x}}\omega ^{\mathrm {x}}_{b a} , \end{aligned}$$


$$\begin{aligned} \omega ^{\mathrm {x}}_{a b} = \nabla _{[a} \mu ^{\mathrm {x}}_{b]} . \end{aligned}$$

At the end of the day, the equations of motion are

$$\begin{aligned} \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} f^\mathrm {x}_a = 0 , \end{aligned}$$


$$\begin{aligned} \nabla _a n^a_{\mathrm {x}}= 0 , \end{aligned}$$

while the stress-energy tensor takes the form

$$\begin{aligned} T^{a b} = \varPsi g^{a b} + (\varPsi - \varLambda ) u^a u^b . \end{aligned}$$

Not surprisingly, these results accord with the expectations from the previous analysis.

Speed of sound (again)

We have already considered the problem of wave propagation in the case of a single component (barotropic) fluid, see Sect. 7.2. Now we are equipped to revisit this problem in the more complex case of a two-constituent single-fluid—a fluid that is “stratified” either by thermal or composition gradients. As before, the analysis is local—assuming that the speed of sound is a locally defined quantity—and performed using local inertial frame (Minkowski) coordinates \(x^a = (t,x^i)\). The purpose of the analysis is twofold: The main aim is to illuminate how the presence of various constituents impacts on the local dynamics, but we also want to illustrate how the problem works out if we take the variational equations of motion as our starting point. An additional motivation is to develop notation that is flexible enough that we can deal with problems of increasing complexity, ideally without losing sight of the underlying physics.

Focussing on a small spacetime region, we can make the same argument as in Sect. 7.2 that the configuration of the matter with no waves present is locally isotropic, homogeneous, and static. Thus, for the background \(n^a_{\mathrm {x}}= [n_{\mathrm {x}},0,0,0]\) and the vorticity \(\omega ^{\mathrm {x}}_{a b}\) vanishes. The general form of the (Eulerian) variation of the force density \(f^{\mathrm {x}}_a\) for each constituent is then

$$\begin{aligned} \delta f_a^{\mathrm {x}}= 2 n^b_\mathrm {x}\partial _{[b} \delta \mu ^{\mathrm {x}}_{a]} . \end{aligned}$$

Similarly, the conservation of the flux \(n^a_{\mathrm {x}}\) gives

$$\begin{aligned} \partial _a \delta n^a_{\mathrm {x}}= 0 . \end{aligned}$$

We are now taking the view that the \(n^a_{\mathrm {x}}\) are the fundamental fluid fields and thus plane-wave propagation means that we have (the covariant analogue fo (7.28))

$$\begin{aligned} \delta n^a_\mathrm {x}= A^a_{\mathrm {x}}e^{i k_b x^b} , \end{aligned}$$

where the amplitudes \(A^a_{\mathrm {x}}\) and the wave vector \(k_a\) are constant. Combining Eqs. (8.26) and (8.27) we see that

$$\begin{aligned} k_a \delta n^a_{\mathrm {x}}= 0 , \end{aligned}$$

i.e. the waves are “transverse” in the spacetime sense. It is worth pointing out that this requirement is not in contradiction with the fact that sound waves are longitudinal (in the spatial sense), as established in Sect. 7.2. It is easy to see that (8.28) is exactly what we should expect, if we note that \(\delta n_{\mathrm {x}}^a = \delta n_\mathrm {x}u^a + n_{\mathrm {x}}\delta v^a\) and identify \(k_0 = - k \sigma \) where, recall, \(\sigma \) is the mode speed and k is the spatial part magnitude obtained from \(k^2 = k_j k^j\) (\(k^i = g^{i j} k_j\)).

Moving on to the equations of motion, as given by (8.25), we need the perturbed momentum \(\delta \mu ^{\mathrm {x}}_a\). For future reference, we will work out its general form, and only afterwards assume a static, homogeneous, and isotropic background. However, in order to establish the strategy, it is useful to start by revisiting the barotropic case. Suppose there is only one constituent, with index \({\mathrm {x}}= \mathrm {n}\). The Lagrangian \(\varLambda \) then depends only on \(n^2_{\mathrm {n}}\), and the variation in the chemical potential due to a small disturbance \(\delta n^a_\mathrm {n}\) is

$$\begin{aligned} \delta \mu ^\mathrm {n}_a = \mathcal {B}^{\mathrm {n}}_{a b} \delta n^b_\mathrm {n}, \end{aligned}$$


$$\begin{aligned} \mathcal {B}^{\mathrm {n}}_{a b} = \mathcal {B}^{\mathrm {n}} g_{a b} - 2 \frac{\partial \mathcal {B}^{\mathrm {n}}}{\partial n^2_{\mathrm {n}}} n^\mathrm {n}_a n^\mathrm {n}_b . \end{aligned}$$

There are two terms, simply because we need to perturb both \(\mathcal {B}^\mathrm {n}\) and \(n_\mathrm {n}^a\) in (8.13).

The single-component equation of motion is \(\delta f^\mathrm {n}_a = 0\). It is not difficult to show, by using the condition of transverse wave propagation, Eq. (8.28), and contracting with the spatial part of the wave vector \(k^i\) (the time part is trivial because (8.25) is orthogonal to \(n_\mathrm {n}^a\) which in turn is aligned with \(u^a\)), that the equation of motion reduces to

$$\begin{aligned} \left( \mathcal {B}^{\mathrm {n}} + \mathcal {B}^{\mathrm {n}}_{0 0} \frac{k_j k^j}{k^2_0}\right) k_i \delta n^i_\mathrm {n}= 0 . \end{aligned}$$

From this we see that the dispersion relation takes the form

$$\begin{aligned} \sigma ^2 = {k_0^2 \over k_jk^j} = - {\mathcal {B}^{\mathrm {n}}_{0 0}\over \mathcal {B}^\mathrm {n}} = 1 + 2 { n_\mathrm {n}^2 \over \mathcal {B}^\mathrm {n}} \frac{d \mathcal {B}^{\mathrm {n}}}{d n^2_{\mathrm {n}}} = 1 + \frac{d \ln \mathcal {B}^{\mathrm {n}}}{d \ln n_\mathrm {n}} . \end{aligned}$$

We have used the fact that we are working in a locally flat spacetime, so that \(g_{a b} = \eta _{a b}\). If we have done this right, then we should recover the expression for the speed of sound \(C_s^2\) from before, cf. Eq. (7.34). To see that this is the case, recall that \(\mu _\mathrm {n}= n_\mathrm {n}\mathcal {B}^\mathrm {n}\) and work out the required derivative. That is

$$\begin{aligned} C_s^2 = \sigma ^2 = {n\over \mu } {d \mu \over dn} = {dp \over d\varepsilon }. \end{aligned}$$

In order to ensure that the behaviour of the system is “physical”, we need to consider two conditions:

  1. 1.

    absolute stability, \(\sigma ^2 \ge 0\) , and

  2. 2.

    causality, \(C^2_s \le 1\) .

These conditions provide constraints which can be imposed on, say, parameters in equation of state models, the net effect being absolute limits on the possible forms for the master function \(\varLambda \). As an example, take the result from Eq. (8.32) and impose the two constraints to find that

$$\begin{aligned} 0 \le 1 + \frac{d \ln \mathcal {B}^{\mathrm {n}}}{d \ln n_\mathrm {n}} \le 1 \quad \implies \quad - 1 \le \frac{d \ln \mathcal {B}^{\mathrm {n}}}{d \ln n_\mathrm {n}} \le 0 . \end{aligned}$$

From the definition of \(\mathcal {B}^\mathrm {n}\), cf. Eq. (8.13), we have two bounds on \(\varLambda \).

Even with the aid of the constraint from Eq. (8.34), the mode frequency solution in Eq. (8.32) is obviously less transparent than the simple statement of the speed of sound as the variation of the pressure with changing density. However, as we will establish, the formalism we are developing readily deals with much more complex situations (such as multiple sound speeds and so-called “two-stream” instabilities). The main reason is that the fluxes enter the formalism on equal footing as four-vectors, whereas starting with energy density typically requires the introduction of an ad-hoc reference frame (e.g., the \(U^a\) from Sect. 5), in order to define what the energy density is, and any independent fluid motion (like heat flow) is then defined as a three-velocity with respect to this frame.

As a further example, let us consider the case when there are the two constituents with densities \(n_\mathrm {n}\) and \(n_\mathrm {s}\), two conserved density currents \(n^a_\mathrm {n}\) and \(n^a_\mathrm {s}\), two chemical potential covectors \(\mu ^\mathrm {n}_a\) and \(\mu ^\mathrm {s}_a\), but still only one four-velocity \(u^a\). (We are primarily thinking about matter and entropy, as before, but it could be any two individually conserved components which move together.) The matter Lagrangian \(\varLambda \) may now depend on both \(n^2_{\mathrm {n}}\) and \(n^2_{\mathrm {s}}\) meaning that

$$\begin{aligned} \delta \mu ^{\mathrm {x}}_a = \mathcal {B}^{{\mathrm {x}}}_{a b} \delta n^b_{\mathrm {x}}+ \mathcal{X}^{{\mathrm {x}}{\mathrm {y}}}_{a b} \delta n^b_{\mathrm {y}}, \quad {\mathrm {y}}\ne {\mathrm {x}}, \end{aligned}$$

where we recall that summation is not implied for repeated constituent indices, and we have defined

$$\begin{aligned} \mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= - \mathcal {C}_{cc}\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}} u^{\mathrm {x}}_a u^{\mathrm {x}}_b , \end{aligned}$$

(with \(u^{\mathrm {x}}_a = u^{\mathrm {y}}_a = u_a\) in this specific example) where

$$\begin{aligned} \mathcal {C}_{cc}^2 \equiv \frac{1}{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}} \left( 2 n_{\mathrm {x}}n_{\mathrm {y}}\frac{\partial \mathcal {B}^{\mathrm {x}}}{\partial n_{\mathrm {y}}^2}\right) ^2 . \end{aligned}$$

The \(\mathcal {B}^{\mathrm {n}}_{a b}\) coefficient is defined as before and \(\mathcal {B}^{\mathrm {s}}_{a b}\) is given by the same expression (Eq. (8.30)) with each \(\mathrm {n}\) replaced by \(\mathrm {s}\). The \(\mathcal {C}_{cc}\) coefficient represents a true multi-constituent effect, which depends on the composition (e.g., the entropy per baryon \(x_\mathrm {s}= n_\mathrm {s}/n_\mathrm {n}\) used in the discussion surrounding Eq. (8.10)).

The fact that \(n^a_\mathrm {s}\) is parallel to \(n^a_\mathrm {n}\) implies that it is only the magnitude of the entropy density current that is independent. One can show that the condition of transverse propagation, as applied to both currents, implies

$$\begin{aligned} \delta n^a_\mathrm {s}= x_\mathrm {s}\delta n^a_\mathrm {n}. \end{aligned}$$

It is worth taking a closer look at this condition. First of all, the time component leads to

$$\begin{aligned} \delta n_\mathrm {s}= x_\mathrm {s}\delta n_\mathrm {n}= \frac{n_\mathrm {s}}{n_\mathrm {n}} \delta n_\mathrm {n}\qquad \Longrightarrow \qquad \delta x_\mathrm {s}= 0 . \end{aligned}$$

That is, the entropy per particle is constant—the perturbations are adiabatic. Meanwhile, it is easy to show that the spatial part of (8.38) is trivial, since the two components move together.

Now, we proceed as in the previous example. Noting that the equation of motion is

$$\begin{aligned} \delta f^\mathrm {n}_a + \delta f^\mathrm {s}_a = 0, \end{aligned}$$

we find

$$\begin{aligned} \left[ \left( \mathcal {B}^{\mathrm {n}} + x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}}\right) \sigma ^2 - \left( \mathcal {B}^{\mathrm {n}} c^2_\mathrm {n}+ x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}} c^2_\mathrm {s}- 2 x_{\mathrm {s}} \mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0} \right) \right] k_i \delta n^i_\mathrm {n}= 0 , \end{aligned}$$

where, inspired by the result for the speed of sound in the single component case [cf. Eq. (8.32)], we have defined

$$\begin{aligned} c^2_{\mathrm {x}}\equiv 1 + \frac{\partial \ln \mathcal {B}^{{\mathrm {x}}}}{\partial \ln n_{\mathrm {x}}} . \end{aligned}$$

We find that the speed of sound is given by

$$\begin{aligned} C_s^2 = \sigma ^2 = \frac{\mathcal {B}^{\mathrm {n}} c^2_\mathrm {n}+ x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}} c^2_\mathrm {s}- 2 x_\mathrm {s}\mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0}}{\mathcal {B}^{\mathrm {n}} + x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}}} . \end{aligned}$$

As this result looks quite complicated, let us see if we can manipulate it to make it more intuitive. The obvious starting point is to replace the abstract coefficients we have introduced with the underlying thermodynamical quantities, i.e. use \(\mu _n = n_\mathrm {n}\mathcal {B}^\mathrm {n}= \mu \) and \(\mu _\mathrm {s}= n_\mathrm {s}\mathcal {B}^\mathrm {s}= T\) leading to

$$\begin{aligned} c_\mathrm {n}^2 = {n\over \mu } \left( {\partial \mu \over \partial n} \right) _s \qquad \text{ and } \qquad c_\mathrm {s}^2 = {s\over T} \left( {\partial T \over \partial s} \right) _n . \end{aligned}$$

We also see that

$$\begin{aligned} \mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0} = - \left( {\partial \mu \over \partial s} \right) _n = - \left( {\partial T \over \partial n} \right) _s , \end{aligned}$$

where the identity follows since we have mixed partial derivatives (both \(\mu \) and T arise as derivatives of \(\varepsilon \)). Given these results, we find that

$$\begin{aligned} C_s^2 = {1 \over p+\varepsilon } \left[ n^2 \left( {\partial \mu \over \partial n} \right) _s + 2sn \left( {\partial T \over \partial n} \right) _s + s^2 \left( {\partial T \over \partial s} \right) _n \right] , \end{aligned}$$

which already looks a little bit more transparent. However, we can also use the fact that \(dp = nd\mu +sdT\) to rewrite this as

$$\begin{aligned} C_s^2 = {1 \over p+\varepsilon } \left[ n \left( {\partial p \over \partial n} \right) _s + s \left( {\partial p \over \partial s} \right) _n \right] . \end{aligned}$$

Finally, let us ask what happens if we work with \(x_\mathrm {s}\) instead of s.

To do this, we need

$$\begin{aligned} dp= & {} \left( {\partial p \over \partial n} \right) _{x_\mathrm {s}} dn + \left( {\partial p \over \partial x_\mathrm {s}} \right) _n d x_\mathrm {s}\nonumber \\= & {} \left[ \left( {\partial p \over \partial n} \right) _{x_\mathrm {s}} - {s \over n^2} \left( {\partial p \over \partial x_\mathrm {s}} \right) _{n} \right] dn + {1 \over n} \left( {\partial p \over \partial s} \right) _n ds . \end{aligned}$$

From this we see that

$$\begin{aligned} \left( {\partial p \over \partial n} \right) _{x_\mathrm {s}} = \left( {\partial p \over \partial n} \right) _s + {s \over n} \left( {\partial p \over \partial s} \right) _n \end{aligned}$$

and once we combine with the fact that, when \(x_\mathrm {s}\) is kept constant we have

$$\begin{aligned} d\varepsilon = {p+\varepsilon \over n} dn , \end{aligned}$$

we get the expected result for the adiabatic sound speed:

$$\begin{aligned} C_s^2 = \left( {\partial p \over \partial \varepsilon } \right) _{x_\mathrm {s}} . \end{aligned}$$

Multi-component cosmology

The modern description of cosmology draws on ideas from fluid dynamics. In the simplest picture—after averaging up to a suitably large scale—planets, stars and galaxies are treated as collisionless “dust”, represented by the simple stress-energy tensor

$$\begin{aligned} T^{ab} = \varepsilon u^a u^b . \end{aligned}$$

This introduces a natural flow of cosmological time—associated with the proper time linked to \(u^a\)—and the associated fibration of spacetime (Barrow et al. 2007). The focus on the “fluid observer” worldlines means that the model is closely related to our description of fluid dynamics, and it is fairly straightforward to build more complex (read:realistic) models by, for example, adding the cosmological constant to the Einstein equations (or viewing it as a “dark energy” contribution with negative pressure, \(p=-\varepsilon \)) or accounting for more complicated description of the matter content in the Universe. The matter description relies on ideas we have already introduced. In particular, the cosmological principle states that the Universe is homogeneous and isotropic, suggesting that the relevant matter Lagrangian should be built from scalars. Given the increased quality of cosmological observations, this fundamental principle is now becoming testable, and (perhaps) questionable.

The most pressing issues that arise in cosmology relate to the simple fact that we do not have a good handle on the nature of dark components that appear to dominate the “standard model” (Peter and Uzan 2009). A number of alternative models—including alternatives to Einstein’s relativistic gravity—have been suggested, but few of these are compelling. The treatment of the different matter components, in particular, tends to remain based on the notion of coupled perfect fluids or scalar fields. If we are to understand the bigger picture, we may need to review this aspect, especially if we want to be able to consider issues like heat flow (Modak 1984; Triginer and Pavón 1995; Andersson and Lopez-Monsalvo 2011), dissipative mechanisms (Weinberg 1971; Patel and Koppar 1991; Velten and Schwarz 2011), Bose–Einstein condensation of dark matter (Sikivie and Yang 2009; Harko 2011) and possibly many others. Many issues are similar to ones that arise in more realistic models of neutron star astrophysics.

A particularly interesting aspect, given the focus of this review, may be the suggestion that there could have been phases during which the Universe would have effectively been anisotropic (see Tsagas et al. 2008 for a useful review), with different components evolving “independently” (Comer et al. 2012a, b). For the most part, models considered in the current literature, including initially anisotropic geometries, describe the matter content in terms of either effectively many component single fluid models (Gromov et al. 2004), or a single component (Gümrükçüoglu et al. 2007; Pitrou et al. 2008; Kim and Minamitsuji 2010); although an evolution towards isotropy is expected in such settings, as required to end up with a realistic (read: in agreement with observational data) model (Dechant et al. 2009). Having said that, interesting new consequences may be inferred by enhancing an initially vanishingly small non-Gaussian signal (Dey and Paban 2012).

Within this context, it is relevant to ask how distinct fluid flows may lead to anisotropy, with the spacetime metric taking the form of a Bianchi I solution of the Einstein equations. In this case there is a spacelike privileged vector, associated with the relative flow between two matter components. As we will soon establish, such a feature is natural in the multi-fluid context, but it can never arise in the usual multi-constituent single fluid. This point has been considered in some detail in Comer et al. (2012a, 2012b). It has been suggested (Barrow and Tsagas 2007; Adhav et al. 2011; Cataldo et al. 2011) that, since Bianchi universes—seen as averaged inhomogeneous and anisotropic spacetimes—can have effective strong energy condition violating stress-energy tensors, they could be part of a backreaction driven acceleration model.

Yet another reason for studying such cosmological models stem, perhaps surprisingly, from the observations: Large angle anomalies in the Cosmic Microwave Background (CMB) have been observed and discussed for quite some time (Schwarz et al. 2004; Copi et al. 2010; Perivolaropoulos 2011; Ma et al. 2011) and may be related with underlying Bianchi models (Pontzen and Challinor 2007; Pontzen 2009).

The “pull-back” formalism for two fluids

Having discussed the single fluid model, and how one accounts for stratification (either thermal or composition gradients), it is time to move on to the problem of modelling multi-fluid systems. We will experience for the first time novel effects due to a relative flow between two interpenetrating fluids, and the fact that there is no longer a single, preferred rest-frame. This kind of formalism is necessary, for example, for the simplest model of a neutron star, since it is generally accepted that the inner crust is permeated by an independent neutron superfluid, and the outer core is thought to contain superfluid neutrons, superconducting protons, and a highly degenerate gas of electrons. Still unknown is the number of independent fluids required for neutron stars that have deconfined quark matter in the deep core (Alford et al. 2000). The model can also be used to describe superfluid Helium and heat-conducting fluids, problems which relate to the incorporation of dissipation (see Sect. 16). We will focus on this example here, as a natural extension of the case considered in the previous section. It should be noted that, even though the particular system we concentrate on consists of only two fluids, it illustrates all new features of a general multi-fluid system. Conceptually, the greatest step is to go from one to two fluids. A generalization to a system with further degrees of freedom is straightforward.

In keeping with the previous section, we will rely on use of constituent indices, which throughout this section will range over \({\mathrm {x}},{\mathrm {y}}= \mathrm {n},\mathrm {s}\). In the example we consider the two fluids represent the particles (\(\mathrm {n}\)) and the entropy (\(\mathrm {s}\)). Once again, the number density four-currents, to be denoted \(n^a_{\mathrm {x}}\), are taken to be separately conserved, meaning that

$$\begin{aligned} \nabla _a n^a_{\mathrm {x}}= 0 . \end{aligned}$$

As before, we use the dual formulation, i.e., introduce the three-forms

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \epsilon _{d a b c } n^d_{\mathrm {x}}, \qquad n^a_{\mathrm {x}}= \frac{1}{3!} \epsilon ^{b c d a} n^{\mathrm {x}}_{b c d}. \end{aligned}$$

Also like before, the conservation rules are equivalent to the individual three-forms being closed (the arguments proceeds in exactly the same way); i.e.

$$\begin{aligned} \nabla _{[a} n^{\mathrm {x}}_{b c d]} = 0. \end{aligned}$$

However, we need a formulation whereby such conservation obtains automatically, at least in principle.

We make this happen by introducing the three-dimensional matter space, the difference being that we now need two such spaces. These will be labelled by coordinates \(X^A_\mathrm {x}\), and we recall that \(A,B,C,\mathrm {etc.} = 1,2,3\). The idea is illustrated in Fig. 12, which indicates the important facts that (i) a given point in space can be intersected by each fluid’s worldline and (ii) the individual worldlines are not necessarily parallel at the intersection, i.e., the independent fluids are interpenetrating and can exhibit a relative flow with respect to each other. Although we have not indicated this in Fig. 12 (in order to keep the figure as uncluttered as possible) attached to each worldline of a given constituent will be a fixed number of particles \(N^\mathrm {x}_1\), \(N^\mathrm {x}_2\), etc. (cf. Fig. 10). For the same reason, we have also not labelled (as in Fig. 10) the “pull-backs” (represented by the arrows) from the matter spaces to spacetime.

Fig. 12

The pull-back from a point in the \({\mathrm {x}}^{ th }\)-constituent’s three-dimensional matter space (on the left) to the corresponding “fluid-particle” worldline in spacetime (on the right). The points in matter space are labelled by the coordinates \(\{X^1_{\mathrm {x}},X^2_{\mathrm {x}},X^3_{\mathrm {x}}\}\), and the constituent index \({\mathrm {x}}= \mathrm {n},\mathrm {s}\). There exist as many matter spaces as there are dynamically independent fluids, which for this case means two

By “pushing forward” each constituent’s three-form onto its respective matter space we can once again construct three-forms that are automatically closed on spacetime, i.e., let

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \psi _{{\mathrm {x}}a}^A \psi _{{\mathrm {x}}b}^A \psi _{{\mathrm {x}}c}^C N^{\mathrm {x}}_{A B C} , \end{aligned}$$


$$\begin{aligned} \psi _{{\mathrm {x}}a}^A = {\partial X_{\mathrm {x}}^A \over \partial x^a } , \end{aligned}$$

and \(N^{\mathrm {x}}_{A B C}\) is completely antisymmetric in its indices and is a function only of the \(X^A_{\mathrm {x}}\). Using the same reasoning as in the single fluid case, the construction produces three-forms that are automatically closed, i.e., they satisfy Eq. (9.3) identically. If we let the scalar fields \(X^A_{\mathrm {x}}\) (as functions on spacetime) be the fundamental variables, they yield a representation for each particle number density current that is automatically conserved. The variations of the three-forms can now be derived by varying them with respect to the \(X^A_{\mathrm {x}}\).

The Lagrangian displacements on spacetime for each fluid, to be denoted \(\xi ^a_{\mathrm {x}}\), are related to the variations \(\delta X^A_{\mathrm {x}}\) via

$$\begin{aligned} \varDelta _{\mathrm {x}}X^A = \delta X^A_{\mathrm {x}}+\xi ^a_{\mathrm {x}}\partial _a X^A_{\mathrm {x}}= \delta X^A_{\mathrm {x}}+\xi ^a_{\mathrm {x}}\psi ^A_{{\mathrm {x}}a}= 0 . \end{aligned}$$

In general, the various single-fluid equations we have considered are easily extended to the two-fluid case, except that each displacement and four-current will now be associated with a constituent index, using the decomposition

$$\begin{aligned} n^a_{\mathrm {x}}= n_{\mathrm {x}}u^a_{\mathrm {x}}, \qquad u^{\mathrm {x}}_a u^a_{\mathrm {x}}= - 1 . \end{aligned}$$

Associated with each constituent’s Lagrangian displacement is its own Lagrangian variation. As above, these are naturally defined to be

$$\begin{aligned} \varDelta _{\mathrm {x}}\equiv \delta + \mathcal{L}_{\xi _{\mathrm {x}}}, \end{aligned}$$

so that it follows that

$$\begin{aligned} \varDelta _\mathrm {x}n^{\mathrm {x}}_{a b c} = 0, \end{aligned}$$

as expected for the pull-back construction. Likewise, two-fluid analogues of Eqs. (6.406.42) exist which take the same form except that the constituent index is attached. However, in contrast to the ordinary fluid case, there are more options to consider. For instance, we could also look at the Lagrangian variation of the first constituent with respect to the second constituent’s flow, i.e., \(\varDelta _\mathrm {s}n_\mathrm {n}\), or the other way around, i.e., \(\varDelta _\mathrm {n}n_\mathrm {s}\). The Newtonian analogues of these Lagrangian displacements were essential to an analysis of instabilities in rotating superfluid neutron stars (Andersson et al. 2004a).

We are now in a position to construct an action principle that yields the equations of motion and the stress-energy tensor. Again, the central quantity is the matter Lagrangian \(\varLambda \), which is now a function of all the different scalars that can be formed from the \(n^a_{\mathrm {x}}\), i.e., the scalars \(n_{\mathrm {x}}\) together with

$$\begin{aligned} n^2_{{\mathrm {x}}{\mathrm {y}}} = n^2_{{\mathrm {y}}{\mathrm {x}}} = - g_{a b} n^a_\mathrm {x}n^b_\mathrm {y}. \end{aligned}$$

In the limit where all the currents are parallel, i.e., the fluids are comoving, \(- \varLambda \) corresponds (As before) to the local thermodynamic energy density. In the action principle, \(\varLambda \) is the Lagrangian density for the fluids.


An unconstrained variation of \(\varLambda \) with respect to the independent vectors \(n^a_{\mathrm {x}}\) and the metric \(g_{a b}\) takes the form

$$\begin{aligned} \delta \varLambda = \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} \mu ^{\mathrm {x}}_a \, \delta n^a_{\mathrm {x}}+ \frac{1}{2} \left( \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}\right) \delta g_{a b}, \end{aligned}$$


$$\begin{aligned} \mu ^{\mathrm {x}}_a= & {} \mathcal {B}^{{\mathrm {x}}} n_a^{\mathrm {x}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}} n_a^{\mathrm {y}}, \end{aligned}$$
$$\begin{aligned} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}= & {} \mathcal {A}^{{\mathrm {y}}{\mathrm {x}}} = - \frac{\partial \varLambda }{\partial n^2_{{\mathrm {x}}{\mathrm {y}}}}, \qquad \mathrm {for\ } {\mathrm {x}}\ne {\mathrm {y}}. \end{aligned}$$

The momentum covectors \(\mu ^{\mathrm {x}}_a\) are each dynamically, and thermodynamically, conjugate to their respective number density currents \(n^a_{\mathrm {x}}\), and their magnitudes are the chemical potentials. Here we note something new: the \(\mathcal {A}^{\mathrm {x}\mathrm {y}}\) coefficient represents the fact that each fluid momentum \(\mu ^{\mathrm {x}}_a\) may, in general, be given by a linear combination of the individual currents \(n^a_{\mathrm {x}}\). That is, the current and momentum for a particular fluid do not have to be parallel. This is known as the entrainment effect. We have chosen to represent it by the letter \(\mathcal {A}\) for historical reasons. When Carter first developed his formalism he opted for this notation, referring to the “anomaly” of having misaligned currents and momenta. It has since been realized that the entrainment is a key feature of most multi-fluid systems and it would, in fact, be anomalous to leave it out!

In the general case, the momentum of one constituent carries along some mass current of the other constituents. The entrainment only vanishes in the special case where \(\varLambda \) is independent of \(n^2_{{\mathrm {x}}{\mathrm {y}}}\) (\({\mathrm {x}}\ne {\mathrm {y}}\)) because then we obviously have \(\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}} = 0\). Entrainment is an observable effect in laboratory superfluids (Putterman 1974; Tilley and Tilley 1990) (e.g., via flow modifications in superfluid \({}^4{\mathrm {He}}\) and mixtures of superfluid \({}^3{\mathrm {He}}\) and \({}^4{\mathrm {He}}\)). In the case of neutron stars, entrainment—in this case related to the mobility of the superfluid neutrons that permeate the neutron star crust—plays a key role in the discussion of pulsar glitches glitches (Radhakrishnan and Manchester 1969; Reichley and Downs 1969). As we will see later (in Sect. 15), these “anomalous” terms are necessary for causally well-behaved heat conduction in relativistic fluids, and by extension necessary for building well-behaved relativistic equations that incorporate dissipation (see also Andersson and Comer 2010, 2011).

In terms of the constrained Lagrangian displacements, a variation of \(\varLambda \) now yields

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} \frac{1}{2} \sqrt{- g} \left( \varPsi g^{a b} + \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}\right) \delta g_{a b} - \sqrt{- g} \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} f^{\mathrm {x}}_a \xi ^a_{\mathrm {x}}\nonumber \\&+ \nabla _a \left( \frac{1}{2} \sqrt{-g} \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} \mu ^{a b c}_{\mathrm {x}}n^{\mathrm {x}}_{b c d} \xi ^d_{\mathrm {x}}\right) , \end{aligned}$$

where \(f^{\mathrm {x}}_a\) is as defined in Eq. (8.20) except that the individual velocities are no longer parallel. The generalized pressure \(\varPsi \) is now

$$\begin{aligned} \varPsi = \varLambda - \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^{\mathrm {x}}_a. \end{aligned}$$

At this point we return to the view that \(n^a_\mathrm {n}\) and \(n^a_\mathrm {s}\) are the fundamental variables. Because the \(\xi ^a_{\mathrm {x}}\) are independent variations, the equations of motion consist of the two original conservation conditions from Eq. (6.8), plus two Euler-type equations

$$\begin{aligned} f^{\mathrm {x}}_a = n_{\mathrm {x}}^b \omega ^{\mathrm {x}}_{ba} = 0 , \end{aligned}$$

and of course the Einstein equations (obtained exactly as before by adding in the Einstein–Hilbert term, see Sect. 4.4). We also find that the stress-energy tensor is

$$\begin{aligned} T^a{}_b = \varPsi \delta ^a{}_b + \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^{\mathrm {x}}_b. \end{aligned}$$

When the complete set of field equations is satisfied then it is automatically true that \(\nabla _b T^b{}_a = 0\). One can also verify that \(T_{a b}\) is symmetric. The momentum form \(\mu ^{a b c}_{\mathrm {x}}\) entering the boundary term is the natural extension of Eq. (8.15) to this two-fluid case.

It must be noted that Eq. (9.16) is significantly different from the multi-constituent version from Eq. (8.22). This is true even if one is solving for a static and spherically symmetric configuration, where the fluid four-velocities would all necessarily be parallel. Simply put, Eq. (9.16) represents two independent equations. If one takes entropy as an independent fluid, then the static and spherically symmetric solutions will exhibit thermal equilibrium (Comer et al. 1999). This explains, for instance, why one must specify an extra condition (e.g., convective stability; Weinberg 1972) to solve for a double-constituent star with only one four-velocity.

Waves in multi-fluid systems

Crucial to the understanding of black holes is the effect of spacetime curvature on the light-cone structure, that is, the null vectors that emanate from each spacetime point. Crucial to the propagation of massless fields (and gravitational waves!) is the light-cone structure. In the case of fluids, it is both the speed of light and the speed (and/or speeds) of sound that dictate how waves propagate through the matter. We have already used a local analysis of plane-wave propagation to derive the speed of sound for both the single-fluid case (in Sect. 7.2) and the two-constituent single-fluid case (in Sect. 8.2). We will now repeat the analysis for a general two-fluid system, using the same assumptions as before (see Carter 1989a for a more rigorous derivation). However, we will provide an important extension by allowing a relative flow between the two fluids in the background/equilibrium state. While this extension is straight-forward, we will see that the final results are quite astonishing—demonstrating the existence of a two-stream instability.

Two-fluid case

As a reminder, we first note that the analysis is, in principle, performed in a small region (where the meaning of “small” is dictated by the particular system being studied) and we assume that the configuration of the matter with no waves present is locally isotropic, homogeneous, and static. Thus, for the background, \(n^a_{\mathrm {x}}= [n_{\mathrm {x}},0,0,0]\) and the vorticity \(\omega ^{\mathrm {x}}_{a b}\) vanishes. The linearized fluxes take the plane-wave form given in Eq. (8.27).

The two-fluid problem is qualitatively different from the previous cases, since there are now two independent currents. This impacts on the analysis in two crucial ways: (i) The Lagrangian \(\varLambda \) depends on \(n^2_{\mathrm {n}}\), \(n^2_{\mathrm {s}}\), and \(n^2_{\mathrm {n}\mathrm {s}} = n^2_{\mathrm {s}\mathrm {n}}\) (i.e. entrainment is present), and (ii) the equations of motion, after taking into account the transverse flow condition of Eq. 8.28 for both fluids, are doubled to \(\delta f^\mathrm {n}_a = 0 = \delta f^\mathrm {s}_a\). The key point is that there can be two simultaneous wave propagations, with each distinct mode having its own sound speed.

Another ramification of having two fluids is that the variation \(\delta \mu ^{\mathrm {x}}_a\) has more terms than in the previous, single-fluid analysis. There are individual fluid bulk effects, cross-constituent effects due to coupling between the fluids, and entrainment. We can isolate these various effects by writing \(\delta \mu ^{\mathrm {x}}_a\) in the form

$$\begin{aligned} \delta \mu _a^{\mathrm {x}}= \left( \mathcal {B}^{\mathrm {x}}_{a b}+ {\mathcal {A}}^{\mathrm {x}}_{a b}\right) \delta n^b_{\mathrm {x}}+ \left( \mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}+ {\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\right) \delta n^b_{\mathrm {y}}. \end{aligned}$$

The bulk effects are contained in

$$\begin{aligned} \mathcal {B}^{\mathrm {x}}_{a b}= \mathcal {B}^{\mathrm {x}}\left( \perp ^{\mathrm {x}}_{a b} - c^2_{\mathrm {x}}u^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) , \end{aligned}$$

which is just the two-fluid extension of Eq. (8.30) [with \(\mathrm {n}\) replaced by \({\mathrm {x}}\) and using Eq. (8.42)]. The cross-constituent coupling enters via \(\mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\) [defined already in Eq. (8.36)]. Finally, entrainment enters through the coefficients \({\mathcal {A}}^{\mathrm {x}}_{a b}\) and \({\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\) given by, respectively,

$$\begin{aligned} {\mathcal {A}}^{\mathrm {x}}_{a b}= & {} - \left[ \mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}} \left( u^{\mathrm {x}}_a u^{\mathrm {y}}_b + u^{\mathrm {x}}_b u^{\mathrm {y}}_a \right) + \frac{n_{\mathrm {y}}}{n_{\mathrm {x}}} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} u^{\mathrm {y}}_a u^{\mathrm {y}}_b \right] , \end{aligned}$$
$$\begin{aligned} {\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= & {} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\perp ^{\mathrm {x}}_{a b} \nonumber \\&- \left[ \left( \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}+ \frac{n_{\mathrm {x}}}{n_{\mathrm {y}}} \mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}}\right) u^{\mathrm {x}}_a u^{\mathrm {x}}_b + \frac{n_{\mathrm {y}}}{n_{\mathrm {x}}} \mathcal {B}^{\mathrm {y}}_{, {\mathrm {x}}{\mathrm {y}}} u^{\mathrm {y}}_a u^{\mathrm {y}}_b + \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} u^{\mathrm {y}}_a u^{\mathrm {x}}_b\right] , \end{aligned}$$

where we have introduced the notation

$$\begin{aligned} \mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}} \equiv n_{\mathrm {x}}n_{\mathrm {y}}\frac{\partial \mathcal {B}^{\mathrm {x}}}{\partial n_{{\mathrm {x}}{\mathrm {y}}}^2} , \end{aligned}$$


$$\begin{aligned} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} \equiv n_{\mathrm {x}}n_{\mathrm {y}}\frac{\partial \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{\partial n_{{\mathrm {x}}{\mathrm {y}}}^2} . \end{aligned}$$

The same procedure as in the previous two examples—the single fluid with one and then two constituents—leads to the dispersion relation

$$\begin{aligned}&\left( \mathcal {B}^{\mathrm {n}} \sigma ^2 - \left[ \mathcal {B}^{\mathrm {n}}_{0 0} + \mathcal {A}^{\mathrm {n}\mathrm {n}}_{0 0} \right] \right) \left( \mathcal {B}^{\mathrm {s}} \sigma ^2 - \left[ \mathcal {B}^{\mathrm {s}}_{0 0} + \mathcal {A}^{\mathrm {s}\mathrm {s}}_{0 0} \right] \right) \nonumber \\&\quad - \left( \mathcal {A}^{\mathrm {n}\mathrm {s}} \sigma ^2 - \left[ \mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0} + \mathcal {A}^{\mathrm {n}\mathrm {s}}_{0 0} \right] \right) ^2 = 0 , \end{aligned}$$

recalling from Eq. (8.32) that \(\sigma ^2 = k^2_0/k_i k^i\). This is a quadratic in \(\sigma ^2\), meaning that there are two sound speeds. This is a natural result of the doubling of fluid degrees of freedom.

To finish this discussion of local mode solutions in the two-fluid problem, it is useful to consider what constraints the simplest solutions of zero interaction impose on the equation of state. The dispersion relation becomes simply

$$\begin{aligned} (\sigma ^2 - c_\mathrm {n}^2)(\sigma ^2 - c_\mathrm {s}^2) = 0 , \end{aligned}$$

so the mode speed solutions \(\sigma _\mathrm {n}\) and \(\sigma _\mathrm {s}\) are

$$\begin{aligned} \sigma ^2_\mathrm {n}= c^2_\mathrm {n}= 1+ \frac{\partial \log \mathcal {B}^\mathrm {n}}{\partial \log n_\mathrm {n}} , \quad \sigma ^2_\mathrm {s}= c^2_\mathrm {s}= 1+ \frac{\partial \log \mathcal {B}^\mathrm {s}}{\partial \log n_\mathrm {s}} . \end{aligned}$$

The constraints of absolute stability and causality implies that \(\varLambda \) must be such that

$$\begin{aligned} - 1 \le \frac{\partial \log \mathcal {B}^\mathrm {n}}{\partial \log n} \le 0 , \quad - 1 \le \frac{\partial \log \mathcal {B}^\mathrm {s}}{\partial \log s} \le 0 . \end{aligned}$$

A general analysis which keeps in entrainment and cross-constituent coupling has been performed by Samuelsson et al. (2010).

While the sound speed analysis is local, the doubling of the fluid degrees of freedom naturally carries over to the global scale relevant for the analysis of modes of oscillation of a fluid body.


The two-stream instability

Consider a system having two components between which there can be a relative flow, such as ions and electrons in a plasma, entropy and matter in a superfluid, or even the rotation of a neutron star as viewed from asymptotically flat infinity. If the relative flow reaches a speed where a mode in one of the components looks like it is going one direction with respect to that component, but the opposite direction with respect to the other component, then the mode will have a negative energy and become dynamically unstable. This kind of “two-stream” instability has a long history of investigation in the area of plasma physics (see Farley 1963; Buneman 1963). The Chandrasekhar–Friedman–Schutz (CFS) instability (Chandrasekhar 1970; Friedman and Schutz 1978a, b) (already discussed in Sect. 7.4) develops when a mode in a rotating star appears to be retrograde with respect to the star itself, and yet prograde with respect to an observer at infinity. The possible link between two-stream instability in the superfluid in the inner crust and pulsar glitches is more recent (Andersson et al. 2003, 2004b). Another relevant discussion considers a cosmological model consisting of a relative flow between matter and blackbody radiation (Comer et al. 2012a). Two-stream instability between two relativistic fluids in the linear regime has been examined in general by Samuelsson et al. (2010), and extended to the non-linear regime by Hawke et al. (2013). Finally, a discussion on the relationship between energetic and dynamical instabilities, starting from a Lagrangian for two complex scalar fields, was provided by Haber et al. (2016).

Repeating the key steps from Samuelsson et al. (2010), we start with a system having plane-wave propagation (as before, in a locally flat region of spacetime) on backgrounds such that \(\omega ^{\mathrm {x}}_{a b} = 0\). The various background quantities are considered constant, and there is a relative flow between the fluids. As in the previous sound-speed analyses, we let \(u^a_{\mathrm {x}}\) represent the background four-velocity of the \({\mathrm {x}}\)-fluid. Its total particle flux then takes the form

$$\begin{aligned} n^a_{\mathrm {x}}= n_{\mathrm {x}}u^a_{\mathrm {x}}+ A^a_{\mathrm {x}}\exp ^{i k_b x^b} , \end{aligned}$$

Because \(\omega ^{\mathrm {x}}_{a b} = 0\) for the background and there is flux conservation, the analysis still leads to the linearized equations;

$$\begin{aligned} \nabla _a \delta n^a_{\mathrm {x}}= 0 , \quad n_\mathrm {x}^a \nabla _{[a}\delta \mu ^{\mathrm {x}}_{b]} = 0 . \end{aligned}$$

The variation \(\delta \mu ^{\mathrm {x}}_a\) is the same as in Eq. (10.1).

However, the system flow is now such that \(u^a_{\mathrm {x}}\) does not equal \(u^a_{\mathrm {y}}\), the \({\mathrm {y}}\)-fluid four-velocity. There is a non-zero relative velocity of, say, the \({\mathrm {y}}\)-fluid with respect to the \({\mathrm {x}}\)-fluid given by

$$\begin{aligned} \gamma _{{\mathrm {x}}{\mathrm {y}}} v^a_{{\mathrm {x}}{\mathrm {y}}} = \perp ^{{\mathrm {x}}a}_b u^b_{\mathrm {y}}, \end{aligned}$$

where \(v_{{\mathrm {x}}{\mathrm {y}}} = v_{{\mathrm {y}}{\mathrm {x}}}\) represents the magnitude of the relative flow,

$$\begin{aligned} \perp ^{{\mathrm {x}}b}_a = \delta _a{}^b + u^{\mathrm {x}}_a u^b_{\mathrm {x}}, \quad \perp ^{{\mathrm {x}}b}_a u^a_{\mathrm {x}}= 0 , \end{aligned}$$


$$\begin{aligned} \gamma _{{\mathrm {x}}{\mathrm {y}}} = \gamma _{{\mathrm {y}}{\mathrm {x}}} = - u^c_{\mathrm {x}}u^{\mathrm {y}}_c = \frac{1}{\sqrt{1 - v^2_{{\mathrm {x}}{\mathrm {y}}}}} . \end{aligned}$$

This leads to (adapting (5.9) to the present context)

$$\begin{aligned} u^a_{\mathrm {y}}= \gamma _{{\mathrm {x}}{\mathrm {y}}} \left( u^a_{\mathrm {x}}+ v^a_{{\mathrm {x}}{\mathrm {y}}}\right) . \end{aligned}$$

For convenience, we will work in the material frame associated with the x fluid component, meaning that \(k_a\) and \(A^a_{\mathrm {x}}\) will be decomposed into timelike and spatial pieces as defined locally by \(u^a_{\mathrm {x}}\). For \(k_a\) we write

$$\begin{aligned} k_a = k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}u^{\mathrm {x}}_a + \hat{k}^{\mathrm {x}}_a\right) , \end{aligned}$$

where \(\sigma _{\mathrm {x}}\), \(k_{\mathrm {x}}\), and the unit wave vector \(\hat{k}^{\mathrm {x}}_a\) are obtained from \(k_a\) via

$$\begin{aligned} k_{\mathrm {x}}\sigma _{\mathrm {x}}= - k_a u^a_{\mathrm {x}}, \quad k^a k_a = - k^2_{\mathrm {x}}\left( 1 - \sigma ^2_{\mathrm {x}}\right) , \quad \hat{k}^{\mathrm {x}}_a = \frac{1}{k_{\mathrm {x}}} \perp ^b_{{\mathrm {x}}a} k_b. \end{aligned}$$

Similarly, the wave amplitude \(A^a_{\mathrm {x}}\) becomes

$$\begin{aligned} A^a_{\mathrm {x}}= A^{\mathrm {x}}_{||} u^a_{\mathrm {x}}+ A_{{\mathrm {x}}\perp }^a , \end{aligned}$$


$$\begin{aligned} A^{\mathrm {x}}_{||} = - u^{\mathrm {x}}_a A^a_{\mathrm {x}}, \quad A_{{\mathrm {x}}\perp }^a = \perp ^a_{{\mathrm {x}}b} A^b_{\mathrm {x}}. \end{aligned}$$

It is necessary to point out that the three quantities \(\sigma _{\mathrm {x}}\), \(k^{\mathrm {x}}_a\), and \(v^a_{{\mathrm {x}}{\mathrm {y}}}\) are determined by an observer moving along with the \({\mathrm {x}}\)-fluid. Of course, we could choose the frame attached to the other fluid. Fortunately, there are well-defined transformations between the two frames, which we determine as follows: The relative flow \(v^a_{{\mathrm {y}}{\mathrm {x}}}\) of the \({\mathrm {x}}^\mathrm{th}\)-fluid with respect to the \({\mathrm {y}}^\mathrm{th}\)-fluid frame is related to \(v^a_{{\mathrm {x}}{\mathrm {y}}}\) via

$$\begin{aligned} v^a_{{\mathrm {y}}{\mathrm {x}}} = - \gamma _{{\mathrm {x}}{\mathrm {y}}} \left( v^2_{{\mathrm {x}}{\mathrm {y}}} u^a_{\mathrm {x}}+ v^a_{{\mathrm {x}}{\mathrm {y}}}\right) , \end{aligned}$$

using the fact that \(v_{{\mathrm {y}}{\mathrm {x}}} = v_{{\mathrm {x}}{\mathrm {y}}}\). Since \(k_a\) is a tensor, we must have

$$\begin{aligned} k_a = k_{\mathrm {y}}\left( \sigma _{\mathrm {y}}u^{\mathrm {y}}_a + \hat{k}^{\mathrm {y}}_a\right) = k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}u^{\mathrm {x}}_a + \hat{k}^{\mathrm {x}}_a\right) . \end{aligned}$$

Noting that

$$\begin{aligned} u^a_{\mathrm {x}}= & {} - v^{- 2}_{{\mathrm {x}}{\mathrm {y}}} \left( v^a_{{\mathrm {x}}{\mathrm {y}}} + \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} v^a_{{\mathrm {y}}{\mathrm {x}}}\right) , \end{aligned}$$
$$\begin{aligned} u^a_{\mathrm {y}}= & {} - v^{- 2}_{{\mathrm {x}}{\mathrm {y}}} \left( v^a_{{\mathrm {y}}{\mathrm {x}}} + \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} v^a_{{\mathrm {x}}{\mathrm {y}}}\right) , \end{aligned}$$

and contracting each with the wave-vector \(k_a\), we obtain the matrix equation

$$\begin{aligned} \left[ \begin{array}{cc} v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}- \cos \theta _{{\mathrm {x}}{\mathrm {y}}} &{} - \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} \cos \theta _{{\mathrm {y}}{\mathrm {x}}} \\ - \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} \cos \theta _{{\mathrm {x}}{\mathrm {y}}} &{} v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {y}}- \cos \theta _{{\mathrm {y}}{\mathrm {x}}} \end{array}\right] \left[ \begin{array}{c} k_{\mathrm {x}}\\ k_{\mathrm {y}}\end{array}\right] = \left[ \begin{array}{c} 0 \\ 0 \end{array}\right] . \end{aligned}$$

The non-trivial solution requires that the determinant of the \(2 \times 2\) matrix vanishes; therefore,

$$\begin{aligned} \sigma _{\mathrm {y}}= \cos \theta _{{\mathrm {y}}{\mathrm {x}}} \frac{\sigma _{\mathrm {x}}- v_{{\mathrm {x}}{\mathrm {y}}} \cos \theta _{{\mathrm {x}}{\mathrm {y}}}}{v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}- \cos \theta _{{\mathrm {x}}{\mathrm {y}}}} . \end{aligned}$$

It is not difficult to show that if \(\sigma ^2_{\mathrm {x}}\le 1\) then \(\sigma ^2_{\mathrm {y}}\le 1\), and clearly if \(\sigma _{\mathrm {x}}\) is real then so is \(\sigma _{\mathrm {y}}\).

The equation of flux conservation is the same as (8.28) (except \({\mathrm {x}}\) ranges over two values). Here, it implies for each mode that

$$\begin{aligned} - \sigma _{\mathrm {x}}A^{\mathrm {x}}_{||} + \hat{k}^{\mathrm {x}}_a A_{{\mathrm {x}}\perp }^a = 0 . \end{aligned}$$

The two-fluid Euler equations become

$$\begin{aligned} 0= & {} K^{\mathrm {x}}_{a b} A^b_{\mathrm {x}}+ K^{{\mathrm {x}}{\mathrm {y}}}_{a b} A^b_{\mathrm {y}}, \end{aligned}$$
$$\begin{aligned} 0= & {} K^{\mathrm {y}}_{a b} A^b_{\mathrm {y}}+ K^{{\mathrm {y}}{\mathrm {x}}}_{a b} A^b_{\mathrm {x}}, \end{aligned}$$

where the “dispersion” tensors are

$$\begin{aligned} K^{\mathrm {x}}_{a b}= & {} n^c_{\mathrm {x}}\left( k_{[c} \mathcal {B}^{\mathrm {x}}_{a]b} + k_{[c} \mathcal {A}^{\mathrm {x}}_{a]b} \right) , \end{aligned}$$
$$\begin{aligned} K^{{\mathrm {x}}{\mathrm {y}}}_{a b}= & {} n^c_{\mathrm {x}}\left( k_{[c} \mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a]b} + k_{[c}\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{a]b}\right) . \end{aligned}$$

Note that \(K^{\mathrm {y}}_{a b}\) and \(K^{{\mathrm {y}}{\mathrm {x}}}_{a b}\) are obtained via the interchange of \({\mathrm {x}}\leftrightarrow {\mathrm {y}}\) in (10.30) and (10.31).

The general solution to (10.29) requires, say, using Eq. (10.29) to determine \(A^a_{\mathrm {y}}\), and then substitute that into Eq. (10.28). This means we need the four inverses

$$\begin{aligned} \tilde{K}^{a c}_{\mathrm {x}}K^{\mathrm {x}}_{c b} = \delta ^a{}_c , \quad \tilde{K}^{a c}_{{\mathrm {y}}{\mathrm {x}}} K^{{\mathrm {x}}{\mathrm {y}}}_{c b} = \delta ^a{}_c . \end{aligned}$$

With these in hand, we can write

$$\begin{aligned} 0 = \left( \tilde{K}^{a c}_{\mathrm {y}}K^{{\mathrm {y}}{\mathrm {x}}}_{c b} - \tilde{K}^{a c}_{{\mathrm {y}}{\mathrm {x}}} K^{\mathrm {x}}_{c b}\right) A^b_{\mathrm {x}}\equiv \mathcal{M}^a{}_b A^b_{\mathrm {x}}. \end{aligned}$$

Having a non-trivial solution requires that \(k_a\) be such that \(\det \mathcal {M}^a_{\ b} = 0 \). However, the examples which follow will be kept simple enough that the general procedure will not be required. For example, we will focus on the case of aligned flows.

Samuelsson et al. (2010) have shown that the relative flow between the two fluids enters through the inner product \(\hat{v}^a_{{\mathrm {x}}{\mathrm {y}}} \hat{k}^{\mathrm {x}}_a\) (where \(\hat{v}^a_{{\mathrm {x}}{\mathrm {y}}} = v^a_{{\mathrm {x}}{\mathrm {y}}}/v_{{\mathrm {x}}{\mathrm {y}}}\)), and so it is natural to introduce the angle \(\theta _{{\mathrm {x}}{\mathrm {y}}}\) between the two vectors. This means that, the inner product becomes

$$\begin{aligned} \hat{v}^a_{{\mathrm {x}}{\mathrm {y}}} \hat{k}^{\mathrm {x}}_a = \cos \theta _{{\mathrm {x}}{\mathrm {y}}} . \end{aligned}$$

Having an aligned flow means, say, setting \(\theta _{{\mathrm {x}}{\mathrm {y}}} = 0\) and \(\theta _{{\mathrm {y}}{\mathrm {x}}} = \pi \). The wave vector takes the form

$$\begin{aligned} k^a = \frac{1}{\gamma _{{\mathrm {x}}{\mathrm {y}}} v_{{\mathrm {x}}{\mathrm {y}}}} \left( k_{\mathrm {x}}u^a_{\mathrm {y}}- k_{\mathrm {y}}u^a_{\mathrm {x}}\right) , \end{aligned}$$

and the flux conservation becomes

$$\begin{aligned} k_{\mathrm {x}}u_a^{\mathrm {y}}A^a_{\mathrm {x}}= k_{\mathrm {y}}u_a^{\mathrm {x}}A^a_{\mathrm {x}}. \end{aligned}$$

This, in turn, implies that the problem is reduced from four equations with four unknowns to a much simpler \(2\times 2\) system. Finally, we note that Eqs. (10.22) and (10.26) imply, respectively,

$$\begin{aligned} \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} = \sqrt{\frac{1- \sigma ^2_{\mathrm {x}}}{1 - \sigma ^2_{\mathrm {y}}}} \end{aligned}$$


$$\begin{aligned} \sigma _{\mathrm {y}}= \frac{\sigma _{\mathrm {x}}- v_{{\mathrm {x}}{\mathrm {y}}}}{1 - v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}} . \end{aligned}$$

It will prove useful later to note that this last result implies

$$\begin{aligned} 1 - \sigma ^2_{\mathrm {y}}= \frac{1}{\gamma ^2_{{\mathrm {x}}{\mathrm {y}}}} \frac{1 - \sigma ^2_{\mathrm {x}}}{\left( 1 - v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}\right) ^2} \end{aligned}$$

and therefore

$$\begin{aligned} \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} = \gamma _{{\mathrm {x}}{\mathrm {y}}} \sqrt{\left( 1- v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}\right) ^2} . \end{aligned}$$

Another place where we will simplify the analysis is the choice of equation of state; namely, to consider forms with just enough complexity in the \(\mathcal {B}^{\mathrm {x}}_{a b}\), \(\mathcal {A}^{\mathrm {x}}_{a b}\), \(\mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\), and \(\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\) coefficients to establish the main feature we are interested in: the two-stream instability. Obviously, any fluid must have non-zero bulk properties; the other two properties of entrainment and cross-constituent coupling depend on the particular features of the fluid system incorporated into the equation of state. We will first consider the case where only bulk features are present and then follow this up by incorporating entrainment.

Let us first set both the entrainment and cross-constituent coupling to zero. This implies \(K^{{\mathrm {x}}{\mathrm {y}}}_{a b} = 0\) and the mode equations are

$$\begin{aligned} 0= & {} K^{\mathrm {x}}_{a b} A^b_{\mathrm {x}}= - \frac{1}{2} \mathcal {B}^{\mathrm {x}}n_{\mathrm {x}}k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + c^2_{\mathrm {x}}\hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) A^b_{\mathrm {x}}, \end{aligned}$$
$$\begin{aligned} 0= & {} K^{\mathrm {y}}_{a b} A^b_{\mathrm {y}}= - \frac{1}{2} \mathcal {B}^{\mathrm {y}}n_{\mathrm {y}}k_{\mathrm {y}}\left( \sigma _{\mathrm {y}}\perp ^{\mathrm {y}}_{a b} + c^2_{\mathrm {y}}\hat{k}^{\mathrm {y}}_a u^{\mathrm {y}}_b\right) A^b_{\mathrm {y}}. \end{aligned}$$

We contract each mode equation with \(k_a\) to find

$$\begin{aligned} 0 = \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) A^{\mathrm {x}}_{||} , \quad 0 = \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) A^{\mathrm {y}}_{||} , \end{aligned}$$

and, as the solution reduces to the \(2\times 2\) matrix problem

$$\begin{aligned} \left[ \begin{array}{cc} \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) &{} 0 \\ 0 &{} \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) \end{array}\right] \left[ \begin{array}{c} A^{\mathrm {x}}_{||} \\ A^{\mathrm {y}}_{||} \end{array}\right] = \left[ \begin{array}{c} 0 \\ 0 \end{array}\right] , \end{aligned}$$

it is easy to see that the resulting dispersion relation is

$$\begin{aligned} \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) = 0 . \end{aligned}$$

The modes of this system are the “bare” sound waves with speeds \(c_{{\mathrm {x}}}\) or \(c_{\mathrm {y}}\), as one would have expected. There are no interactions between the two fluids and so there is no sense in which they “see” each other. Generally, we conclude that the existence of a two-stream instability requires more than just a background relative flow. Some coupling agent is required.

With this in mind, we include coupling via entrainment. As we are ignoring the cross-constituent coupling term we still have \(\mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= 0\). The simplest inclusion of entrainment is to set \(\mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}} = 0\) and \(\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} = 0\). This means \({\mathcal {A}}^{\mathrm {x}}_{a b}= 0\), \({\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}g_{a b}\), and therefore

$$\begin{aligned} K^{\mathrm {x}}_{a b}= & {} - \frac{1}{2} \mathcal {B}^{\mathrm {x}}n_{\mathrm {x}}k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + c^2_{\mathrm {x}}\hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) , \end{aligned}$$
$$\begin{aligned} K^{{\mathrm {x}}{\mathrm {y}}}_{a b}= & {} - \frac{1}{2} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}n_{\mathrm {x}}k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + \hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) . \end{aligned}$$

The mode equations then become

$$\begin{aligned} 0= & {} \mathcal {B}^{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + c^2_{\mathrm {x}}\hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) A^b_{\mathrm {x}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + \hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) A^b_{\mathrm {y}}, \end{aligned}$$
$$\begin{aligned} 0= & {} \mathcal {B}^{\mathrm {y}}\left( \sigma _{\mathrm {y}}\perp ^{\mathrm {y}}_{a b} + c^2_{\mathrm {y}}\hat{k}^{\mathrm {y}}_a u^{\mathrm {y}}_b\right) A^b_{\mathrm {y}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( \sigma _{\mathrm {y}}\perp ^{\mathrm {y}}_{a b} + \hat{k}^{\mathrm {y}}_a u^{\mathrm {y}}_b\right) A^b_{\mathrm {x}}. \end{aligned}$$

By contracting each with \(k_a\), using Eqs. (10.35) and (10.36), we get

$$\begin{aligned} 0= & {} \frac{1}{k_{\mathrm {x}}}\left\{ \mathcal {B}^{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} k^a + c^2_{\mathrm {x}}k_{\mathrm {x}}u^{\mathrm {x}}_b\right) A^b_{\mathrm {x}}\right. \nonumber \\&\left. + \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left[ \sigma _{\mathrm {x}}k_a + k_{\mathrm {x}}\left( 1 - \sigma ^2_{\mathrm {x}}\right) u^{\mathrm {x}}_a\right] A^a_{\mathrm {y}}\right\} \nonumber \\= & {} \mathcal {B}^{\mathrm {x}}\left[ \frac{\sigma _{\mathrm {x}}}{\gamma _{{\mathrm {x}}{\mathrm {y}}} v_{{\mathrm {x}}{\mathrm {y}}}} \left( \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} - \gamma _{{\mathrm {x}}{\mathrm {y}}} \right) + c^2_{\mathrm {x}}\right] u^{\mathrm {x}}_a A^a_{\mathrm {x}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( 1 - \sigma ^2_{\mathrm {x}}\right) \frac{k_{\mathrm {x}}}{k_{\mathrm {y}}} u^{\mathrm {y}}_a A^a_{\mathrm {y}}, \end{aligned}$$
$$\begin{aligned} 0= & {} \mathcal {B}^{\mathrm {y}}\left[ \frac{\sigma _{\mathrm {y}}}{\gamma _{{\mathrm {x}}{\mathrm {y}}} v_{{\mathrm {x}}{\mathrm {y}}}} \left( \frac{k_{\mathrm {x}}}{k_{\mathrm {y}}} - \gamma _{{\mathrm {x}}{\mathrm {y}}} \right) + c^2_{\mathrm {y}}\right] u^{\mathrm {y}}_a A^a_{\mathrm {y}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( 1 - \sigma ^2_{\mathrm {y}}\right) \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} u^{\mathrm {x}}_a A^a_{\mathrm {x}}. \end{aligned}$$

The dispersion relation now becomes

$$\begin{aligned} 0 = \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) - \left( \frac{\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}}}\right) ^2 \left( 1 - \sigma ^2_{\mathrm {x}}\right) \left( 1 - \sigma ^2_{\mathrm {y}}\right) . \end{aligned}$$

This can be rewritten in a form more useful for numerical solutions; namely,

$$\begin{aligned} 0 = \left( x^2 - b^2\right) \left[ \left( x - y\right) ^2 - \left( 1 - c_{\mathrm {y}}^2 y x\right) ^2\right] - a^2 \frac{\left( 1 - c^2_{\mathrm {y}}x^2\right) ^2}{\gamma ^2_{{\mathrm {x}}{\mathrm {y}}}} , \end{aligned}$$

where \(x = \sigma _{\mathrm {x}}/c_{\mathrm {y}}\), \(y = v_{{\mathrm {x}}{\mathrm {y}}}/c_{\mathrm {y}}\), \(b = c_{\mathrm {x}}/c_{\mathrm {y}}\). and

$$\begin{aligned} a^2 = \left( \frac{\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{c^2_{\mathrm {y}}\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}}}\right) ^2 . \end{aligned}$$

The immediate thing to note is that the relative speed changes the equation from a quadratic in \(\sigma ^2_{\mathrm {x}}\) to being fully quartic in \(\sigma _{\mathrm {x}}\); thus, it is inevitable that complex solutions will result. The question is if the imaginary contributions can be realized for physical parameters. Recall that this means the system must exhibit absolute stability and causality. Samuelsson et al. (2010) have shown that these are guaranteed when

$$\begin{aligned} 0 \le \left( \frac{\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}}}\right) ^2 \le c^2_{\mathrm {x}}c^2_{\mathrm {y}}\quad \Longrightarrow \quad a^2 \le b^2 . \end{aligned}$$

In the Newtonian limit the dispersion relation takes the same mathematical form for entrainment as it does for non-zero cross-constituent coupling; namely,

$$\begin{aligned} \frac{\left( x^2 - b^2\right) }{a^2} \left[ \left( x - y\right) ^2 - 1\right] = 1 . \end{aligned}$$

As this is quartic in x, the exact solutions are known. However, they are quite tedious and their main use is to serve as the basis for numerical evaluations of the modes. A basic algorithm would be to fix a and b, subject to the constraint in Eq. (10.55), and then evaluate the real and imaginary parts of \(\sigma _{\mathrm {x}}\) as functions of y. The end result of this process is to reveal that the instability exists in a “window” of y-values (Andersson et al. 2003, 2004b; Samuelsson et al. 2010). As an example we may consider the example from Andersson et al. (2004b), illustrated in Fig. 13. A more recent study (Andersson and Schmitt 2019), in the framework of relativity, highlights the fact that the system will be prone to an energy instability (closely related to the CFS instability from Sect. 7.4, as it sets in at the point where originally backwards moving modes are dragged forwards by the background flow). As indicated by the left panel of Fig. 13 this energy instability tends to set in before the system suffers the (dynamical) two-stream instability.

Fig. 13

Image reproduced with permission from Andersson et al. (2004b), copyright by RAS

An illustration of the two-stream instability, showing the real (left panel) and imaginary (right panel) parts of the four roots of the dispersion relation for the model parameters (\(a^2=0.0249\) and \(b^2=0.0379\)) used in Andersson et al. (2004b). For these parameters the quartic dispersion relation has four real roots for both \(y=0\) and \(y=2\), while it has two real roots and a complex conjugate pair for y in the range \(0.6<y<1.5\). In this range, the two-stream instability is active

Finally, let us take the opportunity to note that the relativistic two-stream instability has also been analyzed in the non-linear regime (Hawke et al. 2013). This first nonlinear numerical simulation of the effect in relativistic multi-species hydrodynamical systems shows that the onset and initial growth of the instability match closely the results of linear perturbation theory. But, in the later stages of the evolution, the linear and nonlinear description have only qualitative overlaps. The main conclusion is that the instability does not saturate in the nonlinear regime by purely ideal hydrodynamic effects.

Numerical simulations: fluid dynamics in a live spacetime

Many astrophysical phenomena involve violent nonlinear matter dynamics. Such systems cannot (meaningfully) be described within perturbation theory. Instead, the modelling requires fully nonlinear—and multi-dimensional, given the lack of symmetry of (say) turbulent flows—simulations, taking into account the live spacetime of General Relativity. The last decades have seen considerable progress in the development of the relevant computational tools, especially for gravitational-wave sources like supernova core collapse (Müller 2016) and neutron star mergers (Baiotti and Rezzolla 2017). The state-of-the-art technology includes the consideration of fairly sophisticated matter models. In the case of supernova modelling, neutrinos are expected to play an important role in triggering the explosion (Janka 2012) and the role of magnetic fields may also be significant (Mösta et al. 2015). Meanwhile, for neutron star mergers, finite temperature effects are central as shock heating ramps up the temperature of the merged object to levels beyond that expected even during core collapse (see, e.g., Bauswein et al. 2010 or Kastaun and Galeazzi 2015). Magnetic fields are expected to have decisive impact on the post-merger dynamics are likely to leave an observational signature, e.g., in terms of short gamma-ray bursts (e.g., Kumar and Zhang 2015).

Spacetime foliation

We have already explored some aspects of the problem (like the thermodynamics and the matter equation of state, see Sect. 2) and we have considered features that arise in models of increasing complexity (in particular when we need to account for the relative flow of distinct fluid components). So far, the discussion has assumed a fibration of spacetime associated with a family of fluid observers. This approach is natural if one is mainly interested in the local fluid dynamics (e.g., wave propagation) and it also leads to the 1+3 formulation often used in cosmology (where “clocks” associated with the fluid observers define the notion of cosmic time), see Barrow et al. (2007) for a relevant discussion. The strategy is, however, not natural for numerical simulations with a live spacetime. Instead, most such work makes use of a 3+1 spacetime foliation (see Baumgarte and Shapiro 2003 for a relevant discussion), where progression towards the “future” is associated with a set of Eulerian observers. Hence, we need to understand how we extend the multifluid model from fibration to foliation.

The standard approach to numerical simulations takes as its starting point a “foliation”of spacetime into a family of spacelike hypersurfaces, \(\varSigma _t\), which arise as level surfaces of a scalar time t (see, e.g., Alcubierre 2008). Given the normal to this surface

$$\begin{aligned} N_a = - \alpha \nabla _ a t , \end{aligned}$$

where the function \(\alpha \) is known as the lapse, we have

$$\begin{aligned} N_a = (-\alpha ,0,0,0) , \end{aligned}$$

and the normalisation \(N_a N^a=-1\) (we are thinking of the normal as associated with an observer moving through spacetime in the usual way) leads to \(\alpha ^2 = -1/g^{tt}\). The sign in (11.1) ensures that time flows into the future. The dual to \(\nabla _a t\) leads to a time vector

$$\begin{aligned} t^a = \alpha N^a + \beta ^a , \end{aligned}$$

where the so-called shift vector \(\beta ^a\) is spatial, in the sense that \(N_a \beta ^a = 0\). It follows that

$$\begin{aligned} N^a = \alpha ^{-1} ( 1,-\beta ^i) , \end{aligned}$$

and the spacetime can be written in the Arnowitt–Deser–Misner (ADM) form (Arnowitt et al. 2008; York 1979):

$$\begin{aligned} ds^2 = - \alpha ^2 dt^2 + \gamma _{ij} \left( dx^i + \beta ^i dt \right) \left( dx^j + \beta ^j dt \right) , \end{aligned}$$

where the (induced) metric on the spacelike hypersurface is

$$\begin{aligned} \gamma _{ab} = g_{ab} + N_a N_b . \end{aligned}$$

Note that \(\gamma ^a_b\) represents the projection orthogonal to \(N_a\) and that \(\gamma _{ab}\) and its inverse can be used to raise and lower indices of purely spatial tensors. For example, we have \(\beta _i = \gamma _{ij} \beta ^j\).

In essence, the lapse \(\alpha \) determines the rate at which proper time advances from one time slice to the next, along the normal \(N_a\), while the vector \(\beta ^i\) determines how the coordinates shift from one spatial slice to the next. This is illustrated in Fig. 14. The two functions encode the coordinate freedom of General Relativity.

Fig. 14

An illustration of the two formulations for the relativistic fluid problem. The fibration approach, which focuses on the worldline associated with a given fluid element (and a four velocity \(\varvec{u}\) with components \(u^a\)), provides a natural description of the microphysics and issues relating to thermodynamics. Meanwhile, a spacetime foliation, based on the use of spatial slices and normal observers (with the coordinate freedom encoded in the lapse \(\alpha \) and the shift vector \(\beta ^i\)), is typically used in numerical simulations. In order to ensure that the local physics is appropriately implemented in simulations, we need to understand the translation between the two descriptions

Reading off the metric from the line element, we have

$$\begin{aligned} g_{ab} = \left( \begin{array}{cc} -\alpha ^2 + \beta _i \beta ^i &{} \beta _i \\ \beta _i &{} \gamma _{ij} \end{array} \right) , \end{aligned}$$

with inverse

$$\begin{aligned} g^{ab} = \left( \begin{array}{cc} -1/\alpha ^2 &{} \beta ^i/\alpha ^2 \\ \beta ^i/\alpha ^2 &{} \gamma ^{ij}-\beta ^i \beta ^j /\alpha ^2 \end{array} \right) . \end{aligned}$$

Having specified the spacetime foliation, we can decompose any tensor into time and space components (adapting the logic from the discussion of the stress-energy tensor in Sect. 5). Suppose, for example, that we have a fluid associated with a four velocity \(u^a\). Then we can introduce the decompositionFootnote 18

$$\begin{aligned} u^a = W (N^a + \hat{v}^a) = {W\over \alpha } \left( t^a - \beta ^a + \alpha \hat{v}^a \right) , \end{aligned}$$

where \(N_a \hat{v}^a =0\) and the Lorentz factor is given by

$$\begin{aligned} W= - N_a u^a = \alpha u^t = (1-\hat{v}^2)^{-1/2} , \end{aligned}$$

where \(\hat{v}^2 = \gamma _{ij} \hat{v}^i \hat{v}^j\) and the last equality follows from \(u^a u_a=-1\), as usual. From this relation it is easy to see that

$$\begin{aligned} \hat{v}^t = 0 , \qquad \hat{v}^i = {u^i\over W} - N^i = {1\over \alpha } \left( {u^i \over u^t} + \beta ^i\right) , \end{aligned}$$

and it follows that

$$\begin{aligned} \hat{v}_t = g_{ta} \hat{v}^a = \beta _i \hat{v}^i , \qquad \hat{v}_i = \gamma _{ia} \hat{v}^a = {\gamma _{ij} \over \alpha } \left( {u^j\over u^t} + \beta ^j \right) . \end{aligned}$$

We also need to consider derivatives. First of all, we introduce a derivative associated with the hypersurface. Thus, we use the (totally) projected derivative

$$\begin{aligned} D_a = \gamma _a^b \nabla _b , \end{aligned}$$

where all free indices should be projected into the surface. This derivative is compatible with the spatial metric (see Sect. 3) in the sense that

$$\begin{aligned} D_a\gamma _{bc} = \gamma _a^d \gamma _b^e \gamma _c^f\nabla _d \gamma _{ef} = 0 , \end{aligned}$$

which means that it acts as a covariant derivative in the surface orthogonal to \(N^a\). The upshot of this is that we can construct a tensor algebra for the three-dimensional spatial slices. In particular, we can introduce a three-dimensional Riemann tensor. This projected Riemann tensor does not contain all the information from its four-dimensional cousin; the missing information is encoded in the extrinsic curvature, \(K_{ab}\). This is a symmetric spatial tensor, such that \(N^a K_{ab}=0\). The extrinsic curvature provides a measure of how the \(\varSigma _t\) surfaces curve relative to spacetime. In practice, we measure how the normal \(N_a\) changes as it is parallel transported along the hypersurface. That is, we defineFootnote 19

$$\begin{aligned} K_{ac} = -D_a N_c = - \gamma _a^b \gamma _c^d \nabla _b N_d = - \nabla _a N_c - N_a (N^b\nabla _b N_c) , \end{aligned}$$

where the second term is analogous to the fluid four-acceleration. We also have

$$\begin{aligned} K= K^a_a = g^{ab}K_{ab} = - \gamma ^{ab} D_a N_b = - \nabla _a N^a . \end{aligned}$$

Alternatively, we can use the properties of the Lie derivative to show that

$$\begin{aligned} K_{ij} = - {1 \over 2}\mathcal {L}_N \gamma _{ij} , \end{aligned}$$

but since

$$\begin{aligned} \mathcal {L}_N = {1\over \alpha } ( \mathcal {L}_t - \mathcal {L}_\beta ) = {1\over \alpha } ( \partial _t - \mathcal {L}_\beta ) , \end{aligned}$$

we have

$$\begin{aligned} \partial _t \gamma _{ij} = - 2\alpha K_{ij} + \mathcal {L}_\beta \gamma _{ij} . \end{aligned}$$

From the trace of this expression we get

$$\begin{aligned} \alpha K = - \partial _t \ln \gamma ^{1/2} + D_i \beta ^i , \end{aligned}$$

where \(\gamma =g^{ab}\gamma _{ab}\) and \(\gamma ^{ij} \partial _t \gamma _{ij} = \partial _t \ln \gamma \).

Perfect fluids

The spacetime foliation provides us with the tools we need to formulate relativistic fluid dynamics in a way suitable for numerical simulations (compatible with the solution of the Einstein field equations for the spacetime metric, which needs to be carried out in parallel; Alcubierre 2008; Baumgarte and Shapiro 2010). However, our immediate focus is on the equations of fluid dynamics (see Font 2008 for more details).

Let us start with the simple case of baryon number conservation. That is, we assume the flux \(n u^a\) is conserved, where n is the baryon number density according to an observer moving along with the fluid. Thus, we have

$$\begin{aligned} \nabla _a (n u^a) = \nabla _a [ Wn (N^a + \hat{v}^a) ]= 0 . \end{aligned}$$

First we note that the particle number density measured by the Eulerian observer is

$$\begin{aligned} \hat{n} =-N_a n u^a = nW , \end{aligned}$$

so we have

$$\begin{aligned} N^a \nabla _a \hat{n} + \nabla _i (\hat{n} \hat{v}^i) = - \hat{n} \nabla _a N^a = \hat{n} K , \end{aligned}$$

(since \(\hat{v}^i\) is spatial). Making use of the Lie derivative and (11.18) this can be written

$$\begin{aligned} N^a \nabla _a \hat{n} = \mathcal {L}_N \hat{n} = {1\over \alpha } ( \partial _t - \mathcal {L}_\beta ) \hat{n} = - \nabla _i (\hat{n} \hat{v}^i) + \hat{n} K , \end{aligned}$$


$$\begin{aligned} \partial _t \hat{n} + (\alpha \hat{v}^i - \beta ^i )\nabla _i \hat{n} + \alpha \hat{n} \nabla _i \hat{v}^i = \alpha \hat{n} K . \end{aligned}$$

Finally, since \(\hat{v}^i\) and \(\beta ^i\) are already spatial, we have

$$\begin{aligned} \partial _t \hat{n} + (\alpha \hat{v}^i - \beta ^i )D_i \hat{n} + \alpha \hat{n} D_i \hat{v}^i = \alpha \hat{n} K =- \hat{n} \partial _t \ln \gamma ^{1/2} + \hat{n} D_i \beta ^i , \end{aligned}$$


$$\begin{aligned} \partial _t \left( \gamma ^{1/2} \hat{n}\right) + D_i \left[ \gamma ^{1/2}\hat{n} (\alpha \hat{v}^i - \beta ^i )\right] = 0. \end{aligned}$$

This simply represents the advection of the baryons along the flow, as seen by an Eulerian observer. In arriving at this result, we have used the fact that

$$\begin{aligned} \left( -g\right) ^{1/2} = \alpha \gamma ^{1/2} , \end{aligned}$$


$$\begin{aligned} \nabla _a (-g)^{1/2} = \nabla _a ( \alpha \gamma ^{1/2}) = 0 . \end{aligned}$$

For future reference, it is also worth noting that

$$\begin{aligned} D_i \gamma ^{1/2} = \partial _i \gamma ^{1/2} - \varGamma ^j_{ji} \gamma ^{1/2} = 0 , \end{aligned}$$

where the Christoffel symbol is the one associated with the covariant derivative in the hypersurface.


Moving on, the fluid equations of motion follow from \(\nabla _a T^{ab}=0\), where we recall that a perfect fluid is described by the stress-energy tensor

$$\begin{aligned} T^{ab} = (p+\varepsilon ) u^a u^b + p g^{ab} . \end{aligned}$$

Here p and \(\varepsilon \) are the pressure and the energy density, respectively. As discussed in Sect. 2 these quantities are related by the equation of state, which encodes the relevant microphysics. In order to make contact with this discussion, a numerical simulation must allow us to extract these quantities from the evolved variables.

However, a numerical simulation is naturally carried out using quantities measured by the Eulerian observer. That is, we decompose the stress-energy tensor into normal and spatial parts as (again, see the discussion in Sect. 5)

$$\begin{aligned} T^{ab} = \rho N^a N^b + 2 N^{(a} S^{b)} + S^{ab} , \end{aligned}$$

with (noting the conflict in notation from the discussion in Sect. 11, where \(\rho \) represented the mass density)

$$\begin{aligned} \rho= & {} N_a N_b T^{ab} = \varepsilon W^2 - p \left( 1 - W^2\right) , \end{aligned}$$
$$\begin{aligned} S^i= & {} - \gamma ^i_c N_d T^{cd} = \left( p+\varepsilon \right) W^2 \hat{v}^i , \end{aligned}$$


$$\begin{aligned} S^{ij} = \gamma ^i_c \gamma ^j_d T^{cd} = p \gamma ^{ij} + \left( p +\varepsilon \right) W^2 \hat{v}^i \hat{v}^j . \end{aligned}$$

A projection of the equations of motion along \(N_a\) then leads to the energy equation. From

$$\begin{aligned} N^a \nabla _a \rho + \rho \nabla _a N^a + \nabla _ a S^a - N_b N^a \nabla _a S^b - N_b \nabla _a S^{ab} = 0 , \end{aligned}$$

we get

$$\begin{aligned} N^a \nabla _a \rho + \nabla _a S^a= \rho K-S^b N^a\nabla _a N_b - S^{ab}\nabla _a N_b , \end{aligned}$$

where we have used

$$\begin{aligned} N^a\nabla _a N_b = D_b \ln \alpha \end{aligned}$$

We also have

$$\begin{aligned} {1\over \alpha } \left( \partial _t - \mathcal {L}_\beta \right) \rho + \nabla _a S^a= \rho K-S^b D_b \ln \alpha + S^{ab}K_{ab} , \end{aligned}$$

leading to

$$\begin{aligned} \partial _t \left( \gamma ^{1/2} \rho \right) + D_i \left[ \gamma ^{1/2} \left( \alpha S^i -\rho \beta ^i\right) \right] = \gamma ^{1/2} \left( \alpha S^{ij}K_{ij} -S^i D_i \alpha \right) . \end{aligned}$$

Turning to the momentum equation, which is obtained by a projection orthogonal to \(N_a\), we have

$$\begin{aligned} \rho N^a \nabla _a N^c + \gamma ^c_{\ b}N^a \nabla _a S^b + S^c \nabla _a N^a + S^a \nabla _a N^c + \gamma ^c_{\ b} \nabla _a S^{ab} =0 , \end{aligned}$$

which leads to

$$\begin{aligned} \left( \partial _t - \mathcal {L}_\beta \right) S_i - S^j \left( \partial _t - \mathcal {L}_\beta \right) \gamma _{ij} - \alpha K S_i + \rho D_i \alpha + \alpha \gamma _{ij} D_k S^{kj} = 0 , \end{aligned}$$

where we have used

$$\begin{aligned} N^a \nabla _a S^c = \mathcal {L}_N S^c + S^a \nabla _a N^c = \mathcal {L}_N S^c - S^a K_a^c . \end{aligned}$$

This leads to the final result

$$\begin{aligned} \partial _t (\gamma ^{1/2} S_i) + D_j \left[ \gamma ^{1/2} \left( \alpha S_i^j -S_i \beta ^j \right) \right] = \gamma ^{1/2} \left( S_j D_i \beta ^j - \rho D_i \alpha \right) . \end{aligned}$$

This completes the set of equations we need in order to carry out a perfect fluid simulation. The extension to more general setting follows, at least formally, the same steps.

Conservative to primitive

We have written down the set of evolution equations we need for a single-component problem. This leaves us with one important issue to resolve. How do we connect the evolution to the underlying microphysics and the equation of state? In order to do this, we have to consider the inversion from the variables used in the evolution to the “primitive” fluid variables associated with the equation of state.

Let us, in the interest of conceptual clarity, focus on the case of a cold barotropic fluid, such that the equation of state provides the energy as a function of the baryon number density \(\varepsilon = \varepsilon (n)\) (see Sect. 2). This then leads to the chemical potential

$$\begin{aligned} \mu = {d\varepsilon \over dn} , \end{aligned}$$

and the pressure p follows from the thermodynamical relation:

$$\begin{aligned} p = n \mu - \varepsilon . \end{aligned}$$

We see that, in order to connect with the thermodynamics we need the evolved number density. We also need to decide which observer measures equation of state quantities. In the single-fluid case this question is relatively easy to answer; we need to express the equation of state in the fluid frame (use the fibration associated with \(u^a\)).

In the simple case we consider here the evolved system, (11.27) and (11.45), provides (assuming that \(\gamma ^{1/2}\) is known from the evolution of the Einstein equations)

$$\begin{aligned} \hat{n} = nW = n (1-\hat{v}^2)^{-1/2} , \end{aligned}$$


$$\begin{aligned} S^i = (p+\varepsilon ) W^2 \hat{v}^i . \end{aligned}$$

We need to invert these two relations to extract the primitive variables, n and \(\hat{v}^i\). This can be formulated as a one-dimensional root-finding problem. For example, we may start by guessing a value for \(n=\bar{n}\). This then allows us to work out \(\varepsilon \) from the equation of state and p from (11.47). With these variables in hand we can solve

$$\begin{aligned} {S^2 \over (p+ \varepsilon )^2} = W^4 \hat{v}^2 , \quad \text{ with } \quad S^2 = \gamma _{ij}S^i S^j , \end{aligned}$$

for \(\hat{v}^2\). This, in turn, allows us to work out the Lorentz factor W and then \(\hat{v}^i\) follows from (11.49). Finally, we get \(n=\hat{n}/W\) from (11.48). The result can be compared to our initial guess \(\bar{n}\). Iterating the procedure gives a solution consistent with the conserved quantities, and hence all primitive quantities.

Unfortunately, the numerical implementation of this strategy may not be as straightforward as it sounds. For example, the result may be sensitive to the initial guess and the algorithm may not converge. This is particularly true for more complex situations (e.g., multi-parameter equations of state or problems involving magnetic fields; Font 2000; Dionysopoulou et al. 2013). However, our aim here is not to resolve the possible numerical issues. We are only outlining the logic of the approach.

The state of the art

Without attempting an exhaustive survey of the relevant literature, it is useful to provide comments on the current state of the art along with suggestions for further reading. The area of numerical simulations of general relativistic fluids is developing rapidly, stimulated by the breakthrough discoveries in gravitational-wave astronomy—in particular, the astonishing GW170817 neutron star binary merger event (Abbott et al. 2017c, b), observations of which engaged a large fraction of the global astronomy community.

Focus on nonlinear simulations with a live spacetime, one may identify (at least) four (more or less) separate bodies of work:

  • First of all, numerical simulations have been used to explore the problem of instabilities in rotating stars and disks. This is a classic problem in applied mathematics/fluid dynamics, where perturbative studies may be used to establish the existence of an instability (for simpler models) but where numerical simulations are required for a higher level of realism and also to investigate the nonlinear evolution of an unstable system (to what extent the nonlinear coupling of different oscillation models leads to an instability saturating at some level, etcetera). The archetypal problems—basically because they involve instabilities that grow sufficiently rapidly that they can be tracked by (expensive) multi-dimensional simulations—are the bar-mode instability of (rapidly and differentially) rotating stars (Tohline et al. 1985; Williams and Tohline 1987; New et al. 2000; Shibata et al. 2000; Baiotti et al. 2007) and the run-away instability of (thick) accretion disks (Zanotti et al. 2003).

  • A second setting that has been explored since the early days of numerical relativity (Stark and Piran 1985; Piran and Stark 1986) involve the gravitational collapse to form a black hole (Baiotti et al. 2005; Ott et al. 2007, 2011). The typical collapse time-scale is short enough that these simulations can be carried out without extortionate cost, but the problem involves a number of complicating issues relating to the formation of the black-hole horizon. The typical set-up involves initial data representing a stable fluid body from which pressure support is artificially removed to trigger the collapse. The main conclusion drawn from this body of work may be that the gravitational-wave signal from collapse and black-hole formation tends to be dominated by quasinormal mode ringing.

  • Realistic modelling of the core-collapse of star that reaches the endpoint of its main-sequence life is exceedingly complicated (Janka et al. 2007; Morozova et al. 2018). The problem involves complex physics and a vast range of scales that need to be accurately tracked in a simulation. In spite of the challenges, there has been huge progress on understanding the problem in the last two decades. From the fluid dynamics point of view, the main developments involve the implementation of a (more) realistic matter description (based on nuclear physics and accounting for thermal effect; Richers et al. 2017) and developments towards an accurate implementation of neutrinos (Roberts et al. 2016; Andresen et al. 2017; Glas et al. 2019; Endrizzi et al. 2020). The latter is crucial, as the neutrinos are thought to be necessary to trigger the supernova explosion.

  • The final problem setting—attracting a lot of interest at the present time (Baiotti and Rezzolla 2017; Bernuzzi 2020)—involves the inspiral and merger of binary neutron stars. Many of the challenges, regarding the physics, are the same as in the case of core-collapse simulations. The problem involves a vast range of scale, not so much involved with an explosion as the outflow of matter that is unbound during the merger, undergoes rapid nuclear reactions and give rise to a kilonova signal (Goriely et al. 2011; Bauswein et al. 2012; Kasen et al. 2015; Radice et al. 2018; Margalit and Metzger 2019). At the same time the hot merger remnant oscillates wildly (Stergioulas et al. 2011; Bernuzzi et al. 2015; Rezzolla and Takami 2016) until it loses enough angular momentum (or cools enough) that it (most likely) collapses to form a black hole. An important additional complication involves the presence of magnetic fields (Palenzuela et al. 2009), hugely relevant as neutron star mergers are expected to be the source of observed short gamma-ray bursts (Rezzolla et al. 2011; Paschalidis et al. 2015). This connection was observationally confirmed by the GW170817 event, but numerical simulations have not yet reached the stage where the detailed engine of of these events can be explored (Ciolfi 2020).

Relativistic elasticity

Shortly after a neutron star is born, the outer layers freeze to form an elastic crust and the temperature of the high-density core drops below the level where superfluid and superconducting components are expected to be present. The different phases of matter impact on the observations in a number of ways. The crust is important as

  • it anchors the star’s magnetic field (and provides dissipative channels leading to the gradual field evolution; Viganò et al. (2013)),

  • there is an immediate connection between observed quasi-periodic oscillations in the tails of magnetar flares (Strohmayer and Watts 2005) (see Watts et al. 2016 for an overview the relevant literature) and the dynamics of the elastic nuclear lattice. An understanding of the properties of the crust is essential for efforts to match the theory to observed seismology. The idea of associating observed variability with torsional oscillation of the crust was first put forward by Duncan (1998). Relativistic aspects of the problem (particularly relevant for the present discussion) were developed by Samuelsson and Andersson (2007, 2009).

  • the ability of the crust to sustain elastic strain is key to the formation of asymmetries which may lead to detectable gravitational waves from a mature spinning neutron star. Continuous gravitational-wave searches with the LIGO-Virgo network of interferometers is beginning to set interesting upper limits for such signals for a number of known pulsars (Abbott et al. 2017a), in some instances reaching significantly below the expected maximum “mountain” size estimated from state of the art molecular dynamics simulations of the crustal breaking strain (Horowitz and Kadau 2009; Johnson-McDaniel and Owen 2013) (see Baiko and Chugunov (2018) for an alternative estimate, and note that while the model of a lattice of point-like ions may apply to the outer crust, the situation in the inner crust—with a superfluid component and possible pasta phases—is much less clear).

In essence, the elastic properties of the crust are crucial for an understanding of neutron-star phenomenology. In order for such models to reach the required level of realism we must consider the problem in the context of General Relativity. Interestingly, relativistic elasticity turns out to represent a (more or less) natural extension of the variational framework, with the key step involving the structure of matter space.

The matter space metric

The modern view of elasticity (Carter and Quintana 1972, 1975a, b; Kijowski and Magli 1992, 1997; Beig and Schmidt 2003a, b; Carter et al. 2006a) relies on comparing the actual matter configuration to an unstrained/relaxed reference shape (see Carter and Chachoua 2006; Carter and Samuelsson 2006 for discussions of how the problem changes when an interpretating superfluid component is present, as in the inner crust of a neutron star). In order to keep track of the reference state relative to which the strain is measured, we introduce a positive definite and symmetric tensor field, \(k_{a b}\) (Karlovini and Samuelsson 2003). The geometric meaning of this object is quite intuitive; it encodes the (three-)geometry of the solid (as seen by the solid itself). We will mostly cite key results from the extant literature about the properties of \(k_{a b}\); in particular, for the discussion that follows, the Appendix of Andersson et al. (2019) may be the most relevant.

From the point of view of the variational framework, the tensor \(k_{a b}\) is similar to \(n_{a b c}\) in the sense that it is flow-line orthogonal (Carter and Quintana 1972)

$$\begin{aligned} u^a k_{a b} = 0 . \end{aligned}$$

The main properties of \(k_{a b}\) are established by introducing the corresponding matter space object, \(k_{A B} (= k_{B A})\), via the usual map:

$$\begin{aligned} k_{a b} = \psi ^A_a \psi ^B_b k_{A B} . \end{aligned}$$

The tensor \(k_{A B}\) is “fixed” on matter space, in the same sense as \(n_{A B C}\), because it is (assumed to be) a function of its own matter space coordinates \(X^A\) only. The associated volume form is \(n_{A B C}\) (see Andersson et al. 2019). If we introduce

$$\begin{aligned} g^{A B} = \psi ^A_a \psi ^B_b g^{a b} = \psi ^A_a \psi ^B_b \perp ^{a b} , \end{aligned}$$

as before, and use Eqs. (6.5) and (6.10), then we can show thatFootnote 20

$$\begin{aligned} n^2 = - g_{a b} n^a n^b = \frac{1}{3!} \det {\left( k_{A B}\right) } \det {\left( g^{A B}\right) } . \end{aligned}$$

Moreover, using the relations (6.13) and (12.2), we can easily establish that the Lagrangian variation of \(k_{a b}\) vanishes. That is, we have

$$\begin{aligned} \delta k_{a b} = - {\mathcal {L}}_\xi k_{a b} \quad \Longrightarrow \quad \varDelta k_{a b} = 0 . \end{aligned}$$

Finally, since \(u^a \psi ^A_a = 0\), and \(k_{A B}\) is a function of \(X^A\), we have

$$\begin{aligned} {\mathcal {L}}_u k_{A B} = u^a \psi ^C_a \frac{\partial k_{A B}}{\partial X^C} = 0 , \end{aligned}$$

and it follows that

$$\begin{aligned} {\mathcal {L}}_u k_{a b}= & {} k_{A B} {\mathcal {L}}_u \left( \psi ^A_a \psi ^B_b \right) \nonumber \\= & {} k_{A B} \left[ u^c \frac{\partial }{\partial x^c} \left( \psi ^A_a \psi ^B_b\right) + \psi ^A_c \psi ^B_b \frac{\partial u^c}{\partial x^a} + \psi ^A_a \psi ^B_c \frac{\partial u^c}{\partial x^b}\right] \nonumber \\= & {} k_{A B} u^c \left[ \frac{\partial ^2 X^A}{\partial x^c \partial x^a} \psi ^B_b + \psi ^A_a \frac{\partial ^2 X^B}{\partial x^c \partial x^b} - \frac{\partial ^2 X^A}{\partial x^a \partial x^c} \psi ^B_b - \psi ^A_a \frac{\partial ^2 X^B}{\partial x^b \partial x^c}\right] = 0 . \end{aligned}$$

Following Karlovini and Samuelsson (2003) we now introduce the matter space tensor \(\eta _{A B}\) to quantify the unsheared state. Its defining characteristic is that it is the inverse to \(g^{A B}\) but only for the relaxed configuration (when the energy density \(\varepsilon = \check{\varepsilon }\), using a check to indicate the reference shape from now on):

$$\begin{aligned} g^{A C} \eta _{C B} = \delta ^A_B , \quad \varepsilon = \check{\varepsilon } . \end{aligned}$$

If we introduce

$$\begin{aligned} \epsilon ^{A B C} = \psi ^A_a \psi ^B_b \psi ^C_c u_d \epsilon ^{d a b c} , \end{aligned}$$

then it follows from (6.10) that

$$\begin{aligned} n_{A B C} = n \epsilon _{A B C} . \end{aligned}$$

In other words,

$$\begin{aligned} \epsilon _{A B C} = \sqrt{\det {\left( \eta _{A B}\right) }} \left[ A \ B \ C\right] . \end{aligned}$$

The tensor \(\eta _{AB}\) is useful because it provides us with a straightforward way to model conformal elastic deformations. Specifically, if f is the conformal factor, we let

$$\begin{aligned} k_{A B} = f \eta _{A B} \quad \Longrightarrow \quad \det {\left( k_{A B}\right) } = f^3 \det {\left( \eta _{A B}\right) } . \end{aligned}$$


$$\begin{aligned} n_{A B C} = \sqrt{ \det {\left( k_{A B}\right) }} \left[ A \ B \ C\right] = n \epsilon _{A B C} = n \sqrt{\det {\left( \eta _{A B}\right) }} \left[ A \ B \ C\right] , \end{aligned}$$

which shows that \(f = n^{2/3}\). This demonstrates that k (a suitably defined 3D determinant of \(k_{a b}\), see Andersson et al. 2019) is such that \(k = n^2\) (Karlovini and Samuelsson 2003), even though \(k_{a b}\) does not itself depend on the number density.


Elastic variations

Let us now consider the variational derivation of the equations of motion for an elastic system. First of all, the fact that the Lagrangian variation of \(k_{a b}\) vanishes means that \(k_{a b}\), in addition to being a natural quantity for describing the elastic configuration, is useful in the development of Lagrangian perturbation theory.

Letting the Lagrangian \(\varLambda \) depend also on the new tensor (in essence, incorporating the energy associated with elastic strain) we have

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left[ \mu _a \delta n^a + \left( \frac{1}{2}\varLambda g^{a b} + {\partial \varLambda \over \partial g_{ab}} \right) \delta g_{a b} + {\partial \varLambda \over \partial k_{ab} }\delta k_{ab} \right] . \end{aligned}$$

We proceed as in Sect. 6 and replace \(\delta n^a\) with the Lagrangian displacement \(\xi ^a\). In addition, it follows from (12.5) that

$$\begin{aligned} \delta k_{ab} = - \xi ^d \nabla _d k_{ab} - k_{d b} \nabla _a \xi ^d - k_{a d} \nabla _b \xi ^d . \end{aligned}$$

Again ignoring surface terms, we have (as \(k_{ab}\) is symmetric)

$$\begin{aligned} {\partial \varLambda \over \partial k_{ab} }\delta k_{ab} = \xi ^a \left[ 2 \nabla _b \left( {\partial \varLambda \over \partial k_{bd} } k_{a d}\right) - {\partial \varLambda \over \partial k_{bd} }\nabla _a k_{b d} \right] . \end{aligned}$$

Making use of this result, we arrive at

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left\{ \left[ \frac{1}{2}\left( \varLambda - n^d \mu _d\right) g^{a b} + {\partial \varLambda \over \partial g_{ab}} \right] \delta g_{a b} + \tilde{f}_a \xi ^a \right\} , \end{aligned}$$


$$\begin{aligned} \tilde{f}_a = 2 n^b \nabla _{[a}\mu _{b]} + 2 \nabla _b \left( {\partial \varLambda \over \partial k_{bd} } k_{a d} \right) - {\partial \varLambda \over \partial k_{bd} }\nabla _a k_{b d} = 0 . \end{aligned}$$

As in the fluid case, this result provides the equations of motion for the system. However, we need to do a bit of work in order to get the result into a more user-friendly form. To start with, we read off the stress-energy tensor from (12.17):

$$\begin{aligned} T^{ab} = \left( \varLambda - n^d \mu _d\right) g^{a b} + 2 {\partial \varLambda \over \partial g_{ab}} . \end{aligned}$$

The next step involves giving physical meaning to \(k_{ab}\). This involves quantifying the deviation of a given state from the relaxed configuration. This is where the additional matter space tensor \(\eta _{A B}\) comes into play (Karlovini and Samuelsson 2003). This object depends on n, and relates directly to the relaxed state, see (12.8). Its spacetime counterpart is

$$\begin{aligned} \eta _{a b} = \psi ^A_a \psi ^B_b \eta _{A B} . \end{aligned}$$

and we have already seen that

$$\begin{aligned} \eta _{a b} = n^{- 2/3} k_{a b} . \end{aligned}$$

This relation is important, as we have already established that \(k_{ab}\) is a fixed matter space tensor.

Let us now imagine that the system evolves away from the relaxed state. This means that (12.8) no longer holds: \(\eta _{AB}\) retains the value set by the initial state, but \(g^{AB}\) evolves along with the spacetime. This leads to the build up of elastic strain, simply quantified in terms of the strain tensor

$$\begin{aligned} s_{a b} = {1\over 2} ( \perp _{a b} - \eta _{a b}) = {1\over 2} \left( \perp _{a b} - n^{-2/3} k_{a b} \right) . \end{aligned}$$

In the relaxed configuration, we have \(\eta _{ab} = \perp _{ab}\) by construction so it is obvious that \(s_{ab}\) vanishes.

This model is fairly intuitive, but in practice it is more natural to work with scalars formed from \(\eta _{ab}\) (which can be viewed as “invariant”). This helps make the model less abstract. Hence, we introduce the strain scalar \(s^2\) (not to be confused with the entropy density from before) as a suitable combination of the invariants of \(\eta _{ab}\):

$$\begin{aligned} I_1 = \eta ^a_{\ a} = g^{A B} \eta _{A B} , \end{aligned}$$
$$\begin{aligned} I_2 = \eta ^a_{\ b} \eta ^b_{\ a} = g^{A D} g^{B E} \eta _{E A} \eta _{D B} , \end{aligned}$$
$$\begin{aligned} I_3 = \eta ^a_{\ b} \eta ^b_{\ d} \eta ^d_{\ a} = g^{A E} g^{B F} g^{D G} \eta _{E B} \eta _{F D} \eta _{G A} . \end{aligned}$$

However, the number density n also can be seen to be a combination of invariants, since

$$\begin{aligned} k = n^2 = {1\over 3!} \left( I_1^3 - 3 I_1I_2+2I_3 \right) . \end{aligned}$$

Given this, it makes sense to replace one of the \(I_N\) (\(N=1, 2, 3\)) with n, which now becomes one of the required invariants. Then we define \(s^2\) to be a function of two of the other invariants. We can choose different combinations, but we must ensure that \(s^2\) vanishes for the relaxed state. For example, Karlovini and Samuelsson (2003) work with

$$\begin{aligned} s^2 = {1\over 36} \left( I_1^3- I_3-24 \right) . \end{aligned}$$

In the limit \(\eta _{ab} \rightarrow \perp _{ab}\) we have \(I_1 , I_3 \rightarrow 3\) and we see that the combination for \(s^2\) in Eq. (12.27) vanishes.

Next, we assume that the Lagrangian of the system depends on \(s^2\), rather than the tensor \(k_{ab}\). In doing this, we need to keep in mind that Eqs. (12.21) and (12.25) show that the invariants \(I_N\) depend on n (and hence both \(n^a\) and \(g_{ab}\)) as well as \(k_{ab}\).

So far, the description is nonlinear, but in most situations of astrophysical interest it should be sufficient to consider a slightly deformed configuration.Footnote 21 In effect, we may focus on a Hookean model, such that

$$\begin{aligned} \varLambda = - \check{\varepsilon }(n) - \check{\mu }(n) s^2 = - \varepsilon , \end{aligned}$$

where \(\check{\mu }\) is usual shear modulus, associated with a linear stress-strain relation for small deviations away from the relaxed state. (It should not to be confused with the chemical potential!) As mentioned earlier, the checks indicate that quantities are calculated for the unstrained state, with the specific understanding that \(s^2=0\), and it should be apparent from (12.28) that we have an expansion in (a supposedly small) \(s^2\).

Since the strain scalar is given in terms of invariants, as in (12.27), it might be tempting to suggest a change of variables such that \(s^2=s^2(I_1,I_3)\). Our final equations of motion will, indeed, reflect this, but it would be premature to make the change at this point. Instead we note that the momentum is now given by

$$\begin{aligned} \mu _a= & {} {\partial \varLambda \over \partial n^a} = {\partial n^2 \over \partial n^a} {\partial \varLambda \over \partial n^2} \nonumber \\= & {} - {1 \over n} {\partial \varLambda \over \partial n} g_{ab}n^b = {1\over n} \left( {d \check{\varepsilon } \over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) g_{ab}n^b , \end{aligned}$$


$$\begin{aligned} {\partial \varLambda \over \partial g_{ab} }=- \left( {d\check{\varepsilon } \over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) {\partial n \over \partial g_{ab}} - \check{\mu } {\partial s^2 \over \partial g_{ab}} . \end{aligned}$$

Here we need (note that \(n^a\) is held fixed in the partial derivative)

$$\begin{aligned} {\partial n \over \partial g_{ab}} = - {1\over 2n} n^a n^b , \end{aligned}$$

and it is useful to note that

$$\begin{aligned} {\partial s^2 \over \partial g_{ab}} = - g^{ad} g^{be} {\partial s^2 \over \partial g^{de}} . \end{aligned}$$

Also, when working out this derivative, we need to hold n fixed [as is clear from (12.30)]. At the end of the day, we have for the stress-energy tensor

$$\begin{aligned} T^{ab}= & {} \left[ \varLambda + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) \right] g^{a b} \nonumber \\&+ {1\over n} \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) n^a n^b +2 \check{\mu } g^{ad} g^{be} {\partial s^2 \over \partial g^{de}} \nonumber \\= & {} \varLambda g^{ab} + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) h^{ab} + 2 \check{\mu } g^{ad} g^{be} {\partial s^2 \over \partial g^{de}} . \end{aligned}$$

Let us now make the change of variables we hinted at previously. In order to establish the procedure, let us consider a situation where \(s^2\) depends only on \(I_1\). Then we need

$$\begin{aligned}&I_1 = \eta ^a_{\ a} = n^{-2/3} g^{ab} k_{ab} , \end{aligned}$$
$$\begin{aligned}&\left( {\partial s^2 \over \partial n}\right) _1= - {2I_1 \over 3n} { \partial s^2 \over \partial I_1 } , \end{aligned}$$
$$\begin{aligned}&\left( {\partial \varLambda \over \partial k_{ab}}\right) _1 = - \check{\mu } {\partial s^2 \over \partial k_{ab} } = - \check{\mu } n^{-2/3} g^{ab} { \partial s^2 \over \partial I_1 } , \end{aligned}$$

(recall the comment on the partial derivative from before) and

$$\begin{aligned} \left( {\partial s^2 \over \partial g^{de}}\right) _1 = { \partial s^2 \over \partial I_1 }\eta _{de} . \end{aligned}$$

Making use of these results, we readily find

$$\begin{aligned} T^{ab}= & {} -\varepsilon g^{ab} + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 \right) \perp ^{ab} + 2 \check{\mu } { \partial s^2 \over \partial I_1 } \left( \eta ^{ab} - {1\over 3} I_1 \perp ^{ab} \right) \nonumber \\= & {} -\varepsilon g^{ab} + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 \right) \perp ^{ab} + 2 \check{\mu } { \partial s^2 \over \partial I_1 } \eta ^{\langle ab \rangle } , \end{aligned}$$

where the \(\langle \ldots \rangle \) brackets indicate the symmetric, trace-free part of a tensor with two free indices. In our case, we have

$$\begin{aligned} \eta _{\langle ab \rangle } = \eta _{(ab)} - { 1 \over 3} \eta ^d_{\ d} \perp _{ab} . \end{aligned}$$

Comparing this result to the standard decomposition of the stress-energy tensor,

$$\begin{aligned} T^{ab} = \varepsilon u^a u^b + \bar{p} \perp ^{ab} + \pi ^{ab}, \qquad \text{ where } \qquad \pi ^a_{\ a} = 0 , \end{aligned}$$

and \(\bar{p}\) is the isotropic pressure (which differs from the fluid pressure, p, as it accounts for the elastic contribution). We see that elasticity introduces an anisotropic contribution

$$\begin{aligned} \pi ^1_{ab} = 2 \check{\mu } {\partial s^2 \over \partial I_1} \eta _{\langle ab \rangle } . \end{aligned}$$

Following the same steps for the other two invariants (see Andersson et al. 2019 for details), \(I_2\) and \(I_3\), we find that

$$\begin{aligned} \pi ^2_{ab} = 4 \check{\mu } {\partial s^2 \over \partial I_2} \eta _{d \langle a} \eta _{b \rangle }^{\ d} , \end{aligned}$$


$$\begin{aligned} \pi ^3_{ab} = 6 \check{\mu } {\partial s^2 \over \partial I_3} \eta ^{d e} \eta _{d \langle a} \eta _{b \rangle e} , \end{aligned}$$

respectively. Combining these results with (12.27), we have

$$\begin{aligned} \pi _{ab} = \sum _N \pi ^N_{ab} = {\check{\mu }\over 6} \left[ \left( \eta ^d_{\ d}\right) ^2 \eta _{\langle ab\rangle }- \eta ^{d e} \eta _{d \langle a} \eta _{b\rangle e}\right] , \end{aligned}$$

which agrees with equation (128) from Karlovini and Samuelsson (2003).

Now consider the final stress-energy tensor. Note first of all that, if we consider n and \(s^2\) as the independent variables of the energy functional, then the isotropic pressure should follow from

$$\begin{aligned} \bar{p} = n \left( {\partial \varepsilon \over \partial n} \right) _{s^2} - \varepsilon = \check{p} + \left( \frac{n}{\check{\mu }} {d\check{\mu }\over dn} -1 \right) {\check{\mu }} s^2 , \end{aligned}$$


$$\begin{aligned} \check{p} = n {d\check{\varepsilon }\over dn} - \check{\varepsilon }, \end{aligned}$$

is identical to the fluid pressure from before. However, we may also introduce a corresponding momentum, such that

$$\begin{aligned} \bar{\mu }_a = - \left( {\partial \varLambda \over \partial n^a} \right) _{s^2} = \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 \right) n_a, \end{aligned}$$

which leads to

$$\begin{aligned} \bar{p} = \varLambda - n^a \bar{\mu }_a = \check{p}+ \left( {n \over \check{\mu }} {d \check{\mu }\over dn} - 1 \right) \check{\mu }s^2 . \end{aligned}$$

Finally, in order to obtain the equations of motion for the system we can either take the divergence of (12.40) or return to (12.18) and make use of our various definitions. The results are the same (as they have to be). After a little bit of work we find that (12.18) leads to

$$\begin{aligned} 2n^b\nabla _{[b}\bar{\mu }_{a]} + \perp _a^d \left( \nabla ^b \pi _{b d} - \check{\mu }\nabla _d s^2\right) = 0 , \end{aligned}$$

where it is worth noting that the combination in the parentheses is automatically flow line orthogonal.

Lagrangian perturbations of an unstrained medium

Many applications of astrophysical interest—ranging from neutron star oscillations to tidal deformations in binary systems and mountains on spinning neutron stars—are adequately modelled within perturbation theory. As should be clear from the development of the elastic model, this requires the use of a Lagrangian framework. Luckily, we have already done most of the work needed to consider this problem. In particular, we know that

$$\begin{aligned} \varDelta k_{ab} = 0 . \end{aligned}$$

We now make maximal use of this fact.

If we assume that the background configuration is relaxed, i.e. that \(s^2=0\) vanishes for the configuration we are perturbing with respect to, then the fluid results from Sect. 6 together with (12.50) make the elastic perturbation problem straightforward (although it still involves some algebra).

Consider, first of all, the strain scalar. A few simple steps lead to

$$\begin{aligned} \varDelta s^2 = 0 . \end{aligned}$$

To see this, recall that \(s^2\) is a function of the invariants, \(I_N\). Express these in terms of the number density n, the spacetime metric and \(k_{ab}\). Once this is done, make use of (12.50) and the fact that the background is unstrained, i.e. \(\eta _{ab} = \perp _{ab}\), to see that \(\varDelta I_N=0\), which makes intuitive sense. Since the strain scalar is quadratic, linear perturbations away from a relaxed configuration should vanish. An important implication of this result is that the last term in (12.49) does not contribute to the perturbed equations of motion.

This leads to

$$\begin{aligned} \varDelta \eta _{ab} = {1\over 3} \eta _{ab} \perp ^{de} \varDelta g_{de} , \end{aligned}$$


$$\begin{aligned} \varDelta \eta ^{ab} = \left[ - 2 g^{a(e} \eta ^{d)b} + {1\over 3} \eta ^{ab} \perp ^{de} \right] \varDelta g_{de} . \end{aligned}$$

It then follows from (12.22) and (12.44), that

$$\begin{aligned} \varDelta \pi _{ab} = - 2 \check{\mu }\varDelta s_{ab} , \end{aligned}$$


$$\begin{aligned} 2 \varDelta s_{ab} = \left( \perp ^e_{\ a} \perp ^d_{\ b} - \frac{1}{3} \perp _{ab} \perp ^{de} \right) \varDelta g_{de} . \end{aligned}$$

It is worth noting that the final result for an isotropic material agrees with, for example, Schumaker and Thorne (1983) where the relevant strain term is simply added to the stress-energy tensor (without detailed justification).

Next, let us consider the perturbed equations of motion. In the case of an unstrained background, it is easy to see that the argument that led to (7.79) still holds. This gives us the perturbation of the first term in (12.49) (after replacing \(\mu _a\rightarrow \bar{\mu }_a\)). Similarly, since \(\pi _{ab}\) vanishes in the background, the Lagrangian variation commutes with the covariant derivative in the second term. Thus, we end up with a perturbation equation of form

$$\begin{aligned} 2n^a\nabla _{[a} \varDelta \bar{\mu }_{b]} + \nabla ^a \varDelta \pi _{ab} = 0 . \end{aligned}$$

This is the final result, but in order to arrive at an explicit expression for the perturbed momentum, it is useful to note that

$$\begin{aligned} \varDelta \mu _a = - {1 \over 2n} \check{\beta }u_a \perp ^{{b} d} \varDelta g_{{b} d} + \mu \left( \delta _a^{{b}} u^d + { 1 \over 2} u_a u^{{b}} u^d \right) \varDelta g_{{b} d} , \end{aligned}$$

where we have defined the bulk modulus \(\check{\beta }\) as

$$\begin{aligned} \check{\beta }= n {d\check{p}\over dn} = (\check{p}+ \check{\varepsilon }) {d\check{p}\over d \check{\varepsilon }} = (\check{p}+ \check{\varepsilon }) \check{C}^2_s , \end{aligned}$$

\(\check{C}^2_s\) is the sound speed in the elastic medium and we have used the fundamental relation \(\check{p}+ \check{\varepsilon } = n \mu \). It also follows that

$$\begin{aligned} \varDelta p = - {\check{\beta } \over 2} \perp ^{ab} \varDelta g_{ab} . \end{aligned}$$

When we consider perturbations of an elastic medium we need to pay careful attention to the magnitude of the deviation away from the relaxed state. If the perturbation is too large, the material will yield (Horowitz and Kadau 2009). It may fracture or behave in some other fashion that is not appropriately described by the equations of perfect elasticity. We need to quantify the associated breaking strain. In applications involving neutron stars, this is important if we want to consider star quakes in a spinning down pulsar, establish to what extent crust quakes in a magnetar lead to the observed flares (Watts et al. 2016) and whether the crust breaks due to the tidal interaction in an inspiralling binary (Strohmayer and Watts 2005; Penner et al. 2012; Tsang et al. 2012). A commonly used criterion to discuss elastic yield strains in engineering involves the von Mises stress, defined as

$$\begin{aligned} \varTheta _{\mathrm {vM}} = \sqrt{\frac{3}{2} s_{ab}s^{ab}} \end{aligned}$$

When this scalar exceeds some critical value \(\varTheta _{\mathrm {vM}} > \varTheta ^{\mathrm {crit}}_{\mathrm {vM}}\), say, the material no longer behaves elastically. In order to work out the dominant contribution to the von Mises stress in general we need to (at least formally) consider second order perturbation theory (Andersson et al. 2019), but in the simple case of an unstrained background we have

$$\begin{aligned} \varTheta _{\mathrm {vM}} = \sqrt{\frac{3}{2} \varDelta s_{ab} \varDelta s^{ab}} = \sqrt{\frac{3}{8} \perp ^{a\langle c}\perp ^{d\rangle b}\varDelta g_{ab}\varDelta g_{cd}} \end{aligned}$$

This allows us to quantify when a strained crust reaches the point of failure. This allows us to work out the maximal deformation, but unfortunately it is difficult to model what happens beyond this point. The same is true for terrestrial materials.


Low temperature physics continues to be a vibrant area of research, providing interesting and exciting challenges, many of which are associated with the properties of superfluids/superconductors. Basically, matter appears to have two options when the temperature decreases towards absolute zero. According to classical physics one would expect the atoms in a liquid to slow down and come to rest, forming a crystalline structure. It is, however, possible that quantum effects become relevant before the liquid solidifies, leading to the formation of a superfluid condensate (a quantum liquid). This will only happen if the interaction between the atoms is attractive and relatively weak. The archetypal superfluid system is Helium. It is well established that \(^4\)He exhibits superfluidity below \(T=2.17\) K. Above this temperature liquid Helium is accurately described by the Navier-Stokes equations. Below the critical temperature the modelling of superfluid \(^4\)He requires a “two-fluid” description. Two fluid degrees of freedom are required to explain, for example, “clamped” flow through narrow capillaries and the presence of a second sound (associated with heat flow).

Many other low temperature systems are known to exhibit superfluid properties. The different phases of \(^3\)He have been well studied, both theoretically and experimentally, and there is considerable current interest in atomic Bose–Einstein condensates. The relevance of superfluid dynamics reaches beyond systems that are accessible in the laboratory. It is generally expected that neutron stars will contain a number of superfluid phases. This expectation is natural given the extreme core density (reaching several times the nuclear saturation density) and low temperature (compared to the nuclear scale of the Fermi temperatures of the different constituents, about \(10^{12}~\hbox {K}\)) of these stars.

The rapid spin-up and subsequent relaxation associated with radio pulsar glitches provides strong, albeit indirect, evidence for neutron-star superfluidity (Haskell and Sedrakian 2018). The standard model for these events is based on, in the first instance, the pinning of superfluid vortices (e.g., to the crust lattice) which allows a rotational lag to build up between the superfluid and the part of the star that spins down electromagnetically, and secondly the sudden unpinning which transfers angular momentum from one component to the other, leading to the observed spin-change. Recent observations of the youngest known neutron star in the galaxy, the compact object in the Cassiopeia A supernova remnant, with an estimated age of around 330 years, are also relevant in this context. The cooling of this objects seems to accord with our understanding of neutron stars with a superfluid component in the core (Page et al. 2011; Shternin et al. 2011). The idea remains somewhat controversial—see, for example Elshamouty et al. (2013), Posselt et al. (2013), Ho et al. (2015) and Posselt and Pavlov (2018)— but in principle the data can be used to infer the pairing gap for neutron superfluidity in the core, which helps constrain current theory. Similarly, the slow thermal relaxation observed in neutron stars that enter quiescence at the end of an accretion phase requires a superfluid component to be present in the neutron star crust (Wijnands et al. 2017).

Basically, neutron star astrophysics provides ample motivation for us to develop a relativistic description of superfluid systems. At one level this turns out to be straightforward, given the general variational multi-fluid model. However, when we consider the fine print we uncover a number of hard physics questions. In particular, we need to make contact with microphysics calculations that determine the various parameters of the relevant multi-fluid systems. We also need to understand how to incorporate quantized vortices (Barenghi et al. 2001), and the associated mutual friction, in the relativistic context. In order to establish the proper context for the discussion, it makes sense to first discuss the multi-fluid approach to Newtonian superfluids. We do this for the particular case of Helium, the archetypal laboratory two-fluid system.

Bose–Einstein condensates

In order to understand the key aspects of the connection between the fluid model and the underlying quantum system, it is natural to consider the problem of a single component Bose–Einstein condensate. In recent years there has been a virtual explosion of interest in such systems. A key reason for this is that atomic condensates lend themselves to precision experiments, allowing researchers to probe the nature of the associated macroscopic quantum behaviour (Pethick and Smith 2008) In addition, from the relativity point of view, the description of Bose–Einstein condensates is relevant as it connects with issues that may play a role in cosmology (Sikivie and Yang 2009; Harko 2011).

On a sufficiently large scale, atomic condensatesFootnote 22 are accurately represented by a fluid model, similar to that used for superfluid Helium (described below). Consider as an example a uniform Bose gas, in a volume V, with an effective (long-range) interaction energy \(U_0\). The relevant interaction arises in the Born approximation, and is related to the s-wave scattering length a through

$$\begin{aligned} U_0 = {4\pi \hbar ^2 a\over m}, \end{aligned}$$

where m is the atomic mass. This effectively means that the model is appropriate only for dilute gases, where short-range corrections to the interaction can be ignored. In essence, we are focussing on the long-wavelength behaviour. Given the interaction, the energy of a state with N bosons (recalling that we need to multiply by the number of ways that these can be arranged in pairs) is

$$\begin{aligned} E = {N(N-1)\over 2} {U_0\over V} \approx {N^2\over 2} {U_0\over V} = {1\over 2} n^2 V U_0 , \end{aligned}$$

where we have defined the number density \(n=N/V\). From this we see that the chemical potential is

$$\begin{aligned} \mu = {dE\over dN } = {N\over V} U_0 = n U_0 . \end{aligned}$$

Alternatively, we may work with the energy density

$$\begin{aligned} \varepsilon = {E \over V} \qquad \Longrightarrow \qquad \mu = {d\varepsilon \over dn} , \end{aligned}$$

as in Sect. 2. From the usual thermodynamical relation we see that the pressure of the system follows from

$$\begin{aligned} dp = n d\mu . \end{aligned}$$

The main theoretical tool for studying the dynamics of atomic Bose–Einstein condensates is the Gross–Pitaevskii equation. This equation, which takes the form

$$\begin{aligned} - {\hbar ^2 \over 2m} \nabla ^2 \varPsi +V_\mathrm {ext} \varPsi + U_0 |\varPsi |^2\varPsi = i \hbar \partial _t\varPsi , \end{aligned}$$

encodes the dependence of the order parameter \(\varPsi \) (note that this is not the many-body quantum wave-function) on the interaction \(U_0\) and an external potential \(V_\mathrm {ext}\). In laboratory systems the external potential usually represents an optical trap. In an astrophysical setting it can be taken as a proxy for the coupling to the gravitational field.

At low temperatures (such that we can ignore thermal excitations) the order parameter is normalized in such a way that the density of the condensate equals the density of the gas

$$\begin{aligned} |\varPsi |^2 = n. \end{aligned}$$

With this identification, we may consider the simplest problem; the stationary solution to (13.6), representing the ground state of the system. Letting the time dependence be of form \(\varPsi =\varPsi _0 \exp (-i\mu t/\hbar ) \) we see that a uniform, stationary solution corresponds to

$$\begin{aligned} \mu = n U_0+V_\mathrm {ext} . \end{aligned}$$

Moving on to the time-dependent dynamics, we note that (13.6) describes a complex-valued function \(\varPsi \). In effect, there are two degrees of freedom to consider. Given the connection to n it is useful to consider the magnitude of \(\varPsi \). Multiplying (13.6) with \(\varPsi ^*\) (where the asterisk represents complex conjugation) and subtracting the result from its own complex conjugate, we readily arrive at

$$\begin{aligned} \partial _t |\varPsi |^2 + {\hbar \over 2mi} \nabla _i \left( \varPsi ^* \nabla ^i \varPsi - \varPsi \nabla ^i \varPsi ^* \right) = 0 . \end{aligned}$$

Comparing this result with the continuity equation, we see that the two take the same form provided that we identify (in analogy with the momentum operator in quantum mechanics) the velocity

$$\begin{aligned} v^i = {p^i \over mi} = {\hbar \over 2mi} {1\over |\varPsi |^2} \left( \varPsi ^* \nabla ^i \varPsi - \varPsi \nabla ^i \varPsi ^* \right) . \end{aligned}$$

In other words, we have

$$\begin{aligned} \partial _t n + \nabla _i \left( n v^i \right) = 0 . \end{aligned}$$

Having already made use of the magnitude, it makes sense to let the second degree of freedom in the problem be represented by the phase of \(\varPsi \). Letting \(\varPsi = \sqrt{n} \exp (iS)\) we can write the real part of (13.6) as

$$\begin{aligned} -\hbar \partial _t S = \mu + V_\mathrm {ext} + {mv^2 \over 2}- {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} . \end{aligned}$$

Here we have identified the chemical potential as before. We have also used

$$\begin{aligned} {\hbar ^2 \over 2m} (\nabla _i S) (\nabla ^i S) = {mv^2 \over 2} , \end{aligned}$$

which follows from (13.10). Finally, we take the gradient of (13.12) to get

$$\begin{aligned}&m \partial _t v_i + \nabla _i \left[ \mu + V_\mathrm {ext} + {mv^2 \over 2}- {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right] \nonumber \\&\quad = m \left( \partial _t +v^j \nabla _j \right) v_i + \nabla _i \left( \mu + V_\mathrm {ext} \right) \nonumber \\&\qquad +\,m \epsilon _{ijk} v^j \left( \epsilon ^{klm} \nabla _l v_m\right) - \nabla _i \left( {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right) = 0 . \end{aligned}$$

By definition, the flow is potential and hence irrotational (at least as long as we ignore quantum vortices, which we consider later), so

$$\begin{aligned} m \left( \partial _t +v^j \nabla _j \right) v_i + \nabla _i \left( \mu + V_\mathrm {ext} \right) - \nabla _i \left( {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right) = 0 . \end{aligned}$$

Comparing to the standard fluid result, we see that only the last term differs. Notably, it is also the only term that (explicitly) retains the quantum origins of the model (Planck’s constant!).

So far, we have not made any simplifications. The two Eqs. (13.11) and (13.15) contain the same information as the Gross–Pitaevskii equation (13.6). The equations differ from those for irrotational fluid flow only by the presence of the final term in (13.15). This term, which represents a “quantum pressure” is, however, irrelevant as long as we focus on the large-scale dynamics. To see this, assume that the order parameter varies on some length-scale L. It then follows that

$$\begin{aligned} \nabla \mu \sim {nU_0 \over L} \qquad \text{ and } \qquad \nabla \left( {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right) \sim {\hbar ^2 \over mL^3} . \end{aligned}$$

In other words, the quantum pressure can be neglected as long as

$$\begin{aligned} {\hbar ^2 \over mn L^2 U_0} \ll 1 . \end{aligned}$$

In order to give this relation a clearer meaning, we introduce the coherence length \(\xi \), roughly the length-scale on which the kinetic energy balances the pressure. This leads to

$$\begin{aligned} {\hbar ^2 \over 2m \xi ^2 } \approx n U_0 , \end{aligned}$$

and we can neglect the quantum pressure as long as

$$\begin{aligned} \left( {\xi \over L} \right) ^2 \ll 1 . \end{aligned}$$

As long as this condition is satisfied, a low temperature Bose–Einstein condensate is faithfully represented by a fluid model. In the atomic condensate literature this regime is sometimes referred to as the Thomas–Fermi limit. It is worth noting that, even though the above condition implies that the fluid model is appropriate on larger scales, it is fundamentally not the same averaging argument that leads to the notion of a fluid element in the usual discussion. In the case of quantum condensates, the fluid model may in fact be appropriate at much shorter scales since it tends to be the case that the coherence length is vastly smaller than the mean-free path of the various particles that make up a normal “fluid”. This scale enters the quantum problem once we consider finite temperature excitations, being relevant for the second component that then comes into play.


Helium: the original two-fluid model

Phenomenologically, the behaviour of superfluid Helium is “easy” to understand if one first considers a system at absolute zero temperature. Then the dynamics is entirely due to the quantum condensate (as in the previous example). There exists a single quantum wavefunction, and the momentum of the flow follows directly from the gradient of its phase. This immediately implies that the flow is irrotational. At finite temperatures, one must also account for thermal excitations (like phonons). A second dynamical degree of freedom arises since the excitation gas may drift relative to the atoms. In the standard two-fluid model, one makes a distinction between a “normal” fluid componentFootnote 23 and a superfluid part. The identification of the associated densities is to a large extent “statistical” as one cannot physically separate the “normal” component from the “superfluid” one. It is important to keep this in mind.

We take as our starting point the Newtonian version of the multi-fluid framework. We consider the simplest conducting system corresponding to a single particle species exhibiting superfluidity. Such systems typically have two degrees of freedom, c.f. \(\mathrm {He}^4\) (Putterman 1974; Tilley and Tilley 1990) where the entropy can flow independently of the superfluid Helium atoms. Superfluid \(\mathrm {He}^3\) can also be included in the mixture, in which case there will be a relative flow of the \(\mathrm {He}^3\) isotope with respect to \(\mathrm {He}^4\), and relative flows of each with respect to the entropy (Vollhardt and Wölfle 2002). The model we advocate here distinguishes the atoms from the massless “entropy”—the former will be identified by a constituent index \(\mathrm {n}\), while the latter is represented by \(\mathrm {s}\). As this description is different (in spirit) from the standard two-fluid model for Helium, it is relevant to explain how the two descriptions are related.

First of all, we need to allow for a difference in the two three-velocities

$$\begin{aligned} w_i^{\mathrm {y}\mathrm {x}} = {v}_{i}^{\mathrm {y}} - {v}_{i}^{\mathrm {x}} , \quad {\mathrm {y}}\ne {\mathrm {x}}. \end{aligned}$$

Letting the square of this difference be given by \(w^2\), the equation of state then takes the form \(\mathcal{E} = \mathcal{E}(n_{\mathrm {n}},n_{\mathrm {s}},w^2)\). Hence, we have

$$\begin{aligned} {d} \mathcal{E} = \mu ^{\mathrm {n}} \, {d} n_{\mathrm {n}} + \mu ^{\mathrm {s}} \, {d} n_{\mathrm {s}} + \alpha \, {d} w^2, \end{aligned}$$


$$\begin{aligned} \mu ^{\mathrm {n}} = \left. \frac{\partial \mathcal {E}}{\partial n_{\mathrm n}} \right| _{n_{\mathrm {s}},w^2}, \qquad \mu ^{\mathrm {s}} = \left. \frac{\partial \mathcal {E}}{\partial n_{\mathrm s}} \right| _{n_{\mathrm {s}},w^2}, \qquad \alpha = \left. \frac{\partial \mathcal {E}}{\partial w^2} \right| _{n_{\mathrm {s}},n_{\mathrm {s}}}. \end{aligned}$$

The \(\alpha \) coefficient reflects the effect of entrainment on the equation of state. Similarly, entrainment causes the fluid momenta to be modified to

$$\begin{aligned} \frac{p^\mathrm {x}_i}{m^\mathrm {x}} = v^i_\mathrm {x}+ 2 \frac{\alpha }{\rho _\mathrm {x}} w^i_{\mathrm {y}\mathrm {x}}. \end{aligned}$$

The number density of each fluid obeys a continuity equation:

$$\begin{aligned} \frac{\partial n_{\mathrm {x}}}{\partial t} + \nabla _{j} (n_{\mathrm {x}} v_{\mathrm {x}}^{j}) = 0. \end{aligned}$$

Each fluid also satisfies an Euler-type equation, which ensures the conservation of total momentum. This equation can be written

$$\begin{aligned} \left( \frac{\partial }{\partial t} + {v}^{j}_{\mathrm {x}}\nabla _{j} \right) \left[ {v}_{i}^{\mathrm {x}} + \varepsilon _{\mathrm {x}} w_i^{\mathrm {y}\mathrm {x}} \right] + \nabla _{i} (\varPhi + \tilde{\mu }_{\mathrm {x}}) + \varepsilon _{\mathrm {x}} w_j^{\mathrm {y}\mathrm {x}} \nabla _{i} v^{j}_{\mathrm {x}} = 0, \end{aligned}$$


$$\begin{aligned} \tilde{\mu }_\mathrm {x}= \frac{\mu ^\mathrm {x}}{m^\mathrm {x}} , \end{aligned}$$

and the entrainment is now included via the coefficients

$$\begin{aligned} \varepsilon _\mathrm {x}= 2 \rho _\mathrm {x}\alpha . \end{aligned}$$

For a detailed discussion of these equations, see Prix (2004) and Andersson and Comer (2006).

We have already seen that the entrainment means that each momentum does not have to be parallel to the associated flux. In the case of a two-component system, with a single species of particle flowing with \(n^\mathrm {n}_i = n v^\mathrm {n}_i\) and a massless entropy with flux \(n^\mathrm {s}_i = sv^\mathrm {s}_i\) (i.e., letting \(n_\mathrm {n}=n\) and \(n_s=s\), where n is the particle number density and s represents the entropy per unit volume), the momentum densities are

$$\begin{aligned} \pi _i^\mathrm {n}= n p_i^\mathrm {n}= mn v_i^\mathrm {n}- 2 \alpha w_i^{\mathrm {n}\mathrm {s}} , \end{aligned}$$


$$\begin{aligned} \pi ^\mathrm {s}_i = s p^\mathrm {s}_i = 2 \alpha w_i^{\mathrm {n}\mathrm {s}} . \end{aligned}$$

In order to understand the physical relevance of the entrainment better, let us compare the two-fluid model to the orthodox model used to describe laboratory superfluids. This also clarifies the dynamical role of the thermal excitations in the system.

Expressed in terms of the momentum densities, the two momentum equations can be written, cf. (13.25),

$$\begin{aligned} \partial _t \pi _i^\mathrm {n}+ \nabla _j \left( v_\mathrm {n}^j \pi _i^\mathrm {n}\right) + n \nabla _i \left( \mu _\mathrm {n}- \frac{1}{2} m v_\mathrm {n}^2 \right) + \pi _j^\mathrm {n}\nabla _i v_\mathrm {n}^j = 0 , \end{aligned}$$


$$\begin{aligned} \partial _t \pi _i^\mathrm {s}+ \nabla _j \left( v_\mathrm {s}^j \pi _i^\mathrm {s}\right) + s \nabla _i T + \pi _j^\mathrm {s}\nabla _i v_\mathrm {s}^j = 0 , \end{aligned}$$

where we have used the fact that the temperature follows from \(\mu _\mathrm {s}= T\). Let us now assume that we are considering a superfluid system. For low temperatures and velocities the fluid described by (13.30) should be irrotational. In order to impose this constraint we need to appreciate that it is the momentum that is quantized in a rotating superfluid, not the velocity. This means that we require

$$\begin{aligned} \epsilon ^{klm} \nabla _l p^\mathrm {n}_m = 0 . \end{aligned}$$

To see how this affects the equations of motion, we rewrite (13.30) as

$$\begin{aligned} n \partial _t p_i^\mathrm {n}+ n \nabla _i \left[ \mu _\mathrm {n}- \frac{m}{2} v_\mathrm {n}^2 + v_\mathrm {n}^j p_j^\mathrm {n}\right] - n \epsilon _{ijk} v_\mathrm {n}^j (\epsilon ^{klm} \nabla _l p^\mathrm {n}_m) = 0 \end{aligned}$$

Using (13.32) we have

$$\begin{aligned} \partial _t p_i^\mathrm {n}+ \nabla _i \left[ \mu _\mathrm {n}- \frac{m}{2} v_\mathrm {n}^2 + v_\mathrm {n}^j p_j^\mathrm {n}\right] = 0 . \end{aligned}$$

We now have all the expressions we need to make a direct comparison with the standard two-fluid model for Helium.

It is natural to begin by identifying the drift velocity of the quasiparticle excitations in the two models. After all, this is the variable that leads to the “two-fluid” dynamics. Moreover, since it distinguishes the part of the flow that is affected by friction it has a natural physical interpretation. In the standard two-fluid model this velocity, \(v_{\mathrm {N}}^i\), is associated with the “normal fluid” component. In the variational framework, the excitations are directly associated with the entropy of the system, which flows with \(v_\mathrm {s}^i\). These two quantities should be the same, and hence we have

$$\begin{aligned} v_{\mathrm {N}}^i = v_\mathrm {s}^i . \end{aligned}$$

The second fluid component, the “superfluid”, is usually associated with a “velocity” \(v_{\mathrm {S}}^i\). This quantity is directly linked to the gradient of the phase of the superfluid condensate wave function. This means that it is, in fact, a rescaled momentum. This means that we should identify

$$\begin{aligned} v_{\mathrm {S}}^i = \frac{\pi ^i_\mathrm {n}}{\rho _\mathrm {n}} = \frac{p_\mathrm {n}^i}{m} . \end{aligned}$$

These identifications lead to

$$\begin{aligned} \rho v_{\mathrm {S}}^i = \rho \left[ \left( 1 - \varepsilon \right) v_\mathrm {n}^i + \varepsilon v_{\mathrm {N}}^i \right] , \end{aligned}$$

where \(\varepsilon = 2\alpha /\rho \) and \(\rho \) is the total mass density. We see that the total mass current is

$$\begin{aligned} \rho v_\mathrm {n}^i = \frac{\rho }{1 - \varepsilon } v_{\mathrm {S}}^i - \frac{\varepsilon \rho }{1 - \varepsilon } v_{\mathrm {N}}^i . \end{aligned}$$

If we introduce the superfluid and normal fluid densities,

$$\begin{aligned} \rho _{\mathrm {S}}= \frac{\rho }{1 - \varepsilon } , \qquad \text{ and } \qquad \rho _{\mathrm {N}}= - \frac{\varepsilon \rho }{1 - \varepsilon } , \end{aligned}$$

we arrive at the usual result (Khalatnikov 1965; Putterman 1974)

$$\begin{aligned} \rho v_\mathrm {n}^i = \rho _{\mathrm {S}}v_{\mathrm {S}}^i + \rho _{\mathrm {N}}v_{\mathrm {N}}^i . \end{aligned}$$

Obviously, it is the case that \(\rho = \rho _{\mathrm {S}}+ \rho _{\mathrm {N}}\). This completes the translation between the two formalisms. Comparing the two descriptions, it is clear that the variational approach has identified the natural physical variables—the average drift velocity of the excitations and the total momentum flux. Since the system can be “weighed” the total density \(\rho \) also has a clear interpretation. Moreover, the variational derivation identifies the truly conserved fluxes. In contrast, the standard model uses quantities that only have a statistical meaning. The density \(\rho _{\mathrm {N}}\) is inferred from the mean drift momentum of the excitations. That is, there is no “group” of excitations that can be identified with this density. Since the superfluid density \(\rho _{\mathrm {S}}\) is inferred from \(\rho _{\mathrm {S}}= \rho -\rho _{\mathrm {N}}\), it is a statistical concept, as well. Furthermore, the two velocities, \(v_{\mathrm {N}}^i\) and \(v_{\mathrm {S}}^i\), are not individually associated with a conservation law. From a practical point of view, this is not a problem. The various quantities can be calculated from microscopic theory and the results are known to compare well to experiments. At the end of the day, the two descriptions are (as far as applications are concerned) identical and the preference of one over the other is very much a matter of taste (or convention).

The above results show that the entropy entrainment coefficient follows from the “normal fluid” density according to

$$\begin{aligned} \alpha = - \frac{\rho _{\mathrm {N}}}{2} \left( 1 - \frac{\rho _{\mathrm {N}}}{\rho } \right) ^{-1} . \end{aligned}$$

This shows that the entrainment coefficient diverges as the temperature increases towards the superfluid transition and \(\rho _{\mathrm {N}}\rightarrow \rho \). At first sight, this may seem an unpleasant feature of the model. However, it is simply a manifestation of the fact that the two fluids must lock together as one passes through the phase transition. The model remains non-singular as long as \(v_i^\mathrm {n}\) approaches \(v_i^\mathrm {s}\) sufficiently fast as the critical temperature is approached. More detailed discussions of entrainment and finite temperature superfluids can be found in Andersson et al. (2013), Gusakov and Andersson (2006), Kantor and Gusakov (2011), Gusakov et al. (2009), Gusakov and Haensel (2005), Leinson (2018), Dommes et al. (2020) and Rau and Wasserman (2020).

Having related the main variables, let us consider the form of the equations of motion. We start with the inviscid problem. It is common to work with the total momentum. Thus, we combine (13.30) and (13.31) to get

$$\begin{aligned}&\partial _t \left( \pi _i^\mathrm {n}+ \pi _i^\mathrm {s}\right) + \nabla _l \left( v_\mathrm {n}^l \pi ^\mathrm {n}_i + v_\mathrm {s}^l \pi _i^\mathrm {s}\right) + n \nabla _i \mu _\mathrm {n}+ s \nabla _i T \nonumber \\&\quad - n \nabla _i \left( \frac{1}{2} m v_\mathrm {n}^2 \right) + \pi _l^\mathrm {n}\nabla _i v_\mathrm {n}^l + \pi _l^\mathrm {s}\nabla _i v_\mathrm {s}^l = 0 . \end{aligned}$$

Here we have

$$\begin{aligned} \pi _i^\mathrm {n}+ \pi _i^\mathrm {s}= \rho v_i^\mathrm {n}\equiv j_i \end{aligned}$$

which defines the total momentum density. From the continuity equations (13.24) we see that

$$\begin{aligned} \partial _t \rho + \nabla _i j^i = 0 . \end{aligned}$$

The pressure \(\varPsi \) follows from

$$\begin{aligned} \nabla _i \varPsi = n \nabla _i \mu _\mathrm {n}+ s \nabla _i T - \alpha \nabla _i w_{\mathrm {n}\mathrm {s}}^2 , \end{aligned}$$

and we also need the relation

$$\begin{aligned} v_n^l \pi _i^\mathrm {n}+ v_\mathrm {s}^l \pi _i^\mathrm {s}= v^{\mathrm {S}}_i j^l + v_{\mathrm {N}}^l j^0_i , \end{aligned}$$

where we have defined

$$\begin{aligned} j^0_i = \rho _{\mathrm {N}}(v_i^{\mathrm {N}}- v_i^{\mathrm {S}}) = \pi _i^\mathrm {s}, \end{aligned}$$


$$\begin{aligned} \pi _l^\mathrm {n}\nabla _i v_\mathrm {n}^l + \pi _l^\mathrm {s}\nabla _i v_\mathrm {s}^l = n \nabla _i \left( \frac{1}{2} m v_\mathrm {n}^2 \right) - 2 \alpha w_l^{\mathrm {n}\mathrm {s}} \nabla _i w^l _{\mathrm {n}\mathrm {s}} . \end{aligned}$$

Putting all the pieces together we have

$$\begin{aligned} \partial _t j_i + \nabla _l \left( v_i^{\mathrm {S}}j^l + v_{\mathrm {N}}^l j^0_i\right) + \nabla _i \varPsi = 0 . \end{aligned}$$

The second equation of motion follows directly from (13.34);

$$\begin{aligned} \partial _t v_i^{\mathrm {S}}+ \nabla _i \left( \tilde{\mu }_{\mathrm {S}}+ \frac{1}{2} v_{\mathrm {S}}^2 \right) = 0 , \end{aligned}$$

where we have defined

$$\begin{aligned} \tilde{\mu }_{\mathrm {S}}= \frac{1}{m} \mu _\mathrm {n}- \frac{1}{2} \left( v_\mathrm {n}^i - v_{\mathrm {S}}^i\right) ^2 . \end{aligned}$$

The above relations show that our inviscid equations of motion are identical to the standard ones (Khalatnikov 1965; Putterman 1974). The identified relations between the different variables also provide a direct way to translate the quantities in the two descriptions. For example, we can write down a generalized first law, starting from (13.21). The key point is that we have demonstrated how the “normal fluid density” corresponds to the entropy entrainment in the variational model. This clarifies the role of the entropy entrainment; a quantity that arises in a natural way within the variational framework.

Relativistic models

Neutron star physics provides ample motivation for the need to develop a relativistic description of superfluid systems. As the typical core temperatures (below \(10^8~\mathrm {K}\)) are far below the Fermi temperature of the various constituents (of the order of \(10^{12}~\mathrm {K}\) for baryonsFootnote 24) mature neutron stars are extremely cold on the nuclear temperature scale. This means that—just like ordinary matter at near absolute zero temperature—the matter in the star will most likely freeze to a solid or become superfluid. While the outer parts of the star, the so-called crust, form an elastic lattice, the inner parts of the star are expected to be superfluid. In practice, this means that we must be able to model mixtures of superfluid neutrons and superconducting protons. It is also likely that we need to understand superfluid hyperons and colour superconducting quarks. There are many hard physics questions that need to be considered if we are to make progress in this area. In particular, we need to make contact with microphysics calculations that determine parameters of such multi-fluid systems.

One of the key features of a pure superfluid is that it is irrotational. On a larger scale, bulk rotation is mimicked by the formation of vortices, slim “tornadoes” representing regions where the superfluid degeneracy is broken (Barenghi et al. 2001). In practice, this means that one would often, e.g., when modelling global neutron star oscillations, consider a macroscopic model based on “averaging” over a large number of vortices. The resulting model closely resembles the standard fluid model. Of course, it is important to remember that the vortices are present on the microscopic scale and that they may affect the parameters in the problem. There are also unique effects that are due to the vortices, e.g., the mutual friction that is thought to be the key agent that counteracts relative rotation between the neutrons and protons in a superfluid neutron star core (Mendell 1991b).

For the present discussion, let us focus on the case of superfluid \(\mathrm {He}^4\). We then have two fluids, the superfluid Helium atoms with particle number density \(n_\mathrm {n}\) and the entropy with particle number density \(n_\mathrm {s}\), as before. From the derivation in Sect. 9 we know that the equations of motion can be written

$$\begin{aligned} \nabla _a n_{\mathrm {x}}^a = 0 , \end{aligned}$$


$$\begin{aligned} n_{\mathrm {x}}^b \nabla _{[b} \mu ^{\mathrm {x}}_{a]} = 0. \end{aligned}$$

To make contact with other discussions of the superfluid problem (Carter and Khalatnikov 1992, 1994; Carter and Langlois 1995a, 1998), we will use the notation \(s^a= n_\mathrm {s}^a\) and \(\varTheta _a = \mu _a^\mathrm {s}\). Then the equations that govern the motion of the entropy become

$$\begin{aligned} \nabla _a s^a = 0 \qquad \mathrm {and} \qquad s^b \nabla _{[b} \varTheta _{a]} = 0 . \end{aligned}$$

Now, since the superfluid constituent is irrotational we also have

$$\begin{aligned} \nabla _{[a} \mu ^\mathrm {n}_{b]} = 0 . \end{aligned}$$

The particle conservation law for the matter component is, of course, unaffected by this constraint. This shows how easy it is to restrict the multi-fluid equations to the case where one (or several) components are irrotational. It is worth emphasizing that it is the momentum that is quantized, not the velocity. This is an important distinction in situations where entrainment plays a role.

It is instructive to contrast this description with other models, like the potential formulation due to Khalatnikov and Lebedev (1982) and Lebedev and Khalatnikov (1982). We arrive at this alternative formulation in the following way (Carter and Khalatnikov 1994). First of all, we know that the irrotationality condition implies that the particle momentum can be written as a gradient of a scalar potential, \(\varphi \) (say). That is, we have

$$\begin{aligned} V_a = - \frac{\mu ^\mathrm {n}_a}{m} = - \nabla _a \varphi . \end{aligned}$$

Here m is the mass of the Helium atom and \(V_a\) is traditionally (and somewhat confusedly, see the previous section) referred to as the “superfluid velocity”. It really is a rescaled momentum. Next assume that the momentum of the remaining fluid (in this case, the entropy) is written

$$\begin{aligned} \mu ^\mathrm {s}_a = \varTheta _a = \kappa _a + \nabla _a \phi . \end{aligned}$$

Here \(\kappa _a\) is Lie transported along the entropy flow provided that \(s^a \kappa _a = 0\) (assuming that the equation of motion (13.54) is satisfied). This leads to

$$\begin{aligned} s^a \nabla _a \phi = s^a \varTheta _a . \end{aligned}$$

There is now no loss of generality in introducing further scalar potentials \(\beta \) and \(\gamma \) such that \(\kappa _a = \beta \nabla _a \gamma \), where the potentials are constant along the flow-lines as long as

$$\begin{aligned} s^a \nabla _a \beta = s^a \nabla _a \gamma = 0. \end{aligned}$$

Given this, we have

$$\begin{aligned} \varTheta _a = \nabla _a \phi + \beta \nabla _a \gamma . \end{aligned}$$

Finally, comparing to Khalatnikov’s formulation (Khalatnikov and Lebedev 1982; Lebedev and Khalatnikov 1982) we define \(\varTheta _a = - \kappa w_a\) and let \(\phi \rightarrow \kappa \zeta \) and \(\beta \rightarrow \kappa \beta \). Then we arrive at the final equation of motion

$$\begin{aligned} - \frac{\varTheta _a}{\kappa } = w_a = - \nabla _a \zeta - \beta \nabla _a \gamma . \end{aligned}$$

Equations (13.56) and (13.61), together with the standard particle conservation laws, are the key equations of the potential formulation. The content of this description is (obviously) identical to that of the variational picture, and we have now seen how the various quantities can be related.

This example shows how easy it is to specify the equations that we derived earlier to the case when one (or several) components are irrotational/superfluid.

Another alternative approach, related to the field theory inspired discussion in Sect. 6.4, is based on the notion of broken symmetries. At a very basic level, a model with a broken U(1) symmetry corresponds to the superfluid model described above. In essence, the superfluid flow introduces a preferred direction which break the assumption that the model is isotropic. At first sight our equations differ from those used in, for example, Son (2001), Pujol and Davesne (2003) and Zhang (2002), but it is easy to demonstrate that we can reformulate our equations to get those written down for a system with a broken U(1) symmetry. The exercise is of interest since it connects with models that have been used to describe other superfluid systems.

Take as starting point the general two-fluid system. From the discussion in Sect. 9, we know that the momenta are in general related to the fluxes via

$$\begin{aligned} \mu ^{\mathrm {x}}_a = \mathcal{B}^{\mathrm {x}}n_a^{\mathrm {x}}+ \mathcal{A}^{{\mathrm {x}}{\mathrm {y}}} n_a^{\mathrm {y}}. \end{aligned}$$

Suppose that, instead of using the fluxes as our key variables, we consider a “hybrid” formulation based on a mixture of fluxes and momenta. In the case of the particle-entropy system, we may use

$$\begin{aligned} n_a^\mathrm {n}= \frac{1}{\mathcal{B}^\mathrm {n}} \mu _a^\mathrm {n}- \frac{\mathcal{A}^{\mathrm {n}\mathrm {s}}}{\mathcal{B}^\mathrm {n}} n_a^\mathrm {s}. \end{aligned}$$

Let us impose irrotationality on the fluid by representing the momentum as the gradient of a scalar potential \(\varphi \). With \(\mu _a^\mathrm {n}= \nabla _a \varphi \) we get

$$\begin{aligned} n_a^\mathrm {n}= \frac{1}{\mathcal{B}^\mathrm {n}} \nabla _a \varphi - \frac{\mathcal{A}^{\mathrm {n}\mathrm {s}}}{\mathcal{B}^\mathrm {n}} n_a^\mathrm {s}. \end{aligned}$$

Now take the preferred frame to be that associated with the entropy flow, i.e. introduce the unit four velocity \(u^a\) such that \(n_\mathrm {s}^a = n_\mathrm {s}u^a = s u^a\). Then we have

$$\begin{aligned} n_a^\mathrm {n}= n u_a - V^2 \nabla _a \varphi \end{aligned}$$

where we have defined

$$\begin{aligned} n \equiv - \frac{s \mathcal{A}^{\mathrm {n}\mathrm {s}}}{\mathcal{B}^\mathrm {n}} \qquad \text{ and } \qquad V^2 = - \frac{1}{\mathcal{B}^\mathrm {n}} . \end{aligned}$$

With these definitions, the particle conservation law becomes

$$\begin{aligned} \nabla _a n_\mathrm {n}^a = \nabla _a \left( n u^a - V^2 \nabla ^a \varphi \right) = 0 . \end{aligned}$$

Meanwhile, the chemical potential in the entropy frame follows from

$$\begin{aligned} \mu = - u^a \mu ^\mathrm {n}_a = - u^a \nabla _a \varphi . \end{aligned}$$

One can also show that the stress-energy tensor becomes

$$\begin{aligned} T^a{}_b = \varPsi \delta ^a{}_b + (\varPsi + \rho ) u^a u_b - V^2 \nabla ^a \varphi \nabla _b \varphi , \end{aligned}$$

where the generalized pressure is given by \(\varPsi \) as usual, and we have introduced

$$\begin{aligned} \varPsi + \rho = \mathcal{B}^\mathrm {s}s^2 + \mathcal{A}^{\mathrm {s}\mathrm {n}} s n . \end{aligned}$$

The equations of motion can now be obtained from \(\nabla _b T^b{}_a = 0\). (Keeping in mind that the equation of motion for \(\mathrm {x}=\mathrm {n}\) is automatically satisfied once we impose irrotationality, as before.) This essentially completes the set of equations written down by, for example, Son (2001) (see also Gusakov and Andersson 2006; Kantor and Gusakov 2011). The argument in favour of this formulation is that it is close to the microphysics calculations, which means that the parameters may be relatively straightforward to obtain. Against the description is the fact that it is a—not very elegant—hybrid where the inherent symmetry amongst the different constituents is lost, and there is also a risk of confusion since one is treating a momentum as if it were a velocity.

In the case when the superfluid rotates, the two-fluid equations apply as long as the rotation is sufficiently fast that one can meaningfully average over the vortex array. In effect, we assume that we can “ignore” the smaller scales associated with, for example, the vortex cores. This may not be possible in all situations, and even if it is, the “effective” parameters on the averaged scale may depend on the more local physics. For example, averaging may be appropriate to describe rotating superfluid neutron stars, but it is easy to construct laboratory systems where averaging is not appropriate. One may also envisage cosmological settings, e.g., involving dark matter condensates (Harko 2011), where averaging is not possible. In such situations we have to pay more careful attention to the forces acting on the vortices and the ensuing motion.


Vortices and mutual friction

Due to the fundamental quantum nature of superfluid (and for that matter, superconducting) condensates, the neutron component in a neutron star core will be quantized into localized vortices that each carry a single quantum of momentum circulation. For simplicity, we will assume that the vortices are locally arranged in a rectilinear array, directed along a unit vector \(\hat{\kappa }^i\), with surface density \(\mathcal {N} \). At the hydrodynamics level, after averaging and in the Newtonian gravity framework, we then have

$$\begin{aligned} \mathcal {W}^i_\mathrm {n}= \frac{1}{m} \epsilon ^{ijk} \nabla _j p^\mathrm {n}_k = \mathcal {N} \kappa ^i , \end{aligned}$$

where we have used \(\kappa ^i = \kappa \hat{\kappa }^i \) with \(\kappa = h / 2m\) the quantum of circulation (the factor of 2 arises from the underlying Cooper pairing, relevant for superfluid neutrons). It is important to note that the quantized “vorticities” refer to the circulation of the canonical momentum \(p^i_\mathrm {n}\) rather than the circulation of velocity. It is the canonical momentum which is related to the gradient of each condensate’s wavefunction phase \( \varphi \), leading to the Onsager–Feynman quantization condition

$$\begin{aligned} \oint p^i_\mathrm {n}dl_i= (\hbar /2) \oint (\nabla ^i \varphi ) dl_i = h/2 . \end{aligned}$$

The variational analysis has already provided us with a two-fluid model that allows for vorticity (obviously). However, if we want to understand the role of the vortices it is useful to consider the problem from a more intuitive (albeit less general) point of view. To do this we generalize an approach that was originally developed in the context of two-fluid hydrodynamics for superfluid Helium (Hall and Vinen 1956). This provides a conceptually different derivation of the Euler equations, based on the kinematics of a conserved number of vortices. It also requires the input of the forces that determine the motion of a single isolated vortex. Thus, consistency between the two derivations allows us to identify the total conservative force exerted on a single vortex, without any need to study the detailed mesoscopic vortex-fluid interaction—at least as long as the vortices are locally aligned. This will be useful when we consider the vortex mediated friction later.


The starting point of the derivation is the Onsager–Feynman condition (13.71). We also need to use the fact that the vortex number density is conserved, i.e. \(\mathcal {N} \) obeys a continuity equation of the form

$$\begin{aligned} \partial _t \mathcal {N} + \nabla _j \left( \mathcal {N} v_\mathrm {v}^j\right) =0 , \end{aligned}$$

where \(v_\mathrm {v}^i\) is the collective vortex velocity within a typical fluid element—in a sense, this relation defines this averaged vortex velocity. Taking the time derivative of (13.71) we have

$$\begin{aligned} \partial _t \mathcal {W}^i_\mathrm {n}= -\kappa ^i \nabla _j ( \mathcal {N} v_\mathrm {v}^j ) + \mathcal {N} \partial _t \kappa ^i . \end{aligned}$$

Reshuffling terms and using the identity \(\nabla _i \mathcal {W}^i_\mathrm {n}=0 \) we obtain

$$\begin{aligned} \partial _t \mathcal {W}^i_\mathrm {n}= \nabla _j \left( \mathcal {W}^j_\mathrm {n}v_\mathrm {v}^i \right) - \nabla _j \left( \mathcal {W}^i_\mathrm {n}v_\mathrm {v}^j \right) + \mathcal {N} \left( \partial _t \kappa ^i + v_\mathrm {v}^j \nabla _j \kappa ^i -\kappa ^j \nabla _j v_\mathrm {v}^i \right) . \end{aligned}$$

The motion of a single vortex can be expressed as the Lie-dragging of the vector \(\kappa ^i \) (which designates the local vortex direction) by the \(v_\mathrm {v}^i \) flow, leading to

$$\begin{aligned} \partial _t \kappa ^i + \mathcal{L}_{v_\mathrm {v}} \kappa ^i=0 . \end{aligned}$$

Then (13.75) reduces to

$$\begin{aligned} \partial _t \mathcal {W}^i_\mathrm {n}+ \epsilon ^{ijk} \nabla _j \left( \epsilon _{klm} \mathcal {W}^l_\mathrm {n}v_\mathrm {v}^m\right) =0 . \end{aligned}$$

which states that the canonical vorticity \( \mathcal {W}^i_\mathrm {n}\) is locally conserved and advected by the \(v_\mathrm {v}^i \) flow. Rewriting the result in terms of the momentum, we have

$$\begin{aligned} \partial _t p^i_\mathrm {n}-\epsilon ^{ijk} v_{\mathrm {v}_j} \epsilon _{klm} \nabla ^l p^m_\mathrm {n}= \nabla ^i \varPsi , \end{aligned}$$

where \(\varPsi \) is a (so far unspecified) scalar potential.

Making use of the relative velocity, \( w^i_\mathrm{nv} = v^i_\mathrm {n}- v_\mathrm {v}^i\), we subsequently write (13.78) as

$$\begin{aligned} n_\mathrm {n}\partial _t p^i_\mathrm {n}- \epsilon ^{ijk} n_j^\mathrm {n}\epsilon _{klm} \nabla ^l p^m_\mathrm {n}-n_\mathrm {n}\nabla ^i \varPsi _\mathrm {n}= \mathcal {N} \rho _\mathrm {n}\epsilon ^{ijk} \kappa _j w_k^\mathrm{nv}. \end{aligned}$$

The left-hand-side of this equation coincides with the vortex-free Euler equations of motion (13.33) after a suitable identification of the potential \(\varPsi \). The right-hand side appears only in the presence of vortices. We can trace the origin of this contribution back to the Magnus force exerted on a vortex (per unit length) by the associated fluid given by

$$\begin{aligned} f^i_\mathrm{M} = -\rho _\mathrm {n}\epsilon ^{ijk} \kappa _j w^\mathrm{nv}_k . \end{aligned}$$

Thus, we identify \(-\mathcal {N} f^i_\mathrm{M}\), the right-hand side of (13.79), as the averaged reaction force exerted on a fluid element by the vortex array. In the absence of balancing forces, like dissipative scattering off thermal excitations, the equation of motion for a single vortex leads to \(f^i_\mathrm{M} =0\), implying that the vortices must move along with \(v_\mathrm {n}^i\) flow. In this case, we retain (13.33) as the appropriate equation of motion.

This situation is, of course, somewhat artificial. In order for the argument to make sense, something must prevent the vortices from moving with the bulk flow. Of course, in order to describe a real superfluid, either at finite temperatures or co-existing with some other component (as in a neutron star core) we need (at least) two components. The interaction between the vortices and this second component effects the relative vortex flow. This interaction tends to be dissipative. The standard example of this is the so-called mutual friction which assumes that the Magnus force acting on each vortex is balanced by resistivity with respect to the second component in the system (e.g., the thermal excitations in Helium, represented by \({\mathrm {x}}=\mathrm {p}\) here). That is we have (Hall and Vinen 1956; Mendell 1991b; Andersson et al. 2006)

$$\begin{aligned} f^i_\mathrm{M} = -\rho _\mathrm {n}\epsilon ^{ijk} \kappa _j w^\mathrm{nv}_k = - \mathcal {R} w^i_\mathrm{vp} \end{aligned}$$

which leads to—after repeated cross products to isolate the vortex velocity—the force acting on the superfluid neutronsFootnote 25;

$$\begin{aligned} f^\mathrm {n}_i = \rho _\mathrm {n}\mathcal {N} \kappa \left( \mathcal {B}' \epsilon _{ijk} \hat{\kappa }^j w_{\mathrm {n}\mathrm {p}}^k + \mathcal {B} \epsilon _{ijk}\hat{\kappa }^j \epsilon ^{klm}\hat{\kappa }_l w_m^{\mathrm {n}\mathrm {p}} \right) \ \end{aligned}$$


$$\begin{aligned} \mathcal {B}' = \mathcal {R} \mathcal {B} = { \mathcal {R}^2 \over 1 + \mathcal {R}^2 } . \end{aligned}$$

The mutual friction has decisive impact on superfluid dynamics. In particular, it provides one of the main mechanisms for damping (or even preventing) the CFS instability in rotating superfluid neutron stars (Lindblom and Mendell 1995).

The Kalb–Ramond variation

Moving on to the relativistic description of the quantized vortex problem, we have two options. We could “simply” generalize the steps from the Newtonian case. This is helpful, as it assists the intuition. However, it may be more instructive to take an alternative route. Opting for this strategy—with the view that it will allow us to introduce additional aspects—we now set out to derive the fluid results from a different perspective. The ultimate aim is to arrive at an alternative description of the (suitably averaged) dynamics of a collection of quantized vortices.

The new strategy builds on efforts to relate string dynamics to the forces acting on a superfluid vortex (Lund and Regge 1976; Kalb and Ramond 1974; Davis and Shellard 1988, 1989). We start by recalling that the superfluid velocity (technically; the momentum) can be linked the gradient of a scalar potential \(\varphi \). We identify this velocity as the dualFootnote 26

$$\begin{aligned} \tilde{H}_a = \eta \partial _a \varphi = {1\over 3!} \epsilon _{abcd} H^{bcd} , \end{aligned}$$

where \(\eta \) is a constant, and introduce the so-called Kalb–Ramond field (Kalb and Ramond 1974), such that

$$\begin{aligned} H^{abc} = \partial ^{[a} B^{bc]} . \end{aligned}$$

It is now easy to see that the scalar wave equation

$$\begin{aligned} \Box \varphi = 0 , \end{aligned}$$

is automatically satisfied, as long as

$$\begin{aligned} \nabla _a \left( \nabla ^a B^{bc} + \nabla ^c B^{ab} + \nabla ^b B^{ca} \right) = 0 . \end{aligned}$$

In effect, we can shift the focus from \(\varphi \) to \(B^{ab}\), treating this object as an independent variable. The relevant dynamical equations are then automatically solved by expressing this field in terms of a scalar potential. The two descriptions are complementary, as they have to be (Davis and Shellard 1988). However, as we will soon demonstrate, the Kalb–Ramond representation makes the introduction of topological defects (vortices/strings) intuitive.

First, let us return to the fluid problem but shift the attention from the matter flux to the vorticity. Following Carter (1994, 2000) and Carter and Langlois (1995b), we do this by noting that we can ensure that the conservation law (6.8) is automatically satisfied by introducing a two-form \(B_{ab}\) (the Kalb–Ramond field) such that

$$\begin{aligned} n_{abc} = 3 \nabla _{[a}B_{bc]} \end{aligned}$$

That is, we have

$$\begin{aligned} n^a = {1\over 2} \epsilon ^{abcd} \nabla _b B_{cd} \end{aligned}$$

and the flux conservation (6.8) follows as an identity—we no longer need to introduce the three-dimensional matter space.

Second, in order to find an action that reproduces the perfect fluid results, we elevate the vorticity \(\omega _{ab}\) to an additional variable. A Legendre transformation—designed in such a way that the stress-energy tensor remains unchanged (Carter and Langlois 1995b)—leads to the Lagrangian

$$\begin{aligned} \bar{\varLambda } = \varLambda - {1\over 4} \epsilon ^{abcd} B_{ab} \omega _{cd} = \varLambda - {1\over 2} \tilde{\omega }^{ab} B_{ab} , \end{aligned}$$

where we have used the dual

$$\begin{aligned} \tilde{\omega }^{ab} = {1\over 2} \epsilon ^{abcd} \omega _{cd} . \end{aligned}$$

Assuming that \(\varLambda =\varLambda (n)\) we get (ignoring the perturbed metric for clarity)

$$\begin{aligned} \delta \bar{\varLambda } = -{1\over 3!} \mu ^{abc} \delta n_{abc} - {1\over 2} B_{ab} \delta \tilde{\omega }^{ab} - {1\over 2} \tilde{\omega }^{ab} \delta B_{ab} , \end{aligned}$$

where we note that, cf. Sect. 6,

$$\begin{aligned} {\partial \varLambda \over \partial n_{abc} } = - {1\over 3!} \mu ^{abc} . \end{aligned}$$

However, we now have

$$\begin{aligned} \delta n_{abc} = 3 \nabla _{[a} \delta B_{bc]} , \end{aligned}$$

which means that

$$\begin{aligned} \delta \bar{\varLambda } ={1\over 2} \left( \nabla _{a} \mu ^{abc} - \tilde{\omega }^{bc} \right) \delta B_{bc}- {1\over 2} B_{ab} \delta \tilde{\omega }^{ab} -{1\over 2} \nabla _a \left( \mu ^{abc} \delta B_{bc}\right) . \end{aligned}$$

Ignoring the surface term (as usual), we see that a variation with respect to \(B_{ab}\) requires

$$\begin{aligned} \tilde{\omega }^{bc} = \nabla _{a} \mu ^{abc} , \end{aligned}$$

which leads back to (6.29). However, with a free variation we would also have \(B_{ab}=0\). That is, we need to constrain the variation of \(\tilde{\omega }^{ab}\) (or rather \(\omega _{ab}\)). Fortunately, the matter space argument comes to the rescue, providing us with the strategy for doing this. The only difference is that we now make use of a two-dimensional space with coordinates \(\chi ^I\) (here, and in the following \(I,J,\ldots \) represent two-dimensional coordinates). We obtain this two-dimensional space either via a map from the original matter space

$$\begin{aligned} \hat{\psi }^I_A = {\partial \chi ^I \over \partial X^A} , \end{aligned}$$

or directly from spacetime, using

$$\begin{aligned} \bar{\psi }^I_a = {\partial \chi ^I \over \partial x^a}. \end{aligned}$$

The two descriptions are consistent since

$$\begin{aligned} \bar{\psi }^I_a =\hat{\psi }^I_A \psi ^A_a = {\partial \chi ^I \over \partial X^A} {\partial X^A \over \partial x^a} = {\partial \chi ^I \over \partial x^a} . \end{aligned}$$

The different coordinates and the maps are illustrated in Fig. 15.

Fig. 15

An illustration of the matter space maps and the coordinates used in the analysis of vortex dynamics and elasticity

The third step involves introducing the four velocity \(u^a\) associated with the motion of the vortices in spacetime, which may be different from the motion of the “fluid” (in turn related to \(n^a\)). In order for the vorticity to be a purely spatial object—orthogonal to the flow—we must have

$$\begin{aligned} u^a\omega _{ab} = 0 . \end{aligned}$$

In addition, we want it to be “fixed” in the (new) matter space, in the sense that

$$\begin{aligned} \mathcal {L}_u \omega _{ab} = 0 . \end{aligned}$$

Since \(\omega _{ab}\) is anti-symmetric, this leads to

$$\begin{aligned} u^c \nabla _{[a}\omega _{bc]} = 0 , \end{aligned}$$

which will be satisfied if

$$\begin{aligned} \nabla _{[a}\omega _{bc]} = \partial _{[a}\omega _{bc]} = 0 . \end{aligned}$$

Adapting the logic that led to the conserved matter flux in Sect. 6, we introduce the matter space tensor \(\omega _{IJ}\) (associated with two-dimensional space orthogonal to the vortex world sheet), such that

$$\begin{aligned} \omega _{ab} = \psi ^A_a \psi ^B_b \omega _{AB} = \bar{\psi }^I_a \bar{\psi }^J_b \omega _{IJ} . \end{aligned}$$

Noting that (13.103) becomes

$$\begin{aligned} \partial _{[a}\omega _{bc]} = \bar{\psi }^I_a \bar{\psi }^J_b \bar{\psi }^K_c \partial _{[I} \omega _{JK]} = 0 , \end{aligned}$$

it follows that the condition holds as long as \(\omega _{IJ}\) only depends on the \(\chi ^I\) coordinates. It should (by now) be a familiar argument.

Next, we introduce Lagrangian perturbations such that

$$\begin{aligned} \varDelta \chi ^I = 0 \longrightarrow \delta \chi ^I = - \mathcal {L}_\xi \chi ^I , \end{aligned}$$

and we have

$$\begin{aligned} \varDelta \omega _{ab} = 0 . \end{aligned}$$

Again leaving out the metric variations, we have

$$\begin{aligned} \delta \tilde{\omega }^{ab} = {1\over 2} \epsilon ^{abcd} \delta \omega _{cd} = - \xi ^c \nabla _c \tilde{\omega }^{ab} - \epsilon ^{abcd} \omega _{ed} \nabla _c \xi ^e , \end{aligned}$$

and, after a little bit of work, the middle term in (13.95) becomes

$$\begin{aligned} - {1\over 2} B_{ab} \delta \tilde{\omega }^{ab} = {3\over 2} \xi ^c \tilde{\omega }^{ab} \nabla _{[c} B_{ab]} + \nabla _c \left( \omega ^{ab} B_{ab} \xi ^c\right) . \end{aligned}$$

We have have noted that, (13.96) implies that

$$\begin{aligned} \nabla _a \tilde{\omega }^{ab} = 0 . \end{aligned}$$

Finally, we see that a variation with respect to \(\xi ^a\) leads to

$$\begin{aligned} {3\over 2} \tilde{\omega }^{ab} \nabla _{[c} B_{ab]} = {1\over 4} \epsilon ^{abde} \omega _{de} n_{cab} = n^d \omega _{dc} = 0 , \end{aligned}$$

and we recover the usual fluid equations. This completes the initial argument. The introduction of the Kalb–Ramond field shifts the focus onto the vorticity, which is associated with a two-dimensional subspace (replacing the usual three-dimensional matter space). The key point is that we arrive at fluid equations without explicitly associating the fluid flux \(n^a\) with the four-velocity \(u^a\).

String fluids

In order to form a complete picture—including connections with related problems—and develop the tools we need to make progress, it is useful to take a slight detour in the direction of string theory. The key point is that, a one-dimensional string moving through spacetime traces out a two-dimensional world sheet. This world sheet is spanned by two vectors, one timelike (here taken to be the four velocity of the string, \(u^a\)) and one spacelike (intuitively, the tangent vector to the string, represented by \(\hat{\kappa }^a\)). These vectors are associated with two-dimensional coordinatesFootnote 27 such that \(x^a = x^a (\phi ^I)\), leading to the tangent surface element

$$\begin{aligned} S^{ab} = \bar{\epsilon }^{IJ} {\partial x^a \over \partial \phi ^I} {\phi x^b \over \partial \phi ^J} , \end{aligned}$$

with \(\bar{\epsilon }^{IJ}\) the (normalized) two-dimensional Levi-Civita tensor (density), representing the measure tensor for the two-dimensional surface orthogonal to the vortex world sheet.

Associated with this world sheet we have a bivector (read: an anti-symmetric tensor of rank 2), to be denoted \(\varSigma ^{ab}\). This object can be expressed in terms of the linearly independent vectors that span the surface; as the bivector spans a surface, it is natural to think of it as a contravariant object. Noting that a simple timelike bivector can be written as the alternating product of a timelike and a spacelike vector (Stachel 1980) (such that its dual will be a simple spacelike bivector) and assuming the normalisation

$$\begin{aligned} \varSigma _{ab} \varSigma ^{ab} = -2 , \end{aligned}$$

we may use