1 Introduction

Two of the most tantalizing mysteries of modern astrophysics are known as the dark matter and dark energy problems. These problems come from the discrepancies between, on one side, the observations of galactic and extragalactic systems (as well as the observable Universe itself in the case of dark energy) by astronomical means, and on the other side, the predictions of general relativity from the observed amount of matter-energy in these systems. In short, what astronomical observations are telling us is that the dynamics of galactic and extragalactic systems, as well as the expansion of the Universe itself, do not correspond to the observed mass-energy as they should if our understanding of gravity is complete. Thus, this indicates either (i) the presence of unseen (and yet unknown) mass-energy, or (ii) a failure of our theory of gravity, or (iii) both.

The third case is a priori the most plausible, as there are good reasons for there being more particles than those of the standard model of particle physics [257] (actually, even in the case of baryons, we suspect that a lot of them have not yet been seen and, thus, literally make up unseen mass, in the form of “missing baryons”), and as there is a priori no reason that general relativity should be valid over a wide range of scales, where it has never been tested [45], and where the need for a dark sector actually prevents the theory from being tested until this sector has been detected by other means than gravity itselfFootnote 1. However, either of the first two cases could be the dominant explanation of the discrepancies in a given class of astronomical systems (or even in all astronomical systems), and this is actually testable.

For instance, as far as (ii) is concerned, if the mass discrepancies in a class of systems are mostly caused by some subtle change in gravitational physics, then there should be a clear signature of a single, universal force law at work in this whole class of systems. If instead there is a distinct dark matter component in these, the kinematics of any given system should then depend on the particular distribution of both dark and luminous mass. This distribution would vary from system to system, depending on their environment and past history of formation, and should, in principle, not result in anything like an apparent universal force lawFootnote 2.

Over the years, there have been a large variety of such attempts to alter the theory of gravity in order to remove the need for dark matter and/or dark energy. In the case of dark energy, there is some wiggle room, but in the case of dark matter, most of these alternative gravity attempts fail very quickly, and for a simple reason: once a force law is specified, it must fit all relevant kinematic data in a given class of systems, with the mass distribution specified by the visible matter only. This is a tall order with essentially zero wiggle room: at most one particular force law can work. However, among all these attempts, there is one survivor: the Modified Newtonian Dynamics (MOND) hypothesized by Milgrom almost 30 years ago [294, 295, 293] seems to come close to satisfying the criterion of a universal force law in a whole class of systems, namely galaxies. This success implies a unique relationship between the distribution of baryons and the gravitational field in galaxies and is extremely hard to understand within the present dominant paradigm of the concordance cosmological model, hypothesizing that general relativity is correct on every relevant scale in cosmology including galactic scales, and that the dark sector in galaxies is made of non-baryonic dissipationless and collisionless particles. Even if such particles are detected directly in the near to far future, the success of MOND on galaxy scales as a phenomenological law, as well as the associated appearance of a universal critical acceleration constant a0 ≃ 10−10 m s−2 in various, seemingly unrelated, aspects of galaxy dynamics, will still have to be explained and understood by any successful model of galaxy formation and evolution. Previous reviews of various aspects of MOND, at an observational and theoretical level, can be found in [34, 81, 100, 151, 279, 311, 318, 401, 407, 429]. A website dedicated to this topic is also maintained, with all the relevant literature as well as introductory level articles [263] (see also [238]).

Here, we first review the basics of the dark matter problem (Section 2) as well as the basic ingredients of the present-day concordance model of cosmology (Section 3). We then point out a few outstanding challenges for this model (Section 4), both from the point of view of unobserved predictions of the model, and from the point of view of unpredicted observations (all uncannily involving a common acceleration constant a0). Up to that point, the challenges presented are purely based on observations, and are fully independent of any alternative theoretical frameworkFootnote 3. We then show that, surprisingly, many of these puzzling observations can be summarized within one single empirical law, Milgrom’s law (Section 5), which can be most easily (although not necessarily uniquely) interpreted as the effect of a single universal force law resulting from a modification of Newtonian dynamics (MOND) in the weak-acceleration regime a < a0, for which we present the current observational successes and problems (Section 6). We then summarize the various attempts currently made to embed this modification in a generally-covariant relativistic theory of gravity (Section 7) and how such theories allow new predictions on gravitational lensing (Section 8) and cosmology (Section 9). We finally draw conclusions in Section 10.

2 The Missing Mass Problem in a Nutshell

There exists overwhelming evidence for mass discrepancies in the Universe from multiple independent observations. This evidence involves the dynamics of extragalactic systems: the motions of stars and gas in galaxies and clusters of galaxies. Further evidence is provided by gravitational lensing, the temperature of hot, X-ray emitting gas in clusters of galaxies, the large scale structure of the Universe, and the gravitating mass density of the Universe itself (Figure 1). For an exhaustive historical review of the problem, we refer the reader to [394].

Figure 1
figure 1

Summary of the empirical roots of the missing mass problem (below line) and the generic possibilities for its solution (above line). Illustrated lines of evidence include the approximate flatness of the rotation curves of spiral galaxies, gravitational lensing in a cluster of galaxies, and the growth of large-scale structure from an initially very-nearly-homogeneous early Universe. Other historically-important lines of evidence include the Oort discrepancy, the need to stabilize galactic disks, motions of galaxies within clusters of galaxies and the hydrodynamics of hot, X-ray emitting gas therein, and the apparent excess of gravitating mass density over the mass density of baryons permitted by Big-Bang nucleosynthesis. From these many distinct problems grow several possible solutions. Generically, the observed discrepancies either imply the existence of dark matter, or the necessity to modify dynamical laws. Dark matter could, in principle, be any combination of non-luminous baryons and/or some non-baryonic form of mass-like neutrinos (hot dark matter) or some new particle, whose mass makes it dynamically cold or perhaps warm. Alternatively, the observed discrepancies might point to the need to modify the equation of gravity that is employed to infer the existence of dark matter, or perhaps some other fundamental dynamical assumption like the equivalence of inertial mass and gravitational charge. Many specific ideas of each of these types have been considered over the years. Note that none of these ideas are mutually exclusive, and that some form or the other of dark matter could happily cohabit with a modification of the gravitational law, or could even be itself the cause of an effective modification of the gravitational law. Question marks on some tree branches represent the fruit of ideas yet to be had. Perhaps these might also address the dark energy problem, with the most satisfactory result being a theory that would simultaneously explain the acceleration scale in the dark matter problem as well as the accelerating expansion of the Universe, and explain the coincidence of scales between these two problems, a coincidence exhibited in Section 4.1.

The data leave no doubt that when the law of gravity as currently known is applied to extragalactic systems, it fails if only the observed stars and gas are included as sources in the stress-energy tensor. This leads to a stark choice: either the Universe is pervaded by some unseen form of mass — dark matter — or the dynamical laws that lead to this inference require revision. Though the mass discrepancy problem is now well established [394, 465], such a dramatic assertion warrants a brief review of the evidence.

Historically, the first indications of the modern missing mass problem came in the 1930s shortly after galaxies were recognized to be extragalactic in nature. Oort [342] noted that the sum of the observed stars in the vicinity of the sun fell short of explaining the vertical motions of stars in the disk of the Milky Way. The luminous matter did not provide a sufficient restoring force for the observed stellar vertical oscillations. This became known as the Oort discrepancy. Around the same time, Zwicky [518] reported that the velocity dispersion of galaxies in clusters of galaxies was far too high for these objects to remain bound for a substantial fraction of cosmic time. The Oort discrepancy was approximately a factor of two in amplitude, and confined to the Galactic disk — it required local dark matter, not necessarily the quasi-spherical halo we now envision. It was long considered a serious problem, but has now largely (though perhaps not fully) gone away [194, 240]. The discrepancy Zwicky reported was less subtle, as the required dark mass outweighed the visible stars by a factor of at least 100. This result was apparently not taken seriously at the time.

One of the first indications of the need for dark matter in modern times came from the stability of galactic disks. Stars in spiral galaxies like the Milky Way are predominantly on approximately circular orbits, with relatively few on highly eccentric orbits [132]. The small velocity dispersion of stars relative to their circular velocities makes galactic disks dynamically cold. Early simulations [343] revealed that cold, self-gravitating disks were subject to severe instabilities. In order to prevent the rapid, self-destructive growth of these instabilities, and hence preserve the existence of spiral galaxies over a sizable fraction of a Hubble time, it was found to be necessary to embed the disk in a quasi-spherical potential well — a role that could be played by a halo of dark matter, as first proposed in 1973 by Ostriker & Peebles [343].

Perhaps the most persuasive piece of evidence was then provided, notably through the seminal works of Bosma and Rubin, by establishing that the rotation curves of spiral galaxies are approximately flat [67, 370]. A system obeying Newton’s law of gravity should have a rotation curve that, like the Solar system, declines in a Keplerian manner once the bulk of the mass is enclosed: Vcr−1/2. Instead, observations indicated that spiral galaxy rotation curves tended to remain approximately flat with increasing radius: Vc ∼ constant. This was shown to happen over and over and over again [370] with the approximate flatness of the rotation curve persisting to the largest radii observable [67], well beyond where the details of each galaxy’s mass distribution mattered, so that Keplerian behavior should have been observed. Again, a quasi-spherical halo of dark matter as proposed by Ostriker and Peebles was implicated.

Other types of galaxies exhibit mass discrepancies as well. Perhaps most notable are the dwarf spheroidal galaxies that are satellites of the Milky Way [427, 477] and of Andromeda [217]. These satellites are tiny by galaxy standards, possessing only millions, or in the case of the ultrafaint dwarfs, thousands, of individual stars. They are close enough that the line-of-sight velocities of individual stars can be measured, providing for a precise measurement of the system’s velocity dispersion. The mass inferred from these motions (roughly, M2/G) greatly exceeds the mass visible in luminous stars. Indeed, these dim satellite galaxies exhibit some of the largest mass discrepancies observed. In contrast, bright giant elliptical galaxies (often composed of much more than the ∼ 1011 stars of the Milky Way) exhibit remarkably modest and hard to detect mass discrepancies [367]. Thus, it is inferred that fainter galaxies are progressively more dark-matter dominated than bright ones. However, as we shall expand on in Section 4.3, the primary correlation is not with luminosity, but with surface brightness: the lower the surface brightness of a system, the larger its mass discrepancy [279].

On larger scales, groups and clusters of galaxies also show mass discrepancies, just as individual galaxies do. One of the earliest lines of evidence comes from the “timing argument” in the Local Group [213]. Presumably the material that was to become the Milky Way and Andromeda (M31) was initially expanding apart with the general Hubble expansion. Currently they are approaching one another at ∼ 100 km s−1. In order for the Milky Way and M31 to have overcome the initial expansion and fallen back towards one another, there must be a greater-than-average gravitating mass between the two. To arrive at their present separation with the observed blueshifted line of sight velocity after a Hubble time requires a dynamical mass-to-light ratio M/L > 80. This greatly exceeds the mass-to-light ratio of the stars themselves, which is of order unity in Solar units [42] (the Sun is a fairly average star, so averaged over many stars each Solar mass produces roughly one Solar luminosity).

Rich clusters of galaxies are rare structures containing dozens or even hundreds of bright galaxies. These objects exhibit mass discrepancies in several distinct ways. Measurements of the redshifts of individual cluster members give velocity dispersions in the vicinity of 1,000 km s−1 typically implying dynamical mass-to-light ratios in excess of 100 [24]. The actual mass discrepancy is not this large, as most of the detected baryonic mass in clusters is in a diffuse intracluster gas rather than in the stars in the galaxies (something Zwicky was not aware of back in 1933). This gas is heated to the virial temperature and emits X-rays. Mapping the temperature and emission of this X-ray gas provides another probe of the cluster mass through the equation of hydrostatic equilibrium. In order to hold the gas in the clusters at the observed temperatures, the dark matter must outweigh the gas by a factor of ∼ 8 [175]. Furthermore, some clusters are observed to gravitationally lens background galaxies (Figure 1). Once again, mass above and beyond that observed is required to explain this phenomenon [227]. Thus, three independent methods all imply the need for about the same amount of dark matter in clusters of galaxies.

In addition to the abundant evidence for mass discrepancies in the dynamics of extragalactic systems, there are also strong motivations for dark matter in cosmology. Two observations are particularly important: (i) the small baryonic mass density Ωb inferred from Big-Bang nucleosynthesis (BBN) (and from the measured Hubble parameter), and (ii) the growth of large scale structure by a factor of ∼ 105 from the surface of last scattering of the cosmic microwave background at redshift z ∼ 1000 until present-day z = 0, implying Ωm > Ωb. Together, these observations imply not only the need for dark matter, but for some exotic new form of non-baryonic cold dark matter. Indeed, observational estimates of the gravitating mass density of the Universe Ωm, measured, for instance, from peculiar galaxy (or large-scale) velocity fields, have, for several decades, persistently returned values in the range 1/4 < Ωm < 1/3 [116]. While shy of the value needed for a flat Universe, this mass density is well in excess of the baryon density inferred from BBN. The observed abundances of the light isotopes deuterium, helium, and lithium are consistent with having been produced in the first few minutes after the Big Bang if the baryon density is just a few percent of the critical value: Ωb < 0.05 [480, 107]. Thus, Ωm > Ωb. Consequently, we do not just need dark matter, we need the dark matter to be non-baryonic.

Another early Universe constraint is provided by the Cosmic Microwave Background (CMB). The small (microKelvin) amplitude of the temperature fluctuations at the time of baryon-photon decoupling (z ∼ 1000) indicates that the Universe was initially very homogeneous, roughly to one part in 105. The Universe today (z = 0) is very inhomogeneous, at least on “small” scales of less than ∼ 100 Mpc (∼ 3 × 108 ly), with huge density contrasts between planets, stars, galaxies, clusters, and empty intergalactic space. The only attractive long-range force acting on the entire Universe, that can make such structures, is gravity. In a rich-get-richer while the poor-get-poorer process, the small initial over-densities attract more mass and grow into structures like galaxies while under-dense regions become less dense, leading to voids. The catch is that gravity is rather weak, so this process takes a long time. If the baryon density from BBN is all we have to work with, we can only obtain a growth factor of ∼ 102 in a Hubble time [424], orders of magnitude short of the observed 105. The solution is to boost the growth rate with extra invisible mass displaying larger density fluctuations: dark matter. In order not to make the same mark on the CMB that baryons would, this dark matter must not interact with the photons. So, in effect, the density fluctuations in the dark matter can already be very large at the epoch of baryon-photon decoupling, especially if the dark matter is cold (i.e., with effectively zero Jeans length). The baryons fall into the already deep dark matter potential wells only after that, once released from their electromagnetic link to the photon bath. Before decoupling, the fluctuations in the baryon-photon fluid did not grow but were oscillating in the form of acoustic waves, maintaining the same amplitude as when they entered the horizon; actually they were even slightly diffusion-damped. In principle, at baryon-photon decoupling, CMB fluctuations on smaller angular scales, having entered the horizon earlier, would have been damped with respect to those on larger scales (Silk damping). Nevertheless, the presence of decoupled non-baryonic dark matter would provide a net forcing term countering the damping of the oscillations at recombination, meaning that the second and third acoustic peaks of the CMB could then be of equal amplitude rather than exhibiting a damping tail. The actual observation of a high third-peak in the CMB angular power spectrum is another piece of compelling evidence for non-baryonic dark matter (see, e.g., [229]). Both BBN and the CMB thus drive us to consider a form of mass that is non-baryonic and which does not interact electromagnetically. Moreover, in order to form structure (see Section 3.2), the mass must be dynamically cold (i.e., moving much slower than the speed of light when it decouples from the photon bath), and is known as cold dark matter (CDM).

Now, in addition to CDM, modern cosmology also requires something even more mysterious, dubbed dark energy. The fact that the baryon fraction in clusters of galaxies was such that Ωm was implied to be much smaller than 1 — the value needed for a flat Euclidean Universe favored by inflationary models —, as well as tensions between the measured Hubble parameter and independent estimates of the age of the Universe, led Ostriker & Steinhardt [344] to propose in 1995 a “concordance model of cosmology” or ΛCDM model, where a cosmological constant Λ — supposed to represent vacuum energy or dark energy — provided the major contribution to the Universe’s energy density. Three years later, the observations of SNIa [351, 365] indicating late-time acceleration of the Universe’s expansion, led most people to accept this model. This concordance model has since been refined and calibrated through subsequent large-scale observations of the CMB and of the matter power spectrum, to lead to the favored cosmological model prevailing today (see Section 3). However, as we shall see, curious coincidences of scales between the dark matter and dark energy sectors (see Section 4.1) have prompted the question of whether these two sectors are really physically independent, and the existence of dark energy itself has led to a renewed interest in modified gravity theories as a possible alternative to this exotic fluid [100].

3 A Brief Overview of the ΛCDM Cosmological Model

General relativity provides a clear and compelling cosmology, the Friedmann-Lemaître-Robertson-Walker (FLRW) model. The expansion of the Universe discovered by Hubble and Slipher found a natural explanationFootnote 4 in this context. The picture of a hot Big-Bang cosmology that emerged from this model famously predicted the existence of the 3 degree CMB and the abundances of the light isotopes via BBN.

Within the FLRW framework, we are inexorably driven to infer the existence of both non-baryonic cold dark matter and a non-zero cosmological constant as discussed in Section 2. The resulting concordance ΛCDM model — first proposed in 1995 by Ostriker and Steinhardt [344] — is encouraged by a wealth of observations: the consistency of the Hubble parameter with the ages of the oldest stars [344], the consistency between the dynamical mass density of the Universe, that of baryons from BBN (see also discussion in Section 9.2), and the baryon fraction of clusters [486], as well as the power spectrum of density perturbations [103, 452]. A prediction of the concordance model is that the expansion rate of the Universe should be accelerating; this was confirmed by observations of high redshift Type Ia supernovae [351, 365]. Another successful prediction was the scale of the baryonic acoustic oscillation [134]. Perhaps the most emphatic support for ACDM comes from fits to the acoustic power spectrum of temperature fluctuations in the CMB [229].

For a brief review of the basics and successes of the concordance cosmological model we refer the reader to, e.g., [87, 349] and all references therein. We note that, while most of the cosmological probes in the above list are not uniquely fit by the ΛCDM model on their own, when they are taken together they provide a remarkably tight set of constraints. The success of this now favoured cosmological model on large scales is, thus, remarkable indeed, as there was a priori no reason that such a parameterized cosmology could explain all these completely independent data sets with such outstanding consistency.

In this model, the Hubble constant is H0 = 70 km s−1 Mpc−1 (i.e., h = 0.7), the amplitude of density fluctuations within a top-hat sphere of 8h−1 Mpc is σ8 = 0.8, the optical depth to reionization is τ = 0.08, the spectral index measuring how fluctuations change with scale is ns =0. 97, and the price we pay for the outstanding success of the model is new physics in the form of a dark sector. This dark sector is making up 95% of the mass-energy content of the Universe in ΛCDM: it is composed separately of a dark energy sector and a cold dark matter sector, which we briefly describe below.

3.1 Dark Energy (Λ)

In ΛCDM, dark energy is a non-vanishing vacuum energy represented by the cosmological constant Λ in the field equations of general relativity. Einstein’s cosmological constant is equivalent to vacuum energy with equation of state p/ρ = w = −1. In principle, the equation of state could be merely close to, but not exactly w = −1. In this case, the dark energy could evolve and clump, depending on the value of w and its evolution . However, to date, there is no compelling observational reason to require any form of dark energy more complex than the simple cosmological constant introduced by Einstein.

The various observational datasets discussed above constrain the ratio of the dark energy density to the critical density to be \({\Omega _\Lambda} = \Lambda/3H_0^2 = 0.73\) where H0 is Hubble’s constant and ι is expressed in s−2. This value, together with the matter density Ωm (see below), leads to a total Ω = Ωιm = 1, i.e., a spatially-flat Euclidean geometry in the Robertson-Walker sense that is nicely consistent with the expectations of inflation. It is important to stress that this model relies on the cosmological principle, i.e., that our observational location in the Universe is not special, and on the fact that on large scales, the Universe is isotropic and homogeneous. For possible challenges to these assumptions and their consequences, we refer the reader to, e.g., [83, 487, 488].

3.2 Cold Dark Matter (CDM)

In ΛCDM, dark matter is assumed to be made of non-baryonic dissipationless massive particles [48], the “cold dark matter” (CDM). This dark matter outweighs the baryons that participate in BBN by about 5:1. The density of baryons from the CMB is Ωb = 0.046, grossly consistent with BBN [229]. This is a small fraction of the critical density; with the non-baryonic dark matter the total matter density is Ωm = Ωcdm + Ωb = 0.27.

The “cold” in cold dark matter means that CDM moves slowly so that it is non-relativistic when it decouples from photons. This allows it to condense and begin to form structure, while the baryons are still electromagnetically coupled to the photon fluid. After recombination, when protons and electrons first combine to form neutral atoms so that the cross-section for interaction with the photon bath suddenly drops, the baryons can fall into the potential wells already established by the dark matter, leading to a hierarchical scenario of structure formation with the repeated merger of smaller CDM clumps to form ever larger clumps.

Particle candidates for the CDM must be massive, non-baryonic, and immune to electromagnetic interactions. The currently preferred CDM candidates are Weakly Interacting Massive Particles (WIMPs, [46, 47, 48]) that condensed from the thermal bath of the early Universe. These should have masses on the order of about 100 GeV so that (i) the free-streaming length is small enough to create small-scale structures as observed (e.g., dwarf galaxies), and (ii) that thermal relics with cross-sections typical for weak nuclear reactions account for the right amount of matter density Ωm (see, e.g., Eq. 28 of [48]). This last point is known as the WIMP miracleFootnote 5.

For lighter particle candidates (e.g., ordinary neutrinos or light sterile neutrinos), the damping scale becomes too large. For instance, a hot dark matter (HDM) particle candidate with mass of a few to 15 eV would have a free-streaming length of about ∼ 100 Mpc, leading to too little power at the small-scale end of the matter power spectrum. The existence of galaxies at redshift z ∼ 6 implies that the coherence length should have been smaller than 100 kpc or so, meaning that even warm dark matter (WDM) particles with masses between 1 and 10 keV are close to being ruled out as well (see, e.g., [348]). Thus, ΛCDM presently remains the state-of-the-art in cosmology, although some of the challenges listed in Section 4 are leading to a slow drift of the standard concordance model from CDM to WDM [252], but this drift brings along its own problems, and fails to address most of the current observational challenges summarized in the following Section 4, which might perhaps point to a more radical alternative to the model.

4 Some Challenges for the ΛCDM Model

The great concordance of independent cosmological observables from Gpc to Mpc scales lends a certain air of inevitability to the ΛCDM model. If we accept these observables as sufficient to prove the model, then any discrepancy appears as trivia that will inevitably be explained away. If instead we require a higher standard, such as positive laboratory evidence for the dark sectors, then ΛCDM appears as a yet unproven hypothesis that relies heavily on two potentially fictitious invisible entities. Thus, an important test of ΛCDM as a scientific hypothesis is the existence of dark matter. By this we mean not just unseen mass, but specifically CDM: some novel form of particle with the right microscopic properties and correct cosmic mass density. Searches for WIMPs are now rather mature and not particularly encouraging. Direct detection experiments have as yet no positive detections, and have now excluded [19] the bulk of the parameter space (interaction cross-section and particle mass) where WIMPs were expected to reside. Indirect detection through the observation of γ-rays produced by the self-annihilationFootnote 6 of WIMPs in the galactic halo and in nearby satellite galaxies have similarly returned null results [6, 84, 172] at interestingly restrictive levels. For the most-plausible minimally-supersymmetric models, particle colliders should already have produced evidence for WIMPs [2, 1, 23]. The right model need not be minimal. It is always possible to construct a more complicated model that manages to evade all experimental constraints. Indeed, it is readily possible to imagine dark matter candidates that do not interact at all with the rest of the Universe except through gravity. Though logically possible, such dark matter candidates are profoundly unsatisfactory in that they could not be detected in the laboratory: their hypothesized existence could neither be confirmed nor falsified.

Apart from this current non-detection of CDM candidates, there also exists prominent observational challenges for the ΛCDM model, which might point towards the necessity of an alternative model (or, at the very least, an improved one). These challenges are that (i) some of the parameters of the model appear fine-tuned (Section 4.1), and that (ii) as far as galaxy formation and evolution are concerned (mainly processes happening on kpc scales so that the predictions are more difficult to make because the baryon physics should play a more prominent role), many predictions that have been made were not successful (Section 4.2); (iii) what is more, a number of observations on these galactic scales do exhibit regularities that are fully unexpected in any CDM context without a substantial amount of fine-tuning in terms of baryon feedback (Section 4.3).

4.1 Coincidences

What is generally considered as the biggest problem for the ΛCDM model is that it requires a large and still unexplained fine-tuning to reduce by 120 orders of magnitude the theoretical expectation of the vacuum energy to yield the observed cosmological-constant value, and, even more importantly, that it faces a coincidence problem to explain why the dark energy density ΩΛ is precisely of the same order of magnitude as the other cosmological components todayFootnote 7. This uncanny coincidence is generally seen as evidence for some yet-to-be-discovered underlying cosmological mechanism ruling the evolution of dark energy (such as quintessence or generalized additional fluid components, see, e.g., [106]). But it could also indicate that the effect attributed to dark energy is rather due to a breakdown of general relativity (GR) on the largest scales [158].

Then, as we shall see in more detail in Section 4.3, another coincidence, which is central to this whole review, is the appearance of a characteristic scale — dubbed a0 — in the behavior of the dark matter sector, a scale with units of acceleration. This acceleration scale appears in various seemingly unrelated galactic scaling relations, mostly unpredicted by the ΛCDM model (see Section 4.3). The value of this scale is a0 ≃ 10−10 m s−2, which yields in natural unitsFootnote 8, a0H0 (or, more precisely, a0cH0/2π). It is perhaps even more meaningful [51, 298, 304] to note that, in these same units:

$$a_0^2 \sim \Lambda ,$$

where Λ is the currently-favored value of the cosmological constantFootnote 9. Whether these numerical coincidences are physically relevant or just true (insignificant) coincidences remains an open question, closely related to the nature of the dark sector, which we are going to elaborate on in Sections 510. But, at this stage, it is in any case striking that the dark matter and dark energy sectors do have such a common scale. This coincidence of scales, together with the coincidence of energy densities at redshift zero, might perhaps be a strong indication that one should cease to consider dark energy as an additional component physically independent from the dark matter sector [7], and/or cease to consider that GR correctly describes gravity on the largest scales and in extremely weak gravitational fields, in order to perhaps address the two above coincidence problems at the same time.

Finally, let us note that the existence of the a0-scale is actually not the only dark-matter-related coincidence, as there is also, in principle, absolutely no reason why the mechanism leading to the baryon asymmetry (between baryonic matter and antimatter) would simultaneously leave both the baryon and dark matter densities with a similar order of magnitude (Ωdmb = 5). If the effects we attribute to dark matter are actually also due to a breakdown of GR on cosmological scales, then such a coincidence might perhaps appear more natural as the baryons would then be the actual source of the effect attributed to the dark matter sector.

4.2 Unobserved predictions

Apart from the above puzzling coincidences, the concordance ΛCDM model also has a few more concrete empirical challenges to address, in the sense of having made a few predictions in contradiction with observations (with the caveat in mind that the model itself is not always that predictive on small scales). These include the following non-exhaustive list:

  1. 1.

    The bulk flow challenge. Peculiar velocities of galaxy clusters are predicted to be on the order of 200 km/s in the ΛCDM model. These can actually be measured by studying the fluctuations in the CMB generated by the scattering of the CMB photons by the hot X-ray-emitting gas inside clusters (the kinematic SZ effect). This yields an observed coherent bulk flow of order 1000 km/s (5 times more than predicted) on scales out to at least 400 Mpc [221]. This bulk flow challenge appears not only in SZ studies but also in galaxy studies [483]. A related problem is the collision velocity larger than 3100 km/s for the merging bullet cluster 1E0657-56 at z = 0.3, much too high to be accounted for by ΛCDM [249, 455]. These observations would seem to indicate that the attractive force between DM particles is enhanced compared to what ΛCDM predicts, and changing CDM into WDM would not solve the problem.

  2. 2.

    The high-z clusters challenge. Observation of even a single massive cluster at high redshift can falsify ΛCDM [331]. In this respect the existence of the galaxy cluster XMMU J2235.3-2557 [368] with a mass of of ∼ 4 × 1014 M at z = 1.4, even though not sufficient to rule out the model, is very surprising and could indicate that structure formation is actually taking place earlier and faster than in ΛCDM (see also [420] on the Shapley supercluster and the Sloan Great Wall).

  3. 3.

    The Local Void challenge. The Local Volume is composed of 562 known galaxies at distances smaller than 8 Mpc from the center of the Local Group, and the region known as the “Local Void” hosts only 3 of them. This is much less than the expected ∼ 20 for a typical similar void in ΛCDM [350]. What is more, in the Local Volume, large luminous galaxies are over-represented by a factor of 6 in the underdense regions, exactly opposite to what is expected from ΛCDM. This could mean that the Local Volume is just a statistical anomaly, but it could also point, in line with the two previous challenges, towards more rapid structure formation, allowing sparse regions to more quickly form large galaxies cleaning their environment, making the galaxies larger and the voids emptier at early times [350].

  4. 4.

    The missing satellites challenge. It has long been known that the model predicts an overabundance of dark subhalos orbiting Milky-Way-sized galaxies compared to the observed number of satellite galaxies around the Milky Way [329]. This is a different problem from the above-predicted overabundance of small galaxies in voids. It has subsequently been suggested that stellar feedback and heating processes limit baryonic growth, that re-ionisation prevents low-mass dark halos from forming stars, and that tidal forces from the host halo limit growth of the dark-matter sub-halos and lead to their truncation. This important theoretical effort has led recent semi-analytic models to predict a reduced number of ∼ 100 to 600 faint satellites rather than the original thousands. Moreover, during the past 15 years 13 “new” and mostly ultra-faint satellite galaxies have been found in addition to the 11 previously-known classical bright ones. Since these new galaxies have been largely discovered with the Sloan Digital Sky Survey (SDSS), and since this survey covered only one fifth of the sky, it has been argued that the problem was solved. However, there are actually still missing satellites on the low mass and high mass end of the mass function predicted by “ΛCDM+re-inoisation” semi-analytic models. This is best illustrated on Figure 2 of [239] showing the cumulative distribution for the predicted and observationally-derived masses within the central 300 pc of Milky Way satellites. A lot of low-mass satellites are still missing, and the most massive predicted subhaloes are also incompatible with hosting any of the known Milky Way satellites [73, 75, 74]. This is the modern version of the missing satellites challenge. An obvious but rather discomforting way-out would be to simply state that the Milky Way must be a statistical outlier, but this is contradicted by the study of [447] on the abundance of bright satellites around Milky Way-like galaxies in SDSS. Another solution would be to change from CDM to WDM [252] (it is actually one of the only listed challenges that such a change would probably immediately solve).

  5. 5.

    The satellites phase-space correlation challenge. In addition to the above challenge, the distribution of dark subhalos around the Galaxy is also predicted by ΛCDM to be isotropic, or quasi-isotropic. However, the Milky Way satellites are currently observed to be correlated in phase-space: they lie within a seemingly rotation-supported disk [239]. Young halo globular clusters define the same disk, and streams of stars and gas, tracing the orbits of the objects from which they are stripped, preferentially lie in this disk, too [347]. Since SDSS covered only one fifth of the sky, it will be interesting to see whether future surveys such as Pan-Starrs will confirm this state of affairs. Whether or not this phase-space correlation would be unique to the Milky Way should also be carefully checked, the evidence in M31 being currently much less convincing, with a richer and more complex satellite population [289]. But in any case, the current distribution of satellites around the Milky Way is statistically incompatible with the predictions of ΛCDM at a very high level of confidence, even when taking into account the observational bias from SDSS [239]. While this might perhaps have been explained by the infall of a small group of galaxies that would have retained correlated orbits, this solution is ruled out by the fact that no nearby groups are observed to be anywhere near as spatially small as the disk of satellites [290]. Another solution might be that most Milky Way satellites are actually not primordial galaxies but old tidal dwarf galaxies created in an early major merger event, accounting for their presently-correlated phase-space distribution [346]. Note in passing that if only one or two long-lived tidal dwarfs are created in each gas-dissipational galaxy encounter, they could probably account for most of the dwarf galaxy population in the Universe, leaving no room for small CDM subhalos to create galaxies, which would transform the missing satellites challenge into a missing satellites catastrophe [239].

  6. 6.

    The cusp-core challenge. Another long-standing problem of ΛCDM is the fact that the simulations of the collapse of CDM halos lead to a density distribution as a function of radius, ρ(r), which is well fitted by a smooth function asymptoting to a central cusp with slope d ln ρ/d ln r = −1 in the central parts [126, 332], while observations clearly point towards large constant density cores in the central parts [118, 169, 479]. Even though the latest simulations [333] rather point towards Einasto [133] profiles with d ln ρ/d ln r ∝ − r(1/n) (with n slightly varying with halo mass, and n ∼ 6 for a Milky Way-sized halo, meaning that the slope is zero only very close to the nucleus [177], and is still ∼ −1 at 200 pc from the center), fitting such profiles to observed galactic kinematical data such as rotation curves [88] leads to values of n that are much smaller than simulated values (meaning that they have much larger cores), which is another way of re-assessing the old cusp problem of ΛCDM. Note that a change from CDM to WDM could solve the problem in dwarf galaxies, by leading to the formation of small cores, but certainly not in large galaxies where large cores are needed from observations. Thus, one has to rely on baryon feedback to erase the cusp from all galaxies. But this is not easily done, as the adiabatic cooling of baryons in the center of dark matter halos should lead to an even more concentrated dark matter distribution. A possibility would be that angular momentum transfer from a rotating stellar bar destroys dark-matter cusps: however, significant cusp destruction requires substantially more angular momentum than is realistically available in stellar bars [89, 286]. Note also that not all galaxies are barred (e.g., M33 is not). The state-of-the-art solution nowadays is to enforce strong supernovae outflows that move large amounts of low-angular-momentum gas from the central parts and that “pull” on the central dark matter concentration to create a core [176], but this is still a highly fine-tuned process, which fails to address the baryon fraction problem (see challenge 10 below).

  7. 7.

    The angular momentum challenge. As a consequence of the merger history of galaxy disks in a hierarchical formation scenario, as well as of the associated transfer of angular momentum from the baryonic disk to the dark halo, the specific angular momentum of the baryons ends up being much too small in simulated disks, which in turn end up much smaller than the observed ones [4]. Similarly, elliptical systems end up too concentrated as well. Addressing this challenge within the standard paradigm essentially relies on forming disks through late-time quiescent gas accretion from large-scale filaments, with much less late-time mergers than presently predicted in ΛCDM.

  8. 8.

    The pure disk challenge. Related to the previous challenge, large bulgeless thin disk galaxies are extremely difficult to produce in simulations. This is because major mergers, at any time in the galaxy formation process, typically create bulges, so bulgeless galaxies would represent the quiescent tail of a distribution of merger histories for galaxies of the Local Volume. However, these bulgeless disk galaxies represent more than half of large galaxies (with Vc > 150 km/s) in the Local Volume [178, 231]. Solving this problem would rely, e.g., on suppressing central spheroid formation for mergers with mass ratios lower than 30% [228].

  9. 9.

    The stability challenge. Round CDM halos tend to stabilize very low surface density disks against the formation of bars and spirals, due to a lack of disk self-gravity [291]. The observation [282] of Low Surface Brightness (LSB) disk galaxies with strong bars and spirals is thus challenging in the absence of a significant disk component of dark matter. What is more, in the absence of such a disk DM component, the lack of disk self-gravity prevents the creation of very-large razor-thin LSB disks, but these are observed [222, 260]. In the standard context, these observations would tend to point towards an additional disk DM component, either a CDM-one linked to in-plane accretion of satellites or a baryonic one in the form of molecular gas.

  10. 10.

    The missing baryons challenge(s). As mentioned above, constraints from the CMB imply Ωm = 0.27 and Ωb = 0.046. However, our inventory of known baryons in the local Universe, summing over all observed stars, gas, etc., comes up short of the total. For example, [42] estimate that the sum of stars and cold gas is only ∼ 5% of Ωb While there now seems to be a good chance that many of the missing baryons are in the form of highly ionized gas in the warm-hot intergalactic medium (WHIM), we are still far from being able to give a confident account of where all the baryons reside. Indeed, there could be multiple distinct reservoirs in addition to the WHIM, each comparable to the mass in stars, within the current uncertainties. But there is another missing baryons challenge, namely the halo-by-halo missing baryons. Indeed, each CDM halo can, to a first approximation, be thought of as a microcosm of the whole. As such, one would naively expect each halo to have the same baryon fraction as the whole Universe, fb = Ωbm = 0.17. On the scale of clusters of galaxies, this is approximately true (but still systematically low), but for individual galaxies, observations depart from this in a systematic way which we have yet to understand, and which has nothing to do with the truncation radius. The ratio of the galaxy-detected baryon fraction over the cosmological one, fd, is plotted as a function of the potential well of the systems in Figure 2 [284]. There is a clear correlation, less massive objects being much more dark-matter dominated than massive ones. This correlation is a priori not predicted at all by ΛCDM, at least not with the correct shape [273]. This missing baryons challenge is actually closely related to the baryonic Tully-Fisher relation, which we expand on in Section 4.3.1.

Figure 2
figure 2

The fraction of the expected baryons that are detected as a function of potential-well depth (bottom axis) and mass (top). Measurements are referenced to the radius R500, where the enclosed density is 500 times the cosmic mean [284]. The detected baryon fraction fd = Mb/(0.17 M500), where Mb is the detected baryonic mass, 0.17 is the universal baryon fraction [229], and M500 is the dynamical mass (baryonic + dark mass) enclosed by R500. Each point is a bin representing many objects. Gray triangles represent galaxy clusters, which come close to containing the cosmic fraction. The detected baryon fraction declines systematically for smaller systems. Dark-blue circles represent star-dominated spiral galaxies. Light-blue circles represent gas-dominated disk galaxies. Orange squares represent Local Group dwarf satellites for which the baryon content can be less than 1% of the cosmic value. Where these missing baryons reside is one of the challenges currently faced by ΩCDM.

However, let us note that, while challenges 1 to 3 are not real smoking guns yet for the ΛCDM model, challenges 4 to 10 are concerned with processes happening on kpc scales, for which it is fair to consider that the model is not very predictive because the baryon physics should play a more important role, and this is hard to take into account rigorously. However, it is not sufficient to qualitatively invoke handwavy baryon physics to avoid confronting predictions of ΛCDM with observations. It is also mandatory to show that the feedback from the baryons, which is needed to solve the observational problems, is what would quantitatively happen in a physical galaxy. This, presently, is not yet the case for the aforementioned challenges. However, these challenges are “model-dependent problems”, in the sense of being failed predictions of a given model, but would not have appeared a priori surprising without the standard concordance model at hand. This means that subtly changing some parameters of the model (like, e.g., swapping CDM for WDM, making DM more self-interacting, etc.) might help solving at least a few of them. But what is even more challenging is a set of observations that appear surprising independently of any specific dark matter model, as they involve a fine-tuned relation between the distribution of visible and dark matter. These are what we call hereafter “unpredicted observations”.

4.3 Unpredicted observations

There are several important examples of systematic relations between the dynamics of galaxies (in theory presumed to be dominated by dark matter) and their baryonic content. These relations are fully empirical, and as such must be explained by any viable theory. As we shall see, they inevitably involve a critical acceleration scale, or equivalently, a critical surface density of baryonic matter.

4.3.1 Baryonic Tully-Fisher relation

One of the strongest correlations in extragalactic astronomy is the Tully-Fisher relation [467]. Originally identified as an empirical relation between a galaxy’s luminosity and its HI line-width, it has been widely employed as a distance indicator. Though extensively studied for decades, the physical basis of the relation remains unclear.

Luminosity and line-width are readily accessible observational quantities. The optical luminosity of a galaxy is a proxy for its stellar mass, and the HI line-width is a proxy for its rotation velocity. The quality of the correlation improves as more accurate indicators of these quantities are employed. For example, resolved rotation curves, where the flat portion of the rotation curve Vf or the maximum peak velocity Vp can be measured, give relations that are tighter than those utilizing only line-width information [108]. Similarly, the scatter declines as we shift from optical luminosities to those in the near-infrared [475] as the latter are expected to give a more reliable mapping of starlight to stellar mass [42].

It was then realized [322, 157, 283] that a more fundamental relation was that between the total observed baryonic mass and the rotation velocity. In most bright galaxies, the stars harbor the majority of the detected baryonic mass, so luminosity suffices as a proxy for mass. The next-most-important known reservoir of baryons is the neutral atomic hydrogen (HI) of the interstellar medium. As studies have probed down the mass spectrum to lower mass, more slowly rotating systems, a higher preponderance of gas rich galaxies is found. The luminous Tully-Fisher relation breaks down [283, 272], but a tight relation persists if instead of luminosity, the detected baryonic mass Mb = M* + Mg is used [283, 475, 42, 272, 353, 31, 445, 462, 276]. This is the Baryonic Tully-Fisher Relation (BTFR), plotted on Figure 3.

Figure 3
figure 3

The Baryonic Tully-Fisher (mass-rotation velocity) relation for galaxies with well-measured outer velocities Vf. The baryonic mass is the combination of observed stars and gas: Mb = M* + Mg. Galaxies have been selected that have well observed, extended rotation curves from 21 cm interferrometric observations providing a good measure of the outer, flat rotation velocity. The dark blue points are galaxies with M* > Mg [272]. The light blue points have M* < Mg [276] and are generally less precise in velocity, but more accurate in terms of the harmlessness on the result of possible systematics on the stellar mass-to-light ratio. For a detailed discussion of the stellar mass-to-light ratios used here, see [272, 276]. The dotted line has slope 4 corresponding to a constant acceleration parameter, 1.2 × 10−10 m s−2. The dashed line has slope 3 as expected in ΛCDM with the normalization expected if all of the baryons associated with dark matter halos are detected. The difference between these two lines is the origin of the variation in the detected baryon fraction in Figure 2.

The luminous Tully-Fisher relation extends over about two decades in luminosity. Recent work extending the relation to low mass, typically LSB and gas rich galaxies [31, 445, 462] extends the dynamic range of the BTFR to five decades in baryonic mass. Over this range, the BTFR has remarkably little intrinsic scatter (consistent with zero given the observational errors) and is well described as a power law, or equivalently, as a straight line in log-log space:

$$\log {M_b} = \alpha \log {V_f} - \log \beta$$

with slope α = 4 [272, 445, 276]. This slope is consistent with a constant acceleration scale \({\rm{a =}}V_f^4/(G{M_b})\) such thatFootnote 10 the normalization constant β = Ga.

The acceleration scale a ≈ 10−10 m s−2 ∼ Λ1/2 (Eq. 1) is thus present in the data. Figure 4 shows the distribution of this acceleration \(V_f^4/{M_b}\), around the best fit line in Figure 3, strongly peaked around ∼ 2 × 10−62 in natural units. As we shall see, this acceleration scale arises empirically in a variety of distinct situations involving the mass discrepancy problem.

Figure 4
figure 4

Histogram of the accelerations \({\rm{a =}}V_f^4/(G{M_b})\) (bottom axis) and natural units [c4/(GmP) where mp is the Planck mass] for galaxies with well measured Vf. The data are peaked around a characteristic value of ∼ 10−10 m s−2 (∼ 2 × 10−62 in natural units).

A BTFR of the observed form does not arise naturally in ΛCDM. The naive expectation is \(\alpha = 3\) and \(\beta = 10f_V^3G{H_0}\) [446]Footnote 11 where H0 is the Hubble constant and fV is a factor of order unity (currently estimated to be ≈ 1.3 [361]) that relates the observed Vf to the circular velocity of the potential at the virial radiusFootnote 12. This modest fudge factor is necessary because ΛCDM does not explicitly predict either axis of the observed BTFR. Rather, there is a relationship between total (baryonic plus dark) mass and rotation velocity at very large radii. This simple scaling fails (dashed line in Figure 3), obliging us to introduce an additional fudge factor fd [273, 284] that relates the detected baryonic mass to the total mass of baryons available in a halo. This mismatch drives the variation in the detected baryon fraction fd seen in Figure 2. A constant fd is excluded by the difference between the observed and predicted slopes; fd must vary with Vf, or M, or the gravitational potential Φ

This brings us to the first fine-tuning problem posed by the data. There is essentially zero intrinsic scatter in the BTFR [276], while the detected baryon fraction fd could, in principle, obtain any value between zero and unity. Somehow galaxies must “know” what the circular velocity of the halo they reside in is so that they can make observable the correct fraction of baryons.

Quantitatively, in the ΛCDM picture, the baryonic mass plotted in the BTFR (Figure 3) is Mb = M* + Mg while the total baryonic mass available in a halo is fbMtot. The difference between these quantities implies a reservoir of dark baryons in some undetected form, Mother. It is commonly speculated that the undetected baryons could be in a hard-to-detect hot, diffuse, ionized phase mixed in with the dark matter halo (and extending to comparable radius), or that the missing baryons have been entirely blown away by winds from supernovae. For the purposes of this argument, it does not matter which form the dark baryons take. All that matters is that a substantial mass of them are required so that [283]

$${f_d} = {{{M_b}} \over {{f_b}{M_{{\rm{tot}}}}}} = {{{M_\ast} + {M_g}} \over {{M_\ast} + {M_g} + {M_{{\rm{other}}}}}}.$$

Since there is negligible intrinsic scatter in the observed BTFR, there must be effectively zero scatter in fd. By inspection of Eq. 3, it is apparent that small scatter in fd can only be obtained naturally in the limits M* + MgMother so that fd → 1 or M* + MgMother so that fd → 0. Neither of these limits apply. We require not only an appreciable mass in dark baryons Mother, but we need the fractional mass of these missing baryons to vary in lockstep with the observed rotation velocity Vf. Put another way, for any given galaxy, we know not only how many baryons we see, but also how many we do not see — a remarkable feat of non-observation.

Another remarkable fact about the BTFR is that it shows no residuals with variations in the distribution of baryons [517, 443, 109, 271]. Figure 5 shows deviations from the BTFR as a function of the characteristic baryonic surface density of the galaxies, as defined in [271], i.e., \({\Sigma _b} = 0.75{M_b}/R_p^2\) where Rp is the radius at which the rotation curve Vb(r) of baryons peaks. Over several decades in surface density, the BTFR is completely insensitive to variations in the mass distribution of the baryons. This is odd because, a priori, V2M/R, and thus V4MΣ. Yet the BTFR is \({M_b} \sim V_f^4\) with no dependence on Σ. This brings us to a second fine-tuning problem. For some time, it was thought [156] that spiral galaxies all had very nearly the same surface brightness (a condition formerly known as “Freeman’s Law”). If this is indeed the case, the observed BTFR naturally follows from the constancy of Σ. However, there do exist many LSB galaxies [264] that violate the constancy of surface brightness implied in Freeman’s Law. Thus, one would expect them to deviate systematically from the Tully-Fisher relation, with lower surface brightness galaxies having lower rotation velocities at a given mass. Yet they do not. Thus, one must fine-tune the mass surface density of the dark matter to precisely make up for that of the baryons [279]. As the surface density of baryons declines, that of the dark matter must increase just so as to fill in the difference (Figure 6 [271]). The relevant quantity is the dynamical surface density enclosed within the radius, where the velocity is measured. The latter matters little along the flat portion of the rotation curve, but the former is the sum of dark and baryonic matter.

Figure 5
figure 5

Residuals (δ log Vf) from the baryonic Tully-Fisher relation as a function of a galaxy’s characteristic baryonic surface density \(({\Sigma _b} = 0.75{M_b}/R_p^2\) [271], Rp being the radius at which the contribution of baryons to the rotation curve peaks). Color differentiates between star (dark blue) and gas (light blue) dominated galaxies as in Figure 3, but not all galaxies there have sufficient data (especially of Rp) to plot here. Stellar masses have been estimated with stellar population synthesis models [42]. More accurate data, with uncertainty on rotation velocity less than 5%, are shown as larger points; less accurate data are shown as smaller points. The rotation velocity of galaxies shows no dependence on the distribution of baryons as measured by Σb or Rp. This is puzzling in the conventional context, where V2 = GM/r should lead to a strong systematic residual [109].

Figure 6
figure 6

The fractional contribution to the total velocity Vp at the radius RP where the contribution of the baryons peaks for both baryons (Vb/Vp, top) and dark matter (Vdm/Vp, bottom). Points as per Figure 5. As the baryonic surface density increases, the contribution of the baryons to the total gravitating mass increases. The dark matter contribution declines in compensation, maintaining a see-saw balance that manages to leave no residual in the BTFR (Figure 5). The absolute amplitude of Vb, and Vdm depends on choice of stellar mass estimator, but the fine-tuning between them must persist for any choice of M*/L.

One might be able to avoid fine-tuning if all galaxies are dark-matter dominated [109]. In the limit Σ dmb, the dynamics are entirely dark-matter dominated and the distribution of the baryons is irrelevant. There is some systematic uncertainty in the mass-to-light ratios of stellar populations [42], making such an approach a priori tenable. In effect, we return to the interpretation of Σ ∼ constant originally made by [3] in the context of Freeman’s Law, but now we invoke a constant surface density of CDM rather than of baryons. But as we will see, such an interpretation, i.e., that Σb ≪Σdm in all disk galaxies, is flatly contradicted by other observations (e.g., Figure 9 and Figure 13).

The Tully-Fisher relation is remarkably persistent. Originally posited for bright spirals, it applies to galaxies that one would naively expected to deviate from it. This includes low-luminosity, gas-dominated irregular galaxies [445, 462, 276], LSB galaxies of all luminosities [517, 443], and even tidal dwarfs formed in the collision of larger galaxies [165]. Such tidal dwarfs may be especially important in this context (see also Section 6.5.4). Galactic collisions should be very effective at segregating dark and baryonic matter. The rotating gas disks of galaxies that provide the fodder for tidal tails and the tidal dwarfs that form within them initially have nearly circular, coplanar orbits. In contrast, the dark-matter particles are on predominantly radial orbits in a quasi-spherical distribution. This difference in phase space leads to tidal tails that themselves contain very little dark matter [72]. When tidal dwarfs form from tidal debris, they should be largely devoidFootnote 13 of dark matter. Nevertheless, tidal dwarfs do appear to contain dark matter [72] and obey the BTFR [165].

The critical acceleration scale of Eq. 1 also appears in non-rotating galaxies. Elliptical galaxies are three-dimensional stellar systems supported more by random motions than organized rotation. First of all, in such systems of measured velocity dispersion σ, the typical acceleration σ2/R is also on the order of a0 within a factor of a few, where R is the effective radius of the system [401]. Moreover, they obey an analogous relation to the Tully-Fisher one, known as the Faber-Jackson relation (Figure 7). In bulk, the data for these star-dominated galaxies follow the relation σ4/(GM*) ∝ a0 (dotted line in Figure 7). This is not strictly analogous to the flat part of the rotation curves of spiral galaxies, the dispersion typically being measured at smaller radii, where the equivalent circular velocity curve is often falling [367, 323], or in a temporary plateau before falling again (see also Section 6.6.1). Indeed, unlike the case in spiral galaxies, where the distribution of stars is irrelevant, it clearly does matter in elliptical galaxies (the Faber-Jackson relation is just one projection of the “fundamental plane” of elliptical galaxies [85]). This is comforting: at small radii in dense stellar systems where the baryonic mass of stars is clearly important, the data behave as Newton predicts.

Figure 7
figure 7

The Faber-Jackson relation for spheroidal galaxies, including both elliptical galaxies (red squares, [85, 232]) and Local Group dwarf satellites [285] (orange squares are satellites of the Milky Way; pink squares are satellites of M31). In analogy with the Tully-Fisher relation for spiral galaxies, spheroidal galaxies follow a relation between stellar mass and line of sight velocity dispersion (σ). The dotted line represents a constant value of the acceleration parameter σ4/(GM*). Note, however, that this relation is different from the BTFR because it applies to the bulk velocity dispersion while the BTFR applies to the asymptotic circular velocity. In the context of Milgrom’s law (Section 5) the Faber-Jackson relation is predicted only when relying on assumptions such as isothermality, isotropy, and the slope of the baryonic density distribution (see 3rd law of motion in Section 5.2). In addition, not all pressure-supported systems are in the weak-acceleration regime. So, in the context of Milgrom’s law, deviations from the weak-field regime, from isothermality and from isotropy, as well as variations in the baryonic density distribution slope, would thus explain the scatter in this relation.

The acceleration scale a0 is clearly imprinted on the data for local galaxies. This is an empirical statement that might not hold at all times, perhaps evolving over cosmic time or evaporating altogether. Substantial efforts have been made to investigate the Tully-Fisher relation to high redshift. To date, there is no persuasive evidence of evolution in the zero point of the BTFR out to z = 0.6 [356, 357] and perhaps even to z = 1 [485]. One must exercise caution in interpreting such results given the difficulty inherent in peering many Gyr back in cosmic time. Nonetheless, it appears that the scale a0 remains present in the data and has not obviously changed over the more recent half of the age of the Universe.

4.3.2 The role of surface density

The Freeman limit [156] is the maximum central surface brightness in the distribution of galaxy surface brightnesses. Originally thought to be a universal surface brightness, it has since become clear that instead galaxies exist over a wide range in surface brightness [264]. In the absence of a perverse and fine-tuned anti-correlation between surface brightness and stellar mass-to-light ratio [517], this implies a comparable range in baryonic surface density (Figure 8).

Figure 8
figure 8

Size and surface density. The characteristic surface density of baryons as defined in Figure 5 is plotted against their dynamical scale length Rp in the left panel. The dark-blue points are star-dominated galaxies and the light-blue ones gas-dominated. High characteristic surface densities at low Rp in the left panel are typical of bulge-dominated galaxies. The stellar disk component of most spiral galaxies is well approximated by the exponential disk with \(\Sigma (R) = {\Sigma _{{0^e}}}^{- R/{R_d}}\). This disk-only central surface density and the exponential scale length of the stellar disk are plotted in the right panel. Galaxies exist over a wide range in both size and surface density. There is a maximum surface density threshold (sometimes referred to as Freeman’s limit) above which disks become very rare [264]. This is presumably a stability effect, as purely Newtonian disks are unstable [343, 415]. Stable disks only appear below a critical surface density Σa0/G [299, 77].

An upper limit to the surface brightness distribution is interesting in the context of disk stability. Recall that dynamically cold, purely Newtonian disks are subject to potentially-self-destructive instabilities, one cure being to embed them in the potential wells of spherical dark-matter halos [343]. While the proper criterion for stability is much debated [131, 415], it is clear that the dark matter halo moderates the growth of instabilities and that the ratio of halo to disk self gravity is a relevant quantity. The more self-gravitating a disk is, the more likely it is to suffer undamped growth of instabilities. But, in principle, galaxies with a baryonic disk and a dark matter halo are totally scalable: if a galaxy model has a certain dynamics, and one multiplies all densities by any (positive) constant (and also scales the velocities appropriately) one gets another galaxy with exactly the same dynamics (with scaled time scales). So if one is stable, so is the other. In turn, the mere fact that there might be an upper limit to Σb is a priori surprising, and even more so that there might be a coincidence of this upper limit with the acceleration scale a0 identified dynamically.

The scale Σ = a0/G is clearly present in the data (Figure 8). Selection effects make high-surface-brightness (HSB) galaxies easy to detect and hence discover, but their intrinsic numbers appear to decline exponentially when the central surface density of the stellar disk Σ0 > Σ [264]. It seems natural to associate the dynamical scale a0 with the disk stability scale since they are numerically indistinguishable and both arise in the context of the mass discrepancy. However, there is no reason to expect this in ΛCDM, which predicts denser dark matter halos than observed [280, 169, 167, 241, 243, 478, 118]. Such dense dark matter halos could stabilize much higher density disks than are observed to exist. Lacking a clear mechanism to specify this scale, it is introduced into models by hand [115].

Poisson’s equation provides a direct relation between the force per unit mass (centripetal acceleration in the case of circular orbits in disk galaxies), the gradient of the potential, and the surface density of gravitating mass. If there is no dark matter, the observed surface density of baryons must correlate perfectly with the dynamical acceleration. If, on the other hand, dark matter dominates the dynamics of a system, as we might infer from Figure 5 [279, 109], then there is no reason to expect a correlation between acceleration and the dynamically-insignificant baryons. Figure 9 shows the dynamical acceleration as a function of baryonic surface density in disk galaxies. The acceleration ap = Vp/Rp is measured at the radius Rp, where the rotation curve Vb(r) of baryons peaks. Given the systematic variation of rotation curve shape [376, 495], the specific choice of radii is unimportant. Nevertheless, this radius is advocated by [109] since this maximizes the possibility of perceiving the baryonic contribution in the plot of Figure 5. That this contribution is not present leads to the inference that Σb ≪ Σdm in all disk galaxies [109]. This is directly contradicted by Figure 9, which shows a clear correlation between ap and Σb. The higher the surface density of baryons, the higher the observed acceleration. The slope of the relation is not unity, ap ∝ Σb, as we would expect in the absence of a mass discrepancy, but rather \({a_p} \propto \Sigma _b^{1/2}\). To simultaneously explain Figure 5 and Figure 9, there must be a strong fine-tuning between dark and baryonic surface densities (i.e., Figure 6), a sort of repulsion between them, a repulsion which is however contradicted by the correlations between baryonic and dark matter bumps and wiggles in rotation curves (see Section 4.3.4).

Figure 9
figure 9

The dynamical acceleration \({a_p} = V_p^2/{R_p}\) in units of a0 plotted against the characteristic baryonic surface density [275]. Points as per Figure 5. The dotted line shows the relation ap = GΣb that would be obtained if the visible baryons sufficed to explain the observed velocities in Newtonian dynamics. Though the data do not follow this line, they do show a correlation \(({a_p} \propto \Sigma _b^{1/2})\). This clearly indicates a dynamical role for the baryons, in contradiction to the simplest interpretation [109] of Figure 5 that dark matter completely dominates the dynamics.

4.3.3 Mass discrepancy-acceleration relation

So far we have discussed total quantities. For the BTFR, we use the total observed mass of a galaxy and its characteristic rotation velocity. Similarly, the dynamical acceleration-baryonic surface density relation uses a single characteristic value for each galaxy. These are not the only ways in which the “magical” acceleration constant a0 appears in the data. In general, the mass discrepancy only appears at very low accelerations a < a0 and not (much) above a0. Equivalently, the need for dark matter only becomes clear at very low baryonic surface densities Σ < Σ = a0/G. Indeed, the amplitude of the mass discrepancy in galaxies anti-correlates with acceleration [270].

In [270], one examined the role of various possible scales, as well as the effects of different stellar mass-to-light ratio estimators, on the mass discrepancy problem. The amplitude of the mass discrepancy, as measured by (V/Vb)2, the ratio of observed velocity to that predicted by the observed baryons, depends on the choice of estimator for stellar M*/L. However, for any plausible (non-zero) M*/L, the amplitude of the mass discrepancy correlates with acceleration (Figure 10) and baryonic surface density, as originally noted in [382, 266, 406]. It does not correlate with radius and only weakly with orbital frequencyFootnote 14.

Figure 10
figure 10

The mass discrepancy in spiral galaxies. The mass discrepancy is defined [270] as the ratio \({V^2}/V_b^2\) where V is the observed velocity and Vb is the velocity attributable to visible baryonic matter. The ratio of squared velocities is equivalent to the ratio of total-to-baryonic enclosed mass for spherical systems. No dark matter is required when V = Vb, only when V > Vb. Many hundreds of individual resolved measurements along the rotation curves of nearly one hundred spiral galaxies are plotted. The top panel plots the mass discrepancy as a function of radius. No particular linear scale is favored. Some galaxies exhibit mass discrepancies at small radii while others do not appear to need dark matter until quite large radii. The middle panel plots the mass discrepancy as a function of centripetal acceleration a = V2/r, while the bottom panel plots it against the acceleration \({g_N} = V_b^2/r\) predicted by Newton from the observed baryonic surface density Σb. Note that the correlation appears a little better with gn because the data are stretched out over a wider range in gN than in a. Note also that systematics on the stellar mass-to-light ratios can make this relation slightly more blurred than shown here, but the relation is nevertheless always present irrespective of the assumptions on stellar mass-to-light ratios [270]. Thus, there is a clear organization: the amplitude of the mass discrepancy increases systematically with decreasing acceleration and baryonic surface density.

There is no reason in the dark matter picture why the mass discrepancy should correlate with any physical scale. Some systems might happen to contain lots of dark matter; others very little. In order to make a prediction with a dark matter model, it is necessary to model the formation of the dark matter halo, the condensation of gas within it, the formation of stars therefrom, and any feedback processes whereby the formation of some stars either enables or suppresses the formation of further stars. This complicated sequence of events is challenging to model. Baryonic “gastrophysics” is particularly difficult, and has thus far precluded the emergence of a clear prediction for galaxy dynamics from ΛCDM.

ΛCDM does make a prediction for the distribution of mass in baryonless dark matter halos: the NFW halo [332, 333]. These are remarkable for being scale free. Small halos have a profile similar to large halos. No feature stands out that marks a unique physical scale as observed. Galaxies do not resemble pure NFW halos [416], even when dark matter dominates the dynamics as in LSB galaxies [241, 243, 118]. The inference in ΛCDM is that gastrophysics, especially the energetic feedback from stellar winds and supernova explosions, plays a critical role in sculpting observed galaxies. This role is not restricted to the minority baryonic constituents; it must also affect the majority dark matter [176]. Simulations incorporating these effects in a quasi-realistic way are extremely expensive computationally, so a comprehensive survey of the plausible parameter space occupied by such models has yet to be made. We have no reason to expect that a particular physical scale will generically emerge as the result of baryonic gastrophysics. Indeed, feedback from star formation is inherently a random process. While it is certainly possible for simple laws to emerge from complicated physics (e.g., the fact that SNIa are standard candles despite the complicated physics involved), the more common situation is for chaos to beget chaos. Therefore, it seems unnatural to imagine feedback processes leading to the orderly behavior that is observed (Figure 10); nor is it obvious how they would implicate any particular physical scale. Indeed, the dark matter halos formed in ΛCDM simulations [332, 333] provide an initial condition with greater scatter than the final observed one [280, 478], so we must imagine that the chaotic processes of feedback not only impart order, but do so in a way that cancels out some of the scatter in the initial conditions.

In any case, and whatever the reason for it, a physical scale is clearly observationally present in the data: a0 (Eq. 1). At high accelerations aa0, there is no indication of the need for dark matter. Below this acceleration, the mass discrepancy appears. It cannot be emphasized enough that the role played by a0 in the BTFR and this role as a transition acceleration have strictly no intrinsic link with each other, they are fully independent of each other. There is nothing in ΛCDM that stipulates that these two relations (the existence of a transition acceleration and the BTFR) should exist at all, and even less that these should harbour an identical acceleration scale.

Thus, it is important to realize not only that the relevant dynamical scale is one of acceleration, not size, but also that the mass discrepancy appears only at extremely low accelerations. Just as galaxies are much bigger than the Solar system, so too are the centripetal accelerations experienced by stars orbiting within a galaxy much smaller than those experienced by planets in the Solar system. Many of the precise tests of gravity that have been made in the Solar system do not explore the relevant regime of physical parameter space. This is emphasized in Figure 11, which extends the mass discrepancy-acceleration relation to Solar system scales. Many decades in acceleration separate the Solar system from galaxies. Aside from the possible exception of the Pioneer anomaly, there is no hint of a discrepancy in the Solar system: V = Vb. Even the Pioneer anomalyFootnote 15 is well removed from the regime where the mass discrepancy manifests in galaxies, and is itself much too subtle to be perceptible in Figure 11. Indeed, to within a factor of ∼ 2, no system exhibits a mass discrepancy at accelerations aa0.

Figure 11
figure 11

The mass-discrepancy-acceleration relation from Figure 10 extended to solar-system scales (each planet is labelled). This illustrates the large gulf in scale between galaxies and the Solar system where high precision tests are possible. The need for dark matter only appears at very low accelerations.

The systematic increase in the amplitude of the mass discrepancy with decreasing acceleration and baryonic surface density has a remarkable implication. Even though the observed velocity is not correctly predicted by the observed baryons, it is predictable from them. Independent of any theory, we can simply fit a function D(GΣ) to describe the variation of the discrepancy (V/Vb)2 with baryonic surface density [270]. We can then apply it to any new system we encounter to predict V = D1/2Vb. In effect, D boosts the velocity already predicted by the observed baryons. While this is a purely empirical exercise with no underlying theory, it is quite remarkable that the distribution of dark matter required in a galaxy is entirely predictable from the distribution of its luminous mass (see also [167]). In the conventional picture, dark matter outweighs baryonic matter by a factor of five, and more in individual galaxies given the halo-by-halo missing baryon problem (Figure 2), but apparently the baryonic tail wags the dark matter dog. And it does so again through the acceleration scale a0. Indeed, at very low accelerations, the mass discrepancy is precisely defined by the inverse of the square-root of the gravitational acceleration generated by the baryons in units of a0. This actually asymptotically leads to the BTFR.

So, up to now, we have seen five roles of a0 in galaxy dynamics. (i) It defines the zero point of the Tully-Fisher relation, (ii) it appears as the characteristic acceleration at the effective radius of spheroidal systems, (iii) it defines the Freeman limit for the maximum surface density of pure disks, (iv) it appears as a transition-acceleration above which no dark matter is needed, and below which it appears, and (v) it defines the amplitude of the mass-discrepancy in the weak-field regime (this last point is not a fully independent role as it leads to the Tully-Fisher relation). Let us eventually note that there is yet a final role played by a0, which is that it defines the central surface density of all dark matter halos as being on the order of a0/(2πG) [129, 167, 313].

4.3.4 Renzo’s rule

The relation between dynamical and baryonic surface densities appears as a global scaling relation in disk galaxies (Figure 9) and as a local correspondence within each galaxy (Figure 10). When all galaxies are plotted together as in Figure 10, this connection appears as a single smooth function D(a). This does not suffice to illustrate that individual galaxies have features in their baryon distribution that are reflected in their dynamics. While the above correlations could be interpreted as a sort of repulsion between dark and baryonic matter, the following rather indicates closer-than-natural attraction.

Figure 12 shows the spiral galaxy NGC 6946. Two multi-color images of the stellar component are given. The optical bands provide a (nearly) true color picture of the galaxy, which is perceptibly redder near the center and becomes progressively more blue further out. This is typical of spiral galaxies and reflects real differences in stellar content: the stars towards the center tend to be older and more dominated by the light of red giants, while those further out are younger on average so the light has a greater fractional contribution from bright-but-short-lived main sequence stars. The near-infrared bands [209] give a more faithful map of stellar mass, and are less affected by dust obscuration. Radio synthesis imaging of the 21 cm emission from the hydrogen spin-flip transition maps the atomic gas in the interstellar medium, which typically extends to rather larger radii than the stars.

Figure 12
figure 12

The spiral galaxy NGC 6946 as it appears in the optical (color composite from the BVR bands, left; image obtained by SSM with Rachel Kuzio de Naray using the Kitt Peak 2.1 m telescope), near-infrared (JHK bands, middle [209]), and in atomic gas (21 cm radiaiton, right [481]). The images are shown at the same physical scale, illustrating how the atomic gas typically extends to greater radii than the stars. Images like these are used to construct mass models representing the observed distribution of baryonic mass.

Surface density profiles of galaxies are constructed by fitting ellipses to images like those illustrated in Figure 12. The ellipses provide an axisymmetric representation of the variation of surface brightness with radius. This is shown in the top panels of Figure 13 for NGC 6946 (Figure 12) and the nearby, gas rich, LSB galaxy NGC 1560. The K-band light distribution is thought to give the most reliable mapping of observed light to stellar mass [42], and has been used to trace the run of stellar surface density in Figure 13. The sharp feature at the center is a small bulge component visible as the red central region in Figure 12. The bulge contains only 4% of the K-band light. The remainder is the stellar disk; a straight line fit to the data outside the central bulge region gives the parameters of the exponential disk approximation, Σ0 and Rd. Similarly, the surface density of atomic gas is traced by the 21 cm emission, with a correction for the cosmic abundance of helium — the detected hydrogen represents 75% of the gas mass believed to be present, with most of the rest being helium, in accordance with BBN.

Figure 13
figure 13

Surface density profiles (top) and rotation curves (bottom) of two galaxies: the HSB spiral NGC 6946 (Figure 12, left) and the LSB galaxy NGC 1560 (right). The surface density of stars (blue circles) is estimated by azimuthal averaging in ellipses fit to the K-band (2.2µm) light distribution. Similarly, the gas surface density (green circles) is estimated by applying the same procedure to the 21 cm image. Note the different scale between LSB and HSB galaxies. Also note features like the central bulge of NGC 6946, which corresponds to a sharp increase in stellar surface density at small radius. In the lower panels, the observed rotation curves (data points) are shown together with the baryonic mass models (lines) constructed from the observed distribution of baryons. Velocity data for NGC 6946 include both HI data that define the outer, flat portion of the rotation curve [66] and Hα data from two independent observations [54, 114] that define the shape of the inner rotation curve. Velocity data for NGC 1560 come from two independent interferometric HI observations [28, 163]. Baryonic mass models are constructed from the surface density profiles by numerical solution of the Poisson equation using GIPSY [472]. The dashed blue line is the stellar disk, the red dot-dashed line is the central bulge, and the green dotted line is the gas. The solid black line is the sum of all baryonic components. This provides a decent match to the rotation curve at small radii in the HSB galaxy, but fails to explain the flat portion of the rotation curve at large radii. This discrepancy, and its systematic ubiquity in spiral galaxies, ranks as one of the primary motivations for dark matter. Note that the mass discrepancy is large at all radii in the LSB galaxy.

Mass models (bottom panels of Figure 13) are constructed from the surface density profiles by numerical solution of the Poisson equation [52, 472]. No approximations (like sphericity or an exponential disk) are made at this step. The disks are assumed to be thin, with radial scale length exceeding their vertical scale by 8:1, as is typical of edge-on disks [236]. Consequently, the computed rotation curves (various broken lines in Figure 13) are not smooth, but reflect the observed variations in the observed surface density profiles of the various components. The sum (in quadrature) leads to the total baryonic rotation curve Vb(r) (the solid lines in Figure 13): this is what would be observed if no dark matter were implicated. Instead, the observed rotation (data points in Figure 13) exceeds that predicted by Vb,(r): this is the mass discrepancy.

It is often merely stated that flat rotation curves require dark matter. But there is considerably more information in rotation curve data than asymptotic flatness. For example, it is common that the rotation curve in the inner parts of HSB galaxies like NGC 6946 is well described by the baryons alone. The data are often consistent with a very low density of dark matter at small radii with baryons providing the bulk of the gravitating mass. This condition is referred to as maximum disk [471], and also runs contrary to our inferences of dark matter dominance from Figure 5 [414]. More generally, features in the baryonic rotation curve Vb (r) often correspond to features in the total rotation Vc(r).

Perhaps the most succinct empirical statement of the detailed connection between baryons and dynamics has been given by Renzo Sancisi, and known as Renzo’s rule [379]: “For any feature in the luminosity profile there is a corresponding feature in the rotation curve.” Both galaxies shown in Figure 13 illustrate this statement. In the inner region of NGC 6946, the small but compact bulge component causes a sharp feature in Vb(r) that declines rapidly before the rotation curve rises again, as mass from the disk begins to contribute. The up-down-up morphology predicted by the observed distribution of the baryons is observed in high resolution observations [54, 114]. A dark matter halo with a monotonically-varying density profile cannot produce such a morphology; the stellar bulge must be the dominant mass component at small radii in this galaxy.

A surprising aspect of Renzo’s rule is that it applies to LSB galaxies as well as those of high surface brightness. That the baryons should have some dynamical impact where their surface density is highest is natural, though there is no reason to demand that they become competitive with dark matter. What is distinctly unnatural is for the baryons to have a perceptible impact where dark matter must clearly dominate. NGC 1560 provides an example where they appear to do just that. The gas distribution in this galaxy shows a substantial kink in its surface density profile [28] (recently confirmed by [163]) that has a distinct impact on Vb(r). This occurs at a radius where VVb, so dark matter should be dominant. A spherical-dark-matter halo with particles on randomly oriented, highly radial orbits cannot support the same sort of structure as seen in the gas disk, and the spherical geometry, unlike a disk geometry, would smear the effect on the local acceleration. And yet the wiggle in the baryonic rotation curve is reflected in the total, as per Renzo’s rule.Footnote 16

One inference that might be made from these observations is that the dark matter is baryonic. This is unacceptable from a cosmological perspective, but it is possible to have a multiplicity of dark matter components. That is, we could have baryonic dark matter in the disks of galaxies in addition to a halo of non-baryonic cold dark matter. It is often possible to scale up the atomic gas component to fit the total rotation [193]. That implies a component of mass that is traced by the atomic gas — presumably some other dynamically cold gas component — that outweighs the observed hydrogen by a factor of six to ten [193]. One hypothesis for such a component is very cold molecular gas [352]. It is difficult to exclude such a possibility, though it also appears to be hard to sustain in LSB galaxies[292]. Dynamically, one might expect the extra mass to destabilize the LSB disk. One also returns to a fine-tuning between baryonic surface density and mass-to-light ratio. In order to maintain the balance observed in Figure 5, relatively more dark molecular gas will be required in LSB galaxies so as to maintain a constant surface density of gravitating mass, but given the interactions at hand, this might be at least a bit more promising than explaining it with CDM halos.

As a matter of fact, LSB galaxies play a critical role in testing many of the existing models for dark matter. This happens in part because they were appreciated as an important population of galaxies only after many relevant hypotheses were established, and thus provide good tests of their a priori expectations. Observationally, we infer that LSB disks exhibit large mass discrepancies down to small radii [119]. Conventionally, this means that dark matter completely dominates their dynamics: the surface density of baryons in these systems is never high enough to be relevant. Nevertheless, the observed distribution of baryons suffices to predict the total rotation [279, 120]. Once again, the baryonic tail wags the dark matter dog, with the observations of the minority baryonic component sufficing to predict the distribution of the dominant dark matter. Note that, conversely, nothing is “observable” about the dark matter, in present-day simulations, that predicts the distribution of baryons.

Thus, we see that there are many observations, mostly on galaxy scales, that are unpredicted, and perhaps unpredictable, in the standard dark matter context. They mostly involve a unique relationship between the distribution of baryons and the gravitational field, as well as an acceleration constant a0 on the order of the square-root of the cosmological constant, and they represent the most significant challenges to the current ΛCDM model.

5 Milgrom’s Empirical Law and “Kepler Laws” of Galactic Dynamics

Up to this point in this review, the challenges that we have presented have been purely based on observations, and fully independent of any alternative theoretical framework. However, at this point, it would obviously be a step forward if at least some of these puzzling observations could be summarized and empirically unified in some way, as such a unifying process is largely what physics is concerned with, rather than simply exposing a jigsaw of apparently unrelated empirical observations. And such an empirical unification is actually feasible for many of the unpredicted observations presented in the previous Section 4.3, and goes back to a rather old idea of the Israeli physicist Mordehai Milgrom.

Almost 30 years ago, back in 1983 (and thus before most of the aforementioned observations had been carried out), simply prompted by the question of whether the missing mass problem could perhaps reflect a breakdown of Newtonian dynamics in galaxies, Milgrom [293] devised a formula linking the Newtonian gravitational acceleration gN to the true gravitational acceleration g in galaxies. Such attempts to rectify the mass discrepancy by gravitational means often begin by noting that galaxies are much larger than the solar system. It is easy to imagine that at some suitably large scale, let’s say on the order of 1 kpc, there is a transition from the usual dynamics applicable in the comparatively-tiny solar system to some more general theory that applies on the scale of galaxies in order to explain the mass discrepancy problem. If so, we would expect the mass discrepancy to manifest itself at a particular length scale in all systems. However, as already noted, there is no universal length scale apparent in the data (Figure 10) [382, 266, 406, 279, 270]. The mass discrepancy appears already at small radii in some galaxies; in others there is no apparent need for dark matter until very large radii. This now observationally excludes all hypotheses that simply alter the force law at a linear length-scale.

5.1 Milgrom’s law and the dielectric analogy

Before such precise data were available, Milgrom [293] already noted that other scales were also possible, and that one that is as unique to galaxies as size is acceleration. The typical centripetal acceleration of a star in a galaxy is of order ∼ 10−10 m s−2. This is eleven orders of magnitude less than the surface gravity of the Earth. As we have seen in Section 4, this acceleration constant appears “miraculously” in very different scaling relations that should not, in principle, be related to each otherFootnote 17. This observational evidence for the universal appearance of a0 ≃ 10−10 m s−2 in galactic scaling relations was not at all observationally evident back in 1983. What Milgrom [293] then hypothesized was a modification of Newtonian dynamics below this acceleration constant a0, appropriate to the tiny accelerations encountered in galaxiesFootnote 18. This new constant a0 would then play a similar role as the Planck constant h in quantum physics or the speed of light c in special relativity. For large acceleration (or force per unit mass), F/m = ga0, everything would be normal and Newtonian, i.e., g = gN. Or, put differently, formally taking a0 → 0 should make the theory tend to standard physics, just like recovering classical mechanics for h → 0. On the other hand, formally taking a0 → ∞ (and G → 0), or equivalently, in the limit of small accelerations ga0, the modification would apply in the form:

$$g = \sqrt {{g_N}{a_0}} ,$$

where g = |g| is the true gravitational acceleration, and gN = |gN| the Newtonian one as calculated from the observed distribution of visible matter. Note that this limit follows naturally from the scale-invariance symmetry of the equations of motion under transformations (t, r) → (λt, λr) [315]. This particular modification was only suggested in 1983 by the asymptotic flatness of rotation curves and the slope of the Tully-Fisher relation. It is indeed trivial to see that the desired behavior follows from equation (4). For a test particle in circular motion around a point mass M, equilibrium between the radial component of the force and the centripetal acceleration yields \(V_c^2/r = {g_N} = GM/{r^2}\). In the weak-acceleration limit this becomes

$${{V_c^2} \over r} = \sqrt {{{GM{a_0}} \over {{r^2}}}} .$$

The terms involving the radius r cancel, simplifying to

$${V_c}^4(r) = V_f^4 = {a_0}GM.$$

The circular velocity no longer depends on radius, asymptoting to a constant Vf that depends only on the mass of the central object and fundamental constants. The equation above is the equivalent of the observed baryonic Tully-Fisher relation. It is often wrongly stated that Milgrom’s formula was constructed in an ad hoc way in order to reproduce galaxy rotation curves, while this statement is only true of these two observations: (i) the asymptotic flatness of the rotation curves, and (ii) the slope of the baryonic Tully-Fisher relation (but note that, at the time, it was not clear at all that this slope would hold, nor that the Tully-Fisher relation would correlate with baryonic mass rather than luminosity, and even less clear that it would hold over orders of magnitude in mass). All the other successes of Milgrom’s formula related to the phenomenology of galaxy rotation curves were pure predictions of the formula made before the observational evidence. The predictions that are encapsulated in this simple formula can be thought of as sort of “Kepler-like laws” of galactic dynamics. These various laws only make sense once they are unified within their parent formula, exactly as Kepler’s laws only make sense once they are unified under Newton’s law.

In order to ensure a smooth transition between the two regimes ga0 and ga0, Milgrom’s law is written in the following way:

$$\mu \left({{g \over {{a_0}}}} \right){\bf{g}} = {{\bf{g}}_{\bf{N}}},$$

where the interpolating function

$$\mu (x) \rightarrow 1\;{\rm{for}}\;x \gg 1\;{\rm{and}}\;\mu (x) \rightarrow x\;{\rm{for}}\;x \ll 1.$$

Written like this, the analogy between Milgrom’s law and Coulomb’s law in a dielectric medium is clear, as noted in [56]. Indeed, inside a dielectric medium, the amplitude of the electric field E generated by an external point charge Q located at a distance r obeys the following equation:

$$\mu (E)E = {Q \over {4\pi {\epsilon _0}{r^2}}},$$

where μ is the relative permittivity of the medium, and can depend on E. In the case of a gravitational field generated by a point mass M, it is then clear that Milgrom’s interpolating function plays the role of “gravitational permittivity”. Since it is smaller than 1, it makes the gravitational field stronger than Newtonian (rather than smaller in the case of the electric field in a dielectric medium, where μ > 1). In other words, the gravitational susceptibility coefficient χ (such that μ = 1 + χ) is negative, which is correct for a force law where like masses attract rather than repel [56]. This dielectric analogy has been explicitly used in devising a theory[60] where Milgrom’s law arises from the existence of a “gravitationally polarizable” medium (see Section 7).

Of course, inverting the above relation, Milgrom’s law can also be written as

$${\bf{g}} = \nu \left({{{{g_N}} \over {{a_0}}}} \right){{\bf{g}}_{\bf{N}}},$$


$$\nu (y) \rightarrow 1\;{\rm{for}}\;y \gg 1\;{\rm{and}}\;\nu (y) \rightarrow {y^{- 1/2}}{\rm{for}}\;y \ll 1.$$

However, as we shall see in Section 6, in order for g to remain a conservative force field, these expressions (Eqs. 7 and 10) cannot be rigorous outside of highly symmetrical situations. Nevertheless, it allows one to make numerous very general predictions for galactic systems, or, in other words, to derive “Kepler-like laws” of galactic dynamics, unified under the banner of Milgrom’s law. As we shall see, many of the observations unpredicted by ΛCDM on galaxy scales naturally ensue from this very simple law. However, even though Milgrom originally devised this as a modification of dynamics, this law is a priori nothing more than an algorithm, which allows one to calculate the distribution of force in an astronomical object from the observed distribution of baryonic matter. Its success would simply mean that the observed gravitational field in galaxies is mimicking a universal force law generated by the baryons alone, meaning that (i) either the force law itself is modified, or that (ii) there exists an intimate connection between the distribution of baryons and dark matter in galaxies.

It was suggested, for instance, [218] that such a relation might arise naturally in the CDM context, if halos possess a one-parameter density profile that leads to a characteristic acceleration profile that is only weakly dependent upon the mass of the halo. Then, with a fixed collapse factor for the baryonic material, the transition from dominance of dark over baryonic occurs at a universal acceleration, which, by numerical coincidence, is on the order of cH0 and thus of a0 (see also [411]). While, still today, it remains to be seen whether this scenario would quantitatively hold in numerical simulations, it was noted by Milgrom [306] that this scenario only explained the role of a0 as a transition radius between baryon and dark matter dominance in HSB galaxies, precluding altogether the existence of LSB galaxies where dark matter dominates everywhere. The real challenge for ΛCDM is rather to explain all the different roles played by a0 in galaxy dynamics, different roles that can all be summarized within the single law proposed by Milgrom, just like Kepler’s laws are unified under Newton’s law. We list these Kepler-like laws of galactic dynamics hereafter, and relate each of them with the unpredicted observations of Section 4, keeping in mind that these were mostly a priori predictions of Milgrom’s law, made before the data were as good as today, not “postdictions” like we are used to in modern cosmology.

5.2 Galactic Kepler-like laws of motion

  1. 1.

    Asymptotic flatness of rotation curves. The rotation curves of galaxies are asymptotically flat, even though this flatness is not always attained at the last observed point (see point hereafter about the shapes of rotation curves as a function of baryonic surface density). What is more, Milgrom’s law can be thought of as including the total acceleration with respect to a preferred frame, which can lead to the prediction of asymptotically-falling rotation curves for a galaxy embedded in a large external gravitational field (see Section 6.3).

  2. 2.

    Ga0 defining the zero-point of the baryonic Tully-Fisher relation. The plateau of a rotation curve is Vf = (GMa0)1/4. The true Tully-Fisher relation is predicted to be a relation between this asymptotic velocity and baryonic mass, not luminosity. Milgrom’s law yields immediately the slope (precisely 4) and zero-point of this baryonic Tully-Fisher law. The observational baryonic Tully-Fisher relation should thus be consistent with zero scatter around this prediction of Milgrom’s law (the dotted line of Figure 3). And indeed it is. All rotationally-supported systems in the weak acceleration limit should fall on this relation, irrespective of their formation mechanism and history, meaning that completely isolated galaxies or tidal dwarf galaxies formed in interaction events all behave as every other galaxy in this respect.

  3. 3.

    Ga0 defining the zero-point of the Faber-Jackson relation. For quasi-isothermal systems [296], such as elliptical galaxies, the bulk velocity dispersion depends only on the total baryonic mass via σ4GMa0. Indeed, since the equation of hydrostatic equilibrium for an isotropic isothermal system in the weak field regime reads d(σ2ρ)/dr = −ρ(GMa0)1/2/r, one has σ4 = α−2 × GMa0 where α = d ln ρ/d ln r. This underlies the Faber-Jackson relation for elliptical galaxies (Figure 7), which is, however, not predicted by Milgrom’s law to be as tight and precise (because it relies, e.g., on isothermality and on the slope of the density distribution) as the BTFR.

  4. 4.

    Mass discrepancy defined by the inverse of the acceleration in units of a0. Or alternatively, defined by the inverse of the square-root of the gravitational acceleration generated by the baryons in units of a0. The mass discrepancy is precisely equal to this in the very-low-acceleration regime, and leads to the baryonic Tully-Fisher relation. In the low-acceleration limit, gN/g = g/a0, so in the CDM language, inside the virial radius of any system whose virial radius is in the weak acceleration regime (well below a0), the baryon fraction is given by the acceleration in units of a0. If we adopt a rough relation \({M_{500}} \simeq 1.5 \times {10^5}{M_ \odot} \times V_c^3{({\rm{km/s)}}^{- 3}}\), we get that the acceleration at R500, and thus the system baryon fraction predicted by Milgrom’s formula, is Mb/M500 = a500/a0 ≃ 4 × 10−4 × Vc (km/s)−1. Divided by the cosmological baryon fraction, this explains the trend for fd = Mb/(0.17 M500) with potential \((\Phi = V_c^2)\) in Figure 2, thereby naturally explaining the halo-by-halo missing baryon challenge in galaxies. No baryons are actually missing; rather, we infer their existence because the natural scaling between mass and circular velocity \({M_{500}} \propto V_c^3\) in ΛCDM differs by a factor of Vc from the observed scaling \({M_b} \propto V_c^4\).

  5. 5.

    a0 as the characteristic acceleration at the effective radius of isothermal spheres. As a corollary to the Faber-Jackson relation for isothermal spheres, let us note that the baryonic isothermal sphere would not require any dark matter up to the point where the internal gravity falls below a0, and would thus resemble a purely baryonic Newtonian isothermal sphere up to that point. But at larger distances, in the presence of the added force due to Milgrom’s law, the baryonic isothermal sphere would fall [296] as r−4, thereby making the radius at which the gravitational acceleration is a0 the effective baryonic radius of the system, thereby explaining why, at this radius R in quasi-isothermal systems, the typical acceleration σ2/R is almost always observed to be on the order of a0. Of course, this is valid for systems where such a transition radius does exist, but going to very-LSB systems, if the internal gravity is everywhere below a0, one can then have typical accelerations as low as one wishes.

  6. 6.

    a0/G as a critical mean surface density for stability. Disks with mean surface density 〈Σ〉 ≤ Σ = a0/G have added stability. Most of the disk is then in the weak-acceleration regime, where accelerations scale as \(a \propto \sqrt M\), instead of aM. Thus, δa/a = (1/2)δM/M instead of δa/a = δM/M, leading to a weaker response to small mass perturbations [299]. This explains the Freeman limit (Figure 8).

  7. 7.

    a0 as a transition acceleration. The mass discrepancy in galaxies always appears (transition from baryon dominance to dark matter dominance) when \(V_c^2/R \sim {a_0}\), yielding a clear mass-discrepancy acceleration relation (Figure 10). This, again, is the case for every single rotationally-supported system irrespective of its formation mechanism and history. For HSB galaxies, where there exist two distinct regions where \(V_c^2/R > {a_0}\) in the inner parts and \(V_c^2/R < {a_0}\) in the outer parts, locally measured mass-to-light ratios should show no indication of hidden mass in the inner parts, but rise beyond the radius where \(V_c^2/R \approx {a_0}\) (Figure 14). Note that this is the only role of a0 that the scenario of [218] was poorly trying to address (forgetting, e.g., about the existence of LSB galaxies).

  8. 8.

    a0/G as a transition central surface density. The acceleration a0 defines the transition from HSB galaxies to LSB galaxies: baryons dominate in the inner parts of galaxies whose central surface density is higher than some critical value on the order of Σ = a0/G, while in galaxies whose central surface density is much smaller (LSB galaxies), DM dominates everywhere, and the magnitude of the mass discrepancy is given by the inverse of the acceleration in units of a0; see (5). Thus, the mass discrepancy appears at smaller radii and is more severe in galaxies of lower baryonic surface densities (Figure 14). The shapes of rotation curves are predicted to depend on surface density: HSB galaxies are predicted to have rotation curves that rise steeply, then become flat, or even fall somewhat to the not-yet-reached asymptotic flat velocity, while LSB galaxies are supposed to have rotation curves that rise slowly to the asymptotic flat velocity. This is precisely what is observed (Figure 15), and is in accordance [162] with the more complex empirical parametrization of observed rotation curves that has been proposed in [376]. Finally, the total (baryons+DM) acceleration is predicted to decline with the mean baryonic surface density of galaxies, exactly as observed (Figure 16), in the form \(a \propto \Sigma _b^{1/2}\) (see also Figure 9).

  9. 9.

    a0/G as the central surface density of dark halos. Provided they are mostly in the Newtonian regime, galaxies are predicted to be embedded in dark halos (whether real or virtual, i.e., “phantom” dark matter) with a central surface density on the order of a0/(2πG) as observedFootnote 19. LSBs should have a halo surface density scaling as the square-root of the baryonic surface density, in a much more compressed range than for the HSB ones, explaining the consistency of observed data with a constant central surface density of dark matter [167, 313].

  10. 10.

    Features in the baryonic distribution imply features in the rotation curve. Because a small variation in gN will be directly translated into a similar one in g, Renzo’s rule (Section 4.3.4) is explained naturally.

Figure 14
figure 14

The mass discrepancy (as in Figure 10) as a function of radius in observed spiral galaxies. The curves for individual galaxies (lines) are color-coded by their characteristic baryonic surface density (as in Figure 5). In order to be completely empirical and fully independent of any assumption such as maximum disk, stellar masses have been estimated with population synthesis models [42]. The amplitude of the mass discrepancy is initially small in high-surface-density galaxies, and grows only slowly at large radii. As the baryonic surface densities of galaxies decline, the mass discrepancy becomes more severe and appears at smaller radii. This trend confirms one of the predictions of Milgrom’s law [294].

Figure 15
figure 15

The shapes of observed rotation curves depend on baryonic surface density (color coding as per Figure 14). High-surface-density galaxies have rotation curves that rise steeply then become flat, or even fall somewhat to the asymptotic flat velocity. Low-surface-density galaxies have rotation curves that rise slowly to the asymptotic flat velocity. This trend confirms one of the predictions of Milgrom’s law [294].

Figure 16
figure 16

Centripetal acceleration as a function of radius and surface density (color coding as per Figure 14). The critical acceleration a0 is denoted by the dotted line. Milgrom’s formula predicts that acceleration should decline with baryonic surface density, as observed. Moreover, high-surface-density galaxies transition from the Newtonian regime at small radii to the weak-field regime at large radii, whereas low-surface-density galaxies fall entirely in the regime of low acceleration a < a0, as anticipated by Milgrom [294].

As a conclusion, all the apparently independent roles that the characteristic acceleration a0 plays in the unpredicted observations of Section 4.3 (see end of Section 4.3.3 for a summary), as well as Renzo’s rule (Section 4.3.4), have been elegantly unified by the single law proposed by Milgrom [293] in 1983 as a unique scaling relation between the gravitational field generated by observed baryons and the total observed gravitational force in galaxies.

6 Milgrom’s Law as a Modification of Classical Dynamics: MOND

Thus, it appears that many puzzling observations, that are difficult to understand in the ΛCDM context (and/or require an extreme fine-tuning of the DM distribution), are well summarized by a single heuristic law. Therefore, it would appear natural that this law derives from a universal force law, and would reflect a modification of dynamics rather than the addition of massive particles interacting (almost) only gravitationally with baryonic matterFootnote 20. However, applying blindly Eq. 7 to a set of massive bodies directly leads to serious problems [150, 293] such as the non-conservation of momentum. In a two-body configuration, as the implied force is not symmetric in the two masses, Newton’s third law (action and reaction principle) does not hold, so the momentum is not conserved. Consider a translationally invariant isolated system of two such masses m1 and m2 small enough to be in the very weak acceleration limit, and placed at rest on the x-axis. The amplitude of the Newtonian force is then Fn = Gm1m2/(x2x1)2, and applying blindly Eq. 7, would lead to individual accelerations \(\vert {{\rm{a}}_i}{\rm{\vert =}}\sqrt {{F_N}{a_0}/{m^i}}\). This then immediately leads to

$$\dot p = \sqrt {{a_0}{F_N}} (\sqrt {{m_1}} - \sqrt {{m_2}}) \neq 0\;{\rm{if}}\;{m_1} \neq {m_2},$$

meaning that for different masses, the momentum of this isolated system is not conserved. This means that Eq. 7 cannot truly represent a universal force law. If Eq. 7 is to be more than just a heuristic law summarizing how dark matter is arranged in galaxies with respect to baryonic matter, it must then be an approximation (valid only in highly symmetric configurations) of a more general force law deriving from an action and a variational principle. Such theories at the classical level can be classified under the acronym MOND, for Modified Newtonian DynamicsFootnote 21. In this section, we sketch how to devise such theories at the classical level, and list detailed tests of these theories at all astrophysical scales.

6.1 Modified inertia or modified gravity: Non-relativistic actions

If one wants to modify dynamics in order to reproduce Milgrom’s heuristic law while still benefiting from usual conservation laws such as the conservation of momentum, one can start from the action at the classical level. Clearly such theories are only toy-models until they become the weak-field limit of a relativistic theory (see Section 7), but they are useful both as targets for such relativistic theories, and as internally consistent models allowing one to make predictions at the classical level (i.e., neither in the relativistic or quantum regime).

A set of particles of mass moving in a gravitational field generated by the matter density distribution ρ = i miδ (xxi) and described by the Newtonian potential ΦN has the following actionFootnote 22:

$${S_N} = {S_{{\rm{kin}}}} + {S_{{\rm{in}}}} + {S_{{\rm{grav}}}} = \int {{{\rho {{\rm{v}}^2}} \over 2}{d^3}x\,dt -} \int {\rho {\Phi _N}{d^3}x\,dt} - \int {{{|\nabla \Phi {|^2}} \over {8\pi G}}} {d^3}x\,dt.$$

Varying this action with respect to configuration space coordinates yields the equations of motion d2x/dt2 = −∇ΦN, while varying it with respect to the potential leads to Poisson equation ∇ΦN = 4π. Modifying the first (kinetic) term is generally referred to as “modified inertia” and modifying the last term as “modified gravity”Footnote 23.

6.1.1 Modified inertia

The first possibility, modified inertia, has been investigated by Milgrom [300, 321], who constructed modified kinetic actionsFootnote 24 (the first term Skin in Eq. 13) that are functionals depending on the trajectory of the particle as well as on the acceleration constant a0. By construction, the gravitational potential is then still determined from the Newtonian Poisson equation, but the particle equation of motion becomes, instead of Newton’s second law,

$${\bf{A}}[\{{\bf{x}}(t)\} ,{a_0}] = - \nabla {\Phi _N},$$

where A is a functional of the whole trajectory {x(t)}, with the dimensions of acceleration. The Newtonian and MOND limits correspond to [a0 → 0, A → d2x/dt2] and \([{a_0} \rightarrow \infty, {\bf{A}}[\{{\rm{x(t)\},}}{a_0}] \rightarrow a_0^{- 1}{\rm{Q(\{x(}}t)\})]\) where Q has dimensions of acceleration squared.

Milgrom [300] investigated theories of this vein and rigorously showed that they always had to be time-nonlocal (see also Section 7.10) to be Galilean invariantFootnote 25. Interestingly, he also showed that quantities such as energy and momentum had to be redefined but were then enjoying conservation laws: this even leads to a generalized virial relation for bound trajectories, and in turn to an important and robust prediction for circular orbits in an axisymmetric potential, shared by all such theories. Eq. 14 becomes for such trajectories:

$$\mu \left({{{V_c^2} \over {R{a_0}}}} \right){{V_c^2} \over R} = - {{\partial {\Phi _N}} \over {\partial R}},$$

where, Vc and R are the orbital speed and radius, and μ(x) is universal for each theory, and is derived from the expression of the action specialized to circular trajectories. Thus, for circular trajectories, these theories recover exactly the heuristic Milgrom’s law. Interestingly, it is this law, which is used to fit galaxy rotation curves, while in the modified gravity framework of MOND (see hereafter), one should actually calculate the exact predictions of the modified Poisson formulations, which can differ a little bit from Milgrom’s law. However, for orbits other than circular, it becomes very difficult to make predictions in modified inertia, as the time non-locality can make the anomalous acceleration at any location depend on properties of the whole orbit. For instance, if the accelerations are small on some segments of a trajectory, MOND effects can also be felt on segments where the accelerations are high, and conversely [321]. This can give rise to different effects on bound and unbound orbits, as well as on circular and highly elliptic orbits, meaning that “predictions” of modified inertia in pressure-supported systems could differ significantly from those derived from Milgrom’s law per se. Let us finally note that testing modified inertia on Earth would require one to properly define an inertial reference frame, contrary to what has been done in [5, 179] where the laboratory itself was not an inertial frame. Proper set-ups for testing modified inertia on Earth have been described, e.g., in [201, 202]: under the circumstances described in these papers, modified inertia would inevitably predict a departure from Newtonian dynamics, even if the exact departure cannot be predicted at present, except for circular motion.

6.1.2 Bekenstein-Milgrom MOND

The idea of modified gravity is, on the one hand, to preserve the particle equation of motion by preserving the kinetic action, but, on the other hand, to change the gravitational action, and thus modify the Poisson equation. In that case, all the usual conservation laws will be preserved by construction.

A very general way to do so is to write [38]:

$${S_{{\rm{grav}}\;{\rm{BM}}}} \equiv - \int {{{a_{0}^2 F(\vert \nabla \Phi \vert ^{2}/a_0^2)} \over {8\pi G}}} {d^3}x\;dt,$$

where F can be any dimensionless function. The Lagrangian being non-quadratic in |∇Φ|, this has been dubbed by Bekenstein & Milgrom [38] Aquadratic Lagrangian theory (AQUAL). Varying the action with respect to Φ then leads to a non-linear generalization of the Newtonian Poisson equationFootnote 26:

$$\nabla .\left[ {\mu \left({{{\vert \nabla \Phi \vert} \over {{a_0}}}} \right)\nabla \Phi} \right] = 4\pi G\rho$$

where μ(x) = F′(z) and z = x2. In order to recover the μ-function behavior of Milgrom’s law (Eq. 7), i.e., μ(x) → 1 for x ≫ 1 and μ(x) → x for x ≪ 1, one needs to choose:

$$F(z) \rightarrow z\;{\rm{for}}\;z \gg 1\;{\rm{and}}\;F(z) \rightarrow {2 \over 3}{z^{3/2}}\;{\rm{for}}\;z \ll 1.$$

The general solution of the boundary value problem for Eq. 17 leads to the following relation between the acceleration g = −∇Φ and the Newtonian one, gN = −∇Φn

$$\mu \left({{g \over {{a_0}}}} \right){\bf{g}} = {{\bf{g}}_N} + {\bf{S}},$$

where g = |g|, and S is a solenoidal vector field with no net flow across any closed surface (i.e., a curl field S = ∇ × A such that ∇.S = 0). This, it is equivalent to Milgrom’s law (Eq. 7) up to a curl field correction, and is precisely equal to Milgrom’s law in highly symmetric one-dimensional systems, such as spherically-symmetric systems or flattened systems for which the isopotentials are locally spherically symmetric. For instance, the Kuzmin disk [52] is an example of a flattened axisymmetric configuration for which Milgrom’s law is precisely valid, as its Newtonian potential \({\Phi _N} = - GM/\sqrt {{R^2} + {{(b + \vert z\vert)}^2}}\) is equivalent on both sides of the disk to that of a point mass above or below the disk respectively.

In vacuum and at very large distances from a body of mass M, the isopotentials always tend to become spherical and the curl field tends to zero, while the gravitational acceleration falls well below a0 (a regime known as the “deep-MOND” regime), so that:

$$\Phi (r) \sim \sqrt {GM{a_0}} \ln (r).$$

An important point, demonstrated by Bekenstein & Milgrom [38], is that a system with a low center-of-mass acceleration, with respect to a larger (more massive) system, sees the motion of its constituents combine to give a MOND motion for the center-of-mass even if it is made up of constituents whose internal accelerations are above a0 (for instance a compact globular cluster moving in an outer galaxy). The center-of-mass acceleration is independent of the internal structure of the system (if the mass of the system is small), namely the Weak Equivalence Principle is satisfied.

In a modified gravity theory, any time-independent system must still satisfy the virial theorem:

$$2K + W = 0.$$

where K = Mv2〉/2 is the total kinetic energy of the system, M = Σi mi being the total mass of the system, 〈v2〉 the second moment of the velocity distribution, and \(W = - \int {\rho {\rm{x}}{.}\nabla \Phi {d^3}x}\) is the “virial”, proportional to the total potential energy. Milgrom [301, 302] showed that, in Bekenstein-Milgrom MOND, the virial is given by:

$$W = - {2 \over 3}\sqrt {G{M^3}{a_0}} - {1 \over {4\pi G}}\int {\left[ {{3 \over 2}a_0^2F(\vert \nabla \Phi \vert ^{2}/a_0^2) - \mu (\vert \nabla \Phi \vert /{a_0})\vert \nabla \Phi \vert ^{2}} \right]} \;{d^3}x.$$

for a system entirely in the extremely weak field limit (the “deep-MOND” limit x = g/a0 ≪ 1) where μ(x) = x and F(z) = (2/3)z3/2, the second term vanishes and we get \(W = (- 2/3)\sqrt {G{M^3}{a_0}}\)(see [301] for the specific conditions for this to be valid). In this case, we can get an analytic expression for the two-body force under the approximation that the two bodies are very far apart compared to their internal sizes [301, 509, 511]. Since the kinetic energy K = Korb + Kint can be separated into the orbital energy \({K_{{\rm{orb}}}} = {m_1}{m_2}\upsilon _{{\rm{rel}}}^2/(2M)\) and the internal energy of the bodies \({K_{{\mathop{\rm int}}}} = \sum (1/3)\sqrt {Gm_i^3{a_0}}\), we get from the scalar virial theorem of a stationary system:

$${{{m_1}{m_2}v_{{\rm{rel}}}^2} \over M} = {2 \over 3}\left[ {\sqrt {G{M^3}{a_0}} - \sum\limits_i {\sqrt {Gm_i^3{a_0}}}} \right].$$

We can then assume an approximately circular velocity such that the two-body force (satisfying the action and reaction principle) can be written analytically in the deep-MOND limit as:

$${F_{2{\rm{body}}}} = {{{m_1}{m_2}} \over {{m_1} + {m_2}}}{{v_{{\rm{rel}}}^2} \over r} = {2 \over 3}\left[ {{{({m_1} + {m_2})}^{3/2}} - m_1^{3/2} - m_2^{3/2}} \right]{{\sqrt {G{a_0}}} \over r}.$$

The latter equation is not valid for N-body configurations, for which the Bekenstein-Milgrom (BM) modified Poisson equation (Eq. 17) must be solved numerically (apart from highly-symmetric N-body configurations). This equation is a non-linear elliptic partial differential equation. It can be solved numerically using various methods [50, 77, 96, 147, 250, 457]. One of them [77, 457] is to use a multigrid algorithm to solve the discrete form of Eq. 17 (see also Figure 17):

$$\begin{array}{*{20}c}{4\pi G{\rho _{i,j,k}} =}\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\\{\left[({\Phi _{i + 1,j,k}} - {\Phi _{i,j,k}}){\mu _{{M_1}}}\right. - ({\Phi _{i,j,k}} - {\Phi _{i - 1,j,k}}){\mu _{{L_1}}}}\quad\quad\\{+ ({\Phi _{i,j + 1,k}} - {\Phi _{i,j,k}}){\mu _{{M_2}}} - ({\Phi _{i,j,k}} - {\Phi _{i,j - 1,k}}){\mu _{{L_2}}}}\quad\quad\\{+ ({\Phi _{i,j,k + 1}} - {\Phi _{i,j,k}}){\mu _{{M_3}}} - \left.({\Phi _{i,j,k}} - {\Phi _{i,j,k - 1}}){\mu _{{L_3}}}\right]/{h^2}}\\\end{array}$$


  • ρi,j,k is the density discretized on a grid of step h,

  • Φi,j,k is the MOND potential discretized on the same grid of step h,

  • μM1, and μL1, are the values of μ(x) at points M1 and L1 corresponding to (i + 1/2, j, k) and (i − 1/2, j, k) respectively (Figure 17).

The gradient component (/∂x,/∂y,/∂z), in μ(x), is approximated in the case of μMl by \(([\Phi (B) - \Phi (A)]/h,[\Phi (I) + \Phi (H) - \Phi (K) - \Phi (J)]/(4h),[\Phi (C) + \Phi (D) - \Phi (E) - \Phi (F)]/(4h))\) (see Figure 17).

Figure 17
figure 17

Discretisation scheme of the BM modified Poisson equation (Eq. 17) and of the phantom dark matter derivation in QUMOND. The node (i,j,k) corresponds to A on the upper panel. The gradient components in μ(x) (for Eq. 25) and v(y) (for Eq. 35) are estimated at the Li and Mi points. Image courtesy of Tiret, reproduced by permission from [457], copyright by ESO.

In [457] the Gauss-Seidel relaxation with red and black ordering is used to solve this discretized equation, with the boundary condition for the Dirichlet problem given by Eq. 20 at large radii. It is obvious that subsequently devising an evolving N-body code for this theory can only be done using particle-mesh techniques rather than the gridless multipole expansion treecode schemes widely used in standard gravity.

Finally, let us note that it could be imagined that MOND, given some of its observational problems (developed in Section 6.6), is incomplete and needs a new scale in addition to a0. There are several ways to implement such an idea, but for instance, Bekenstein [36] proposed in this vein a generalization of the AQUAL formalism by adding a velocity scale s0, in order to allow for effective variations of the acceleration constant as a function of the deepness of the potential, namely:

$${S_{{\rm{grav}}\;{\rm{Bek}}}} \equiv - {1 \over {8\pi G}}\int {a_0^2{e^{- 2\Phi /s_0^2}}F(\vert \nabla \Phi \vert ^{2}{e^{2\Phi /s_0^2}}/a_0^2){d^3}x\;dt,}$$

leading to

$$\nabla .\left[ {\mu \left({{{\vert \nabla \Phi \vert} \over {{a_{0{\rm{eff}}}}}}} \right)\nabla \Phi} \right] - {{\vert \nabla \Phi \vert ^{2}} \over {s_0^2}}\mu \left({{{\vert \nabla \Phi \vert} \over {{a_{0{\rm{eff}}}}}}} \right) + {{a_{0{\rm{eff}}}^2} \over {s_0^2}}F\left({{{\vert \nabla \Phi \vert ^{2}} \over {a_{0{\rm{eff}}}^2}}} \right) = 4\pi G\rho ,$$

where \({a_{0{\rm{eff}}}} = {a_0}{e^{- \Phi/{\mathcal S}_0^2}}\). Interestingly, with this “modified MOND”, Gauss’ theorem (or Newton’s second theorem) would no longer be valid in spherical symmetry. A suitable choice of s0 (e.g., on the order of 103 km/s; see [36]) could affect the dynamics of galaxy clusters (by boosting the modification with an effectively higher value of a0) compared to the previous MOND equation, while keeping the less massive systems such as galaxies typically unaffected compared to usual MOND, while other (lower) values of s0 could allow (modulo a renormalization of a0) for a stronger modification in galaxy clusters as well as milder modification in subgalactic systems such as globular clusters, which, as we shall soon see could be interesting from a phenomenological point of view (see Section 6.6). However, the possibility of too strong a modification should be carefully investigated, as well as, in a relativistic (see Section 7) version of the theory, the consequences on the dynamics of a scalar-field with a similar action.

6.1.3 QUMOND

Another way [319] of modifying gravity in order to reproduce Milgrom’s law is to still keep the “matter action” unchanged Skin + Sin = ∫ ρ(v2/2 − Φ)d3x dt, thus ensuring that varying the action of a test particle with respect to the particle degrees of freedom leads to d2x/dt2 = −∇Φ, but to invoke an auxiliary acceleration field gN = −∇ΦN in the gravitational action instead of invoking an aquadratic Lagrangian in |∇Φ|. The addition of such an auxiliary field can of course be done without modifying Newtonian gravity, by writing the Newtonian gravitational action in the following wayFootnote 27:

$${S_{{\rm{grav}}\;{\rm{N}}}} = - {1 \over {8\pi G}}\int {(2\nabla \Phi {.}{{\bf{g}}_N} - {\bf{g}}_N^2){d^3}x\;dt.}$$

It gives, after variation over gN (or over ΦN): gN = −∇Φ. And after variation of the full action over Φ: −∇.gN = 4π, i.e., Newtonian gravity. One can then introduce a MONDian modification of gravity by modifying this action in the following way, replacing \({\rm{g}}_N^2\) by a non-linear function of it and assuming that it derives from an auxiliary potential gN = −∇ΦN, so that the new degree of freedom is this new potential:

$${S_{{\rm{grav}}\;{\rm{QUMOND}}}} \equiv - {1 \over {8\pi G}}\int {[2\nabla \Phi .\nabla {\Phi _N} - a_0^2Q(\vert \nabla {\Phi _N}\vert ^{2}/a_0^2)]{d^3}x\;dt.}$$

Varying the total action with respect to Φ yields: ∇2ΦN = 4π. And varying it with respect to the auxiliary (Newtonian) potential ΦN yields:

$${\nabla ^2}\Phi = \nabla .\left[ {\nu \left({{{\vert \nabla {\Phi _N}\vert} \over {{a_0}}}} \right)\nabla {\Phi _N}} \right]$$

where v(y) = Q′(z) and z = y2. Thus, the theory requires one only to solve the Newtonian linear Poisson equation twice, with only one non-linear step in calculating the rhs term of Eq. 30. For this reason, it is called the quasi-linear formulation of MOND (QUMOND). In order to recover the v-function behavior of Milgrom’s law (Eq. 10), i.e., v(y) → 1 for y ≫ 1 and v(y) → y−1/2 for y ≪ 1, one needs to choose:

$$Q(z) \rightarrow z\;{\rm{for}}\;z \gg 1\;{\rm{and}}\;Q(z) \rightarrow {4 \over 3}{z^{3/4}}{\rm{for}}\;z \ll 1.$$

The general solution of the system of partial differential equations is equivalent to Milgrom’s law (Eq. 10) up to a curl field correction, and is precisely equal to Milgrom’s law in highly-symmetric one-dimensional systems. However, this curl-field correction is different from the one of AQUAL. This means that, outside of high symmetry, AQUAL and QUMOND cannot be precisely equivalent. An illustration of this is given in [509]: for a system with all its mass in an elliptical shell (in the sense of a squashed homogeneous spherical shell), the effective density of matter that would source the MOND force field in Newtonian gravity is uniformly zero in the void inside the shell for QUMOND, but nonzero for AQUAL.

The concept of the effective density of matter that would source the MOND force field in Newtonian gravity is extremely useful for an intuitive comprehension of the MOND effect, and/or for interpreting MOND in the dark matter language: indeed, subtracting from this effective density the baryonic density yields what is called the “phantom dark matter” distribution. In AQUAL, it requires deriving the Newtonian Poisson equation after having solved for the MOND one. On the other hand, in QUMOND, knowing the Newtonian potential yields direct access to the phantom dark matter distribution even before knowing the MOND potential. After choosing a v-function, one defines

$$\tilde \nu (y) = \nu (y) - 1,$$

and one has, for the phantom dark matter density,

$${\rho _{{\rm{ph}}}} = {{\nabla {.}(\tilde \nu \nabla {\Phi _N})} \over {4\pi G}}.$$

This \({\tilde \nu}\)-function appears naturally in an alternative formulation of QUMOND where one writes the action as a function of an auxiliary potential Φph:

$${S_{{\rm{grav}}\;{\rm{QUMOND}}}} = - {1 \over {8\pi G}}\int {[\vert \nabla \Phi \vert ^{2} - \vert \nabla {\Phi _{{\rm{ph}}}}\vert ^{2} - a_0^2H(\vert \nabla \Phi - \nabla {\Phi _{{\rm{ph}}}}\vert ^{2}/a_0^2)]{d^3}x\;dt,}$$

leading to a potential Φph obeying a QUMOND equation with \(\tilde \nu (y) = {H{\prime}}({y^2})\) and Φ = ΦN + Φph.

Numerically, for a given Newtonian potential discretized on a grid of step h, the discretized phantom dark matter density is given on grid points (i,j,k) by (see Figure 17 and cf. Eq. 25, see also [11]):

$$\begin{array}{*{20}c}{{\rho _{{\rm{ph}}\;(i,j,k)}} =}\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\\{\left[({\Phi _{N(i + 1,j,k)}} - {\Phi _{N(i,j,k)}}){{\tilde \nu}_{{M_1}}}\right. - ({\Phi _{N(i,j,k)}} - {\Phi _{N(i - 1,j,k)}}){{\tilde \nu}_{{L_1}}}}\quad\quad\quad\quad\\{+ ({\Phi _{N(i,j + 1,k)}} - {\Phi _{N(i,j,k)}}){{\tilde \nu}_{{M_2}}} - ({\Phi _{N(i,j,k)}} - {\Phi _{N(i,j - 1,k)}}){{\tilde \nu}_{{L_2}}}}\quad\quad\quad\quad\\{+ ({\Phi _{N(i,j,k + 1)}} - {\Phi _{N(i,j,k)}}){{\tilde \nu}_{{M_3}}} -\left.({\Phi _{N(i,j,k)}} - {\Phi _{N(i,j,k - 1)}}){{\tilde \nu}_{{L_3}}}\right]/(4\pi G{h^2}){.}}\\\end{array}$$

This means that any N-body technique (e.g., treecodes or fast multipole methods) can be adapted to QUMOND (a grid being necessary as an intermediate step). Once the Newtonian potential (or force) is locally known, the phantom dark matter density can be computed and then represented by weighted particles, whose gravitational attraction can then be computed in any traditional manner. An example is given in Figure 18, where one considers a rather typical baryonic galaxy model with a small bulge and a large disk. Applying Eq. 35 (with the v-function of Eq. 43) then yields the phantom density [253]. Interestingly, this phantom density is composed of a round “dark halo” and a flattish “dark disk” (see [305] for an extensive discussion of how such a dark disk component comes about; see also [50] and Section 6.5.2 for observational considerations). Let us note that this phantom dark matter density can be slightly separated from the baryonic density distribution in non-spherical situations [226], and that it can be negative [297, 490], contrary to normal dark matter. Finding the signature of such a local negative dark matter density could be a way of exhibiting a clear signature of MOND.

Figure 18
figure 18

(a) Baryonic density of a model galaxy made of a small Plummer bulge with a mass of 2 × 108 M⊙ and Plummer radius of 185 pc, and of a Miyamoto-Nagai disk of 1.1 × 1010 M⊙, a scale-length of 750 pc and a scale-height of 300 pc. (b) The derived phantom dark matter density distribution: it is composed of a spheroidal component similar to a dark matter halo, and of a thin disk-like component (Figure made by Fabian Lüghausen [253])

Finally, let us note that, as shown in [319, 509], (i) a system made of high-acceleration constituents, but with a low-acceleration center-of-mass, moves according to a low-acceleration MOND law, while (ii) the virial of a system is given by

$$W = - {2 \over 3}\sqrt {G{M^3}{a_0}} - {1 \over {4\pi G}}\int {\left[ {- {3 \over 2}a_0^2Q(\vert \nabla {\Phi _N}\vert ^{2}/a_0^2) + 2\nu (\vert \nabla {\Phi _N}\vert /{a_0})\vert \nabla {\Phi _N}\vert ^{2}} \right]} {d^3}x,$$

meaning that for a system entirely in the extremely weak field limit where v(y) = y−1/2 and Q(z) = (4/3)z3/4, the second term vanishes and we get \(W = (- 2/3)\sqrt {G{M^3}{a_0}}\) precisely like in Bekenstein-Milgrom MOND. This means that, although the curl-field correction is in general different in AQUAL and QUMOND, the two-body force in the deep-MOND limit is the same [509].

6.2 The interpolating function

The basis of the MOND paradigm is to reproduce Milgrom’s law, Eq. 7, in highly symmetrical systems, with an interpolating function asymptotically obeying the conditions of Eq. 8, i.e., μ(x) → 1 for x ≫ 1 and μ(x) → x for x ≪ 1. Obviously, in order for the relation between g and gN to be univocally determined, another constraint is that (x) must be a monotonically increasing function of x, or equivalently

$$\mu (x) + x{\mu {\prime}}(x) > 0,$$

or equivalently

$${{d\ln \mu} \over {d\ln x}} > - 1.$$

Even though this leaves some freedom for the exact shape of the interpolating function, leading to the various families of functions hereafter, let us insist that it is already extremely surprising, from the dark matter point of view, that the MOND prescriptions for the asymptotic behavior of the interpolating function did predict all the aspects of the dynamics of galaxies listed in Section 5.

As we have seen in Section 6.1, an alternative formulation of the MOND paradigm relies on Eq. 10, based on an interpolating function

$$\nu (y) = 1/\mu (x)\;{\rm{where}}\;y = x\mu (x).$$

In that case, we also have that yv (y) must be a monotonically increasing function of y.

Finally, as we shall see in detail in Section 7, many MOND relativistic theories boil down to multifield theories where the weak-field limit can be represented by a potential Φ = Σi ϕi, where each ϕi obeys a generalized Poisson equation, the most common case being

$$\Phi = {\Phi _N} + \phi ,$$

where Φn obeys the Newtonian Poisson equation and the scalar field ϕ (with dimensions of a potential) plays the role of the phantom dark matter potential and obeys an equation of either the type of Eq. 17 or of Eq. 30. When it obeys a QUMOND type of equation (Eq. 30), the v-function must be replaced by the \({\tilde \nu}\)-function of Eq. 32. When it obeys a BM-like equation (Eq. 17), the classical interpolating function μ(x) acting on x = |∇Φ|/a0 must be replaced by another interpolating function \(\tilde \mu ({\mathcal S})\) acting on = |∇ |/a0, in order for the total potential Φ to conform to Milgrom’s lawFootnote 28. In the absence of a renormalization of the gravitational constant, the two functions are related through [145]

$$\tilde \mu (s) = (x - s){s^{- 1}}{\rm{where}}\;s = x[1 - \mu (x)].$$

for x ≪ 1 (the deep-MOND regime), one has s = x(1 − x) ≪ 1 and xs(1 + s), yielding \(\tilde \mu (s) \sim s\), i.e., although it is generally different, \({\tilde \mu}\) has the same low-gravity asymptotic behavior as μ.

In spherical symmetry, all these different formulations can be made equivalent by choosing equivalent interpolating functions, but the theories will typically differ slightly outside of spherical symmetry (i.e., the curl field will be slightly different). As an example, let us consider a widely-used interpolating function [141, 166, 402, 508] yielding excellent fits in the intermediate to weak gravity regime of galaxies (but not in the strong gravity regime of the Solar system), known as the “simple” μ-function (see Figure 19):

$$\mu (x) = {x \over {1 + x}}.$$

This yields y = x2/(1 + x), and thus \(x = (y + \sqrt {{y^2} + 4y})/2\) and v = (1 + x)/x yields the “simple” v-function:

$$\nu (y) = {{1 + {{(1 + 4{y^{- 1}})}^{1/2}}} \over 2}.$$

It also yields s = x[1 − μ(x)] = x/(1 + x) = μ, and hence x = s/(1 − s), yielding for the “simple” \({\tilde \mu}\)-function:

$$\tilde \mu (s) = {s \over {1 - s}}.$$

A more general family of \({\tilde \mu}\)-functions is known as the α-family [15], valid for 0 ≤ α ≤ 1 and including the simple function of the α = 1 case Footnote 29:

$${\tilde \mu _\alpha}(s) = {s \over {1 - \alpha s}}$$

corresponding to the following family of μ-functions:

$${\mu _\alpha}(x) = {{2x} \over {1 + (2 - \alpha)x + {{[{{(1 - \alpha x)}^2} + 4x]}^{1/2}}}}.$$

The α = 0 case is sometimes referred to as “Bekenstein’s μ-function” (see Figure 19) as it was used in [33]. The problem here is that all these μ-functions approach 1 quite slowly, with ζ ≤ 1 in their asymptotic expansion for x → ∞, μ(x) ∼ 1 − Ax−ζ. Indeed, since s = x[1 − μ(x)], its asymptotic behavior is sAx−ζ+1. So, if ζ > 1, s → 0 for x → ∞ as well as for x → 0, which would imply that \(x(s) = s \tilde \mu (s) + s\) would be a multivalued function, and that the gravity would be ill-defined. This is problematic because even for the extreme case ζ = 1, the anomalous acceleration does not go to zero in the strong gravity regime: there is still a constant anomalous “Pioneer-like” acceleration x[1 − μ(x)] → A, which is observationally excludedFootnote 30 from very accurate planetary ephemerides [154]. What is more, these \({\tilde \mu}\)-functions, defined only in the domain 0 < s < α−1, would need very-carefully-chosen boundary conditions to avoid covering values of outside of the allowed domain when solving for the Poisson equation for the scalar field.

Figure 19
figure 19

Various μ-functions. Dotted green line: the α = 0 “Bekenstein” function of Eq. 46. Dashed red line: the α = n = 1 “simple” function of Eq. 46 and Eq. 49. Dot-dashed black line: the n = 2 “standard” function of Eq. 49. Solid blue line: the γ = δ = 1 μ-function corresponding to the v-function defined in Eq. 52 and Eq. 53. The latter function closely retains the virtues of the n = 1 simple function in galaxies (x <∼ 10), but approaches 1 much more quickly and connects with the n = 2 standard function as x ≫ 10.

The way out to design \({\tilde \mu}\)-functions corresponding to acceptable μ-functions in the strong gravity regime is to proceed to a renormalization of the gravitational constant[145]: this means that the bare value of in the Poisson and generalized Poisson equations ruling the bare Newtonian potential ϕN and the scalar field ϕ in Eq. 40 is different from the gravitational constant measured on Earth, GN (related to the true Newtonian potential ΦN). One can assume that the bare gravitational constant G is related to the measured one through

$${G_N} = \xi G,$$

meaning that x = y + s where \(x = \nabla \Phi/{a_0},y = \nabla {\phi _N}/{a_0} = \nabla {\Phi _N}(\xi {a_0})\), and \({s}\tilde \mu ({s}) = y\) We then have for Milgrom’s law:

$$x\mu (x) = \xi (x - s) = \xi s\tilde \mu (s).$$

In order to recover μ(x) → 1 for x → ∞, it is straightforward to show [145] that it suffices that \(\tilde \mu ({s}) \rightarrow {{\tilde \mu}_0}\) for s → ∞, and that \(\xi = 1 + \tilde \mu _0^{- 1}\). Then, if ζ > 1 in the asymptotic expansion μ(x) ∼ 1 − x−ζ, one has \({s} \sim {(1 + \tilde \mu _0^{- 1})^{- 1}}{x^{- \zeta + 1}} + {(1 + {{\tilde \mu}_0})^{- 1}}x\). This second linear term allows s to go to infinity for large x and thus x(s) to be single-valued. On the other hand, for the deep-MOND regime, the renormalization of G implies that \(\tilde \mu ({s}) \rightarrow {s}/\xi\) for \({s} \ll 1\).

We can then use, even in multifield theories, μ-functions quickly asymptoting to 1. For each of these functions, there is a one-parameter family of corresponding \({\tilde \mu}\)-functions (labelled by the parameter \(\tilde \mu (\infty) = {{\tilde \mu}_0})\), obtained by inserting μ(x) into \({s} = x[1 - {\xi ^{- 1}}\mu (x)]\) and making sure that the function is increasing and thus invertible. A useful family of such μ-functions asymptoting more quickly towards 1 than the α-family is the n-family:

$${\mu _n}(x) = {x \over {{{(1 + {x^n})}^{1/n}}}}.$$

The case n= 1 is again the simple-function, while the case n= 2 has been extensively used in rotation curve analysis from the very first analyses [28, 223], to this day [401], and is thus known as the “standard” μ-function (see Figure 19). The corresponding \({\tilde \mu}\)-function for n ≥ 2 has a very peculiar shape of the type shown in Figure 3 of [81] (which might be considered a fine-tuned shape, necessary to account for solar system constraints). On the other hand, the corresponding v-function family is:

$${\nu _n}(y) = {\left[ {{{1 + {{(1 + 4{y^{- n}})}^{1/2}}} \over 2}} \right]^{1/n}}.$$

As the simple μ-function (α = 1 or n =1) fits galaxy rotation curves well (see Section 6.5.1) but is excluded in the solar system (see Section 6.4), it can be useful to define μ-functions that have a gradual transition similar to the simple function in the low to intermediate gravity regime of galaxies, but a more rapid transition towards one than the simple function. Two such families are described in [325] in terms of their v-function:

$${\nu _\beta}(y) = {(1 - {e^{- y}})^{- 1/2}} + \beta {e^{- y}}$$


$${\nu _\gamma}(y) = {(1 - {e^{- {y^{\gamma /2}}}})^{- 1/\gamma}} + (1 - {\gamma ^{- 1}}){e^{- {y^{\gamma /2}}}}.$$

Finally, yet another family was suggested in [274], obtained by deleting the second term of the γ-family, and retaining the virtues of the n-family in galaxies, but approaching one more quickly in the solar system:

$${\nu _\delta}(y) = {(1 - {e^{- {y^{\delta /2}}}})^{- 1/\delta}}.$$

To be complete, it should be noted that other μ-functions considered in the literature include [304, 505] (see also Section 7.10):

$$\mu (x) = {{{{(1 + 4{x^2})}^{1/2}} - 1} \over {2x}},$$


$$\mu (x) = 1 - {(1 + x/3)^{- 3}}.$$

This simply shows the variety of shapes that the interpolating function of MOND can in principle takeFootnote 31. Very precise data for rotation curves, including negligible errors on the distance and on the stellar mass-to-light ratios (or, in that case, purely gaseous galaxies) should allow one to pin down its precise form, at least in the intermediate gravity regime and for “modified inertia” theories (Section 6.1.1) where Milgrom’s law is exact for circular orbits. Nowadays, galaxy data still allow some, but not much, wiggle room: they tend to favor the α = n = 1 simple function [166] or some interpolation between n = 1 and n = 2 [141], while combined data of galaxies and the solar system (see Sections 6.4 and 6.5) rather tend to favor something like the γ = δ = 1 function of Eq. 52 and Eq. 53 (which effectively interpolates between n =1 and n = 2, see Figure 19), although slightly higher exponents (i.e., γ > 1 or δ > 1) might still be needed in the weak gravity regime in order to pass solar system tests involving the external field from the galaxy [62]. Again, it should be stressed that the most salient aspect of MOND is not its precise interpolating function, but rather its successful predictions on galactic scaling relations and Kepler-like laws of galactic dynamics (Section 5.2), as well as its various beneficial effects on, e.g., disk stability (see Section 6.5), all predicted from its asymptotic form. The very concept of a pre-defined interpolating function should even in principle fully disappear once a more profound parent theory of MOND is discovered (see, e.g., [22]).

To end this section on the interpolating function, let us stress that if the μ-function asymptotes as μ(x) = x for x → 0, then the energy of the gravitational field surrounding a massive body is infinite [38]. What is more, if the \({\tilde \mu}\) function of relativistic multifield theories asymptotes in the same way to zero before going to negative values for time-evolution dominated systems (see Section 9.1), then a singular surface exists around each galaxy, on which the scalar degree of freedom does not propagate, and can therefore not provide a consistent picture of collapsed matter embedded into a cosmological background. A simple solution [145, 380] consists in assuming a modified asymptotic behavior of the μ-function, namely of the form

$$\mu (x) \sim {\varepsilon _0} + x\;{\rm{for}}\;x \ll 1.$$

In that case there is a return to a Newtonian behavior (but with a very strong renormalized gravitational constant GN/ε0) at a very low acceleration scale x ≪ ε0, and rotation curves of galaxies are only approximately flat until the galactocentric radius

$$R \sim {1 \over {{\varepsilon _0}}}\sqrt {{{{G_N}M} \over {{a_0}}}} .$$

Thus, one must have ε0 ≪ 1 to not affect the observed phenomenology in galaxies. Note that the μ-function will never go to zero, even at the center of a system. Conversely, in QUMOND and the like, one can modify the v-function in the same way:

$$\nu (y) \sim {1 \over {{\varepsilon _0} + {y^{1/2}}}}{\rm{for}}\;y \ll 1.$$

6.3 The external field effect

The above return to a rescaled Newtonian behavior at very large radii and in the central parts of isolated systems, in order to avoid theoretical problems with the interpolating function, would happen anyway, even with the interpolating function going to zero, for any non-isolated system in the universe (and this return to Newtonian behavior could actually happen at much lower radii) because of a very peculiar aspect of MOND: the external field effect, which appeared in its full significance already in the pristine formulation of MOND [293].

In practice, no objects are truly isolated in the Universe and this has wider and more subtle implications in MOND than in Newton-Einstein gravity. In the linear Newtonian dynamics, the internal dynamics of a subsystem (a cluster in a galaxy, or a galaxy in a galaxy cluster for instance) in the field of its mother system decouples. Namely, the internal dynamics is always the same independent of any external field (constant across the subsystem) in which the system is embedded (of course, if the external field varies across the subsystem, it manifests itself as tides). This has subsequently been built in as a fundamental principle of GR: the Strong Equivalence Principle (see Section 7). But MOND has to break this fundamental principle of GR. This is because, as it is an acceleration-based theory, what counts is the total gravitational acceleration with respect to a pre-defined frame (e.g., the CMB frameFootnote 32). Thus, the MOND effects are only observed in systems where the absolute value of the gravity both internal, g, and external, ge (from a host galaxy, or astrophysical system, or large scale structure), is less than a0. If ge < g < a0 then we have standard MOND effects. However, if the hierarchy goes as g < a0 < ge, then the system is purely NewtonianFootnote 33, and if g < ge < a0 then the system is Newtonian with a renormalized gravitational constant. Ultimately, whenever g falls below ge (which always happens at some point) the gravitational attraction falls again as 1/r2. This is most easily illustrated in a thought experiment where one considers MOND effects in one dimension. In Eq. 17, one has ∇Φ = g + ge and 4πGρ = ∇.(gN + gNe), which in one dimension leads to the following revised Milgrom’s law (Eq. 7) including the external field:

$$g\mu \left({{{g + {g_e}} \over {{a_0}}}} \right) + {g_e}\left[ {\mu \left({{{g + ge} \over {{a_0}}}} \right) - \mu \left({{{{g_e}} \over {{a_0}}}} \right)} \right] = {g_N},$$

such that, when g → 0, we have Newtonian gravity with a renormalized gravitational constant GnormG/[μe(1 + Le)] where μe = μ(ge/a0) and Le = (d ln μ/d ln x)x=ge/a0, assuming, as before, that the external field only varies on a much larger scale than the internal system. Similarly, for QUMOND (Eq. 30) in one dimension, one gets the equivalent of Eq. 10:

$$g = {g_N}\nu \left({{{{g_N} + {g_{Ne}}} \over {{a_0}}}} \right) + {g_{Ne}}\left[ {\nu \left({{{{g_N} + {g_{Ne}}} \over {{a_0}}}} \right) - \nu \left({{{{g_{Ne}}} \over {{a_0}}}} \right)} \right].$$

When dealing in the future with very extended rotation curves whose last observed point is in the extreme weak-field limit, it could be interesting, as a first-order approximation, to use the latter formulaeFootnote 34, adding the external field as an additional parameter of the MOND fit to the external parts of the rotation curve. Of course, this would only be a first-order approximation because it would neglect the three-dimensional nature of the problem and the direction of the external field.

Now, in three dimensions, the problem can be analytically solved only in the extreme case of the completely-external-field-dominated part of the system (where gge) by considering the perturbation generated by a body of low mass m inside a uniform external field, assumed along the b-direction, ge = ge 1z. Eq. 17 can then be linearized and solved with the boundary condition that the total field equals the external one at infinity [38] to yield:

$$\Phi (x,y,z) = - {{Gm} \over {{\mu _e}\tilde r}},$$


$$\tilde r = r{(1 + {L_e}({x^2} + {y^2})/{r^2})^{1/2}},$$

squashing the isopotentials along the external field direction. Thus, this is the asymptotic behavior of the gravitational field in any system embedded in a constant external field. Similarly, in QUMOND (Eq. 30), one gets

$$\Phi (x,y,z) = - {{Gm{\nu _e}} \over {\tilde r}},$$


$$\tilde r = r/[1 + ({L_{Ne}}/2)({x^2} + {y^2})/{r^2}],$$

where LNe = (d ln ν/d ln y)y=gNea0

For the exact behavior of the MOND gravitational field in the regime where g and ge are of the same order of magnitude, one again resorts to a numerical solver, both for the BM equation case and for the QUMOND case (see Eq. 25 and Eq. 35). For the BM case, one adds the three components of the external field (no longer assumed to be in the z-direction only) in the argument of μM1 which becomes {[(Φ(B) − Φ(A))/hgex]2 + [(Φ(I) + Φ(H) − Φ(K) − Φ(J))/(4h) − gey]2 + [(Φ(C) + Φ(D) − Φ(E) − Φ(F))/(4h) − gez]2}1/2, and similarly for the other Mi and Li points on the grid (Figure 17). One also adds the respective component of the external field to the term estimating the force at the Mi and Li points in Eq. 25. With M1, for instance, one changes (Φi+1,j,k − Φi,j,k) → (Φi+1,j,k − Φi,j,khgex) in the first term of Eq. 25. One then solves this discretized equation with the large radius boundary condition for the Dirichlet problem given by Eq. 61 instead of Eq. 20. Exactly the same is applicable to calculating the phantom dark matter component of QUMOND with Eq. 35, except that now the Newtonian external field is added to the terms of the equation in exactly the same way.

This external field effect (EFE) is a remarkable property of MONDian theories, and because this breaks the strong equivalence principle, it allows us to derive properties of the gravitational field in which a system is embedded from its internal dynamics (and not only from tides). For instance, the return to a Newtonian (Eq. 61 or Eq. 63) instead of a logarithmic (Eq. 20) potential at large radii is what defines the escape speed in MOND. By observationally estimating the escape speed from a system (e.g., the Milky Way escape speed from our local neighborhood; see discussion in Section 6.5.2), one can estimate the amplitude of the external field in which the system is embedded, and by measuring the shape of its isopotential contours at large radii, one can determine the direction of that external field, without resorting to tidal effects. It is also noticeable that the phantom dark matter has a tendency to become negative in “conoidal” regions perpendicular to the external field direction (see Figure 3 of [490]): with accurate-enough weak-lensing data, detecting these pockets of negative phantom densities could, in principle, be a smoking gun for MOND [490], but such an effect would be extremely sensitive to the detailed distribution of the baryonic matter. A final important remark about the EFE is that it prevents most possible MOND effects in Galactic disk open clusters or in wide binaries, apart from a possible rescaling of the gravitational constant. Indeed, for wide binaries located in the solar neighborhood, the galactic EFE (coming from the distribution of mass in our galaxy) is about 1.5 × a0. The corresponding rescaling of the gravitational constant then depends on the choice of the μ-function, but could typically account for up to a 50% increase of the effective gravitational constant. Although this is not, properly speaking, a MOND effect, it could still perhaps imply a systematic offset of mass for very-long-period binaries. However, any effect of the type claimed to be observed by [188] would not be expected in MOND due to the external field effect.

6.4 MOND in the solar system

The primary place to test modified gravity theories is, of course, the solar system, where general relativity has, until now, passed all the proposed tests. Detecting a deviation from Einsteinian gravity in our backyard would actually be the holy grail of modified gravity theories, in the same sense as direct detection in the lab is the holy grail of the CDM paradigm. However, MOND anomalies typically manifest themselves only in the weak-gravity regime, several orders of magnitudes below the typical gravitational field exerted by the sun on, e.g., the inner planets. But in the case of modified inertia (Section 6.1.1), the anomalous acceleration at any location depends on properties of the whole orbit (non-locality), so that anomalies may appear in the motion of Solar system bodies that are on highly-eccentric trajectories taking them to large distances (e.g., long period comets or the Pioneer spacecraft), where accelerations are low [314]. Such MOND effects have been proposed as a possible mechanism for generating the Pioneer anomaly [314, 469], without affecting the motions of planets, whose orbits are fully in the high acceleration regime. On the other hand, in classical, non-relativistic modified gravity theories (Sections 6.1.2 and 6.1.3), small effects could still be observable and would primarily probe two aspects of the theory: (i) the shape of the interpolating function (Section 6.2) in the regime x ≫ 1, and (ii) the external Galactic gravitational field (Section 6.3) acting on the solar system, testing the interpolating function in the regime x ≪ 1.

If, as a first approximation, one considers the solar system as isolated, and the Sun as a point mass, the MOND effect in the inner solar system appears as an anomalous acceleration field in addition to the Newtonian one. In units of 0, the amplitude of the anomalous acceleration is given by x[1 − μ(x)], which can be constrained from the motion of the inner planets, typically their perihelion precession and the (non)-variation of Kepler’s constant [293, 391, 417]. These constraints typically exclude the whole-family of interpolating functions (Eq. 46) that are natural for multifield theories such as TeVeS (see Section 6.2 and Section 7) because they yield x[1 − μ(x)] > 1 for x ≫ 1 while it must be smaller than 0.04 at the orbit of Mars [391]Footnote 35. Of course, this does not mean that the μ-function cannot be represented by the α-family in the intermediate gravity regime characterizing galaxies, but it must be modified in the strong gravity regimeFootnote 36. Another potential effect of MOND is anomalously strong tidal stresses in the vicinity of saddle points of the Newtonian potential, which might be tested with the LISA pathfinder [37, 49, 255, 464]. The MOND bubble can be quite big and clearly detectable, or the effect could be small and undetectable, depending on the interpolating function [255, 161].

The approximation of an isolated Solar system being incorrect, it is also important to add the effect of the external field from the galaxy. Its amplitude is typically on the order of ∼ 1.5 × a0. From there, Milgrom [314] has predicted (both analytically and numerically) a subtle anomaly in the form of a quadrupole field that may be detected in planetary and spacecraft motions (as subsequently confirmed by [62, 185]). This has been used to constrain the form of the interpolating function in the weak acceleration regime characteristic of the external field itself. Constraints have essentially been set on the n-family of μ-functions from the perihelion precession of Saturn [63, 154], namely that one must have n > 8 in order to fit these dataFootnote 37.

However, it should be noted that it is slightly inconsistant to compare the classical predictions of MOND with observational constraints obtained by a global fit of solar system orbits using a fully-relativistic first-post-Newtonian model. Although the above constraints on classical MOND models are useful guides, proper constraints can only truly be set on the various relativistic theories presented in Section 7, the first-order constraints on these theories coming from their own post-Newtonian parameters [65, 99, 173, 372, 391, 450]. What is more, and makes all these tests perhaps unnecessary, it has recently been shown that it was possible to cancel any deviation from general relativity at small distances in most of these relativistic theories, independently of the form of the μ-function [22].

6.5 MOND in rotationally-supported stellar systems

6.5.1 Rotation curves of disk galaxies

The root and heart of MOND, as modified inertia or modified gravity, is Milgrom’s formula (Eq. 7). Up to some small corrections outside of symmetrical situations, this formula yields (once a0 and the form of the transition function μ are chosen) a unique prediction for the total effective gravity as a function of the gravity produced by the visible baryons. It is absolutely remarkable that this formula, devised 30 years ago, has been able to successfully predict an impressive number of galactic scaling relations (the “Kepler-like” laws of Section 5.2, backed by the modern data of Section 4.3) that were very unprecise and/or unobserved at the time, and which still are a puzzle to understand in the ΛCDM framework. What is more, this formula not only predicts global scaling relations successfully, we show in this section that it also predicts the shape and amplitude of galactic rotation curves at all radii with uncanny precision, and this for all disk galaxy Hubble types [168, 402]. Of course, the absolute exact prediction of MOND depends on the exact formulation of MOND (as modified inertia or some form or other of modified gravity), but the differences are small compared to observational error bars, and even compared with the differences between various μ-functions.

In order to illustrate this, we plot in Figure 20 the theoretical rotation curve of an HSB exponential disk (see [145] for exact parameters) computed with three different formulations of MONDFootnote 38: Milgrom’s formula (Eq. 7), representative of circular orbits in modified inertia, AQUAL (Eq. 17), and a multi-field theory (Eq. 40) representative of a whole class of relativistic theories (see Sections 7.1 to 7.4), all with the α= n = 1 “simple” μ-function of Eq. 46 and Eq. 49. One can see velocity differences of only a few percents in this case, while, in general, it has been shown that the maximum difference between formulations is on the order of 10% for any type of disk [76]. This justifies using Milgrom’s formula as a proxy for MOND predictions on rotation curves, keeping in mind that, in order to constrain MOND within the modified gravity framework, one should actually calculate predictions of the various modified Poisson formulations of Section 6.1 for each galaxy model, and for each choice of galaxy parameters [18].

Figure 20
figure 20

Comparison of theoretical rotation curves for the inner parts (before the rotation curve flattens) of an HSB exponential disk [145], computed with three different formulations of MOND. Green: Milgrom’s formula; Blue: Bekenstein-Milgrom MOND (AQUAL); Red: TeVeS-like multi-field theory. Image reproduced by permission from [145], copyright by APS.

The procedure is then the following (see Section 4.3.4 for more detail). One usually assumes that light traces stellar mass (constant mass-to-light ratio, but see the counter-example M33), and one adds to this baryonic density the contribution of observed neutral hydrogen, scaled up to account for the contribution of primordial helium. The Newtonian gravitational force of baryons is then calculated via the Newtonian Poisson equation, and the MOND force is simply obtained via Eq. 7 or Eq. 10. First of all, an interpolating function must be chosen, then one can determine the value of a0 by fitting, all at once, a sample of high-quality rotation curves with small distance uncertainties and no obvious non-circular motions. Then, all individual rotation curve fits can be performed with the mass-to-light ratio of the disk as the single free parameter of the fitFootnote 39. It turns out that using the simple interpolating function (α= n = 1, see Eqs. 46 and 49) yields a value of a0 = 1.2 × 10−10 m s−2, and excellent fits to galaxy rotation curves [166]. However, as already pointed out in Sections 6.3 and 6.4, this interpolating function yields too strong a modification in the solar system, so hereafter we use the γ = δ = 1 interpolating function of Eqs. 52 and 53 (solid blue line on Figure 19), very similar to the simple interpolating function in the intermediate to weak gravity regime.

Figure 21 shows two examples of detailed MOND fits to rotation curves of Figure 13. The black line represents the Newtonian contribution of stars and gas and the blue line is the MOND fit, the only free parameter being the stellar mass-to-light ratioFootnote 40. Not only does MOND predict the general trend for LSB and HSB galaxies, it also predicts the observed rotation curves in great detail. This procedure has been carried out for 78 nearby galaxies (all galaxy rotation curves to which the authors have access), and the residuals between the observed and predicted velocities, at every point in all these galaxies (thus about two thousand individual measurements), are plotted in Figure 23. As an illustration of the variety and richness of rotation curves fitted by MOND, as well as of the range of magnitude of the discrepancies covered, we display in Figure 24 fits to rotation curves of extremely massive HSB early-type disk galaxies [402] with Vf up to 400 km/s, and in Figure 25 fits to very low mass LSB galaxies [324] with Vf down to 15 km/s. In the latter, gasrich, small galaxies, the detailed fits are insensitive to the exact form of the interpolating function (Section 6.2) and to the stellar mass-to-light ratio [168, 324]. We then display in Figure 26 eight fits for representative galaxies from the latest high-resolution THINGS survey [166, 481], and in Figure 27 six fits of yet other LSB galaxies (as these provide strong tests of MOND and depend less on the exact form of the interpolating function than HSB ones) from [120], updated with high resolution Hα data [242, 241]. The overall results for the whole 78 nearby galaxies (Figure 23) are globally very impressive, although there are a few outliers among the 2000 measurements. These are but a few trees outlying from a very clear forest. It is actually only as the quality of the data decline [384] that one begins to notice small disparities. These are sometimes attributable to external disturbances that invalidate the assumption of equilibrium [403], non-circular motions or bad observational resolution. For targets that are intrinsically difficult to observe, minor problems become more common [120, 448]. These typically have to do with the challenges inherent in combining disparate astronomical data sets (e.g., rotation curves measured independently at optical and radio wavelengths) and constraining the inclinations. A single individual galaxy that can be considered as a bit problematic is NGC 3198 [68, 166], but this could simply be due to a problem with the potentially too high Cepheids-based distance (reddening problem mentioned in [254]). Indeed, the adopted distance plays an important role in the MOND fitting procedure, as the value of the centripetal acceleration \(V_c^2/R\) depends on the distance through the conversion of the observed angular radius in arcsec into the physical radius R in kpc. Note that other galaxies such as NGC 2841 had historically-posed problems to MOND but that these have largely gone away with modern data (see [166] and Figure 26).

Figure 21
figure 21

Examples of detailed MOND rotation curve fits of the HSB and LSB galaxies of Figure 13 (NGC 6946 on the left and NGC 1560 on the right). The black line represents the Newtonian contribution of stars and gas as determined by numerical solution of the Newtonian Poisson equation for the observed light distribution, as per Figure 13. The blue line is the MOND fit with the γ = δ = 1 function of Eq. 52 and Eq. 53, the only free parameter being the stellar mass-to-light ratio. In the K-band, the best fit value is 0.37 M/L for NGC 6946 and 0.18 M/L for NGC 1560. In practice, the best fit mass-to-light ratio can co-vary with the distance to the galaxy and a0; here a0 is held fixed (1.2 × 10−10 m s−2) and the distance has been held fixed to the best observed value (5.9 Mpc for NGC 6946 [220] and 3.45 Mpc for NGC 1560 [219]). Milgrom’s formula provides an effective mapping between the rotation curve predicted by the observed baryons and the observed rotation, including the bumps and wiggles.

Figure 22
figure 22

The rotation curve [124] and MOND fit [384] of the Local Group spiral M33 assuming a constant stellar mass-to-light ratio (top panel). While the overall shape is a good match, there is a slight mismatch at ∼ 3 kpc and above 7 kpc. The observed color gradient implies a slight variation in the mass-to-light ratio, in the sense that the stars at small radii are slightly redder and heavier than those at large radii. Applying stellar population models [42] to the observed color gradient produces a slight adjustment of the Newtonian mass model. The dotted line in the lower panel reiterates the constant M/L model from the top panel, while the solid line has been corrected for the observed color gradient. This slight adjustment to the baryonic mass distribution considerably improves the fit.

Figure 23
figure 23

Residuals of MOND fits to the rotation curves of 78 nearby galaxies (all data to which authors have access) including about two thousand individual resolved measurements. Data for 21 galaxies are either new or improved in terms of spatial resolution and velocity accuracy over those in [401]. More accurate points are illustrated with larger symbols. The histogram of residuals is plotted on the right panel, and is well fitted by a Gaussian of width ∇v/v ∼ 0.04. The bulk of the more accurate data are in good accord with MOND. There are a few deviant points, mostly at small radii where non-circular motions are ubiquitous and observational resolution (beam smearing) can be a challenge. These are but a few trees outlying from a very clear forest.

Figure 24
figure 24

Examples of MOND fits (blue lines, using Eq. 53 with δ = 1) to two massive galaxies [402]. With baryonic masses in excess of 1011 M, these are among the most massive, rapidly rotating disk galaxies known. Stars dominate the mass, and Newtonian dynamics suffices to explain the innermost regions because of the high acceleration, but the mass discrepancy becomes apparent as the Keplerian decline (black lines) falls well below the data at the enormous radii spanned by these giant disks (the diameter of UGC 2487 spans half a million lightyears).

Figure 25
figure 25

Examples of MOND fits (blue lines) to two dwarf galaxies [324]. The data for DDO 210 come from [29], and those for UGC 11583 (also known as KK98 250) are from [30] augmented with high resolution data from [281, 242]. The high gas content of these galaxies make them strong tests of MOND, as the one fit-parameter — the mass-to-light ratio of the stars — has only a minor impact on the fit. What is more, as they are deep in the MOND regime, the exact form of the interpolating function (Section 6.2) also has little impact on the fits, making them the cleanest tests of MOND, with essentially no wiggle room. Note that, with a mass of only a few million solar masses (comparable in mass to the largest globular clusters), the Local Group dwarf DDO 210 is the smallest galaxy known to show clear rotation (Vf ∼ 15 km/s). It is the lowest point in Figure 3.

Figure 26
figure 26

MOND rotation curve fits for representative galaxies from the THINGS survey [121, 166, 481]. Galaxies are chosen to illustrate a broad range of mass, from Mb ∼ 3 × 108 M to ∼ 3 × 1011 M. All galaxies have high resolution interferometric 21 cm data for the gas and 3.6μ photometry for mapping the stars. The Newtonian baryonic mass model is shown as a black line and the MOND fit as a blue line (as in Figure 21). The fits use the interpolating function of Eq. 53 with δ = 1.

Figure 27
figure 27

MOND rotation curve fits for LSB galaxies [120] updated with high resolution Hα data [242, 241] and using Eq. 53 with δ = 1. LSB galaxies are important tests of MOND because their low surface densities (Σ ≪ a0/G) place them well into the MOND regime everywhere, and the exact form of the interpolating function is rather unimportant. Their baryonic mass models fall well short of explaining the observed rotation at any but the smallest radii in Newtonian dynamics, and MOND nevertheless provides the necessary additional force everywhere (lines as per Figure 21).

We finally note that what makes all these rotation curve fits really impressive is that either (i) stellar mass-to-light ratios are unimportant (in the case of gas-rich galaxies) yielding excellent fits with essentially zero free parameters (apart from some wiggle room on the distance), or (ii) stellar mass-to-light ratios are important, and their best-fit value, obtained on purely dynamical grounds assuming MOND, vary with galaxy color as one would expect on purely astrophysical grounds from stellar population synthesis models [42]. There is absolutely nothing built into MOND that would require that redder galaxies should have higher stellar mass-to-light ratios in the B-band, but this is what the rotation curve fits require. This is shown on Figure 28, where the best-fit mass-to-light ratio in the B-band is plotted against B — V color index (left panel), and the same for the K-band (right panel).

Figure 28
figure 28

A comparison of the mass-to-light ratios obtained from MOND rotation curve fits (points) with the independent expectations of stellar population synthesis models (lines) [42]. The mass-to-light ratio in the optical (blue B-band, left) and near-infrared (2.2 µm K-band, right) are shown as a function of B — V color (the ratio of blue to green light). The one free parameter of MOND rotation curve fits reproduces the normalization, slope, and scatter expected from what we know about stars. Not all galaxies illustrated here have both B and K-band data. Some have neither, instead having photometry in some other bandpass (e.g., V or R or I).

6.5.2 The Milky Way

Our own Milky Way galaxy (an HSB galaxy) is a unique laboratory within which present and future surveys will allow us to perform many precision tests of MOND (at a level of precision that might even discriminate between the various versions of MOND described in Section 6.1) that are not feasible with external galaxies. However, concerning the rotation curve, the test is, at present, not the most conclusive, as the outer rotation curve of the Milky Way is paradoxically much less precisely known than that of external galaxies (the forthcoming Gaia mission should allow improvement to this situation, although the rotation curve will not be measured directly). Nevertheless, past studies of the inner rotation curve of the Milky Way [141, 142, 274], measured with the tangent point method, compared to the baryonic content of the inner Galaxy [53, 155], have shown full agreement between the rotation curve and MOND, assuming, as usual, the simple interpolating function (α = n = 1 in Eqs. 46 and 49) or the γ = δ =1 interpolating function (Eqs. 52 and 53). The inverse problem was also tackled, i.e., deriving the surface density of the inner Milky Way disk from its rotation curve (see Figure 29): this exercise [274] led to a derived surface density fully consistent with star count data, and also even reproducing the details of bumps and wiggles in the surface brightness (Renzo’s rule, Section 4.3.4), while being fully consistent with the (somewhat imprecise) constraints on the outer rotation curve of the galaxy [494].

Figure 29
figure 29

The mass distribution of the Milky Way disk (left) inferred from fitting in MOND the observed bumps and wiggles in the rotation curve of the galaxy (right) [274]. The Newtonian contributions of the stellar and gas disk are shown as dashed and dotted lines as per Figure 13. The resulting model is consistent with independent star count data [155] and compares favorably to constraints on the rotation curve at radii beyond those included in the fit [494]. The prominent feature at R ≈ 6 kpc corresponds to the Centaurus spiral arm.

However, especially with the advent of present and future astrometric and spectroscopic surveys, the Milky Way offers a unique opportunity to test many other predictions of MOND. These include the effect of the “phantom dark disk” (see Figure 18) on vertical velocity dispersions and on the tilt of the stellar velocity ellipsoid, the precise shape of tidal streams around the galaxy, or the effects of the external gravitational field in which the Milky Way is embedded on fundamental parameters such as the local escape speed. However, all these predictions can vary slightly depending on the exact formulation of MOND (mainly Bekenstein-Milgrom MOND, QUMOND, or multi-field theories, the predictions being anyway difficult to make in modified inertia versions of MOND when non-circular orbits are considered). Most of the predictions made until today and reviewed hereafter have been using the Bekenstein-Milgrom version of MOND (Eq. 17).

Based on the baryonic distribution from, e.g., the Besançon model of the Milky Way [366], one can compute the MOND gravitational field of the Galaxy by solving the BM-equation (Eq. 17). This has been done in [490]. Then one can apply the Newtonian Poisson equation to it, in order to find back the density distribution that would have yielded this potential within Newtonian dynamics [50, 140]. In this context, as already shown (Figure 18), MOND predicts a disk of “phantom dark matter” allowing one to clearly differentiate it from a Newtonian model with a dark halo:

  1. (i)

    By measuring the force perpendicular to the galactic plane: at the solar radius, MOND predicts a 60 percent enhancement of the dynamic surface density at 1.1 kpc above the plane compared to the baryonic surface density, a value in agreement with current data (Table 1, see also [339]). The enhancement would become more apparent at large galactocentric radii where the stellar disk mass density becomes negligible.

  2. (ii)

    By determining dynamically the scale length of the disk mass density distribution. This scale length is a factor ∼ 1.25 larger than the scale length of the visible stellar disk if Bekenstein-Milgrom MOND applies. Such a test could be applied with existing RAVE data [423], but the accuracy of available proper motions still limits the possibility to explore the gravitational forces too far from the solar neighborhood.

  3. (iii)

    By measuring the velocity ellipsoid tilt angle within the meridional galactic plane. This tilt is different within the MOND and Newton+dark halo cases in the inner part of the Galactic disk. The tilt of about 6 degrees at z =1 kpc at the solar radius is in agreement with the recent determination of 7.3 ± 1.8 degrees obtained by [422]. The difference between MOND and a Newtonian model with a spherical halo becomes significant at z =2 kpc. Interestingly, recent data [328] on the tilt of the velocity ellipsoid at these heights clearly favor the MOND prediction [50].

Table 1 Values predicted from the Besançon model of the Milky Way in MOND as seen by a Newtonist (i.e., in terms of phantom dark matter contributions) compared to current observational constraints in the Milky Way, for the local dynamical surface density and the tilt of the stellar velocity ellipsoid [50]. Predictions for a round dark halo without a dark disk are also compatible with the current constraints, though [194, 422]. The tilt at z =2 kpc should be more discriminating.

Such tests of MOND could be applied with the first release of future Gaia data. To fix the ideas on the current local constraints, the predictions of the Besancon MOND model are compared with the relevant observations in Table 1. However, let us note that these predictions are extremely dependent on the baryonic content of the model [53, 155, 366], so that testing MOND at the precision available in the Milky Way heavily relies on star counts, stellar population synthesis, census of the gaseous content (including molecular gas), and inhomogeneities in the baryonic distribution (clusters, gas clouds).

Another test of the predictions of MOND for the gravitational potential of the Milky Way is the thickness of the HI layer as a function of position in the disk (see Section 6.5.3): it has been found [378] that Bekenstein-Milgrom MOND and it phantom disk successfully accounts for the most recent and accurate flaring of the HI layer beyond 17 kpc from the center, but that it slightly underpredicts the scale-height in the region between 10 and 15 kpc. This could indicate that the local stellar surface density in this region should be slightly smaller than usually assumed, in order for MOND to predict a less massive phantom disk and hence a thicker HI layer. Another explanation for this discrepancy would rely on non-gravitational phenomena, namely ordered and small-scale magnetic fields and cosmic rays contributing to support the disk.

Yet another test would be the comparison of the observed Sagittarius stream [198, 248] with the predictions made for a disrupting galaxy satellite in the MOND potential of the Milky Way. Basic comparisons of the stream with the orbit of a point mass have shown accordance at the zeroth order [358]. In reality, such an analysis is not straightforward because streams do not delineate orbits, and because of the non-linearity of MOND. However, combining a MOND N-body code with a Bayesian technique [474] in order to efficiently explore the parameter space, it should be possible to rigorously test MOND with such data in the near future, including for external galaxies, which will lead to an exciting battery of new observational tests of MOND.

Finally, a last test of MOND in the Milky Way involves the external field effect of Section 6.3. As explained there, the return to a Newtonian (Eq. 61 or Eq. 63) instead of a logarithmic (Eq. 20) potential at large radii is defining the escape speed in MOND. By observationally estimating the escape speed from a system (e.g., the Milky Way escape speed from our local neighborhood), one can estimate the amplitude of the external field in which the system is embedded. With simple analytical arguments, it was found [144] that with an external field of 0.01a0, the local escape speed at the Sun’s radius was about 550 km/s, exactly as observed (within the observational error range [433]). This was later confirmed by rigorous modeling in the context of Bekenstein-Milgrom MOND and with the Besancon baryonic model of the Milky Way [492]. This value of the external field, 10−2 × a0, corresponds to the order of magnitude of the gravitational field exerted by Large Scale Structure, estimated from the acceleration endured by the Local Group during a Hubble time in order to attain a peculiar velocity of 600 km/s.

6.5.3 Disk stability and interacting galaxies

A lot of questions in galaxy dynamics require using N-body codes. This is notably necessary for studying stability of galaxy disks, the formation of bars and spirals, or highly time-varying configurations such as galaxy mergers. As we have seen in Section 6.1.2, the BM modified Poisson equation (Eq. 17) can be solved numerically using various methods [50, 77, 96, 147, 250, 457]. Such a Poisson solver can then be used in particle-mesh N-body codes. More general codes based on QUMOND (Section 6.1.3) are currently under development.

The main results obtained via these simulations are the following (the comparison with observations will be discussed below):

  1. (i)

    LSB disks are more unstable regarding bar and spiral instabilities in MOND than in the Newton+spherical halo equivalent case,

  2. (ii)

    Bars always tend to appear more quickly in MOND than in the Newton+spherical halo equivalent, and are not slowed down by dynamical friction, leading to fast bars,

  3. (iii)

    LSB disks can be both very thin and extended in MOND thanks to the effect of the “phantom disk”, and vertical velocity dispersions level off at 8 km/s, instead of 2 km/s for Newtonian disks,

  4. (iv)

    Warps can be created in apparently isolated galaxies from the external field effect of large scale structure in MOND,

  5. (v)

    Merging time-scales are longer in MOND for interacting galaxies,

  6. (vi)

    Reproducing interacting systems such as the Antennae require relatively fine-tuned initial conditions in MOND, but the resulting galaxy is more extended and thus closer to observations, thanks to the absence of angular momentum transfer to the dark halo.

Concerning the first point (i), Brada & Milgrom [77] investigated the important problem of stability of disk galaxies. They demonstrated that MOND, as anticipated [299], has an effect similar to a dark halo in stabilizing a rotationally-supported disk, thereby explaining the upper limit in surface density seen in the data (Section 4.3.2), and also showing how it damps the growth-rate of bar-forming modes in the weak gravitational field regime. In a comparison of MOND disks with the equivalent Newtonian+halo counterpart (with identical rotation curves), they found that, as the surface density of the disk decreases, the growth-rate of the bar-forming mode decreases similarly in both cases. However, in the limit of very low surface densities, typical of LSB galaxies, the MOND growth rate stops decreasing, contrary to the Newton+dark halo case (Figure 30). This could provide a solution to the stability challenge of Section 4.2, as observed LSBs do exhibit bars and spirals, which would require an ad hoc dark component within the self-gravitating disk of the Newtonian system. One can also see on this figure that if the surface density is typical of intermediate HSB galaxies, the bar systematically forms quicker in MOND.

Figure 30
figure 30

The scaled growth-rate of the m = 2 instability in Newtonian disks with a dark halo (dotted line) and MONDian disks (solid line) as a function of disk mass. In the MOND case, as the disk mass decreases, the surface density decreases and the disk sinks deeper into the MOND regime. However, at very low masses the growth-rate saturates. In the equivalent Newtonian case, the rotation curve is maintained at the MOND level by supplementing the force with a round stabilizing dark halo, which causes the growth-rate to crash [77, 401]. An ad-hoc dark disk could help maintain the growth rate in the dark matter context. Image reproduced by permission from [401].

This was confirmed in recent simulations [104, 457], where it was additionally found that (ii) the bar is sustained longer, and is not slowed down by dynamical friction against the dark halo, which leads to fast bars, consistent with the observed fast bars in disk galaxies (measured through the position of resonances). However, when gas inflow and external gas accretion are included, a larger range of situations are met regarding pattern speeds in MOND, all compatible with observations [458]. Since the bar pattern speed has a tendency to stay constant, the resonances remain at the same positions, and particles are trapped on these orbits more easily than in the Newtonian case, which leads to the formation of rings and pseudo-rings as observed (see Figure 31 and Figure 32). All these results have been shown to be independent of the exact choice of interpolating μ-function [458].

Figure 31
figure 31

(a) The galaxy ESO 509-98. (b) The galaxy NGC 1543. These are two examples of galaxies that exhibit clear ring and pseudo-ring structures. Image courtesy of Tiret, reproduced by permission from [458], copyright by ESO.

Figure 32
figure 32

Simulations of ESO 509-98 and NGC 1543 in MOND, to be compared with Figure 31. Rings and pseudo-ring structures are well reproduced with modified gravity. Image courtesy of Tiret, reproduced by permission from [458], copyright by ESO.

What is more, (iii) LSB disks can be both very thin and extended in MOND thanks to the stabilizing effect of the “phantom disk”, and vertical velocity dispersions level off at 8 km/s, as typically observed [25, 241], instead of 2 km/s for Newtonian disks with Σ = 1 M pc−2 (depending on the thickness of the disk). However, the observed value is usually attributed to non-gravitational phenomena. Note that [279] utilized this fact to predict that conventional analyses of LSB disks would infer abnormally high mass-to-light ratios for their stellar populations — a prediction that was subsequently confirmed [159, 371]. But let us also note that this stabilizing effect of the phantom disk, leading to very thin stellar and gaseous layers, could even be too strong in the region between 10 and 15 kpc from the galactic center in the Milky Way (see Section 6.5.2), and in external galaxies [497], even though, as said, non-gravitational effects such as ordered and small-scale magnetic fields and cosmic rays could significantly contribute to the prediction in these regions.

Via these simulations, it has also been shown (iv) that the external field effect of MOND (Section 6.3) offers a mechanism other than the relatively weak effect of tides in inducing and maintaining warps [79]. It was demonstrated that a satellite at the position and with the mass of the Magellanic clouds can produce a warp in the plane of the galaxy with the right amplitude and form [79], and even more importantly, that isolated galaxies could be affected by the external field of large scale structure, inducing a differential precession over the disk, in turn causing a warp [104]. This could provide a new explanation for the puzzle of isolated warped galaxies.

Interactions and mergers of galaxies are (v) very important in the cosmological context of galaxy formation (see also Section 9.2). It has been found [95] from analytical arguments that dynamical friction should be much more efficient in MOND, for instance for bar slowing down or mergers occurring more quickly. But simulations display exactly the opposite effect, in the sense of bars not slowing down and merger time-scales being much larger in MOND [338, 459]. Concerning bars, Nipoti [335] found that they were indeed slowed down more in MOND, as predicted analytically [95], but this is because their bars were unrealistically small compared to observed ones. In reality, the bar takes up a significant fraction of the baryonic mass, and the reservoir of particles to interact with, assumed infinite in the case of the analytic treatment [95], is in reality insufficient to affect the bar pattern speed in MOND. Concerning long merging time-scales, an important constraint from this would be that, in a MONDian cosmology, there should perhaps be fewer mergers, but longer ones than in ΛCDM, in order to keep the total observed amount of interacting galaxies unchanged. This is indeed what is expected (see Section 9.2). What is more, the long merging time-scales would imply that compact galaxy groups do not evolve statistically over more than a crossing time. In contrast, in the Newtonian+dark halo case, the merging time scale would be about one crossing time because of dynamical friction, such that compact galaxy groups ought to undergo significant merging over a crossing time, contrary to what is observed [239]. Let us also note that, in MOND, many passages in binary galaxies will happen before the final merging, with a starburst triggered at each passage, meaning that the number of observed starbursts as a function of redshift cannot be used as an estimate of the number of mergers [104].

Finally, (vi) at a more detailed level, the Antennae system, the prototype of a major merger, has been shown to be nicely reproducible in MOND [459]. This is illustrated in Figure 33. On the contrary, while it is well established that CDM models can result in nice tidal tails, it turns out to be difficult to simultaneously match the narrow morphology of many observed tidal tails with rotation curves of the systems from which they come [130]. In MOND, reproducing the Antennae requires relatively fine-tuned initial conditions, but the resulting tidal tails are narrow and the galaxy is more extended and thus closer to observations than with CDM, thanks to the absence of angular momentum transfer to the dark halo (solution to the angular momentum challenge of Section 4.2).

Figure 33
figure 33

Simulation of the Antennae with MOND (right, [459]) compared to the observations (left, [190]). In the observations, the gas is represented in blue and the stars in green. In the simulation the gas is in blue and the stars are in yellow/red. Image courtesy of Tiret, reproduced by permission from [459], copyright by ASP.

6.5.4 Tidal dwarf galaxies

As seen in, e.g., Figure 33, left panel, major mergers between spiral galaxies are frequently observed with dwarf galaxies at the extremity of their tidal tails, called Tidal Dwarf Galaxies (TDG). These young objects are formed through gravitational instabilities within the tidal tails, leading to local collapse of gas and star formation. These objects are very common in interacting systems: in some cases dozens of such condensations are seen in the tidal tails, with a few ones having a mass typical of other dwarf galaxies in the Universe. However, in the ΛCDM model, these objects are difficult to form, and require very extended dark matter distribution [71]. In MOND simulations [459, 104], the exchange of angular momentum occurs within the disks, whose sizes are inflated. For this reason, it is much easier with MOND to form TDGs in extended tidal tails.

What is more, in the ΛCDM context, these objects are not expected to drag CDM around them, the reason being that these objects are formed out of the material in the tidal tails, itself made of the dynamically cold, rotating, material in the progenitor disk galaxies. In these disks, the local ratio of dark matter to baryons is close to zero. For this reason, the ΛCDM prediction is that these objects should not exhibit a mass discrepancy problem. However, the first ever measurement of the rotation curve of three TDGs in the NGC 5291 ring system (Figure 34) has revealed the presence of dark matter in these three objects [72]. A solution to explain this in the standard picture could then be to resort to dark baryons in the form of cold molecular gas in the disks of the progenitor galaxies. However, it is very surprising that a very different kind of dark matter, in this case baryonic dark matter, would conspire to assemble itself precisely in the right way such as to put the three TDGs (see Section 4.3.1) on the baryonic Tully-Fisher relation (when this baryonic dark matter is not taken into account in the baryonic budget of the BTF). Another possibility, not resorting to baryonic dark matter, would be that, by chance, the three TDGs have been observed precisely edge-on. However, if we simply consider the most natural inclination coming from the geometry of the ring (i = 45ΰ, see [72]), and apply Milgrom’s formula to the visible matter distribution with zero free parameters [165, 309], one gets very reasonable curves (Figure 35). Playing around a little bit with the inclinations allows perfect fits to these rotation curves [165], while the influence of the external field effect has been shown not to significantly change the result. Therefore, we can conclude that ΛCDM has severe problems with these objects, while MOND does exceedingly well in explaining their observed rotation curves.

Figure 34
figure 34

The NGC 5291 system [72]. VLA atomic hydrogen 21-cm map (blue) superimposed on an optical image (white). The UV emission observed by GALEX (red) traces dense star-forming concentrations. The most massive of these objects are rotating with the projected spin axis as indicated by dashed arrows. The three most massive ones are denoted as NGC5291N, NGC5291S, and NGC5291W. Image courtesy of Bournaud, reproduced by permission from [72].

Figure 35
figure 35

Rotation curves of the three TDGs in the NGC 5291 system. In red: ΛCDM prediction (with no additional cold molecular gas), with the associated uncertainties. In black: MOND prediction with the associated uncertainties (prediction with zero free parameter, “simple” μ-function assumed). Image reproduced by permission from [165], copyright by ESO.

However, the observations of only three TDGs are, of course, not enough, from a statistical point of view, in order for this result to be as robust as needed. Many other TDGs should be observed to randomize the uncertainties, and consolidate (or invalidate) this potentially extremely important result, that could allow one to really discriminate between Milgrom’s law being either a consequence of some fundamental aspect of gravity (or of the nature of dark matter), or simply a mere recipe for how CDM organizes itself inside spiral galaxies. As a summary, since the internal dynamics of tidal dwarfs should not be affected by CDM, they cannot obey Milgrom’s law for a statistically-significant sample of TDGs if Milgrom’s law is only linked to the way CDM assembles itself in galaxies. Thus, observations of the internal dynamics of TDGs should be one of the observational priorities of the coming years in order to settle this debate.

Finally, let us note that it has been suggested [239], as a possible solution to the satellites phase-space correlation problem of Section 4.2, that most dwarf satellites of the Milky Way could have been formed tidally, thereby being old tidal dwarf galaxies. They would then naturally appear in closely related planes, explaining the observed disk-of-satellites. While this scenario would lead to a missing satellites catastrophe in ΛCDM (see Section 4.2), it could actually make sense in a MONDian Universe (see Section 9.2).

6.6 MOND in pressure-supported stellar systems

We have already outlined (Section 5.2) how Milgrom’s formula accounts for general scaling relations of pressure-supported systems such as the Faber-Jackson relation (Figure 7 and see [395]), and that isothermal systems have a finite mass in MOND with the density at large radii falling approximately as r−4 [296]. Note also that, in order to match the observed fundamental plane, MOND models must actually deviate somewhat from being strictly isothermal and isotropic: a radial orbit anisotropy in the outer regions is needed [388, 86]. Here we concentrate on slightly more detailed predictions and scaling relations. In general, these detailed predictions are less obvious to make than in rotationally-supported systems, precisely because of the new degree of freedom introduced by the anisotropy of the velocity distribution, very difficult to constrain observationally (as higher-order moments than the velocity dispersions would be needed to constrain it). As we shall see, the successes of MOND are in general a bit less impressive in pressure-supported systems than in rotationally-supported ones, and even in some cases really problematic (e.g., in the case of galaxy clusters, see Section 6.6.4). Whether this is due to the fact that predictions are less obvious to make, or whether this truly reflects a breakdown of Milgrom’s formula for these objects (or the fact that certain theoretical versions of MOND would explicitly deviate from Milgrom’s formula in pressure-supported systems, see Section 6.1.1) remains unclear.

6.6.1 Elliptical galaxies

Luminous elliptical galaxies are dense bodies of old stars with very little gas and typically large internal accelerations. The age of the stellar populations suggest they formed early and all the gas has been used to form stars. To form early, one might expect the presence of a massive dark-matter halo, but the study of, e.g., [367] showed that actually, there is very little evidence for dark matter within the effective radius, and even several effective radii, in ellipticals. On the other hand, these are very-HSB objects and would thus not be expected to show a large mass discrepancy within the bright optical object in MOND. And indeed, the results of [367] were shown to be in perfect agreement with MOND predictions, assuming very reasonable anisotropy profiles [323]. On the theoretical side, it was also importantly shown that triaxial elliptical galaxies can be reproduced using the Schwarzschild orbit superposition technique [482], and that these models are stable [493]Footnote 41.

Interestingly, some observational studies circumvented the mass-anisotropy degeneracy by constructing non-parametric models of observed elliptical galaxies, from which equivalent circular velocity curves, radial profiles of mass-to-light ratio, and anisotropy profiles, as well as high-order moments, could be computed [171]. Thanks to these studies, it was, e.g., shown [171] that, although not much dark matter is needed, the equivalent circular velocity curves (see also [484] where the rotation curve could be measured directly) tend to become flat at much larger accelerations than in thin exponential disk galaxies. This would seem to contradict the MOND prescription, for which flat circular velocities typically occur well below the acceleration threshold a0, but not at accelerations on the order of a few times a0 as in ellipticals. However, as shown in [363], if one assumes the simple interpolating function (α = n = 1 in Eq. 46 and Eq. 49), known to yield excellent fits to spiral galaxy rotation curves (see Section 6.5.1), one finds that MONDian galaxies exhibit a flattening of their circular velocity curve at high accelerations if they can be described by a Jaffe profile [208] in the region where the circular velocity is constant. Since this flattening at high accelerations is not possible for exponential profiles, it is remarkable that such flattenings of circular velocity curves at high accelerations are only observed in elliptical galaxies. What is more, [171], as well as [454], derived from their models scaling relations for the configuration space and phase-space densities of dark matter in ellipticals, and these DM scaling relations have been shown [363] to be in very good agreement with the MOND predictions on “phantom DM” (Eq. 33) scaling relations. This is displayed on Figure 36. Of course, some of these galaxies are residing in clusters, and the external field effect (see Section 6.3) could modify the predictions, but this was shown to be negligible for most of the analyzed sample, because the galaxies are far away from the cluster center [363]. Note that when closer to the center of galaxy clusters, interesting behaviors such as lopsidedness caused by the external field effect could allow new tests of MOND in the near future [491]. However, this would require modelling both the orbit of the galaxy in the cluster to take into account time-variations of the external field, as well as a precise estimate of the external field from the cluster itself, which can be tricky as the whole cluster should be modelled at once due to the non-linearity of MOND [113, 259].

Figure 36
figure 36

MOND phantom dark matter scaling relations in ellipticals. The circles display central density ρ0, and central phase space density f of the phantom dark halos predicted by MOND for different masses of baryonic Hernquist profiles (with scale-radius rh related to the effective radius by Reff = 1.815 rh). The dotted lines are the scaling relations of [171], and the dashed lines those of [454], which exhibit a very large observational scatter in good agreement with the MOND prediction [363]. Image reproduced by permission from [363], copyright by ESO.

At a more detailed level, precise full line-of-sight velocity dispersion profiles of individual ellipticals, typically measured with tracers such as PNe or globular-cluster populations, have been reproduced by solving Jeans equation in spherical symmetry:

$${{d{\sigma ^2}} \over {dr}} + {\sigma ^2}{{(2\beta + \alpha)} \over r} = - g(r),$$

where σ is the radial velocity dispersion, α = d ln ρ/d ln r is the slope of the tracer density ρ, and \(\beta = 1 - (\sigma _\theta ^2 + \sigma _\phi ^2)/2{\sigma ^2}\) is the velocity anisotropy. Note that on the left-hand side, one uses the density and the velocity dispersion of the tracers only, which can be different from the density producing the gravity on the right-hand side, if a specific population of tracers such as globular clusters is used. When the global kinematics of a galaxy is analyzed, we do expect in MOND that the gravity on the right-hand side of Eq. 65 is generated by the observed mass distribution, so both should be fit simultaneously: Figure 37 (provided by [399]) shows an example. In general, it was found that field galaxies all fit very naturally with MOND [461, 410] (see also [484]). On the other hand, the MOND modification has been found to slightly underpredict the velocity dispersions in large elliptical galaxies at the very center of galaxy clusters [364], which is just the small-scale equivalent of the problem of MOND in clusters, pointing towards missing baryons (see Section 6.6.4).

Figure 37
figure 37

The surface brightness (a) and velocity dispersion (b) profiles of the elliptical galaxy NGC 7507 [375] fitted by MOND (lines [399]). Elliptical galaxies can be approximated in MOND as high-order polytropes with some radial orbit anisotropy [388]. This particular case has a polytropic index of 14 with anisotropy of the Osipkov-Merritt form with an anisotropy radius of 5 kpc and maximum anisotropy β = 0.75 at large radii [399]. The stellar mass-to-light ratio is \(\Upsilon _\ast^B = 3.03{M_ \odot}/{L_ \odot}\). This simple model captures the gross properties of both the surface brightness and velocity dispersion profiles. The galaxy is well-fitted by MOND, contrary to the claim of [375].

On the other hand, [225] used satellite galaxies of ellipticals to test MOND at distances of several 100 kpcs. They used the stacked SDSS satellites to generate a pair of mock galaxy groups with reasonably precise line-of-sight velocity dispersions as a function of radius across the group. When these systems were first analysed by [225] they claimed that MOND was excluded by 10σ, but this was only for models that had constant velocity anisotropy. It was then found [14] that with varying anisotropy profiles similar to those found in simulations of the formation of ellipticals by dissipationless collapse in MOND [337], excellent fits to the line-of-sight velocity dispersions of both mock galaxies could be found. This can be taken as strong evidence that MOND describes the dynamics in the surroundings of relatively isolated ellipticals very well.

Finally, let us note an intriguing possibility in a MONDian universe (see also Section 9.2). While massive ellipticals would form at z ≈ 10 [393] from monolithic dissipationless collapse [337], dwarf ellipticals could be more difficult to form. A possibility to form those would then be that tidal dwarf galaxies would be formed and survive more easily (see Section 6.5.4) in major mergers, and could then evolve to lead to the population of dwarf ellipticals seen today, thereby providing a natural explanation for the observed density-morphology relation [239] (more dwarf ellipticals in denser environments).

6.6.2 Dwarf spheroidal galaxies

Dwarf spheroidal (dSph) satellites of the Milky Way [427, 477] exhibit some of the largest mass discrepancies observed in the universe. In this sense, they are extremely interesting objects in which to test MOND. Observationally, let us note that there are essentially two classes of objects in the galactic stellar halo: globular clusters (see Section 6.6.3) and dSph galaxies. These overlap in baryonic mass, but not in surface brightness, nor in age or uniformity of the stellar populations. The globular clusters are generally composed of old stellar populations, they are HSB objects and mostly exhibit no mass discrepancy problem, as expected for HSB objects in MOND. The dSphs, on the contrary, generally contain slightly younger stellar populations covering a range of ages, they are extreme LSB objects and exhibit, as said before, an extreme mass discrepancy, as generically expected from MOND. So, contrary to the case of ΛCDM where different formation scenarios have to be invoked (see Section 6.6.3), the different mass discrepancies in these objects find a natural explanation in MOND.

At a more detailed level, MOND should also be able to fit the whole velocity dispersion profiles, and not only give the right ballpark prediction. This analysis has recently been possible for the eight “classical” dSph around the Milky Way [477]. Solving Jeans equation (Eq. 65), it was found [8] that the four most massive and distant dwarf galaxies (Fornax, Sculptor, Leo I and Leo II) have typical stellar mass-to-light ratios, exactly within the expected range. Assuming equilibrium, two of the other four (smallest and most nearby) dSphs have mass-to-light ratios that are a bit higher than expected (Carina and Ursa Minor), and two have very high ones (Sextans and Draco). For all these dSphs, there is a remarkable correlation between the stellar M/L inferred from MOND and the ages of their stellar populations [189]. Concerning the high inferred stellar M/L, note that it has been shown [78] that a dSph will begin to suffer tidal disruption at distances from the Milky Way that are 4–7 times larger in MOND than in CDM, Sextans and Draco could thus actually be partly tidally disrupted in MOND. And indeed, after subjecting the five dSphs with published data to an interloper removal algorithm [418], it was found that Sextans was probably littered with unbound stars, which inflated the computed M/L, while Draco’s projected distance-l.o.s. velocity diagram actually looks as out-of-equilibrium as Sextans’ one. Ursa Minor, on the other hand, is the typical example of an out-of-equilibrium system, elongated and showing evidence of tidal tails. In the end, only Carina has a suspiciously high M/L (> 4; see [418]).

What is more, there is a possibility that, in a MONDian Universe, dSphs are not primordial objects but have been tidally formed in a major merger (see Section 9.2 as a solution to the phasespace correlation challenge of Section 4.2). In addition to the MOND effect, it would be possible that these objects never really reach a stable equilibrium [237], and exhibit an artificially high M/L ratio. This is even more true for the recently discovered “ultra-faint” dwarf spheroidals, that are also, due to to their extremely low-density, very much prone to tidal heating in MOND. Indeed, at face value, if these ultrafaints are equilibrium objects, their velocity dispersions are much too high compared to what MOND predicts, and rule out MOND straightforwardly. However, unless this is due to systematic errors linked with the smallness of the velocity dispersion to measure (one must distinguish between σ ≈ 2 km s−1 and σ ≈ 5 km s−1), and/or to high intrinsic stellar M/L ratios related to stochastic effects linked with the small number of stars [186], it was also found [285] that these objects are all close to filling their MONDian tidal radii, and that their stars can complete only a few orbits for every orbit of the satellite itself around the Milky Way (see Figure 38). As Brada & Milgrom [78] have shown, it then comes as no surprise that they are displaying out-of-equilibrium dynamics in MOND (and even more so in the case of a tidal formation scenario [237]).

Figure 38
figure 38

The characteristic acceleration, in units of a0, in the smallest galaxies known: the dwarf satellites of the Milky Way (orange squares) and M31 (pink squares) [285]. The classical dwarfs, with thousands of velocity measurements of individual stars [477], are largely consistent with MOND. The more recently discovered “ultrafaint” dwarfs, tiny systems with only a handful of stars [427], typically are not, in the sense that their measured velocity dispersions and accelerations are too high. This could be due to systematic uncertainties in the data [230], as we must distinguish between σ ≈ 2 km s−1 and σ ≈ 5 km s−1. Nevertheless, there may be a good physical reason for the non-compliance of the ultrafaint galaxies in the context of MOND. The deviation of these objects only occurs in systems where the stars are close to filling their MONDian tidal radii: the left panel shows the half light radius relative to the tidal radius. Such systems may not be in equilibrium. Brada & Milgrom [78] note that systems will no longer respond adiabatically to the influence of their host galaxy when a star in a satellite galaxy can complete only a few orbits for every orbit the satellite makes about its host. The deviant dwarfs are in this regime (right panel).

6.6.3 Star clusters

Star clusters come in two types: open clusters and globular clusters. Most observed open clusters are in the inner parts of the Milky Way disk, and for that reason, the prediction of MOND is that their internal dynamics is Newtonian [293] with, perhaps, a slightly renormalized gravitational constant and slightly squashed isopotentials, due to the external field effect (Section 6.3). Therefore, the possibility of distinguishing Newtonian dynamics from MOND in these objects would require extreme precision. On the other hand, globular clusters are mostly HSB halo objects (see Section 6.6.2), and are consequently predicted to be Newtonian, and most of those that are fluffy enough to display MONDian behavior are close enough to the Galactic disk to be affected by the external field effect (Section 6.3), and so are Newtonian, too. Interestingly, MOND thus provides a natural explanation for the dichotomy between dwarf spheroidals and globular clusters. In ΛCDM, this dichotomy is rather explained by the formation history [235, 397]: globular clusters are supposedly formed in primordial disk-bound supermassive molecular clouds with high baryon-to-dark matter ratio, and later become more spheroidal due to subsequent mergers. In MOND, it is, of course, not implied that the two classes of objects have necessarily the same formation history, but the different dynamics are qualitatively explained by MOND itself, not by the different formation scenarios.

However, there exist a few globular clusters (roughly, less than ∼ 10 compared to the total number of ∼ 150) both fluffy enough to display typical internal accelerations well below a0, and far away enough from the galactic plane to be more or less immune from the external field effect [27, 182, 181, 436]. Thus, these should, in principle, display a MONDian mass discrepancy. They include, e.g., Pal 14 and Pal 3, or the large fluffy globular cluster NGC 2419. Pal 3 is interesting, because it indeed tends to display a larger-than-Newtonian global velocity dispersion, broadly in agreement with the MOND prediction (Baumgardt & Kroupa, private communication). However, it is difficult to draw too strong a conclusion from this (e.g., on excluding Newtonian dynamics), since there are not many stars observed, and one or two outliers would be sufficient to make the dispersion grow artificially, while a slightly-higher-than-usual mass-to-light ratio could reconcile Newtonian dynamics with the data. Other clusters such as NGC 1851 and NGC 1904 apparently display the same MONDian behavior [408] (see also [187]). On the other hand, Pal 14 displays exactly the opposite behavior: the measured velocity dispersion is Newtonian [212], but again the number of observed stars is too small to draw a statistically significant conclusion [164], and it is still possible to reconcile the data with MOND assuming a slightly low stellar mass-to-light ratio [437]. Note that if the cluster is on a highly eccentric orbit, the external gravitational field could vary very rapidly both in amplitude and direction, and it is possible that the cluster could take some time to accomodate this by still displaying a Newtonian signature in its kinematics after a sudden decrease of the external field.

NGC 2419 is an interesting case, because it allows not only for a measure of the global velocity dispersion, but also of the detailed velocity dispersion profile [199]. And, again, like in the case of Pal 14 (but contrary to Pal 3), it displays Newtonian behavior. More precisely, it was found, solving Jeans equations (Eq. 65), that the best MOND fit, although not extremely bad in itself, was 350 times less likely than the best Newtonian fit without DM [199, 200]. However, the stability [336] of this best MOND fit has not been checked in detail. These results are heavily debated as they rely on the small quoted measurement errors on the surface density, and even a slight rotation of only the outer parts of this system near the plane of the sky (which would not show up in th velocity data) would make a considerable difference in the right direction for MOND [398]. However, these observations, together with the results on Pal 14, although not ruling out any theory, are not a resounding success for MOND. However, it could perhaps indicate that globular clusters are generically on highly eccentric orbits, and out of equilibrium due to this (however, the effect would have to be opposite to that prevailing in ultra-faint dwarfs, where the departure from equilibrium would boost the velocity dispersion instead of decreasing it). A stronger view on these results could indicate that MOND as formulated today is an incomplete paradigm (see, e.g., Eq. 27), or that MOND is an effect due to the fundamental nature of the DM fluid in galaxies (see Sections 7.6 and 7.9), which is absent from globular clusters. Concerning NGC 2419, it is perhaps useful to remind oneself that it is very plausibly not a globular cluster. It is part of the Virgo stream and is thus most probably the remaining nucleus of a disrupting satellite galaxy in the halo of the Milky Way, on a generically-highly-eccentric orbit. Detailed N-body simulations of such an event, and of the internal dynamics of the remaining nucleus, would thus be the key to confront MOND with observations in this object. All in all, the situation regarding MOND and the internal dynamics of globular clusters remains unclear.

On the other hand, it has been noted that MOND seems to overpredict the Roche lobe volume of globular clusters [499, 500, 512]. Again, the fact that globular clusters could generically be on highly eccentric orbits could come to the rescue here. What is more, it was shown that, in MOND, globular clusters can have a cutoff radius, which is unrelated to the tidal radius when non-isothermal [397]. In general, the cutoff radii of dwarf spheroidals, which have comparable baryonic masses, are larger than those of the globular clusters, meaning that those may well extend to their tidal radii because of a possibly different formation history than globular clusters.

Finally, a last issue for MOND related to globular clusters [335, 377] is the existence of five such objects surrounding the Fornax dwarf spheroidal galaxy. Indeed, under similar environmental conditions, dynamical friction occurs on significantly shorter timescales in MOND than standard dynamics [95], which could cause the globular clusters to spiral in and merge within at most 2 Gyrs [377]. However, this strongly depends on the orbits of the globular clusters, and, in particular, on their initial radius [10], which can allow for a Hubble time survival of the orbits in MOND.

6.6.4 Galaxy groups and clusters

As pointed out earlier (3rd Kepler-like law of Section 5.2), it is a natural consequence of Milgrom’s law that, at the effective baryonic radius of the system, the typical acceleration σ2/R is always observed to be on the order of a0, thereby naturally explaining the linear relation between size and temperature for galaxy clusters [327, 392]. However, one of the main predictions of Milgrom’s formula is the baryonic Tully-Fisher relation (circular velocity vs. baryonic mass, Figure 3), and its equivalent for isotropic pressure-supported systems, the Faber-Jackson relation (stellar velocity dispersion vs. baryonic mass, Figure 7), both for their slope and normalization. For systems such as galaxy clusters, where the hot intra-cluster gas is the major baryonic component, this relation can also be translated into a “gas temperature vs. baryonic mass” relation, MbT2, plotted on Figure 39, as the line log(Mb/M) = 2 log(T/keV) + 12.9 (note that this differs slightly from [389] where solar metallicity gas is assumed). Note on this figure that observations are closer to the MOND predicted slope than to the conventional prediction of M ∝ T3/2 in ΛCDM, without the need to invoke preheating (a need that may arise as an artifact of the mismatch in slopes).

Figure 39
figure 39

The baryonic mass-X-ray temperature relation for rich clusters (gray triangles [359, 389]) and groups of galaxies (green triangles [12]). The solid line indicates the prediction of MOND: the data are reasonably consistent with the slope (MT2), but not with the normalization. This is the residual missing baryon problem in MOND: there should be roughly twice as much mass (on average) as observed. Also shown is the scaling relation expected in ΛCDM (dashed line [137]). This is in better (if not perfect) agreement with the normalization of the data for rich clusters, but not the slope. The difference is sometimes attributed to preheating of the gas [496], which might also occur in MOND.

So, interestingly, the data are still reasonably consistent with the slope predicted by MOND [383], but not with the normalization. There is roughly a factor of two of residual missing mass in these objects [170, 354, 387, 389, 392, 453]. This conclusion, reached from applying the hydrostatic equilibrium equation to the temperature profile of the X-ray emitting gas of these objects, has also been reached for low mass X-ray emitting groups [12]. This is essentially because, contrary to the case of galaxies, there is observationally a need for “Newtonian” missing mass in the central partsFootnote 42 of clusters, where the observed acceleration is usually slightly larger than a0, meaning that the MOND prescription is not enough to explain the observed discrepancy between visible and dynamical mass there. For this reason, the residual missing mass in MOND is essentially concentrated in the central parts of clusters, where the ratio of MOND dynamical mass to observed baryonic mass reaches a value of 10, to then only decrease to a value of roughly ∼ 2 in the very outer parts, where almost no residual mass is present. Thus, the profile of this residual mass would thus consist of a large constant density core of about 100–200 kpc in size (depending on the size of the group/cluster in question), followed by a sharp cutoff.

The need for this residual missing mass in MOND might be taken in one of the five following ways:

  1. (i)

    Practical falsification of MOND,

  2. (ii)

    Evidence for missing baryons in the central parts of clusters,

  3. (iii)

    Evidence for non-baryonic dark matter (existing or exotic),

  4. (iv)

    Evidence that MOND is an incomplete paradigm,

  5. (v)

    Evidence for the effect of additional fields in the parent relativistic theories of MOND, not included in Milgrom’s formula.

If (i) is correct, one still needs to explain the success of MOND on galaxy scales with ΛCDM. Such an explanation has yet to be offered. Thus, tempting as case (i) is, it is worth giving a closer inspection to the four other possibilities.

The second case (ii) would be most in line with the elegant absence of need for any non-baryonic mass in MOND (however, see the “dark fields” invoked in Section 7). It has happened before that most of the baryonic mass was in an unobserved component. From the 1930s when Zwicky first discovered the missing mass problem in clusters till the 1980s, it was widely presumed that the stars in the observed galaxies represented the bulk of baryonic mass in clusters. Only after the introduction of MOND (in 1983) did it become widely appreciated that the diffuse X-ray emitting intracluster gas (the ICM) greatly outweighed the stars. That is to say, some of the missing mass problem in clusters was due to optically dark baryons — instead of the enormous mass discrepancies implied by cluster dynamical mass to optical light ratios in excess of 100 [24], the ratio of dark to baryonic mass is only ∼ 8 conventionally [175, 278]. So we should not be too hasty in presuming we now have a complete census of baryons in clusters. Indeed, in the global baryon inventory of the universe, ∼ 30% of the baryons produced during BBN are missing (Figure 40), and presumably reside in some, as yet undetected, (dark) form. It is estimated [160, 421] that the observed baryons in clusters only account for about 4% of those produced during BBN (Figure 40). This is much less than the 30% of baryons that are still missing. Consequently, only a modest fraction of the dark baryons need to reside in clusters to solve the problem of missing mass in the central regions of clusters in MOND. It should be highlighted that this missing mass only appears in MOND for systems with a high abundance of ionised gas and X-ray emission. Indeed, for even smaller galaxy groups, devoid of gas, the MOND predictions for the velocity dispersions of individual galaxies are again perfectly in line with the observations [303, 307]. It is then0 no stretch of the imagination to surmise that these gas rich systems, where the residual-missing-baryons problem have equal quantities of molecular hydrogen or other molecules. Milgrom [310] has, e.g., proposed that the missing mass in MOND could entirely be in the form of cold, dense gas clouds. There is an extensive literature discussing searches for cold gas in the cores of galaxy clusters, but what is usually meant there is quite different from what is meant here, since those searches consisted in trying to find the signature of diffuse cold molecular gas at a temperature of ∼ 30 K. The proposition of Milgrom [310] rather relies on the work of Pfenniger & Combes [352], where dense gas clouds with a temperature of only a few Kelvin (∼ 3 K), solar-system size, and of a Jupiter mass, were considered to be possible candidates for both galactic and extragalactic dark matter. These clouds would behave in a collisionless way, just like stars. However, since the dark mass considered in the context of MOND cannot be present in galaxies, it is not subject to the galactic constraints on such gas clouds. Note that the total sky covering factor of such clouds in the core of the clusters would be on the order of only 10−4, so that they would only occult a minor fraction of the X-rays emitted by the hot gas (and it would be a rather constant fraction). For the same reason, the chances of a given quasar having light absorbed by them is very small. Still, [310] notes that these clouds could be probed through X-ray flashes coming out of individual collisions between them. Of course, this speculative idea also raises a number of questions, the most serious one being how these clumps form and stabilize, and why they form only in clusters, X-ray emitting groups and some ellipticals at the center of these groups and clusters, but not in individual spiral galaxies. As noted above, the fact that missing mass in MOND is necessarily associated with an abundance of ionised gas could be a hint at a formation and stabilization process somehow linked with the presence of hot gas and X-ray emission themselves. Then, there is the issue of knowing whether the cloud formation would be prior to or posterior to the cluster formation. We note that a rather late formation mechanism could help increase the metal abundance, solving the problem of small-scale variations of metallicity in clusters when the clouds are destroyed [330]. Milgrom [310] also noted that these clouds could alleviate the cooling flow conundrum, because whatever destroys them (e.g., cloud-cloud collisions and dynamical friction between the clouds and the hot gas) is conducive to heating the core gas, and thus preventing it from cooling too quickly. Such a heating source would not be transient and would be quite isotropic, contrary to AGN heating.

Figure 40
figure 40

The baryon budget in the low redshift universe adopted from [421]. The census of baryons includes the detected Warm-Hot Intergalactic Medium (WHIM), the Lymanα forest, stars in galaxies, detected cold gas in galaxies (atomic HI and molecular H2), other gas associated with galaxies (the Circumgalactic Medium, CGM), and the Intracluster Medium (ICM) of groups and clusters of galaxies. The sum of known baryons falls short of the density of baryons expected from BBN: ∼ 30% are missing. These missing baryons presumably exist in some as yet undetected (i.e., dark) form. If a fraction of these dark baryons reside in clusters (an amount roughly comparable to that in the ICM) it would suffice to explain the residual mass discrepancy problem MOND suffers in galaxy clusters.)

Another possibility (iii) would be that this residual missing mass in clusters is in the form of non-baryonic matter. There is one obviously existing form of such matter: neutrinos. If \({m_\nu} \approx \sqrt {\Delta {m^2}}\) [434], then the neutrino mass is too small to be of interest in this context. But there is nothing that prevents it from being larger (note that the “cosmological” constraints from structure formation in the ΛCDM context obviously do not apply in MOND). Actual model-independent experimental limits on the electron neutrino mass from the Mainz/Troitsk experiments, counting the highest energy electrons in the β-decay of Tritium [234] are mν < 2.2 eV. Interestingly, the KATRIN experiment (the KArlsruhe TRItium Neutrino experiment, under construction) will be able to falsify these 2 eV electron neutrinos at 95% confidence. If the neutrino mass is substantially larger than the mass differences, then all types have about the same mass, and the cosmological density of three left-handed neutrinos and their antiparticles [392] would be

$${\Omega _\nu} = 0.062\,{m_\nu},$$

where mν is the mass of a single neutrino type in eV. If one assumes that clusters of galaxies respect the baryon-neutrino cosmological ratio, and that the MOND missing mass is mostly made of neutrinos as suggested by [389, 392], then the mass of neutrinos must indeed be around 2 eV. Combined with the effect of additional degrees of freedom in relativistic MOND theories (Section 7), it has been shown that the CMB anisotropies could also be reproduced (see Section 9.2 and [430]), while this hot dark matter would obviously free-stream out of spiral galaxies and would thus not perturb the MOND fits of Section 6.5.1. The main limit on the neutrino ability to condense in clusters comes from the Tremaine-Gunn limit [463], stating that the phase space density must be preserved during collapse. This is a density level half the quantum mechanical degeneracy level in phase-space:

$${f_{\max}} = {1 \over 2}\sum\limits_{i = 1}^{i = 6} {{{m_{{\nu _i}}^4} \over {{h^3}}}.}$$

Converting this into configuration space, the maximum density for a cluster of a given temperature, T, is defined for a given mass of one neutrino type as [463]:

$${{\rho _\nu ^{\max}} \over {7 \times {{10}^{- 5}}{M_ \odot}\;{\rm{p}}{{\rm{c}}^{- 3}}}} = {\left({{T \over {1keV}}} \right)^{1.5}}{\left({{{{m_\nu}} \over {2eV}}} \right)^4}.$$

Assuming the temperature of the neutrino fluid as being equal (due to violent relaxation) to the mean emission weighted temperature of the gas, Sanders [389] showed that such 2 eV neutrinos at the limit of experimental detection could indeed account for the bulk of the dynamical mass in his sample of galaxy clusters of T > 4 keV (see also Section 8.3 for gravitational lensing constraints). This has the great advantage of naturally reproducing the proportionality of the electron density in the cores of clusters to T3/2, as observed in [392]. However, looking at the central region of low-temperature X-ray emitting galaxy groups, it was found [12] that the needed central density of missing mass far exceeded this limit by a factor of several hundred. One would need one neutrino species with m ∼ 10 eV to reach the required densities. One exotic possibility is then the idea of right-handed eV-scale sterile neutrinos [13]: as strange as this sounds, this mass for sterile neutrinos could also provide a good fit to the CMB acoustic peaks (see Section 9.2). This could indeed sound like the strangest and most complicated universe possible, combining true non-baryonic (hot) dark matter with a modification of gravity, but if this is what it takes to simultaneously explain the Kepler-like laws of galactic dynamics and the extragalactic evidence for dark matter, it is useful to remember that there are both good reasons for there being more particles than those of the standard model of particle physics and that there is no reason that general relativity should be valid over a wide range of scales where it has never been tested. In any case, experiments that can address the existence of such a ∼ 10 eV-scale sterile neutrino would thus be very interesting, as this kind of particle could provide the dark matter candidate only in a modified gravity framework, since such a hot dark matter particle would be unable to form small structures and to provide the dark matter that would be needed in galaxies.

Yet another possibility (iv) would be that MOND is incomplete, and that a new scale should be introduced, in order to effectively enhance the value of a0 in galaxy clusters, while lowering it to its preferred value in galaxies. There are several ways to implement such an idea. For instance, Bekenstein [36] proposed adding a second scale in order to allow for effective variations of the acceleration constant as a function of the deepness of the potential (Eq. 27). This idea should be investigated more in the future, but it is not clear that such a simple rescaling of a0 would account for the exact spatial distribution of the residual missing mass in MOND clusters, especially in cases where it is displaced from the baryonic distribution (see Section 8.3). However, as even Gauss’ theorem would not be valid anymore in spherical symmetry, the high non-linearity might provide non-intuitive results, and it would thus clearly be worth investigating this suggestion in more detail, as well as developing similar ideas with other additional scales in the future (such as, for instance, the baryonic matter density; see [82, 143] and Section 7.6).

Finally, as we shall see in Section 7, parent relativistic theories of MOND often require additional degrees of freedom in the form of “dark fields”, which can nevertheless be globally subdominant to the baryon density, and thus do not necessarily act precisely as true “dark matter”. Thus, the last possibility (v) is that these fields, which are obviously not included in Milgrom’s formula, are responsible for the cluster missing mass in MOND. An example of such fields are the vector fields of TeVeS (Section 7.4) and Generalized Einstein-Aether theories (Section 7.7). It has been shown (see Section 9.2) that the growth of the spatial part of the vector perturbation in the course of cosmological evolution can successfully seed the growth of baryonic structures, just as dark matter does. If these seeds persist, it was shown [112] that they could behave in very much the same way as a dark matter halo in relatively unrelaxed galaxy clusters. However, it remains to be seen whether the spatially-concentrated distribution of missing mass in MOND would be naturally reproduced in all clusters. In other relativistic versions of MOND (see, e.g., Sections 7.6 and 7.9), the “dark fields” are truly massive and can be thought of as true dark matter (although more complex than simple collisionless dark matter), whose energy density outweighs the baryonic one, and could provide the missing mass in clusters. However, again, it is not obvious that the centrally-concentrated distribution of residual missing mass in clusters would be naturally reproduced. All in all, there is no obviously satisfactory explanation for the problem of residual missing mass in the center of galaxy clusters, which remains one of the most serious problems facing MOND.

7 Relativistic MOND Theories

In Section 6, we have considered the classical theories of MOND and their predictions in a vast number of astrophysical systems. However, as already stated at the beginning of Section 6, these classical theories are only toy-models until they become the weak-field limit of a relativistic theory (with invariant physical laws under differentiable coordinate transformations), i.e., an extension of general relativity (GR) rather than an extension of Newtonian dynamics. Here, we list the various existing relativistic theories boiling down to MOND in the quasi-static weak-field limit. It is useful to restate here that the motivation for developing such theories is not to get rid of dark matter but to explain the Kepler-like laws of galactic dynamics predicted by Milgrom’s law (see Section 5). As we shall see, many of these theories include new fields, so that dark matter is often effectively replaced by “dark fields” (although, contrary to dark matter, their energy density can be subdominant to the baryonic one; note that, even more importantly, in a static configuration these dark fields are fully determined by the baryons, contrary to the traditional dark matter particles, which may, in principle, be present independent of baryons).

These theories are great advances because they enable us to calculate the effects of gravitational lensing and the cosmological evolution of the universe in MOND, which are beyond the capabilities of classical theories. However, as we shall see, many of these relativistic theories still have their limitations, ranging from true theoretical or observational problems to more aesthetic problems, such as the arbitrary introduction of an interpolating function (Section 6.2) or the absence of an understanding of the \(\Lambda \sim a_0^2\) coincidence. What is more, the new fields introduced in these theories have no counterpart yet in microphysics, meaning that these theories are, at best, only effective. So, despite the existing effective relativistic theories presented here, the quest for a more profound relativistic formulation of MOND continues. Excellent reviews of existing theories can also be found in, e.g., [34, 35, 81, 100, 136, 183, 318, 429, 431].

The heart of GR is the equivalence principle(s), in its weak (WEP), Einstein (EEP) and strong (SEP) form. The WEP states the universality of free fall, while the EEP states that one recovers special relativity in the freely falling frame of the WEP. These equivalence principles are obtained by assuming that all known matter fields are universally and minimally coupled to one single metric tensor, the physical metric. It is perfectly fine to keep these principles in MOND, although certain versions can involve another type of (dark) matter not following the same geodesics as the known matter, and thus effectively violating the WEP. Additionally, note that the local Lorentz invariance of special relativity could be spontaneously violated in MOND theories. The SEP, on the other hand, states that all laws of physics, including gravitation itself, are fully independent of velocity and location in spacetime. This is obtained in GR by making the physical metric itself obey the Einstein-Hilbert action. This principle has to be broken in MOND (see also Section 6.3). We now recall how GR connects with Newtonian dynamics in the weak-field limit, which is actually the regime in which the modification must be set in order to account for the MOND phenomenology of the ultra-weak-field limit. The action of GR written as the sum of the matter action and the Einstein-Hilbert (gravitational) actionFootnote 43:

$${S_{{\rm{GR}}}} \equiv {S_{{\rm{matter}}}}[{\rm{matter}},{g_{\mu \nu}}] + {{{c^4}} \over {16\pi G}}\int {{d^4}x\sqrt {- g} R,}$$

where g denotes the determinant of the metric tensor gμv with (−, +, +, +) signatureFootnote 44, and R = Rμvgμv is its scalar curvature, Rμv being the Ricci tensor (involving second derivatives of the metric). The matter action is a functional of the matter fields, depending on them and their first derivatives. For instance, the matter action of a free point particle Spp writes:

$${S_{{\rm{pp}}}} \equiv - \int {mcds = - \int {mc\sqrt {- {g_{\mu \nu}}(x){v^\mu}{v^\nu}} dt,}}$$

depending on the positions x and on their time-derivatives υμ. Varying the matter action with respect to (w.r.t.) matter fields degrees of freedom yields the equations of motion, i.e., the geodesic equation in the case of a point particle:

$${{{d^2}{x^\mu}} \over {d{\tau ^2}}} = - \Gamma _{\alpha \beta}^\mu {{d{x^\alpha}} \over {d\tau}}{{d{x^\beta}} \over {d\tau}},$$

where the proper time τ = s is approximately equal to ct for slowly moving non-relativistic particles, and is the Christoffel symbol involving first derivatives of the metric. On the other hand, varying the total action w.r.t. the metric yields Einstein’s field equations:

$${R_{\mu \nu}} - {1 \over 2}R{g_{\mu \nu}} = {{8\pi G} \over {{c^4}}}{T_{\mu \nu}},$$

where Tμv is the stress-energy tensor defined as the variation of the Lagrangian density of the matter fields over the metric.

In the static weak-field limit, the metric is written as (up to third-order corrections in 1/c3)Footnote 45:

$${g_{0i}} = {g_{i0}} = 0,\quad {g_{00}}\underbrace = _{{\rm{Taylor}}} - 1 - {{2\Phi} \over {{c^2}}},\quad {g_{ij}}\underbrace = _{{\rm{Taylor}}}\left({1 + {{2\Psi} \over {{c^2}}}} \right){\delta _{ij}},$$

where, in GR,

$$\Phi = {\Phi _N}\;{\rm{and}}\;\Psi = - {\Phi _N},$$

and Φn is the Newtonian gravitational potential. From the (0,0) components of the weak-field metric, one gets back Newton’s second law for massive particles \({d^2}{x^i}/d{t^2} = - \Gamma _{00}^i = - \partial {\Phi _N}/d{x^i}\) from the geodesic equation (Eq. 71). On the other hand, Einstein’s equations (Eq. 72) give back the Newtonian Poisson equation ∇2ΦN = 4π. Thus, the metric plays the role of the gravitational potential, and the Christoffel symbol plays the role of acceleration. Note, however, that if timelike geodesics are determined by the (0, 0) component of the metric, this is not the case for null geodesics. While the gravitational redshift for light-rays is solely governed by the g00 component of the metric too, the deflection of light is, on the other hand, also governed by the components (more specifically by Φ — Ψ in the weak-field limit). This means that, in order for the anomalous effects of any modified gravity theory on lensing and dynamics to correspond to a similarFootnote 46 amount of “missing mass” in GR, it is crucial that Ψ ≃ −Φ in Eq. 73.

7.1 Scalar-tensor k-essence

MOND is an acceleration-based modification of gravity in the ultra-weak-field limit, but since the Christoffel symbol, playing the role of acceleration in GR, is not a tensor, it is, in principle, not possible to make a general relativistic theory depend on it. Another natural way to account for the departure from Newtonian gravity in the weak-field limit and to account for the violation of the SEP inherent to the external field effect is to resort to a scalar-tensor theory, as first proposed by [38]. The added scalar field can play the role of an auxiliary potential, and its gradient then has the dimensions of acceleration and can be used to enforce the acceleration-based modification of MOND.

The relativistic theory of [38] depends on two fields, an “Einstein metric” \({{\tilde g}_{\mu \nu}}\) and a scalar field ϕ. The physical metric gμv entering the matter action is then given by a conformal transformation of the Einstein metricFootnote 47 through an exponential coupling function:

$${g_{\mu \nu}} \equiv {e^{2\phi}}{\tilde g_{\mu \nu}}.$$

In order to recover the MOND dynamics, the Einstein-Hilbert action (involving the Einstein metric) remains unchanged \(\left({\int {{d^4}x\sqrt {- \tilde g} \tilde R}} \right)\), and the dimensionless scalar field is given a k-essence action, with no potential and a non-linear, aquadratic, kinetic termFootnote 48 inspired by the AQUAL action of Eq. 16:

$${S_\phi} \equiv - {{{c^4}} \over {2{k^2}{l^2}G}}\int {{d^4}x\sqrt {- \tilde g} f(X),}$$

where k is a dimensionless constant, l is a length-scale, \(X = k{l^2}{{\tilde g}^{\mu \nu}}\phi {,_\mu}\phi {,_\nu}\), and f (X) is the “MOND function”. Since the action of the scalar field is similar to that of the potential in the Bekenstein-Milgrom version of classical MOND, this relativistic version is known as the Relativistic Aquadratic Lagrangian theory, RAQUAL.

Varying the action w.r.t., the scalar field yields, in a static configuration, the following modified Poisson’s equation for the scalar field:

$${c^2}\nabla .[\nabla \phi {f{\prime}}(k{l^2}\vert \nabla \phi \vert ^{2})] = kG\rho ,$$

and the (0, 0) component of the physical metric is given by \({g_{00}} = - {e^{2({\Phi _N} + {c^2}\phi)/{c^2}}}\), leading us precisely to the situation of Eq. 40 in the weak-field, with Φ = Φn + c2ϕ, with

$$s = ({c^2}/{a_0})\vert \nabla \phi \vert = {(X{c^4}/k{l^2}a_0^2)^{1/2}}$$


$$\tilde \mu (s) = (4\pi {c^2}/k){f{\prime}}(X),$$

whose finely tuned relation with the μ-function of Milgrom’s law is extensively described in Section 6.2. We note that the standard choice for X</