Abstract
A wealth of astronomical data indicate the presence of mass discrepancies in the Universe. The motions observed in a variety of classes of extragalactic systems exceed what can be explained by the mass visible in stars and gas. Either (i) there is a vast amount of unseen mass in some novel form — dark matter — or (ii) the data indicate a breakdown of our understanding of dynamics on the relevant scales, or (iii) both. Here, we first review a few outstanding challenges for the dark matter interpretation of mass discrepancies in galaxies, purely based on observations and independently of any alternative theoretical framework. We then show that many of these puzzling observations are predicted by one single relation — Milgrom’s law — involving an acceleration constant a_{0} (or a characteristic surface density Σ_{†} = a_{0}/G) on the order of the squareroot of the cosmological constant in natural units. This relation can at present most easily be interpreted as the effect of a single universal force law resulting from a modification of Newtonian dynamics (MOND) on galactic scales. We exhaustively review the current observational successes and problems of this alternative paradigm at all astrophysical scales, and summarize the various theoretical attempts (TeVeS, GEA, BIMOND, and others) made to effectively embed this modification of Newtonian dynamics within a relativistic theory of gravity.
Introduction
Two of the most tantalizing mysteries of modern astrophysics are known as the dark matter and dark energy problems. These problems come from the discrepancies between, on one side, the observations of galactic and extragalactic systems (as well as the observable Universe itself in the case of dark energy) by astronomical means, and on the other side, the predictions of general relativity from the observed amount of matterenergy in these systems. In short, what astronomical observations are telling us is that the dynamics of galactic and extragalactic systems, as well as the expansion of the Universe itself, do not correspond to the observed massenergy as they should if our understanding of gravity is complete. Thus, this indicates either (i) the presence of unseen (and yet unknown) massenergy, or (ii) a failure of our theory of gravity, or (iii) both.
The third case is a priori the most plausible, as there are good reasons for there being more particles than those of the standard model of particle physics [257] (actually, even in the case of baryons, we suspect that a lot of them have not yet been seen and, thus, literally make up unseen mass, in the form of “missing baryons”), and as there is a priori no reason that general relativity should be valid over a wide range of scales, where it has never been tested [45], and where the need for a dark sector actually prevents the theory from being tested until this sector has been detected by other means than gravity itself^{Footnote 1}. However, either of the first two cases could be the dominant explanation of the discrepancies in a given class of astronomical systems (or even in all astronomical systems), and this is actually testable.
For instance, as far as (ii) is concerned, if the mass discrepancies in a class of systems are mostly caused by some subtle change in gravitational physics, then there should be a clear signature of a single, universal force law at work in this whole class of systems. If instead there is a distinct dark matter component in these, the kinematics of any given system should then depend on the particular distribution of both dark and luminous mass. This distribution would vary from system to system, depending on their environment and past history of formation, and should, in principle, not result in anything like an apparent universal force law^{Footnote 2}.
Over the years, there have been a large variety of such attempts to alter the theory of gravity in order to remove the need for dark matter and/or dark energy. In the case of dark energy, there is some wiggle room, but in the case of dark matter, most of these alternative gravity attempts fail very quickly, and for a simple reason: once a force law is specified, it must fit all relevant kinematic data in a given class of systems, with the mass distribution specified by the visible matter only. This is a tall order with essentially zero wiggle room: at most one particular force law can work. However, among all these attempts, there is one survivor: the Modified Newtonian Dynamics (MOND) hypothesized by Milgrom almost 30 years ago [294, 295, 293] seems to come close to satisfying the criterion of a universal force law in a whole class of systems, namely galaxies. This success implies a unique relationship between the distribution of baryons and the gravitational field in galaxies and is extremely hard to understand within the present dominant paradigm of the concordance cosmological model, hypothesizing that general relativity is correct on every relevant scale in cosmology including galactic scales, and that the dark sector in galaxies is made of nonbaryonic dissipationless and collisionless particles. Even if such particles are detected directly in the near to far future, the success of MOND on galaxy scales as a phenomenological law, as well as the associated appearance of a universal critical acceleration constant a_{0} ≃ 10^{−10} m s^{−2} in various, seemingly unrelated, aspects of galaxy dynamics, will still have to be explained and understood by any successful model of galaxy formation and evolution. Previous reviews of various aspects of MOND, at an observational and theoretical level, can be found in [34, 81, 100, 151, 279, 311, 318, 401, 407, 429]. A website dedicated to this topic is also maintained, with all the relevant literature as well as introductory level articles [263] (see also [238]).
Here, we first review the basics of the dark matter problem (Section 2) as well as the basic ingredients of the presentday concordance model of cosmology (Section 3). We then point out a few outstanding challenges for this model (Section 4), both from the point of view of unobserved predictions of the model, and from the point of view of unpredicted observations (all uncannily involving a common acceleration constant a_{0}). Up to that point, the challenges presented are purely based on observations, and are fully independent of any alternative theoretical framework^{Footnote 3}. We then show that, surprisingly, many of these puzzling observations can be summarized within one single empirical law, Milgrom’s law (Section 5), which can be most easily (although not necessarily uniquely) interpreted as the effect of a single universal force law resulting from a modification of Newtonian dynamics (MOND) in the weakacceleration regime a < a_{0}, for which we present the current observational successes and problems (Section 6). We then summarize the various attempts currently made to embed this modification in a generallycovariant relativistic theory of gravity (Section 7) and how such theories allow new predictions on gravitational lensing (Section 8) and cosmology (Section 9). We finally draw conclusions in Section 10.
The Missing Mass Problem in a Nutshell
There exists overwhelming evidence for mass discrepancies in the Universe from multiple independent observations. This evidence involves the dynamics of extragalactic systems: the motions of stars and gas in galaxies and clusters of galaxies. Further evidence is provided by gravitational lensing, the temperature of hot, Xray emitting gas in clusters of galaxies, the large scale structure of the Universe, and the gravitating mass density of the Universe itself (Figure 1). For an exhaustive historical review of the problem, we refer the reader to [394].
The data leave no doubt that when the law of gravity as currently known is applied to extragalactic systems, it fails if only the observed stars and gas are included as sources in the stressenergy tensor. This leads to a stark choice: either the Universe is pervaded by some unseen form of mass — dark matter — or the dynamical laws that lead to this inference require revision. Though the mass discrepancy problem is now well established [394, 465], such a dramatic assertion warrants a brief review of the evidence.
Historically, the first indications of the modern missing mass problem came in the 1930s shortly after galaxies were recognized to be extragalactic in nature. Oort [342] noted that the sum of the observed stars in the vicinity of the sun fell short of explaining the vertical motions of stars in the disk of the Milky Way. The luminous matter did not provide a sufficient restoring force for the observed stellar vertical oscillations. This became known as the Oort discrepancy. Around the same time, Zwicky [518] reported that the velocity dispersion of galaxies in clusters of galaxies was far too high for these objects to remain bound for a substantial fraction of cosmic time. The Oort discrepancy was approximately a factor of two in amplitude, and confined to the Galactic disk — it required local dark matter, not necessarily the quasispherical halo we now envision. It was long considered a serious problem, but has now largely (though perhaps not fully) gone away [194, 240]. The discrepancy Zwicky reported was less subtle, as the required dark mass outweighed the visible stars by a factor of at least 100. This result was apparently not taken seriously at the time.
One of the first indications of the need for dark matter in modern times came from the stability of galactic disks. Stars in spiral galaxies like the Milky Way are predominantly on approximately circular orbits, with relatively few on highly eccentric orbits [132]. The small velocity dispersion of stars relative to their circular velocities makes galactic disks dynamically cold. Early simulations [343] revealed that cold, selfgravitating disks were subject to severe instabilities. In order to prevent the rapid, selfdestructive growth of these instabilities, and hence preserve the existence of spiral galaxies over a sizable fraction of a Hubble time, it was found to be necessary to embed the disk in a quasispherical potential well — a role that could be played by a halo of dark matter, as first proposed in 1973 by Ostriker & Peebles [343].
Perhaps the most persuasive piece of evidence was then provided, notably through the seminal works of Bosma and Rubin, by establishing that the rotation curves of spiral galaxies are approximately flat [67, 370]. A system obeying Newton’s law of gravity should have a rotation curve that, like the Solar system, declines in a Keplerian manner once the bulk of the mass is enclosed: V_{ c } ∝ r^{−1/2}. Instead, observations indicated that spiral galaxy rotation curves tended to remain approximately flat with increasing radius: V_{ c } ∼ constant. This was shown to happen over and over and over again [370] with the approximate flatness of the rotation curve persisting to the largest radii observable [67], well beyond where the details of each galaxy’s mass distribution mattered, so that Keplerian behavior should have been observed. Again, a quasispherical halo of dark matter as proposed by Ostriker and Peebles was implicated.
Other types of galaxies exhibit mass discrepancies as well. Perhaps most notable are the dwarf spheroidal galaxies that are satellites of the Milky Way [427, 477] and of Andromeda [217]. These satellites are tiny by galaxy standards, possessing only millions, or in the case of the ultrafaint dwarfs, thousands, of individual stars. They are close enough that the lineofsight velocities of individual stars can be measured, providing for a precise measurement of the system’s velocity dispersion. The mass inferred from these motions (roughly, M ∼ rσ^{2}/G) greatly exceeds the mass visible in luminous stars. Indeed, these dim satellite galaxies exhibit some of the largest mass discrepancies observed. In contrast, bright giant elliptical galaxies (often composed of much more than the ∼ 10^{11} stars of the Milky Way) exhibit remarkably modest and hard to detect mass discrepancies [367]. Thus, it is inferred that fainter galaxies are progressively more darkmatter dominated than bright ones. However, as we shall expand on in Section 4.3, the primary correlation is not with luminosity, but with surface brightness: the lower the surface brightness of a system, the larger its mass discrepancy [279].
On larger scales, groups and clusters of galaxies also show mass discrepancies, just as individual galaxies do. One of the earliest lines of evidence comes from the “timing argument” in the Local Group [213]. Presumably the material that was to become the Milky Way and Andromeda (M31) was initially expanding apart with the general Hubble expansion. Currently they are approaching one another at ∼ 100 km s^{−1}. In order for the Milky Way and M31 to have overcome the initial expansion and fallen back towards one another, there must be a greaterthanaverage gravitating mass between the two. To arrive at their present separation with the observed blueshifted line of sight velocity after a Hubble time requires a dynamical masstolight ratio M/L > 80. This greatly exceeds the masstolight ratio of the stars themselves, which is of order unity in Solar units [42] (the Sun is a fairly average star, so averaged over many stars each Solar mass produces roughly one Solar luminosity).
Rich clusters of galaxies are rare structures containing dozens or even hundreds of bright galaxies. These objects exhibit mass discrepancies in several distinct ways. Measurements of the redshifts of individual cluster members give velocity dispersions in the vicinity of 1,000 km s^{−1} typically implying dynamical masstolight ratios in excess of 100 [24]. The actual mass discrepancy is not this large, as most of the detected baryonic mass in clusters is in a diffuse intracluster gas rather than in the stars in the galaxies (something Zwicky was not aware of back in 1933). This gas is heated to the virial temperature and emits Xrays. Mapping the temperature and emission of this Xray gas provides another probe of the cluster mass through the equation of hydrostatic equilibrium. In order to hold the gas in the clusters at the observed temperatures, the dark matter must outweigh the gas by a factor of ∼ 8 [175]. Furthermore, some clusters are observed to gravitationally lens background galaxies (Figure 1). Once again, mass above and beyond that observed is required to explain this phenomenon [227]. Thus, three independent methods all imply the need for about the same amount of dark matter in clusters of galaxies.
In addition to the abundant evidence for mass discrepancies in the dynamics of extragalactic systems, there are also strong motivations for dark matter in cosmology. Two observations are particularly important: (i) the small baryonic mass density Ω_{ b } inferred from BigBang nucleosynthesis (BBN) (and from the measured Hubble parameter), and (ii) the growth of large scale structure by a factor of ∼ 10^{5} from the surface of last scattering of the cosmic microwave background at redshift z ∼ 1000 until presentday z = 0, implying Ω_{m} > Ω_{ b }. Together, these observations imply not only the need for dark matter, but for some exotic new form of nonbaryonic cold dark matter. Indeed, observational estimates of the gravitating mass density of the Universe Ω_{m}, measured, for instance, from peculiar galaxy (or largescale) velocity fields, have, for several decades, persistently returned values in the range 1/4 < Ω_{m} < 1/3 [116]. While shy of the value needed for a flat Universe, this mass density is well in excess of the baryon density inferred from BBN. The observed abundances of the light isotopes deuterium, helium, and lithium are consistent with having been produced in the first few minutes after the Big Bang if the baryon density is just a few percent of the critical value: Ω_{ b } < 0.05 [480, 107]. Thus, Ω_{ m } > Ω_{ b }. Consequently, we do not just need dark matter, we need the dark matter to be nonbaryonic.
Another early Universe constraint is provided by the Cosmic Microwave Background (CMB). The small (microKelvin) amplitude of the temperature fluctuations at the time of baryonphoton decoupling (z ∼ 1000) indicates that the Universe was initially very homogeneous, roughly to one part in 10^{5}. The Universe today (z = 0) is very inhomogeneous, at least on “small” scales of less than ∼ 100 Mpc (∼ 3 × 10^{8} ly), with huge density contrasts between planets, stars, galaxies, clusters, and empty intergalactic space. The only attractive longrange force acting on the entire Universe, that can make such structures, is gravity. In a richgetricher while the poorgetpoorer process, the small initial overdensities attract more mass and grow into structures like galaxies while underdense regions become less dense, leading to voids. The catch is that gravity is rather weak, so this process takes a long time. If the baryon density from BBN is all we have to work with, we can only obtain a growth factor of ∼ 10^{2} in a Hubble time [424], orders of magnitude short of the observed 10^{5}. The solution is to boost the growth rate with extra invisible mass displaying larger density fluctuations: dark matter. In order not to make the same mark on the CMB that baryons would, this dark matter must not interact with the photons. So, in effect, the density fluctuations in the dark matter can already be very large at the epoch of baryonphoton decoupling, especially if the dark matter is cold (i.e., with effectively zero Jeans length). The baryons fall into the already deep dark matter potential wells only after that, once released from their electromagnetic link to the photon bath. Before decoupling, the fluctuations in the baryonphoton fluid did not grow but were oscillating in the form of acoustic waves, maintaining the same amplitude as when they entered the horizon; actually they were even slightly diffusiondamped. In principle, at baryonphoton decoupling, CMB fluctuations on smaller angular scales, having entered the horizon earlier, would have been damped with respect to those on larger scales (Silk damping). Nevertheless, the presence of decoupled nonbaryonic dark matter would provide a net forcing term countering the damping of the oscillations at recombination, meaning that the second and third acoustic peaks of the CMB could then be of equal amplitude rather than exhibiting a damping tail. The actual observation of a high thirdpeak in the CMB angular power spectrum is another piece of compelling evidence for nonbaryonic dark matter (see, e.g., [229]). Both BBN and the CMB thus drive us to consider a form of mass that is nonbaryonic and which does not interact electromagnetically. Moreover, in order to form structure (see Section 3.2), the mass must be dynamically cold (i.e., moving much slower than the speed of light when it decouples from the photon bath), and is known as cold dark matter (CDM).
Now, in addition to CDM, modern cosmology also requires something even more mysterious, dubbed dark energy. The fact that the baryon fraction in clusters of galaxies was such that Ω_{ m } was implied to be much smaller than 1 — the value needed for a flat Euclidean Universe favored by inflationary models —, as well as tensions between the measured Hubble parameter and independent estimates of the age of the Universe, led Ostriker & Steinhardt [344] to propose in 1995 a “concordance model of cosmology” or ΛCDM model, where a cosmological constant Λ — supposed to represent vacuum energy or dark energy — provided the major contribution to the Universe’s energy density. Three years later, the observations of SNIa [351, 365] indicating latetime acceleration of the Universe’s expansion, led most people to accept this model. This concordance model has since been refined and calibrated through subsequent largescale observations of the CMB and of the matter power spectrum, to lead to the favored cosmological model prevailing today (see Section 3). However, as we shall see, curious coincidences of scales between the dark matter and dark energy sectors (see Section 4.1) have prompted the question of whether these two sectors are really physically independent, and the existence of dark energy itself has led to a renewed interest in modified gravity theories as a possible alternative to this exotic fluid [100].
A Brief Overview of the ΛCDM Cosmological Model
General relativity provides a clear and compelling cosmology, the FriedmannLemaîtreRobertsonWalker (FLRW) model. The expansion of the Universe discovered by Hubble and Slipher found a natural explanation^{Footnote 4} in this context. The picture of a hot BigBang cosmology that emerged from this model famously predicted the existence of the 3 degree CMB and the abundances of the light isotopes via BBN.
Within the FLRW framework, we are inexorably driven to infer the existence of both nonbaryonic cold dark matter and a nonzero cosmological constant as discussed in Section 2. The resulting concordance ΛCDM model — first proposed in 1995 by Ostriker and Steinhardt [344] — is encouraged by a wealth of observations: the consistency of the Hubble parameter with the ages of the oldest stars [344], the consistency between the dynamical mass density of the Universe, that of baryons from BBN (see also discussion in Section 9.2), and the baryon fraction of clusters [486], as well as the power spectrum of density perturbations [103, 452]. A prediction of the concordance model is that the expansion rate of the Universe should be accelerating; this was confirmed by observations of high redshift Type Ia supernovae [351, 365]. Another successful prediction was the scale of the baryonic acoustic oscillation [134]. Perhaps the most emphatic support for ACDM comes from fits to the acoustic power spectrum of temperature fluctuations in the CMB [229].
For a brief review of the basics and successes of the concordance cosmological model we refer the reader to, e.g., [87, 349] and all references therein. We note that, while most of the cosmological probes in the above list are not uniquely fit by the ΛCDM model on their own, when they are taken together they provide a remarkably tight set of constraints. The success of this now favoured cosmological model on large scales is, thus, remarkable indeed, as there was a priori no reason that such a parameterized cosmology could explain all these completely independent data sets with such outstanding consistency.
In this model, the Hubble constant is H_{0} = 70 km s^{−1} Mpc^{−1} (i.e., h = 0.7), the amplitude of density fluctuations within a tophat sphere of 8h^{−1} Mpc is σ_{8} = 0.8, the optical depth to reionization is τ = 0.08, the spectral index measuring how fluctuations change with scale is n_{ s } =0. 97, and the price we pay for the outstanding success of the model is new physics in the form of a dark sector. This dark sector is making up 95% of the massenergy content of the Universe in ΛCDM: it is composed separately of a dark energy sector and a cold dark matter sector, which we briefly describe below.
Dark Energy (Λ)
In ΛCDM, dark energy is a nonvanishing vacuum energy represented by the cosmological constant Λ in the field equations of general relativity. Einstein’s cosmological constant is equivalent to vacuum energy with equation of state p/ρ = w = −1. In principle, the equation of state could be merely close to, but not exactly w = −1. In this case, the dark energy could evolve and clump, depending on the value of w and its evolution ẇ. However, to date, there is no compelling observational reason to require any form of dark energy more complex than the simple cosmological constant introduced by Einstein.
The various observational datasets discussed above constrain the ratio of the dark energy density to the critical density to be \({\Omega _\Lambda} = \Lambda/3H_0^2 = 0.73\) where H_{0} is Hubble’s constant and ι is expressed in s^{−2}. This value, together with the matter density Ω_{ m } (see below), leads to a total Ω = Ω_{ι}+Ω_{ m } = 1, i.e., a spatiallyflat Euclidean geometry in the RobertsonWalker sense that is nicely consistent with the expectations of inflation. It is important to stress that this model relies on the cosmological principle, i.e., that our observational location in the Universe is not special, and on the fact that on large scales, the Universe is isotropic and homogeneous. For possible challenges to these assumptions and their consequences, we refer the reader to, e.g., [83, 487, 488].
Cold Dark Matter (CDM)
In ΛCDM, dark matter is assumed to be made of nonbaryonic dissipationless massive particles [48], the “cold dark matter” (CDM). This dark matter outweighs the baryons that participate in BBN by about 5:1. The density of baryons from the CMB is Ω_{ b } = 0.046, grossly consistent with BBN [229]. This is a small fraction of the critical density; with the nonbaryonic dark matter the total matter density is Ω_{ m } = Ωcdm + Ω_{ b } = 0.27.
The “cold” in cold dark matter means that CDM moves slowly so that it is nonrelativistic when it decouples from photons. This allows it to condense and begin to form structure, while the baryons are still electromagnetically coupled to the photon fluid. After recombination, when protons and electrons first combine to form neutral atoms so that the crosssection for interaction with the photon bath suddenly drops, the baryons can fall into the potential wells already established by the dark matter, leading to a hierarchical scenario of structure formation with the repeated merger of smaller CDM clumps to form ever larger clumps.
Particle candidates for the CDM must be massive, nonbaryonic, and immune to electromagnetic interactions. The currently preferred CDM candidates are Weakly Interacting Massive Particles (WIMPs, [46, 47, 48]) that condensed from the thermal bath of the early Universe. These should have masses on the order of about 100 GeV so that (i) the freestreaming length is small enough to create smallscale structures as observed (e.g., dwarf galaxies), and (ii) that thermal relics with crosssections typical for weak nuclear reactions account for the right amount of matter density Ω_{ m } (see, e.g., Eq. 28 of [48]). This last point is known as the WIMP miracle^{Footnote 5}.
For lighter particle candidates (e.g., ordinary neutrinos or light sterile neutrinos), the damping scale becomes too large. For instance, a hot dark matter (HDM) particle candidate with mass of a few to 15 eV would have a freestreaming length of about ∼ 100 Mpc, leading to too little power at the smallscale end of the matter power spectrum. The existence of galaxies at redshift z ∼ 6 implies that the coherence length should have been smaller than 100 kpc or so, meaning that even warm dark matter (WDM) particles with masses between 1 and 10 keV are close to being ruled out as well (see, e.g., [348]). Thus, ΛCDM presently remains the stateoftheart in cosmology, although some of the challenges listed in Section 4 are leading to a slow drift of the standard concordance model from CDM to WDM [252], but this drift brings along its own problems, and fails to address most of the current observational challenges summarized in the following Section 4, which might perhaps point to a more radical alternative to the model.
Some Challenges for the ΛCDM Model
The great concordance of independent cosmological observables from Gpc to Mpc scales lends a certain air of inevitability to the ΛCDM model. If we accept these observables as sufficient to prove the model, then any discrepancy appears as trivia that will inevitably be explained away. If instead we require a higher standard, such as positive laboratory evidence for the dark sectors, then ΛCDM appears as a yet unproven hypothesis that relies heavily on two potentially fictitious invisible entities. Thus, an important test of ΛCDM as a scientific hypothesis is the existence of dark matter. By this we mean not just unseen mass, but specifically CDM: some novel form of particle with the right microscopic properties and correct cosmic mass density. Searches for WIMPs are now rather mature and not particularly encouraging. Direct detection experiments have as yet no positive detections, and have now excluded [19] the bulk of the parameter space (interaction crosssection and particle mass) where WIMPs were expected to reside. Indirect detection through the observation of γrays produced by the selfannihilation^{Footnote 6} of WIMPs in the galactic halo and in nearby satellite galaxies have similarly returned null results [6, 84, 172] at interestingly restrictive levels. For the mostplausible minimallysupersymmetric models, particle colliders should already have produced evidence for WIMPs [2, 1, 23]. The right model need not be minimal. It is always possible to construct a more complicated model that manages to evade all experimental constraints. Indeed, it is readily possible to imagine dark matter candidates that do not interact at all with the rest of the Universe except through gravity. Though logically possible, such dark matter candidates are profoundly unsatisfactory in that they could not be detected in the laboratory: their hypothesized existence could neither be confirmed nor falsified.
Apart from this current nondetection of CDM candidates, there also exists prominent observational challenges for the ΛCDM model, which might point towards the necessity of an alternative model (or, at the very least, an improved one). These challenges are that (i) some of the parameters of the model appear finetuned (Section 4.1), and that (ii) as far as galaxy formation and evolution are concerned (mainly processes happening on kpc scales so that the predictions are more difficult to make because the baryon physics should play a more prominent role), many predictions that have been made were not successful (Section 4.2); (iii) what is more, a number of observations on these galactic scales do exhibit regularities that are fully unexpected in any CDM context without a substantial amount of finetuning in terms of baryon feedback (Section 4.3).
Coincidences
What is generally considered as the biggest problem for the ΛCDM model is that it requires a large and still unexplained finetuning to reduce by 120 orders of magnitude the theoretical expectation of the vacuum energy to yield the observed cosmologicalconstant value, and, even more importantly, that it faces a coincidence problem to explain why the dark energy density Ω_{Λ} is precisely of the same order of magnitude as the other cosmological components today^{Footnote 7}. This uncanny coincidence is generally seen as evidence for some yettobediscovered underlying cosmological mechanism ruling the evolution of dark energy (such as quintessence or generalized additional fluid components, see, e.g., [106]). But it could also indicate that the effect attributed to dark energy is rather due to a breakdown of general relativity (GR) on the largest scales [158].
Then, as we shall see in more detail in Section 4.3, another coincidence, which is central to this whole review, is the appearance of a characteristic scale — dubbed a_{0} — in the behavior of the dark matter sector, a scale with units of acceleration. This acceleration scale appears in various seemingly unrelated galactic scaling relations, mostly unpredicted by the ΛCDM model (see Section 4.3). The value of this scale is a_{0} ≃ 10^{−10} m s^{−2}, which yields in natural units^{Footnote 8}, a_{0} ∼ H_{0} (or, more precisely, a_{0} ≈ cH_{0}/2π). It is perhaps even more meaningful [51, 298, 304] to note that, in these same units:
where Λ is the currentlyfavored value of the cosmological constant^{Footnote 9}. Whether these numerical coincidences are physically relevant or just true (insignificant) coincidences remains an open question, closely related to the nature of the dark sector, which we are going to elaborate on in Sections 5–10. But, at this stage, it is in any case striking that the dark matter and dark energy sectors do have such a common scale. This coincidence of scales, together with the coincidence of energy densities at redshift zero, might perhaps be a strong indication that one should cease to consider dark energy as an additional component physically independent from the dark matter sector [7], and/or cease to consider that GR correctly describes gravity on the largest scales and in extremely weak gravitational fields, in order to perhaps address the two above coincidence problems at the same time.
Finally, let us note that the existence of the a_{0}scale is actually not the only darkmatterrelated coincidence, as there is also, in principle, absolutely no reason why the mechanism leading to the baryon asymmetry (between baryonic matter and antimatter) would simultaneously leave both the baryon and dark matter densities with a similar order of magnitude (Ω_{dm}/Ω_{ b } = 5). If the effects we attribute to dark matter are actually also due to a breakdown of GR on cosmological scales, then such a coincidence might perhaps appear more natural as the baryons would then be the actual source of the effect attributed to the dark matter sector.
Unobserved predictions
Apart from the above puzzling coincidences, the concordance ΛCDM model also has a few more concrete empirical challenges to address, in the sense of having made a few predictions in contradiction with observations (with the caveat in mind that the model itself is not always that predictive on small scales). These include the following nonexhaustive list:

1.
The bulk flow challenge. Peculiar velocities of galaxy clusters are predicted to be on the order of 200 km/s in the ΛCDM model. These can actually be measured by studying the fluctuations in the CMB generated by the scattering of the CMB photons by the hot Xrayemitting gas inside clusters (the kinematic SZ effect). This yields an observed coherent bulk flow of order 1000 km/s (5 times more than predicted) on scales out to at least 400 Mpc [221]. This bulk flow challenge appears not only in SZ studies but also in galaxy studies [483]. A related problem is the collision velocity larger than 3100 km/s for the merging bullet cluster 1E065756 at z = 0.3, much too high to be accounted for by ΛCDM [249, 455]. These observations would seem to indicate that the attractive force between DM particles is enhanced compared to what ΛCDM predicts, and changing CDM into WDM would not solve the problem.

2.
The highz clusters challenge. Observation of even a single massive cluster at high redshift can falsify ΛCDM [331]. In this respect the existence of the galaxy cluster XMMU J2235.32557 [368] with a mass of of ∼ 4 × 10^{14} M_{⊙} at z = 1.4, even though not sufficient to rule out the model, is very surprising and could indicate that structure formation is actually taking place earlier and faster than in ΛCDM (see also [420] on the Shapley supercluster and the Sloan Great Wall).

3.
The Local Void challenge. The Local Volume is composed of 562 known galaxies at distances smaller than 8 Mpc from the center of the Local Group, and the region known as the “Local Void” hosts only 3 of them. This is much less than the expected ∼ 20 for a typical similar void in ΛCDM [350]. What is more, in the Local Volume, large luminous galaxies are overrepresented by a factor of 6 in the underdense regions, exactly opposite to what is expected from ΛCDM. This could mean that the Local Volume is just a statistical anomaly, but it could also point, in line with the two previous challenges, towards more rapid structure formation, allowing sparse regions to more quickly form large galaxies cleaning their environment, making the galaxies larger and the voids emptier at early times [350].

4.
The missing satellites challenge. It has long been known that the model predicts an overabundance of dark subhalos orbiting MilkyWaysized galaxies compared to the observed number of satellite galaxies around the Milky Way [329]. This is a different problem from the abovepredicted overabundance of small galaxies in voids. It has subsequently been suggested that stellar feedback and heating processes limit baryonic growth, that reionisation prevents lowmass dark halos from forming stars, and that tidal forces from the host halo limit growth of the darkmatter subhalos and lead to their truncation. This important theoretical effort has led recent semianalytic models to predict a reduced number of ∼ 100 to 600 faint satellites rather than the original thousands. Moreover, during the past 15 years 13 “new” and mostly ultrafaint satellite galaxies have been found in addition to the 11 previouslyknown classical bright ones. Since these new galaxies have been largely discovered with the Sloan Digital Sky Survey (SDSS), and since this survey covered only one fifth of the sky, it has been argued that the problem was solved. However, there are actually still missing satellites on the low mass and high mass end of the mass function predicted by “ΛCDM+reinoisation” semianalytic models. This is best illustrated on Figure 2 of [239] showing the cumulative distribution for the predicted and observationallyderived masses within the central 300 pc of Milky Way satellites. A lot of lowmass satellites are still missing, and the most massive predicted subhaloes are also incompatible with hosting any of the known Milky Way satellites [73, 75, 74]. This is the modern version of the missing satellites challenge. An obvious but rather discomforting wayout would be to simply state that the Milky Way must be a statistical outlier, but this is contradicted by the study of [447] on the abundance of bright satellites around Milky Waylike galaxies in SDSS. Another solution would be to change from CDM to WDM [252] (it is actually one of the only listed challenges that such a change would probably immediately solve).

5.
The satellites phasespace correlation challenge. In addition to the above challenge, the distribution of dark subhalos around the Galaxy is also predicted by ΛCDM to be isotropic, or quasiisotropic. However, the Milky Way satellites are currently observed to be correlated in phasespace: they lie within a seemingly rotationsupported disk [239]. Young halo globular clusters define the same disk, and streams of stars and gas, tracing the orbits of the objects from which they are stripped, preferentially lie in this disk, too [347]. Since SDSS covered only one fifth of the sky, it will be interesting to see whether future surveys such as PanStarrs will confirm this state of affairs. Whether or not this phasespace correlation would be unique to the Milky Way should also be carefully checked, the evidence in M31 being currently much less convincing, with a richer and more complex satellite population [289]. But in any case, the current distribution of satellites around the Milky Way is statistically incompatible with the predictions of ΛCDM at a very high level of confidence, even when taking into account the observational bias from SDSS [239]. While this might perhaps have been explained by the infall of a small group of galaxies that would have retained correlated orbits, this solution is ruled out by the fact that no nearby groups are observed to be anywhere near as spatially small as the disk of satellites [290]. Another solution might be that most Milky Way satellites are actually not primordial galaxies but old tidal dwarf galaxies created in an early major merger event, accounting for their presentlycorrelated phasespace distribution [346]. Note in passing that if only one or two longlived tidal dwarfs are created in each gasdissipational galaxy encounter, they could probably account for most of the dwarf galaxy population in the Universe, leaving no room for small CDM subhalos to create galaxies, which would transform the missing satellites challenge into a missing satellites catastrophe [239].

6.
The cuspcore challenge. Another longstanding problem of ΛCDM is the fact that the simulations of the collapse of CDM halos lead to a density distribution as a function of radius, ρ(r), which is well fitted by a smooth function asymptoting to a central cusp with slope d ln ρ/d ln r = −1 in the central parts [126, 332], while observations clearly point towards large constant density cores in the central parts [118, 169, 479]. Even though the latest simulations [333] rather point towards Einasto [133] profiles with d ln ρ/d ln r ∝ − r^{(1/n)} (with n slightly varying with halo mass, and n ∼ 6 for a Milky Waysized halo, meaning that the slope is zero only very close to the nucleus [177], and is still ∼ −1 at 200 pc from the center), fitting such profiles to observed galactic kinematical data such as rotation curves [88] leads to values of n that are much smaller than simulated values (meaning that they have much larger cores), which is another way of reassessing the old cusp problem of ΛCDM. Note that a change from CDM to WDM could solve the problem in dwarf galaxies, by leading to the formation of small cores, but certainly not in large galaxies where large cores are needed from observations. Thus, one has to rely on baryon feedback to erase the cusp from all galaxies. But this is not easily done, as the adiabatic cooling of baryons in the center of dark matter halos should lead to an even more concentrated dark matter distribution. A possibility would be that angular momentum transfer from a rotating stellar bar destroys darkmatter cusps: however, significant cusp destruction requires substantially more angular momentum than is realistically available in stellar bars [89, 286]. Note also that not all galaxies are barred (e.g., M33 is not). The stateoftheart solution nowadays is to enforce strong supernovae outflows that move large amounts of lowangularmomentum gas from the central parts and that “pull” on the central dark matter concentration to create a core [176], but this is still a highly finetuned process, which fails to address the baryon fraction problem (see challenge 10 below).

7.
The angular momentum challenge. As a consequence of the merger history of galaxy disks in a hierarchical formation scenario, as well as of the associated transfer of angular momentum from the baryonic disk to the dark halo, the specific angular momentum of the baryons ends up being much too small in simulated disks, which in turn end up much smaller than the observed ones [4]. Similarly, elliptical systems end up too concentrated as well. Addressing this challenge within the standard paradigm essentially relies on forming disks through latetime quiescent gas accretion from largescale filaments, with much less latetime mergers than presently predicted in ΛCDM.

8.
The pure disk challenge. Related to the previous challenge, large bulgeless thin disk galaxies are extremely difficult to produce in simulations. This is because major mergers, at any time in the galaxy formation process, typically create bulges, so bulgeless galaxies would represent the quiescent tail of a distribution of merger histories for galaxies of the Local Volume. However, these bulgeless disk galaxies represent more than half of large galaxies (with V_{ c } > 150 km/s) in the Local Volume [178, 231]. Solving this problem would rely, e.g., on suppressing central spheroid formation for mergers with mass ratios lower than 30% [228].

9.
The stability challenge. Round CDM halos tend to stabilize very low surface density disks against the formation of bars and spirals, due to a lack of disk selfgravity [291]. The observation [282] of Low Surface Brightness (LSB) disk galaxies with strong bars and spirals is thus challenging in the absence of a significant disk component of dark matter. What is more, in the absence of such a disk DM component, the lack of disk selfgravity prevents the creation of verylarge razorthin LSB disks, but these are observed [222, 260]. In the standard context, these observations would tend to point towards an additional disk DM component, either a CDMone linked to inplane accretion of satellites or a baryonic one in the form of molecular gas.

10.
The missing baryons challenge(s). As mentioned above, constraints from the CMB imply Ω_{ m } = 0.27 and Ω_{ b } = 0.046. However, our inventory of known baryons in the local Universe, summing over all observed stars, gas, etc., comes up short of the total. For example, [42] estimate that the sum of stars and cold gas is only ∼ 5% of Ω_{ b } While there now seems to be a good chance that many of the missing baryons are in the form of highly ionized gas in the warmhot intergalactic medium (WHIM), we are still far from being able to give a confident account of where all the baryons reside. Indeed, there could be multiple distinct reservoirs in addition to the WHIM, each comparable to the mass in stars, within the current uncertainties. But there is another missing baryons challenge, namely the halobyhalo missing baryons. Indeed, each CDM halo can, to a first approximation, be thought of as a microcosm of the whole. As such, one would naively expect each halo to have the same baryon fraction as the whole Universe, f_{ b } = Ω_{ b }/Ω_{ m } = 0.17. On the scale of clusters of galaxies, this is approximately true (but still systematically low), but for individual galaxies, observations depart from this in a systematic way which we have yet to understand, and which has nothing to do with the truncation radius. The ratio of the galaxydetected baryon fraction over the cosmological one, fd, is plotted as a function of the potential well of the systems in Figure 2 [284]. There is a clear correlation, less massive objects being much more darkmatter dominated than massive ones. This correlation is a priori not predicted at all by ΛCDM, at least not with the correct shape [273]. This missing baryons challenge is actually closely related to the baryonic TullyFisher relation, which we expand on in Section 4.3.1.
However, let us note that, while challenges 1 to 3 are not real smoking guns yet for the ΛCDM model, challenges 4 to 10 are concerned with processes happening on kpc scales, for which it is fair to consider that the model is not very predictive because the baryon physics should play a more important role, and this is hard to take into account rigorously. However, it is not sufficient to qualitatively invoke handwavy baryon physics to avoid confronting predictions of ΛCDM with observations. It is also mandatory to show that the feedback from the baryons, which is needed to solve the observational problems, is what would quantitatively happen in a physical galaxy. This, presently, is not yet the case for the aforementioned challenges. However, these challenges are “modeldependent problems”, in the sense of being failed predictions of a given model, but would not have appeared a priori surprising without the standard concordance model at hand. This means that subtly changing some parameters of the model (like, e.g., swapping CDM for WDM, making DM more selfinteracting, etc.) might help solving at least a few of them. But what is even more challenging is a set of observations that appear surprising independently of any specific dark matter model, as they involve a finetuned relation between the distribution of visible and dark matter. These are what we call hereafter “unpredicted observations”.
Unpredicted observations
There are several important examples of systematic relations between the dynamics of galaxies (in theory presumed to be dominated by dark matter) and their baryonic content. These relations are fully empirical, and as such must be explained by any viable theory. As we shall see, they inevitably involve a critical acceleration scale, or equivalently, a critical surface density of baryonic matter.
Baryonic TullyFisher relation
One of the strongest correlations in extragalactic astronomy is the TullyFisher relation [467]. Originally identified as an empirical relation between a galaxy’s luminosity and its HI linewidth, it has been widely employed as a distance indicator. Though extensively studied for decades, the physical basis of the relation remains unclear.
Luminosity and linewidth are readily accessible observational quantities. The optical luminosity of a galaxy is a proxy for its stellar mass, and the HI linewidth is a proxy for its rotation velocity. The quality of the correlation improves as more accurate indicators of these quantities are employed. For example, resolved rotation curves, where the flat portion of the rotation curve V_{ f } or the maximum peak velocity V_{ p } can be measured, give relations that are tighter than those utilizing only linewidth information [108]. Similarly, the scatter declines as we shift from optical luminosities to those in the nearinfrared [475] as the latter are expected to give a more reliable mapping of starlight to stellar mass [42].
It was then realized [322, 157, 283] that a more fundamental relation was that between the total observed baryonic mass and the rotation velocity. In most bright galaxies, the stars harbor the majority of the detected baryonic mass, so luminosity suffices as a proxy for mass. The nextmostimportant known reservoir of baryons is the neutral atomic hydrogen (HI) of the interstellar medium. As studies have probed down the mass spectrum to lower mass, more slowly rotating systems, a higher preponderance of gas rich galaxies is found. The luminous TullyFisher relation breaks down [283, 272], but a tight relation persists if instead of luminosity, the detected baryonic mass M_{ b } = M_{*} + M_{ g } is used [283, 475, 42, 272, 353, 31, 445, 462, 276]. This is the Baryonic TullyFisher Relation (BTFR), plotted on Figure 3.
The luminous TullyFisher relation extends over about two decades in luminosity. Recent work extending the relation to low mass, typically LSB and gas rich galaxies [31, 445, 462] extends the dynamic range of the BTFR to five decades in baryonic mass. Over this range, the BTFR has remarkably little intrinsic scatter (consistent with zero given the observational errors) and is well described as a power law, or equivalently, as a straight line in loglog space:
with slope α = 4 [272, 445, 276]. This slope is consistent with a constant acceleration scale \({\rm{a =}}V_f^4/(G{M_b})\) such that^{Footnote 10} the normalization constant β = Ga.
The acceleration scale a ≈ 10^{−10} m s^{−2} ∼ Λ^{1/2} (Eq. 1) is thus present in the data. Figure 4 shows the distribution of this acceleration \(V_f^4/{M_b}\), around the best fit line in Figure 3, strongly peaked around ∼ 2 × 10^{−62} in natural units. As we shall see, this acceleration scale arises empirically in a variety of distinct situations involving the mass discrepancy problem.
A BTFR of the observed form does not arise naturally in ΛCDM. The naive expectation is \(\alpha = 3\) and \(\beta = 10f_V^3G{H_0}\) [446]^{Footnote 11} where H_{0} is the Hubble constant and f_{ V } is a factor of order unity (currently estimated to be ≈ 1.3 [361]) that relates the observed V_{ f } to the circular velocity of the potential at the virial radius^{Footnote 12}. This modest fudge factor is necessary because ΛCDM does not explicitly predict either axis of the observed BTFR. Rather, there is a relationship between total (baryonic plus dark) mass and rotation velocity at very large radii. This simple scaling fails (dashed line in Figure 3), obliging us to introduce an additional fudge factor f_{ d } [273, 284] that relates the detected baryonic mass to the total mass of baryons available in a halo. This mismatch drives the variation in the detected baryon fraction f_{ d } seen in Figure 2. A constant f_{ d } is excluded by the difference between the observed and predicted slopes; f_{ d } must vary with V_{ f }, or M, or the gravitational potential Φ
This brings us to the first finetuning problem posed by the data. There is essentially zero intrinsic scatter in the BTFR [276], while the detected baryon fraction f_{ d } could, in principle, obtain any value between zero and unity. Somehow galaxies must “know” what the circular velocity of the halo they reside in is so that they can make observable the correct fraction of baryons.
Quantitatively, in the ΛCDM picture, the baryonic mass plotted in the BTFR (Figure 3) is M_{ b } = M_{*} + M_{ g } while the total baryonic mass available in a halo is f_{ b }M_{tot}. The difference between these quantities implies a reservoir of dark baryons in some undetected form, M_{other}. It is commonly speculated that the undetected baryons could be in a hardtodetect hot, diffuse, ionized phase mixed in with the dark matter halo (and extending to comparable radius), or that the missing baryons have been entirely blown away by winds from supernovae. For the purposes of this argument, it does not matter which form the dark baryons take. All that matters is that a substantial mass of them are required so that [283]
Since there is negligible intrinsic scatter in the observed BTFR, there must be effectively zero scatter in f_{ d }. By inspection of Eq. 3, it is apparent that small scatter in f_{ d } can only be obtained naturally in the limits M_{*} + M_{ g } ≫ M_{other} so that f_{ d } → 1 or M_{*} + M_{ g } ≪ M_{other} so that f_{ d } → 0. Neither of these limits apply. We require not only an appreciable mass in dark baryons M_{other}, but we need the fractional mass of these missing baryons to vary in lockstep with the observed rotation velocity V_{ f }. Put another way, for any given galaxy, we know not only how many baryons we see, but also how many we do not see — a remarkable feat of nonobservation.
Another remarkable fact about the BTFR is that it shows no residuals with variations in the distribution of baryons [517, 443, 109, 271]. Figure 5 shows deviations from the BTFR as a function of the characteristic baryonic surface density of the galaxies, as defined in [271], i.e., \({\Sigma _b} = 0.75{M_b}/R_p^2\) where R_{ p } is the radius at which the rotation curve V_{ b }(r) of baryons peaks. Over several decades in surface density, the BTFR is completely insensitive to variations in the mass distribution of the baryons. This is odd because, a priori, V^{2} ∼ M/R, and thus V^{4} ∼ MΣ. Yet the BTFR is \({M_b} \sim V_f^4\) with no dependence on Σ. This brings us to a second finetuning problem. For some time, it was thought [156] that spiral galaxies all had very nearly the same surface brightness (a condition formerly known as “Freeman’s Law”). If this is indeed the case, the observed BTFR naturally follows from the constancy of Σ. However, there do exist many LSB galaxies [264] that violate the constancy of surface brightness implied in Freeman’s Law. Thus, one would expect them to deviate systematically from the TullyFisher relation, with lower surface brightness galaxies having lower rotation velocities at a given mass. Yet they do not. Thus, one must finetune the mass surface density of the dark matter to precisely make up for that of the baryons [279]. As the surface density of baryons declines, that of the dark matter must increase just so as to fill in the difference (Figure 6 [271]). The relevant quantity is the dynamical surface density enclosed within the radius, where the velocity is measured. The latter matters little along the flat portion of the rotation curve, but the former is the sum of dark and baryonic matter.
One might be able to avoid finetuning if all galaxies are darkmatter dominated [109]. In the limit Σ _{ dm } ≫_{b}, the dynamics are entirely darkmatter dominated and the distribution of the baryons is irrelevant. There is some systematic uncertainty in the masstolight ratios of stellar populations [42], making such an approach a priori tenable. In effect, we return to the interpretation of Σ ∼ constant originally made by [3] in the context of Freeman’s Law, but now we invoke a constant surface density of CDM rather than of baryons. But as we will see, such an interpretation, i.e., that Σ_{b} ≪Σ_{ dm } in all disk galaxies, is flatly contradicted by other observations (e.g., Figure 9 and Figure 13).
The TullyFisher relation is remarkably persistent. Originally posited for bright spirals, it applies to galaxies that one would naively expected to deviate from it. This includes lowluminosity, gasdominated irregular galaxies [445, 462, 276], LSB galaxies of all luminosities [517, 443], and even tidal dwarfs formed in the collision of larger galaxies [165]. Such tidal dwarfs may be especially important in this context (see also Section 6.5.4). Galactic collisions should be very effective at segregating dark and baryonic matter. The rotating gas disks of galaxies that provide the fodder for tidal tails and the tidal dwarfs that form within them initially have nearly circular, coplanar orbits. In contrast, the darkmatter particles are on predominantly radial orbits in a quasispherical distribution. This difference in phase space leads to tidal tails that themselves contain very little dark matter [72]. When tidal dwarfs form from tidal debris, they should be largely devoid^{Footnote 13} of dark matter. Nevertheless, tidal dwarfs do appear to contain dark matter [72] and obey the BTFR [165].
The critical acceleration scale of Eq. 1 also appears in nonrotating galaxies. Elliptical galaxies are threedimensional stellar systems supported more by random motions than organized rotation. First of all, in such systems of measured velocity dispersion σ, the typical acceleration σ^{2}/R is also on the order of a_{0} within a factor of a few, where R is the effective radius of the system [401]. Moreover, they obey an analogous relation to the TullyFisher one, known as the FaberJackson relation (Figure 7). In bulk, the data for these stardominated galaxies follow the relation σ^{4}/(GM_{*}) ∝ a_{0} (dotted line in Figure 7). This is not strictly analogous to the flat part of the rotation curves of spiral galaxies, the dispersion typically being measured at smaller radii, where the equivalent circular velocity curve is often falling [367, 323], or in a temporary plateau before falling again (see also Section 6.6.1). Indeed, unlike the case in spiral galaxies, where the distribution of stars is irrelevant, it clearly does matter in elliptical galaxies (the FaberJackson relation is just one projection of the “fundamental plane” of elliptical galaxies [85]). This is comforting: at small radii in dense stellar systems where the baryonic mass of stars is clearly important, the data behave as Newton predicts.
The acceleration scale a_{0} is clearly imprinted on the data for local galaxies. This is an empirical statement that might not hold at all times, perhaps evolving over cosmic time or evaporating altogether. Substantial efforts have been made to investigate the TullyFisher relation to high redshift. To date, there is no persuasive evidence of evolution in the zero point of the BTFR out to z = 0.6 [356, 357] and perhaps even to z = 1 [485]. One must exercise caution in interpreting such results given the difficulty inherent in peering many Gyr back in cosmic time. Nonetheless, it appears that the scale a_{0} remains present in the data and has not obviously changed over the more recent half of the age of the Universe.
The role of surface density
The Freeman limit [156] is the maximum central surface brightness in the distribution of galaxy surface brightnesses. Originally thought to be a universal surface brightness, it has since become clear that instead galaxies exist over a wide range in surface brightness [264]. In the absence of a perverse and finetuned anticorrelation between surface brightness and stellar masstolight ratio [517], this implies a comparable range in baryonic surface density (Figure 8).
An upper limit to the surface brightness distribution is interesting in the context of disk stability. Recall that dynamically cold, purely Newtonian disks are subject to potentiallyselfdestructive instabilities, one cure being to embed them in the potential wells of spherical darkmatter halos [343]. While the proper criterion for stability is much debated [131, 415], it is clear that the dark matter halo moderates the growth of instabilities and that the ratio of halo to disk self gravity is a relevant quantity. The more selfgravitating a disk is, the more likely it is to suffer undamped growth of instabilities. But, in principle, galaxies with a baryonic disk and a dark matter halo are totally scalable: if a galaxy model has a certain dynamics, and one multiplies all densities by any (positive) constant (and also scales the velocities appropriately) one gets another galaxy with exactly the same dynamics (with scaled time scales). So if one is stable, so is the other. In turn, the mere fact that there might be an upper limit to Σ_{ b } is a priori surprising, and even more so that there might be a coincidence of this upper limit with the acceleration scale a_{0} identified dynamically.
The scale Σ_{†} = a_{0}/G is clearly present in the data (Figure 8). Selection effects make highsurfacebrightness (HSB) galaxies easy to detect and hence discover, but their intrinsic numbers appear to decline exponentially when the central surface density of the stellar disk Σ_{0} > Σ_{†} [264]. It seems natural to associate the dynamical scale a_{0} with the disk stability scale since they are numerically indistinguishable and both arise in the context of the mass discrepancy. However, there is no reason to expect this in ΛCDM, which predicts denser dark matter halos than observed [280, 169, 167, 241, 243, 478, 118]. Such dense dark matter halos could stabilize much higher density disks than are observed to exist. Lacking a clear mechanism to specify this scale, it is introduced into models by hand [115].
Poisson’s equation provides a direct relation between the force per unit mass (centripetal acceleration in the case of circular orbits in disk galaxies), the gradient of the potential, and the surface density of gravitating mass. If there is no dark matter, the observed surface density of baryons must correlate perfectly with the dynamical acceleration. If, on the other hand, dark matter dominates the dynamics of a system, as we might infer from Figure 5 [279, 109], then there is no reason to expect a correlation between acceleration and the dynamicallyinsignificant baryons. Figure 9 shows the dynamical acceleration as a function of baryonic surface density in disk galaxies. The acceleration a_{ p } = V_{ p }/R_{ p } is measured at the radius R_{ p }, where the rotation curve V_{ b }(r) of baryons peaks. Given the systematic variation of rotation curve shape [376, 495], the specific choice of radii is unimportant. Nevertheless, this radius is advocated by [109] since this maximizes the possibility of perceiving the baryonic contribution in the plot of Figure 5. That this contribution is not present leads to the inference that Σ_{ b } ≪ Σ_{ dm } in all disk galaxies [109]. This is directly contradicted by Figure 9, which shows a clear correlation between a_{ p } and Σ_{ b }. The higher the surface density of baryons, the higher the observed acceleration. The slope of the relation is not unity, a_{ p } ∝ Σ_{ b }, as we would expect in the absence of a mass discrepancy, but rather \({a_p} \propto \Sigma _b^{1/2}\). To simultaneously explain Figure 5 and Figure 9, there must be a strong finetuning between dark and baryonic surface densities (i.e., Figure 6), a sort of repulsion between them, a repulsion which is however contradicted by the correlations between baryonic and dark matter bumps and wiggles in rotation curves (see Section 4.3.4).
Mass discrepancyacceleration relation
So far we have discussed total quantities. For the BTFR, we use the total observed mass of a galaxy and its characteristic rotation velocity. Similarly, the dynamical accelerationbaryonic surface density relation uses a single characteristic value for each galaxy. These are not the only ways in which the “magical” acceleration constant a_{0} appears in the data. In general, the mass discrepancy only appears at very low accelerations a < a_{0} and not (much) above a_{0}. Equivalently, the need for dark matter only becomes clear at very low baryonic surface densities Σ < Σ_{†} = a_{0}/G. Indeed, the amplitude of the mass discrepancy in galaxies anticorrelates with acceleration [270].
In [270], one examined the role of various possible scales, as well as the effects of different stellar masstolight ratio estimators, on the mass discrepancy problem. The amplitude of the mass discrepancy, as measured by (V/V_{ b })^{2}, the ratio of observed velocity to that predicted by the observed baryons, depends on the choice of estimator for stellar M_{*}/L. However, for any plausible (nonzero) M_{*}/L, the amplitude of the mass discrepancy correlates with acceleration (Figure 10) and baryonic surface density, as originally noted in [382, 266, 406]. It does not correlate with radius and only weakly with orbital frequency^{Footnote 14}.
There is no reason in the dark matter picture why the mass discrepancy should correlate with any physical scale. Some systems might happen to contain lots of dark matter; others very little. In order to make a prediction with a dark matter model, it is necessary to model the formation of the dark matter halo, the condensation of gas within it, the formation of stars therefrom, and any feedback processes whereby the formation of some stars either enables or suppresses the formation of further stars. This complicated sequence of events is challenging to model. Baryonic “gastrophysics” is particularly difficult, and has thus far precluded the emergence of a clear prediction for galaxy dynamics from ΛCDM.
ΛCDM does make a prediction for the distribution of mass in baryonless dark matter halos: the NFW halo [332, 333]. These are remarkable for being scale free. Small halos have a profile similar to large halos. No feature stands out that marks a unique physical scale as observed. Galaxies do not resemble pure NFW halos [416], even when dark matter dominates the dynamics as in LSB galaxies [241, 243, 118]. The inference in ΛCDM is that gastrophysics, especially the energetic feedback from stellar winds and supernova explosions, plays a critical role in sculpting observed galaxies. This role is not restricted to the minority baryonic constituents; it must also affect the majority dark matter [176]. Simulations incorporating these effects in a quasirealistic way are extremely expensive computationally, so a comprehensive survey of the plausible parameter space occupied by such models has yet to be made. We have no reason to expect that a particular physical scale will generically emerge as the result of baryonic gastrophysics. Indeed, feedback from star formation is inherently a random process. While it is certainly possible for simple laws to emerge from complicated physics (e.g., the fact that SNIa are standard candles despite the complicated physics involved), the more common situation is for chaos to beget chaos. Therefore, it seems unnatural to imagine feedback processes leading to the orderly behavior that is observed (Figure 10); nor is it obvious how they would implicate any particular physical scale. Indeed, the dark matter halos formed in ΛCDM simulations [332, 333] provide an initial condition with greater scatter than the final observed one [280, 478], so we must imagine that the chaotic processes of feedback not only impart order, but do so in a way that cancels out some of the scatter in the initial conditions.
In any case, and whatever the reason for it, a physical scale is clearly observationally present in the data: a_{0} (Eq. 1). At high accelerations a ≫ a_{0}, there is no indication of the need for dark matter. Below this acceleration, the mass discrepancy appears. It cannot be emphasized enough that the role played by a_{0} in the BTFR and this role as a transition acceleration have strictly no intrinsic link with each other, they are fully independent of each other. There is nothing in ΛCDM that stipulates that these two relations (the existence of a transition acceleration and the BTFR) should exist at all, and even less that these should harbour an identical acceleration scale.
Thus, it is important to realize not only that the relevant dynamical scale is one of acceleration, not size, but also that the mass discrepancy appears only at extremely low accelerations. Just as galaxies are much bigger than the Solar system, so too are the centripetal accelerations experienced by stars orbiting within a galaxy much smaller than those experienced by planets in the Solar system. Many of the precise tests of gravity that have been made in the Solar system do not explore the relevant regime of physical parameter space. This is emphasized in Figure 11, which extends the mass discrepancyacceleration relation to Solar system scales. Many decades in acceleration separate the Solar system from galaxies. Aside from the possible exception of the Pioneer anomaly, there is no hint of a discrepancy in the Solar system: V = V_{ b }. Even the Pioneer anomaly^{Footnote 15} is well removed from the regime where the mass discrepancy manifests in galaxies, and is itself much too subtle to be perceptible in Figure 11. Indeed, to within a factor of ∼ 2, no system exhibits a mass discrepancy at accelerations a ≫ a_{0}.
The systematic increase in the amplitude of the mass discrepancy with decreasing acceleration and baryonic surface density has a remarkable implication. Even though the observed velocity is not correctly predicted by the observed baryons, it is predictable from them. Independent of any theory, we can simply fit a function D(GΣ) to describe the variation of the discrepancy (V/V_{ b })^{2} with baryonic surface density [270]. We can then apply it to any new system we encounter to predict V = D^{1/2}V_{ b }. In effect, D boosts the velocity already predicted by the observed baryons. While this is a purely empirical exercise with no underlying theory, it is quite remarkable that the distribution of dark matter required in a galaxy is entirely predictable from the distribution of its luminous mass (see also [167]). In the conventional picture, dark matter outweighs baryonic matter by a factor of five, and more in individual galaxies given the halobyhalo missing baryon problem (Figure 2), but apparently the baryonic tail wags the dark matter dog. And it does so again through the acceleration scale a_{0}. Indeed, at very low accelerations, the mass discrepancy is precisely defined by the inverse of the squareroot of the gravitational acceleration generated by the baryons in units of a_{0}. This actually asymptotically leads to the BTFR.
So, up to now, we have seen five roles of a_{0} in galaxy dynamics. (i) It defines the zero point of the TullyFisher relation, (ii) it appears as the characteristic acceleration at the effective radius of spheroidal systems, (iii) it defines the Freeman limit for the maximum surface density of pure disks, (iv) it appears as a transitionacceleration above which no dark matter is needed, and below which it appears, and (v) it defines the amplitude of the massdiscrepancy in the weakfield regime (this last point is not a fully independent role as it leads to the TullyFisher relation). Let us eventually note that there is yet a final role played by a_{0}, which is that it defines the central surface density of all dark matter halos as being on the order of a_{0}/(2πG) [129, 167, 313].
Renzo’s rule
The relation between dynamical and baryonic surface densities appears as a global scaling relation in disk galaxies (Figure 9) and as a local correspondence within each galaxy (Figure 10). When all galaxies are plotted together as in Figure 10, this connection appears as a single smooth function D(a). This does not suffice to illustrate that individual galaxies have features in their baryon distribution that are reflected in their dynamics. While the above correlations could be interpreted as a sort of repulsion between dark and baryonic matter, the following rather indicates closerthannatural attraction.
Figure 12 shows the spiral galaxy NGC 6946. Two multicolor images of the stellar component are given. The optical bands provide a (nearly) true color picture of the galaxy, which is perceptibly redder near the center and becomes progressively more blue further out. This is typical of spiral galaxies and reflects real differences in stellar content: the stars towards the center tend to be older and more dominated by the light of red giants, while those further out are younger on average so the light has a greater fractional contribution from brightbutshortlived main sequence stars. The nearinfrared bands [209] give a more faithful map of stellar mass, and are less affected by dust obscuration. Radio synthesis imaging of the 21 cm emission from the hydrogen spinflip transition maps the atomic gas in the interstellar medium, which typically extends to rather larger radii than the stars.
Surface density profiles of galaxies are constructed by fitting ellipses to images like those illustrated in Figure 12. The ellipses provide an axisymmetric representation of the variation of surface brightness with radius. This is shown in the top panels of Figure 13 for NGC 6946 (Figure 12) and the nearby, gas rich, LSB galaxy NGC 1560. The Kband light distribution is thought to give the most reliable mapping of observed light to stellar mass [42], and has been used to trace the run of stellar surface density in Figure 13. The sharp feature at the center is a small bulge component visible as the red central region in Figure 12. The bulge contains only 4% of the Kband light. The remainder is the stellar disk; a straight line fit to the data outside the central bulge region gives the parameters of the exponential disk approximation, Σ_{0} and R_{ d }. Similarly, the surface density of atomic gas is traced by the 21 cm emission, with a correction for the cosmic abundance of helium — the detected hydrogen represents 75% of the gas mass believed to be present, with most of the rest being helium, in accordance with BBN.
Mass models (bottom panels of Figure 13) are constructed from the surface density profiles by numerical solution of the Poisson equation [52, 472]. No approximations (like sphericity or an exponential disk) are made at this step. The disks are assumed to be thin, with radial scale length exceeding their vertical scale by 8:1, as is typical of edgeon disks [236]. Consequently, the computed rotation curves (various broken lines in Figure 13) are not smooth, but reflect the observed variations in the observed surface density profiles of the various components. The sum (in quadrature) leads to the total baryonic rotation curve V_{ b }(r) (the solid lines in Figure 13): this is what would be observed if no dark matter were implicated. Instead, the observed rotation (data points in Figure 13) exceeds that predicted by V_{ b },(r): this is the mass discrepancy.
It is often merely stated that flat rotation curves require dark matter. But there is considerably more information in rotation curve data than asymptotic flatness. For example, it is common that the rotation curve in the inner parts of HSB galaxies like NGC 6946 is well described by the baryons alone. The data are often consistent with a very low density of dark matter at small radii with baryons providing the bulk of the gravitating mass. This condition is referred to as maximum disk [471], and also runs contrary to our inferences of dark matter dominance from Figure 5 [414]. More generally, features in the baryonic rotation curve V_{ b } (r) often correspond to features in the total rotation V_{ c }(r).
Perhaps the most succinct empirical statement of the detailed connection between baryons and dynamics has been given by Renzo Sancisi, and known as Renzo’s rule [379]: “For any feature in the luminosity profile there is a corresponding feature in the rotation curve.” Both galaxies shown in Figure 13 illustrate this statement. In the inner region of NGC 6946, the small but compact bulge component causes a sharp feature in V_{ b }(r) that declines rapidly before the rotation curve rises again, as mass from the disk begins to contribute. The updownup morphology predicted by the observed distribution of the baryons is observed in high resolution observations [54, 114]. A dark matter halo with a monotonicallyvarying density profile cannot produce such a morphology; the stellar bulge must be the dominant mass component at small radii in this galaxy.
A surprising aspect of Renzo’s rule is that it applies to LSB galaxies as well as those of high surface brightness. That the baryons should have some dynamical impact where their surface density is highest is natural, though there is no reason to demand that they become competitive with dark matter. What is distinctly unnatural is for the baryons to have a perceptible impact where dark matter must clearly dominate. NGC 1560 provides an example where they appear to do just that. The gas distribution in this galaxy shows a substantial kink in its surface density profile [28] (recently confirmed by [163]) that has a distinct impact on V_{ b }(r). This occurs at a radius where V ≫ V_{ b }, so dark matter should be dominant. A sphericaldarkmatter halo with particles on randomly oriented, highly radial orbits cannot support the same sort of structure as seen in the gas disk, and the spherical geometry, unlike a disk geometry, would smear the effect on the local acceleration. And yet the wiggle in the baryonic rotation curve is reflected in the total, as per Renzo’s rule.^{Footnote 16}
One inference that might be made from these observations is that the dark matter is baryonic. This is unacceptable from a cosmological perspective, but it is possible to have a multiplicity of dark matter components. That is, we could have baryonic dark matter in the disks of galaxies in addition to a halo of nonbaryonic cold dark matter. It is often possible to scale up the atomic gas component to fit the total rotation [193]. That implies a component of mass that is traced by the atomic gas — presumably some other dynamically cold gas component — that outweighs the observed hydrogen by a factor of six to ten [193]. One hypothesis for such a component is very cold molecular gas [352]. It is difficult to exclude such a possibility, though it also appears to be hard to sustain in LSB galaxies[292]. Dynamically, one might expect the extra mass to destabilize the LSB disk. One also returns to a finetuning between baryonic surface density and masstolight ratio. In order to maintain the balance observed in Figure 5, relatively more dark molecular gas will be required in LSB galaxies so as to maintain a constant surface density of gravitating mass, but given the interactions at hand, this might be at least a bit more promising than explaining it with CDM halos.
As a matter of fact, LSB galaxies play a critical role in testing many of the existing models for dark matter. This happens in part because they were appreciated as an important population of galaxies only after many relevant hypotheses were established, and thus provide good tests of their a priori expectations. Observationally, we infer that LSB disks exhibit large mass discrepancies down to small radii [119]. Conventionally, this means that dark matter completely dominates their dynamics: the surface density of baryons in these systems is never high enough to be relevant. Nevertheless, the observed distribution of baryons suffices to predict the total rotation [279, 120]. Once again, the baryonic tail wags the dark matter dog, with the observations of the minority baryonic component sufficing to predict the distribution of the dominant dark matter. Note that, conversely, nothing is “observable” about the dark matter, in presentday simulations, that predicts the distribution of baryons.
Thus, we see that there are many observations, mostly on galaxy scales, that are unpredicted, and perhaps unpredictable, in the standard dark matter context. They mostly involve a unique relationship between the distribution of baryons and the gravitational field, as well as an acceleration constant a_{0} on the order of the squareroot of the cosmological constant, and they represent the most significant challenges to the current ΛCDM model.
Milgrom’s Empirical Law and “Kepler Laws” of Galactic Dynamics
Up to this point in this review, the challenges that we have presented have been purely based on observations, and fully independent of any alternative theoretical framework. However, at this point, it would obviously be a step forward if at least some of these puzzling observations could be summarized and empirically unified in some way, as such a unifying process is largely what physics is concerned with, rather than simply exposing a jigsaw of apparently unrelated empirical observations. And such an empirical unification is actually feasible for many of the unpredicted observations presented in the previous Section 4.3, and goes back to a rather old idea of the Israeli physicist Mordehai Milgrom.
Almost 30 years ago, back in 1983 (and thus before most of the aforementioned observations had been carried out), simply prompted by the question of whether the missing mass problem could perhaps reflect a breakdown of Newtonian dynamics in galaxies, Milgrom [293] devised a formula linking the Newtonian gravitational acceleration g_{ N } to the true gravitational acceleration g in galaxies. Such attempts to rectify the mass discrepancy by gravitational means often begin by noting that galaxies are much larger than the solar system. It is easy to imagine that at some suitably large scale, let’s say on the order of 1 kpc, there is a transition from the usual dynamics applicable in the comparativelytiny solar system to some more general theory that applies on the scale of galaxies in order to explain the mass discrepancy problem. If so, we would expect the mass discrepancy to manifest itself at a particular length scale in all systems. However, as already noted, there is no universal length scale apparent in the data (Figure 10) [382, 266, 406, 279, 270]. The mass discrepancy appears already at small radii in some galaxies; in others there is no apparent need for dark matter until very large radii. This now observationally excludes all hypotheses that simply alter the force law at a linear lengthscale.
Milgrom’s law and the dielectric analogy
Before such precise data were available, Milgrom [293] already noted that other scales were also possible, and that one that is as unique to galaxies as size is acceleration. The typical centripetal acceleration of a star in a galaxy is of order ∼ 10^{−10} m s^{−2}. This is eleven orders of magnitude less than the surface gravity of the Earth. As we have seen in Section 4, this acceleration constant appears “miraculously” in very different scaling relations that should not, in principle, be related to each other^{Footnote 17}. This observational evidence for the universal appearance of a_{0} ≃ 10^{−10} m s^{−2} in galactic scaling relations was not at all observationally evident back in 1983. What Milgrom [293] then hypothesized was a modification of Newtonian dynamics below this acceleration constant a_{0}, appropriate to the tiny accelerations encountered in galaxies^{Footnote 18}. This new constant a_{0} would then play a similar role as the Planck constant h in quantum physics or the speed of light c in special relativity. For large acceleration (or force per unit mass), F/m = g ≫ a_{0}, everything would be normal and Newtonian, i.e., g = g_{ N }. Or, put differently, formally taking a_{0} → 0 should make the theory tend to standard physics, just like recovering classical mechanics for h → 0. On the other hand, formally taking a_{0} → ∞ (and G → 0), or equivalently, in the limit of small accelerations g ≫ a_{0}, the modification would apply in the form:
where g = g is the true gravitational acceleration, and g_{ N } = g_{N} the Newtonian one as calculated from the observed distribution of visible matter. Note that this limit follows naturally from the scaleinvariance symmetry of the equations of motion under transformations (t, r) → (λt, λr) [315]. This particular modification was only suggested in 1983 by the asymptotic flatness of rotation curves and the slope of the TullyFisher relation. It is indeed trivial to see that the desired behavior follows from equation (4). For a test particle in circular motion around a point mass M, equilibrium between the radial component of the force and the centripetal acceleration yields \(V_c^2/r = {g_N} = GM/{r^2}\). In the weakacceleration limit this becomes
The terms involving the radius r cancel, simplifying to
The circular velocity no longer depends on radius, asymptoting to a constant V_{ f } that depends only on the mass of the central object and fundamental constants. The equation above is the equivalent of the observed baryonic TullyFisher relation. It is often wrongly stated that Milgrom’s formula was constructed in an ad hoc way in order to reproduce galaxy rotation curves, while this statement is only true of these two observations: (i) the asymptotic flatness of the rotation curves, and (ii) the slope of the baryonic TullyFisher relation (but note that, at the time, it was not clear at all that this slope would hold, nor that the TullyFisher relation would correlate with baryonic mass rather than luminosity, and even less clear that it would hold over orders of magnitude in mass). All the other successes of Milgrom’s formula related to the phenomenology of galaxy rotation curves were pure predictions of the formula made before the observational evidence. The predictions that are encapsulated in this simple formula can be thought of as sort of “Keplerlike laws” of galactic dynamics. These various laws only make sense once they are unified within their parent formula, exactly as Kepler’s laws only make sense once they are unified under Newton’s law.
In order to ensure a smooth transition between the two regimes g ≫ a_{0} and g ≪ a_{0}, Milgrom’s law is written in the following way:
where the interpolating function
Written like this, the analogy between Milgrom’s law and Coulomb’s law in a dielectric medium is clear, as noted in [56]. Indeed, inside a dielectric medium, the amplitude of the electric field E generated by an external point charge Q located at a distance r obeys the following equation:
where μ is the relative permittivity of the medium, and can depend on E. In the case of a gravitational field generated by a point mass M, it is then clear that Milgrom’s interpolating function plays the role of “gravitational permittivity”. Since it is smaller than 1, it makes the gravitational field stronger than Newtonian (rather than smaller in the case of the electric field in a dielectric medium, where μ > 1). In other words, the gravitational susceptibility coefficient χ (such that μ = 1 + χ) is negative, which is correct for a force law where like masses attract rather than repel [56]. This dielectric analogy has been explicitly used in devising a theory[60] where Milgrom’s law arises from the existence of a “gravitationally polarizable” medium (see Section 7).
Of course, inverting the above relation, Milgrom’s law can also be written as
where
However, as we shall see in Section 6, in order for g to remain a conservative force field, these expressions (Eqs. 7 and 10) cannot be rigorous outside of highly symmetrical situations. Nevertheless, it allows one to make numerous very general predictions for galactic systems, or, in other words, to derive “Keplerlike laws” of galactic dynamics, unified under the banner of Milgrom’s law. As we shall see, many of the observations unpredicted by ΛCDM on galaxy scales naturally ensue from this very simple law. However, even though Milgrom originally devised this as a modification of dynamics, this law is a priori nothing more than an algorithm, which allows one to calculate the distribution of force in an astronomical object from the observed distribution of baryonic matter. Its success would simply mean that the observed gravitational field in galaxies is mimicking a universal force law generated by the baryons alone, meaning that (i) either the force law itself is modified, or that (ii) there exists an intimate connection between the distribution of baryons and dark matter in galaxies.
It was suggested, for instance, [218] that such a relation might arise naturally in the CDM context, if halos possess a oneparameter density profile that leads to a characteristic acceleration profile that is only weakly dependent upon the mass of the halo. Then, with a fixed collapse factor for the baryonic material, the transition from dominance of dark over baryonic occurs at a universal acceleration, which, by numerical coincidence, is on the order of cH_{0} and thus of a_{0} (see also [411]). While, still today, it remains to be seen whether this scenario would quantitatively hold in numerical simulations, it was noted by Milgrom [306] that this scenario only explained the role of a_{0} as a transition radius between baryon and dark matter dominance in HSB galaxies, precluding altogether the existence of LSB galaxies where dark matter dominates everywhere. The real challenge for ΛCDM is rather to explain all the different roles played by a_{0} in galaxy dynamics, different roles that can all be summarized within the single law proposed by Milgrom, just like Kepler’s laws are unified under Newton’s law. We list these Keplerlike laws of galactic dynamics hereafter, and relate each of them with the unpredicted observations of Section 4, keeping in mind that these were mostly a priori predictions of Milgrom’s law, made before the data were as good as today, not “postdictions” like we are used to in modern cosmology.
Galactic Keplerlike laws of motion

1.
Asymptotic flatness of rotation curves. The rotation curves of galaxies are asymptotically flat, even though this flatness is not always attained at the last observed point (see point hereafter about the shapes of rotation curves as a function of baryonic surface density). What is more, Milgrom’s law can be thought of as including the total acceleration with respect to a preferred frame, which can lead to the prediction of asymptoticallyfalling rotation curves for a galaxy embedded in a large external gravitational field (see Section 6.3).

2.
Ga_{ 0 } defining the zeropoint of the baryonic TullyFisher relation. The plateau of a rotation curve is V_{ f } = (GMa_{0})^{1/4}. The true TullyFisher relation is predicted to be a relation between this asymptotic velocity and baryonic mass, not luminosity. Milgrom’s law yields immediately the slope (precisely 4) and zeropoint of this baryonic TullyFisher law. The observational baryonic TullyFisher relation should thus be consistent with zero scatter around this prediction of Milgrom’s law (the dotted line of Figure 3). And indeed it is. All rotationallysupported systems in the weak acceleration limit should fall on this relation, irrespective of their formation mechanism and history, meaning that completely isolated galaxies or tidal dwarf galaxies formed in interaction events all behave as every other galaxy in this respect.

3.
Ga_{ 0 } defining the zeropoint of the FaberJackson relation. For quasiisothermal systems [296], such as elliptical galaxies, the bulk velocity dispersion depends only on the total baryonic mass via σ^{4} ∼ GMa_{0}. Indeed, since the equation of hydrostatic equilibrium for an isotropic isothermal system in the weak field regime reads d(σ^{2}ρ)/dr = −ρ(GMa_{0})^{1/2}/r, one has σ^{4} = α^{−2} × GMa_{0} where α = d ln ρ/d ln r. This underlies the FaberJackson relation for elliptical galaxies (Figure 7), which is, however, not predicted by Milgrom’s law to be as tight and precise (because it relies, e.g., on isothermality and on the slope of the density distribution) as the BTFR.

4.
Mass discrepancy defined by the inverse of the acceleration in units of a_{ 0 }. Or alternatively, defined by the inverse of the squareroot of the gravitational acceleration generated by the baryons in units of a_{0}. The mass discrepancy is precisely equal to this in the verylowacceleration regime, and leads to the baryonic TullyFisher relation. In the lowacceleration limit, g_{ N }/g = g/a_{0}, so in the CDM language, inside the virial radius of any system whose virial radius is in the weak acceleration regime (well below a_{0}), the baryon fraction is given by the acceleration in units of a_{0}. If we adopt a rough relation \({M_{500}} \simeq 1.5 \times {10^5}{M_ \odot} \times V_c^3{({\rm{km/s)}}^{ 3}}\), we get that the acceleration at R_{500}, and thus the system baryon fraction predicted by Milgrom’s formula, is M_{ b }/M_{500} = a_{500}/a_{0} ≃ 4 × 10^{−4} × V_{ c } (km/s)^{−1}. Divided by the cosmological baryon fraction, this explains the trend for f_{ d } = M_{ b }/(0.17 M_{500}) with potential \((\Phi = V_c^2)\) in Figure 2, thereby naturally explaining the halobyhalo missing baryon challenge in galaxies. No baryons are actually missing; rather, we infer their existence because the natural scaling between mass and circular velocity \({M_{500}} \propto V_c^3\) in ΛCDM differs by a factor of V_{ c } from the observed scaling \({M_b} \propto V_c^4\).

5.
a_{ 0 } as the characteristic acceleration at the effective radius of isothermal spheres. As a corollary to the FaberJackson relation for isothermal spheres, let us note that the baryonic isothermal sphere would not require any dark matter up to the point where the internal gravity falls below a_{0}, and would thus resemble a purely baryonic Newtonian isothermal sphere up to that point. But at larger distances, in the presence of the added force due to Milgrom’s law, the baryonic isothermal sphere would fall [296] as r^{−4}, thereby making the radius at which the gravitational acceleration is a_{0} the effective baryonic radius of the system, thereby explaining why, at this radius R in quasiisothermal systems, the typical acceleration σ^{2}/R is almost always observed to be on the order of a_{0}. Of course, this is valid for systems where such a transition radius does exist, but going to veryLSB systems, if the internal gravity is everywhere below a_{0}, one can then have typical accelerations as low as one wishes.

6.
a_{ 0 }/G as a critical mean surface density for stability. Disks with mean surface density 〈Σ〉 ≤ Σ_{†} = a_{0}/G have added stability. Most of the disk is then in the weakacceleration regime, where accelerations scale as \(a \propto \sqrt M\), instead of a ∝ M. Thus, δa/a = (1/2)δM/M instead of δa/a = δM/M, leading to a weaker response to small mass perturbations [299]. This explains the Freeman limit (Figure 8).

7.
a_{ 0 } as a transition acceleration. The mass discrepancy in galaxies always appears (transition from baryon dominance to dark matter dominance) when \(V_c^2/R \sim {a_0}\), yielding a clear massdiscrepancy acceleration relation (Figure 10). This, again, is the case for every single rotationallysupported system irrespective of its formation mechanism and history. For HSB galaxies, where there exist two distinct regions where \(V_c^2/R > {a_0}\) in the inner parts and \(V_c^2/R < {a_0}\) in the outer parts, locally measured masstolight ratios should show no indication of hidden mass in the inner parts, but rise beyond the radius where \(V_c^2/R \approx {a_0}\) (Figure 14). Note that this is the only role of a_{0} that the scenario of [218] was poorly trying to address (forgetting, e.g., about the existence of LSB galaxies).

8.
a_{ 0 }/G as a transition central surface density. The acceleration a_{0} defines the transition from HSB galaxies to LSB galaxies: baryons dominate in the inner parts of galaxies whose central surface density is higher than some critical value on the order of Σ_{†} = a_{0}/G, while in galaxies whose central surface density is much smaller (LSB galaxies), DM dominates everywhere, and the magnitude of the mass discrepancy is given by the inverse of the acceleration in units of a_{0}; see (5). Thus, the mass discrepancy appears at smaller radii and is more severe in galaxies of lower baryonic surface densities (Figure 14). The shapes of rotation curves are predicted to depend on surface density: HSB galaxies are predicted to have rotation curves that rise steeply, then become flat, or even fall somewhat to the notyetreached asymptotic flat velocity, while LSB galaxies are supposed to have rotation curves that rise slowly to the asymptotic flat velocity. This is precisely what is observed (Figure 15), and is in accordance [162] with the more complex empirical parametrization of observed rotation curves that has been proposed in [376]. Finally, the total (baryons+DM) acceleration is predicted to decline with the mean baryonic surface density of galaxies, exactly as observed (Figure 16), in the form \(a \propto \Sigma _b^{1/2}\) (see also Figure 9).

9.
a_{ 0 }/2πG as the central surface density of dark halos. Provided they are mostly in the Newtonian regime, galaxies are predicted to be embedded in dark halos (whether real or virtual, i.e., “phantom” dark matter) with a central surface density on the order of a_{0}/(2πG) as observed^{Footnote 19}. LSBs should have a halo surface density scaling as the squareroot of the baryonic surface density, in a much more compressed range than for the HSB ones, explaining the consistency of observed data with a constant central surface density of dark matter [167, 313].

10.
Features in the baryonic distribution imply features in the rotation curve. Because a small variation in g_{ N } will be directly translated into a similar one in g, Renzo’s rule (Section 4.3.4) is explained naturally.
As a conclusion, all the apparently independent roles that the characteristic acceleration a_{0} plays in the unpredicted observations of Section 4.3 (see end of Section 4.3.3 for a summary), as well as Renzo’s rule (Section 4.3.4), have been elegantly unified by the single law proposed by Milgrom [293] in 1983 as a unique scaling relation between the gravitational field generated by observed baryons and the total observed gravitational force in galaxies.
Milgrom’s Law as a Modification of Classical Dynamics: MOND
Thus, it appears that many puzzling observations, that are difficult to understand in the ΛCDM context (and/or require an extreme finetuning of the DM distribution), are well summarized by a single heuristic law. Therefore, it would appear natural that this law derives from a universal force law, and would reflect a modification of dynamics rather than the addition of massive particles interacting (almost) only gravitationally with baryonic matter^{Footnote 20}. However, applying blindly Eq. 7 to a set of massive bodies directly leads to serious problems [150, 293] such as the nonconservation of momentum. In a twobody configuration, as the implied force is not symmetric in the two masses, Newton’s third law (action and reaction principle) does not hold, so the momentum is not conserved. Consider a translationally invariant isolated system of two such masses m_{1} and m_{2} small enough to be in the very weak acceleration limit, and placed at rest on the xaxis. The amplitude of the Newtonian force is then F_{ n } = Gm_{1}m_{2}/(x_{2} − x_{1})^{2}, and applying blindly Eq. 7, would lead to individual accelerations \(\vert {{\rm{a}}_i}{\rm{\vert =}}\sqrt {{F_N}{a_0}/{m^i}}\). This then immediately leads to
meaning that for different masses, the momentum of this isolated system is not conserved. This means that Eq. 7 cannot truly represent a universal force law. If Eq. 7 is to be more than just a heuristic law summarizing how dark matter is arranged in galaxies with respect to baryonic matter, it must then be an approximation (valid only in highly symmetric configurations) of a more general force law deriving from an action and a variational principle. Such theories at the classical level can be classified under the acronym MOND, for Modified Newtonian Dynamics^{Footnote 21}. In this section, we sketch how to devise such theories at the classical level, and list detailed tests of these theories at all astrophysical scales.
Modified inertia or modified gravity: Nonrelativistic actions
If one wants to modify dynamics in order to reproduce Milgrom’s heuristic law while still benefiting from usual conservation laws such as the conservation of momentum, one can start from the action at the classical level. Clearly such theories are only toymodels until they become the weakfield limit of a relativistic theory (see Section 7), but they are useful both as targets for such relativistic theories, and as internally consistent models allowing one to make predictions at the classical level (i.e., neither in the relativistic or quantum regime).
A set of particles of mass moving in a gravitational field generated by the matter density distribution ρ = ∑_{ i } m_{ i }δ (x − x_{ i }) and described by the Newtonian potential Φ_{ N } has the following action^{Footnote 22}:
Varying this action with respect to configuration space coordinates yields the equations of motion d^{2}x/dt^{2} = −∇Φ_{ N }, while varying it with respect to the potential leads to Poisson equation ∇Φ_{ N } = 4πGρ. Modifying the first (kinetic) term is generally referred to as “modified inertia” and modifying the last term as “modified gravity”^{Footnote 23}.
Modified inertia
The first possibility, modified inertia, has been investigated by Milgrom [300, 321], who constructed modified kinetic actions^{Footnote 24} (the first term S_{kin} in Eq. 13) that are functionals depending on the trajectory of the particle as well as on the acceleration constant a_{0}. By construction, the gravitational potential is then still determined from the Newtonian Poisson equation, but the particle equation of motion becomes, instead of Newton’s second law,
where A is a functional of the whole trajectory {x(t)}, with the dimensions of acceleration. The Newtonian and MOND limits correspond to [a_{0} → 0, A → d^{2}x/dt^{2}] and \([{a_0} \rightarrow \infty, {\bf{A}}[\{{\rm{x(t)\},}}{a_0}] \rightarrow a_0^{ 1}{\rm{Q(\{x(}}t)\})]\) where Q has dimensions of acceleration squared.
Milgrom [300] investigated theories of this vein and rigorously showed that they always had to be timenonlocal (see also Section 7.10) to be Galilean invariant^{Footnote 25}. Interestingly, he also showed that quantities such as energy and momentum had to be redefined but were then enjoying conservation laws: this even leads to a generalized virial relation for bound trajectories, and in turn to an important and robust prediction for circular orbits in an axisymmetric potential, shared by all such theories. Eq. 14 becomes for such trajectories:
where, V_{ c } and R are the orbital speed and radius, and μ(x) is universal for each theory, and is derived from the expression of the action specialized to circular trajectories. Thus, for circular trajectories, these theories recover exactly the heuristic Milgrom’s law. Interestingly, it is this law, which is used to fit galaxy rotation curves, while in the modified gravity framework of MOND (see hereafter), one should actually calculate the exact predictions of the modified Poisson formulations, which can differ a little bit from Milgrom’s law. However, for orbits other than circular, it becomes very difficult to make predictions in modified inertia, as the time nonlocality can make the anomalous acceleration at any location depend on properties of the whole orbit. For instance, if the accelerations are small on some segments of a trajectory, MOND effects can also be felt on segments where the accelerations are high, and conversely [321]. This can give rise to different effects on bound and unbound orbits, as well as on circular and highly elliptic orbits, meaning that “predictions” of modified inertia in pressuresupported systems could differ significantly from those derived from Milgrom’s law per se. Let us finally note that testing modified inertia on Earth would require one to properly define an inertial reference frame, contrary to what has been done in [5, 179] where the laboratory itself was not an inertial frame. Proper setups for testing modified inertia on Earth have been described, e.g., in [201, 202]: under the circumstances described in these papers, modified inertia would inevitably predict a departure from Newtonian dynamics, even if the exact departure cannot be predicted at present, except for circular motion.
BekensteinMilgrom MOND
The idea of modified gravity is, on the one hand, to preserve the particle equation of motion by preserving the kinetic action, but, on the other hand, to change the gravitational action, and thus modify the Poisson equation. In that case, all the usual conservation laws will be preserved by construction.
A very general way to do so is to write [38]:
where F can be any dimensionless function. The Lagrangian being nonquadratic in ∇Φ, this has been dubbed by Bekenstein & Milgrom [38] Aquadratic Lagrangian theory (AQUAL). Varying the action with respect to Φ then leads to a nonlinear generalization of the Newtonian Poisson equation^{Footnote 26}:
where μ(x) = F′(z) and z = x^{2}. In order to recover the μfunction behavior of Milgrom’s law (Eq. 7), i.e., μ(x) → 1 for x ≫ 1 and μ(x) → x for x ≪ 1, one needs to choose:
The general solution of the boundary value problem for Eq. 17 leads to the following relation between the acceleration g = −∇Φ and the Newtonian one, g_{ N } = −∇Φn
where g = g, and S is a solenoidal vector field with no net flow across any closed surface (i.e., a curl field S = ∇ × A such that ∇.S = 0). This, it is equivalent to Milgrom’s law (Eq. 7) up to a curl field correction, and is precisely equal to Milgrom’s law in highly symmetric onedimensional systems, such as sphericallysymmetric systems or flattened systems for which the isopotentials are locally spherically symmetric. For instance, the Kuzmin disk [52] is an example of a flattened axisymmetric configuration for which Milgrom’s law is precisely valid, as its Newtonian potential \({\Phi _N} =  GM/\sqrt {{R^2} + {{(b + \vert z\vert)}^2}}\) is equivalent on both sides of the disk to that of a point mass above or below the disk respectively.
In vacuum and at very large distances from a body of mass M, the isopotentials always tend to become spherical and the curl field tends to zero, while the gravitational acceleration falls well below a_{0} (a regime known as the “deepMOND” regime), so that:
An important point, demonstrated by Bekenstein & Milgrom [38], is that a system with a low centerofmass acceleration, with respect to a larger (more massive) system, sees the motion of its constituents combine to give a MOND motion for the centerofmass even if it is made up of constituents whose internal accelerations are above a_{0} (for instance a compact globular cluster moving in an outer galaxy). The centerofmass acceleration is independent of the internal structure of the system (if the mass of the system is small), namely the Weak Equivalence Principle is satisfied.
In a modified gravity theory, any timeindependent system must still satisfy the virial theorem:
where K = M〈v^{2}〉/2 is the total kinetic energy of the system, M = Σ_{ i } m_{ i } being the total mass of the system, 〈v^{2}〉 the second moment of the velocity distribution, and \(W =  \int {\rho {\rm{x}}{.}\nabla \Phi {d^3}x}\) is the “virial”, proportional to the total potential energy. Milgrom [301, 302] showed that, in BekensteinMilgrom MOND, the virial is given by:
for a system entirely in the extremely weak field limit (the “deepMOND” limit x = g/a_{0} ≪ 1) where μ(x) = x and F(z) = (2/3)z^{3/2}, the second term vanishes and we get \(W = ( 2/3)\sqrt {G{M^3}{a_0}}\)(see [301] for the specific conditions for this to be valid). In this case, we can get an analytic expression for the twobody force under the approximation that the two bodies are very far apart compared to their internal sizes [301, 509, 511]. Since the kinetic energy K = K_{orb} + K_{int} can be separated into the orbital energy \({K_{{\rm{orb}}}} = {m_1}{m_2}\upsilon _{{\rm{rel}}}^2/(2M)\) and the internal energy of the bodies \({K_{{\mathop{\rm int}}}} = \sum (1/3)\sqrt {Gm_i^3{a_0}}\), we get from the scalar virial theorem of a stationary system:
We can then assume an approximately circular velocity such that the twobody force (satisfying the action and reaction principle) can be written analytically in the deepMOND limit as:
The latter equation is not valid for Nbody configurations, for which the BekensteinMilgrom (BM) modified Poisson equation (Eq. 17) must be solved numerically (apart from highlysymmetric Nbody configurations). This equation is a nonlinear elliptic partial differential equation. It can be solved numerically using various methods [50, 77, 96, 147, 250, 457]. One of them [77, 457] is to use a multigrid algorithm to solve the discrete form of Eq. 17 (see also Figure 17):
where

ρi,j,k is the density discretized on a grid of step h,

Φi,j,k is the MOND potential discretized on the same grid of step h,

μM_{1}, and μ_{L1}, are the values of μ(x) at points M_{1} and L_{1} corresponding to (i + 1/2, j, k) and (i − 1/2, j, k) respectively (Figure 17).
The gradient component (∂/∂x,∂/∂y,∂/∂z), in μ(x), is approximated in the case of μ_{ Ml } by \(([\Phi (B)  \Phi (A)]/h,[\Phi (I) + \Phi (H)  \Phi (K)  \Phi (J)]/(4h),[\Phi (C) + \Phi (D)  \Phi (E)  \Phi (F)]/(4h))\) (see Figure 17).
In [457] the GaussSeidel relaxation with red and black ordering is used to solve this discretized equation, with the boundary condition for the Dirichlet problem given by Eq. 20 at large radii. It is obvious that subsequently devising an evolving Nbody code for this theory can only be done using particlemesh techniques rather than the gridless multipole expansion treecode schemes widely used in standard gravity.
Finally, let us note that it could be imagined that MOND, given some of its observational problems (developed in Section 6.6), is incomplete and needs a new scale in addition to a_{0}. There are several ways to implement such an idea, but for instance, Bekenstein [36] proposed in this vein a generalization of the AQUAL formalism by adding a velocity scale s_{0}, in order to allow for effective variations of the acceleration constant as a function of the deepness of the potential, namely:
leading to
where \({a_{0{\rm{eff}}}} = {a_0}{e^{ \Phi/{\mathcal S}_0^2}}\). Interestingly, with this “modified MOND”, Gauss’ theorem (or Newton’s second theorem) would no longer be valid in spherical symmetry. A suitable choice of s_{0} (e.g., on the order of 10^{3} km/s; see [36]) could affect the dynamics of galaxy clusters (by boosting the modification with an effectively higher value of a_{0}) compared to the previous MOND equation, while keeping the less massive systems such as galaxies typically unaffected compared to usual MOND, while other (lower) values of s_{0} could allow (modulo a renormalization of a_{0}) for a stronger modification in galaxy clusters as well as milder modification in subgalactic systems such as globular clusters, which, as we shall soon see could be interesting from a phenomenological point of view (see Section 6.6). However, the possibility of too strong a modification should be carefully investigated, as well as, in a relativistic (see Section 7) version of the theory, the consequences on the dynamics of a scalarfield with a similar action.
QUMOND
Another way [319] of modifying gravity in order to reproduce Milgrom’s law is to still keep the “matter action” unchanged S_{kin} + S_{in} = ∫ ρ(v^{2}/2 − Φ)d^{3}x dt, thus ensuring that varying the action of a test particle with respect to the particle degrees of freedom leads to d^{2}x/dt^{2} = −∇Φ, but to invoke an auxiliary acceleration field g_{ N } = −∇Φ_{ N } in the gravitational action instead of invoking an aquadratic Lagrangian in ∇Φ. The addition of such an auxiliary field can of course be done without modifying Newtonian gravity, by writing the Newtonian gravitational action in the following way^{Footnote 27}:
It gives, after variation over g_{ N } (or over Φ_{ N }): g_{ N } = −∇Φ. And after variation of the full action over Φ: −∇.g_{ N } = 4πGρ, i.e., Newtonian gravity. One can then introduce a MONDian modification of gravity by modifying this action in the following way, replacing \({\rm{g}}_N^2\) by a nonlinear function of it and assuming that it derives from an auxiliary potential g_{ N } = −∇Φ_{ N }, so that the new degree of freedom is this new potential:
Varying the total action with respect to Φ yields: ∇^{2}Φ_{ N } = 4πGρ. And varying it with respect to the auxiliary (Newtonian) potential Φ_{ N } yields:
where v(y) = Q′(z) and z = y^{2}. Thus, the theory requires one only to solve the Newtonian linear Poisson equation twice, with only one nonlinear step in calculating the rhs term of Eq. 30. For this reason, it is called the quasilinear formulation of MOND (QUMOND). In order to recover the vfunction behavior of Milgrom’s law (Eq. 10), i.e., v(y) → 1 for y ≫ 1 and v(y) → y^{−1/2} for y ≪ 1, one needs to choose:
The general solution of the system of partial differential equations is equivalent to Milgrom’s law (Eq. 10) up to a curl field correction, and is precisely equal to Milgrom’s law in highlysymmetric onedimensional systems. However, this curlfield correction is different from the one of AQUAL. This means that, outside of high symmetry, AQUAL and QUMOND cannot be precisely equivalent. An illustration of this is given in [509]: for a system with all its mass in an elliptical shell (in the sense of a squashed homogeneous spherical shell), the effective density of matter that would source the MOND force field in Newtonian gravity is uniformly zero in the void inside the shell for QUMOND, but nonzero for AQUAL.
The concept of the effective density of matter that would source the MOND force field in Newtonian gravity is extremely useful for an intuitive comprehension of the MOND effect, and/or for interpreting MOND in the dark matter language: indeed, subtracting from this effective density the baryonic density yields what is called the “phantom dark matter” distribution. In AQUAL, it requires deriving the Newtonian Poisson equation after having solved for the MOND one. On the other hand, in QUMOND, knowing the Newtonian potential yields direct access to the phantom dark matter distribution even before knowing the MOND potential. After choosing a vfunction, one defines
and one has, for the phantom dark matter density,
This \({\tilde \nu}\)function appears naturally in an alternative formulation of QUMOND where one writes the action as a function of an auxiliary potential Φ_{ph}:
leading to a potential Φ_{ph} obeying a QUMOND equation with \(\tilde \nu (y) = {H{\prime}}({y^2})\) and Φ = Φ_{ N } + Φ_{ph}.
Numerically, for a given Newtonian potential discretized on a grid of step h, the discretized phantom dark matter density is given on grid points (i,j,k) by (see Figure 17 and cf. Eq. 25, see also [11]):
This means that any Nbody technique (e.g., treecodes or fast multipole methods) can be adapted to QUMOND (a grid being necessary as an intermediate step). Once the Newtonian potential (or force) is locally known, the phantom dark matter density can be computed and then represented by weighted particles, whose gravitational attraction can then be computed in any traditional manner. An example is given in Figure 18, where one considers a rather typical baryonic galaxy model with a small bulge and a large disk. Applying Eq. 35 (with the vfunction of Eq. 43) then yields the phantom density [253]. Interestingly, this phantom density is composed of a round “dark halo” and a flattish “dark disk” (see [305] for an extensive discussion of how such a dark disk component comes about; see also [50] and Section 6.5.2 for observational considerations). Let us note that this phantom dark matter density can be slightly separated from the baryonic density distribution in nonspherical situations [226], and that it can be negative [297, 490], contrary to normal dark matter. Finding the signature of such a local negative dark matter density could be a way of exhibiting a clear signature of MOND.
Finally, let us note that, as shown in [319, 509], (i) a system made of highacceleration constituents, but with a lowacceleration centerofmass, moves according to a lowacceleration MOND law, while (ii) the virial of a system is given by
meaning that for a system entirely in the extremely weak field limit where v(y) = y^{−1/2} and Q(z) = (4/3)z^{3/4}, the second term vanishes and we get \(W = ( 2/3)\sqrt {G{M^3}{a_0}}\) precisely like in BekensteinMilgrom MOND. This means that, although the curlfield correction is in general different in AQUAL and QUMOND, the twobody force in the deepMOND limit is the same [509].
The interpolating function
The basis of the MOND paradigm is to reproduce Milgrom’s law, Eq. 7, in highly symmetrical systems, with an interpolating function asymptotically obeying the conditions of Eq. 8, i.e., μ(x) → 1 for x ≫ 1 and μ(x) → x for x ≪ 1. Obviously, in order for the relation between g and g_{ N } to be univocally determined, another constraint is that (x) must be a monotonically increasing function of x, or equivalently
or equivalently
Even though this leaves some freedom for the exact shape of the interpolating function, leading to the various families of functions hereafter, let us insist that it is already extremely surprising, from the dark matter point of view, that the MOND prescriptions for the asymptotic behavior of the interpolating function did predict all the aspects of the dynamics of galaxies listed in Section 5.
As we have seen in Section 6.1, an alternative formulation of the MOND paradigm relies on Eq. 10, based on an interpolating function
In that case, we also have that yv (y) must be a monotonically increasing function of y.
Finally, as we shall see in detail in Section 7, many MOND relativistic theories boil down to multifield theories where the weakfield limit can be represented by a potential Φ = Σ_{ i } ϕi, where each ϕ_{ i } obeys a generalized Poisson equation, the most common case being
where Φ_{ n } obeys the Newtonian Poisson equation and the scalar field ϕ (with dimensions of a potential) plays the role of the phantom dark matter potential and obeys an equation of either the type of Eq. 17 or of Eq. 30. When it obeys a QUMOND type of equation (Eq. 30), the vfunction must be replaced by the \({\tilde \nu}\)function of Eq. 32. When it obeys a BMlike equation (Eq. 17), the classical interpolating function μ(x) acting on x = ∇Φ/a_{0} must be replaced by another interpolating function \(\tilde \mu ({\mathcal S})\) acting on = ∇ /a_{0}, in order for the total potential Φ to conform to Milgrom’s law^{Footnote 28}. In the absence of a renormalization of the gravitational constant, the two functions are related through [145]
for x ≪ 1 (the deepMOND regime), one has s = x(1 − x) ≪ 1 and x ∼ s(1 + s), yielding \(\tilde \mu (s) \sim s\), i.e., although it is generally different, \({\tilde \mu}\) has the same lowgravity asymptotic behavior as μ.
In spherical symmetry, all these different formulations can be made equivalent by choosing equivalent interpolating functions, but the theories will typically differ slightly outside of spherical symmetry (i.e., the curl field will be slightly different). As an example, let us consider a widelyused interpolating function [141, 166, 402, 508] yielding excellent fits in the intermediate to weak gravity regime of galaxies (but not in the strong gravity regime of the Solar system), known as the “simple” μfunction (see Figure 19):
This yields y = x^{2}/(1 + x), and thus \(x = (y + \sqrt {{y^2} + 4y})/2\) and v = (1 + x)/x yields the “simple” vfunction:
It also yields s = x[1 − μ(x)] = x/(1 + x) = μ, and hence x = s/(1 − s), yielding for the “simple” \({\tilde \mu}\)function:
A more general family of \({\tilde \mu}\)functions is known as the αfamily [15], valid for 0 ≤ α ≤ 1 and including the simple function of the α = 1 case ^{Footnote 29}:
corresponding to the following family of μfunctions:
The α = 0 case is sometimes referred to as “Bekenstein’s μfunction” (see Figure 19) as it was used in [33]. The problem here is that all these μfunctions approach 1 quite slowly, with ζ ≤ 1 in their asymptotic expansion for x → ∞, μ(x) ∼ 1 − Ax^{−ζ}. Indeed, since s = x[1 − μ(x)], its asymptotic behavior is s ∼ Ax^{−ζ+1}. So, if ζ > 1, s → 0 for x → ∞ as well as for x → 0, which would imply that \(x(s) = s \tilde \mu (s) + s\) would be a multivalued function, and that the gravity would be illdefined. This is problematic because even for the extreme case ζ = 1, the anomalous acceleration does not go to zero in the strong gravity regime: there is still a constant anomalous “Pioneerlike” acceleration x[1 − μ(x)] → A, which is observationally excluded^{Footnote 30} from very accurate planetary ephemerides [154]. What is more, these \({\tilde \mu}\)functions, defined only in the domain 0 < s < α^{−1}, would need verycarefullychosen boundary conditions to avoid covering values of outside of the allowed domain when solving for the Poisson equation for the scalar field.
The way out to design \({\tilde \mu}\)functions corresponding to acceptable μfunctions in the strong gravity regime is to proceed to a renormalization of the gravitational constant[145]: this means that the bare value of in the Poisson and generalized Poisson equations ruling the bare Newtonian potential ϕ_{ N } and the scalar field ϕ in Eq. 40 is different from the gravitational constant measured on Earth, G_{ N } (related to the true Newtonian potential Φ_{ N }). One can assume that the bare gravitational constant G is related to the measured one through
meaning that x = y + s where \(x = \nabla \Phi/{a_0},y = \nabla {\phi _N}/{a_0} = \nabla {\Phi _N}(\xi {a_0})\), and \({s}\tilde \mu ({s}) = y\) We then have for Milgrom’s law:
In order to recover μ(x) → 1 for x → ∞, it is straightforward to show [145] that it suffices that \(\tilde \mu ({s}) \rightarrow {{\tilde \mu}_0}\) for s → ∞, and that \(\xi = 1 + \tilde \mu _0^{ 1}\). Then, if ζ > 1 in the asymptotic expansion μ(x) ∼ 1 − x^{−ζ}, one has \({s} \sim {(1 + \tilde \mu _0^{ 1})^{ 1}}{x^{ \zeta + 1}} + {(1 + {{\tilde \mu}_0})^{ 1}}x\). This second linear term allows s to go to infinity for large x and thus x(s) to be singlevalued. On the other hand, for the deepMOND regime, the renormalization of G implies that \(\tilde \mu ({s}) \rightarrow {s}/\xi\) for \({s} \ll 1\).
We can then use, even in multifield theories, μfunctions quickly asymptoting to 1. For each of these functions, there is a oneparameter family of corresponding \({\tilde \mu}\)functions (labelled by the parameter \(\tilde \mu (\infty) = {{\tilde \mu}_0})\), obtained by inserting μ(x) into \({s} = x[1  {\xi ^{ 1}}\mu (x)]\) and making sure that the function is increasing and thus invertible. A useful family of such μfunctions asymptoting more quickly towards 1 than the αfamily is the nfamily:
The case n= 1 is again the simplefunction, while the case n= 2 has been extensively used in rotation curve analysis from the very first analyses [28, 223], to this day [401], and is thus known as the “standard” μfunction (see Figure 19). The corresponding \({\tilde \mu}\)function for n ≥ 2 has a very peculiar shape of the type shown in Figure 3 of [81] (which might be considered a finetuned shape, necessary to account for solar system constraints). On the other hand, the corresponding vfunction family is:
As the simple μfunction (α = 1 or n =1) fits galaxy rotation curves well (see Section 6.5.1) but is excluded in the solar system (see Section 6.4), it can be useful to define μfunctions that have a gradual transition similar to the simple function in the low to intermediate gravity regime of galaxies, but a more rapid transition towards one than the simple function. Two such families are described in [325] in terms of their vfunction:
and
Finally, yet another family was suggested in [274], obtained by deleting the second term of the γfamily, and retaining the virtues of the nfamily in galaxies, but approaching one more quickly in the solar system:
To be complete, it should be noted that other μfunctions considered in the literature include [304, 505] (see also Section 7.10):
and
This simply shows the variety of shapes that the interpolating function of MOND can in principle take^{Footnote 31}. Very precise data for rotation curves, including negligible errors on the distance and on the stellar masstolight ratios (or, in that case, purely gaseous galaxies) should allow one to pin down its precise form, at least in the intermediate gravity regime and for “modified inertia” theories (Section 6.1.1) where Milgrom’s law is exact for circular orbits. Nowadays, galaxy data still allow some, but not much, wiggle room: they tend to favor the α = n = 1 simple function [166] or some interpolation between n = 1 and n = 2 [141], while combined data of galaxies and the solar system (see Sections 6.4 and 6.5) rather tend to favor something like the γ = δ = 1 function of Eq. 52 and Eq. 53 (which effectively interpolates between n =1 and n = 2, see Figure 19), although slightly higher exponents (i.e., γ > 1 or δ > 1) might still be needed in the weak gravity regime in order to pass solar system tests involving the external field from the galaxy [62]. Again, it should be stressed that the most salient aspect of MOND is not its precise interpolating function, but rather its successful predictions on galactic scaling relations and Keplerlike laws of galactic dynamics (Section 5.2), as well as its various beneficial effects on, e.g., disk stability (see Section 6.5), all predicted from its asymptotic form. The very concept of a predefined interpolating function should even in principle fully disappear once a more profound parent theory of MOND is discovered (see, e.g., [22]).
To end this section on the interpolating function, let us stress that if the μfunction asymptotes as μ(x) = x for x → 0, then the energy of the gravitational field surrounding a massive body is infinite [38]. What is more, if the \({\tilde \mu}\) function of relativistic multifield theories asymptotes in the same way to zero before going to negative values for timeevolution dominated systems (see Section 9.1), then a singular surface exists around each galaxy, on which the scalar degree of freedom does not propagate, and can therefore not provide a consistent picture of collapsed matter embedded into a cosmological background. A simple solution [145, 380] consists in assuming a modified asymptotic behavior of the μfunction, namely of the form
In that case there is a return to a Newtonian behavior (but with a very strong renormalized gravitational constant G_{ N }/ε_{0}) at a very low acceleration scale x ≪ ε_{0}, and rotation curves of galaxies are only approximately flat until the galactocentric radius
Thus, one must have ε_{0} ≪ 1 to not affect the observed phenomenology in galaxies. Note that the μfunction will never go to zero, even at the center of a system. Conversely, in QUMOND and the like, one can modify the vfunction in the same way:
The external field effect
The above return to a rescaled Newtonian behavior at very large radii and in the central parts of isolated systems, in order to avoid theoretical problems with the interpolating function, would happen anyway, even with the interpolating function going to zero, for any nonisolated system in the universe (and this return to Newtonian behavior could actually happen at much lower radii) because of a very peculiar aspect of MOND: the external field effect, which appeared in its full significance already in the pristine formulation of MOND [293].
In practice, no objects are truly isolated in the Universe and this has wider and more subtle implications in MOND than in NewtonEinstein gravity. In the linear Newtonian dynamics, the internal dynamics of a subsystem (a cluster in a galaxy, or a galaxy in a galaxy cluster for instance) in the field of its mother system decouples. Namely, the internal dynamics is always the same independent of any external field (constant across the subsystem) in which the system is embedded (of course, if the external field varies across the subsystem, it manifests itself as tides). This has subsequently been built in as a fundamental principle of GR: the Strong Equivalence Principle (see Section 7). But MOND has to break this fundamental principle of GR. This is because, as it is an accelerationbased theory, what counts is the total gravitational acceleration with respect to a predefined frame (e.g., the CMB frame^{Footnote 32}). Thus, the MOND effects are only observed in systems where the absolute value of the gravity both internal, g, and external, g_{ e } (from a host galaxy, or astrophysical system, or large scale structure), is less than a_{0}. If g_{ e } < g < a_{0} then we have standard MOND effects. However, if the hierarchy goes as g < a_{0} < g_{ e }, then the system is purely Newtonian^{Footnote 33}, and if g < g_{ e } < a_{0} then the system is Newtonian with a renormalized gravitational constant. Ultimately, whenever g falls below g_{ e } (which always happens at some point) the gravitational attraction falls again as 1/r^{2}. This is most easily illustrated in a thought experiment where one considers MOND effects in one dimension. In Eq. 17, one has ∇Φ = g + g_{ e } and 4πGρ = ∇.(g_{ N } + g_{ Ne }), which in one dimension leads to the following revised Milgrom’s law (Eq. 7) including the external field:
such that, when g → 0, we have Newtonian gravity with a renormalized gravitational constant G_{norm} ≈ G/[μ_{ e }(1 + L_{ e })] where μ_{ e } = μ(g_{ e }/a_{0}) and L_{ e } = (d ln μ/d ln x)_{x=ge}/_{a0}, assuming, as before, that the external field only varies on a much larger scale than the internal system. Similarly, for QUMOND (Eq. 30) in one dimension, one gets the equivalent of Eq. 10:
When dealing in the future with very extended rotation curves whose last observed point is in the extreme weakfield limit, it could be interesting, as a firstorder approximation, to use the latter formulae^{Footnote 34}, adding the external field as an additional parameter of the MOND fit to the external parts of the rotation curve. Of course, this would only be a firstorder approximation because it would neglect the threedimensional nature of the problem and the direction of the external field.
Now, in three dimensions, the problem can be analytically solved only in the extreme case of the completelyexternalfielddominated part of the system (where g ≪ g_{ e }) by considering the perturbation generated by a body of low mass m inside a uniform external field, assumed along the bdirection, g_{e} = g_{ e } 1_{z}. Eq. 17 can then be linearized and solved with the boundary condition that the total field equals the external one at infinity [38] to yield:
with
squashing the isopotentials along the external field direction. Thus, this is the asymptotic behavior of the gravitational field in any system embedded in a constant external field. Similarly, in QUMOND (Eq. 30), one gets
with
where L_{ Ne } = (d ln ν/d ln y)_{ y=gNea0 }
For the exact behavior of the MOND gravitational field in the regime where g and g_{ e } are of the same order of magnitude, one again resorts to a numerical solver, both for the BM equation case and for the QUMOND case (see Eq. 25 and Eq. 35). For the BM case, one adds the three components of the external field (no longer assumed to be in the zdirection only) in the argument of μ_{M1} which becomes {[(Φ(B) − Φ(A))/h − g_{ ex }]^{2} + [(Φ(I) + Φ(H) − Φ(K) − Φ(J))/(4h) − g_{ ey }]^{2} + [(Φ(C) + Φ(D) − Φ(E) − Φ(F))/(4h) − g_{ ez }]^{2}}^{1/2}, and similarly for the other Mi and Li points on the grid (Figure 17). One also adds the respective component of the external field to the term estimating the force at the Mi and Li points in Eq. 25. With M_{1}, for instance, one changes (Φ_{ i }+1,j,k − Φi,j,k) → (Φ_{ i }+1,j,k − Φi,j,k − hg_{ ex }) in the first term of Eq. 25. One then solves this discretized equation with the large radius boundary condition for the Dirichlet problem given by Eq. 61 instead of Eq. 20. Exactly the same is applicable to calculating the phantom dark matter component of QUMOND with Eq. 35, except that now the Newtonian external field is added to the terms of the equation in exactly the same way.
This external field effect (EFE) is a remarkable property of MONDian theories, and because this breaks the strong equivalence principle, it allows us to derive properties of the gravitational field in which a system is embedded from its internal dynamics (and not only from tides). For instance, the return to a Newtonian (Eq. 61 or Eq. 63) instead of a logarithmic (Eq. 20) potential at large radii is what defines the escape speed in MOND. By observationally estimating the escape speed from a system (e.g., the Milky Way escape speed from our local neighborhood; see discussion in Section 6.5.2), one can estimate the amplitude of the external field in which the system is embedded, and by measuring the shape of its isopotential contours at large radii, one can determine the direction of that external field, without resorting to tidal effects. It is also noticeable that the phantom dark matter has a tendency to become negative in “conoidal” regions perpendicular to the external field direction (see Figure 3 of [490]): with accurateenough weaklensing data, detecting these pockets of negative phantom densities could, in principle, be a smoking gun for MOND [490], but such an effect would be extremely sensitive to the detailed distribution of the baryonic matter. A final important remark about the EFE is that it prevents most possible MOND effects in Galactic disk open clusters or in wide binaries, apart from a possible rescaling of the gravitational constant. Indeed, for wide binaries located in the solar neighborhood, the galactic EFE (coming from the distribution of mass in our galaxy) is about 1.5 × a_{0}. The corresponding rescaling of the gravitational constant then depends on the choice of the μfunction, but could typically account for up to a 50% increase of the effective gravitational constant. Although this is not, properly speaking, a MOND effect, it could still perhaps imply a systematic offset of mass for verylongperiod binaries. However, any effect of the type claimed to be observed by [188] would not be expected in MOND due to the external field effect.
MOND in the solar system
The primary place to test modified gravity theories is, of course, the solar system, where general relativity has, until now, passed all the proposed tests. Detecting a deviation from Einsteinian gravity in our backyard would actually be the holy grail of modified gravity theories, in the same sense as direct detection in the lab is the holy grail of the CDM paradigm. However, MOND anomalies typically manifest themselves only in the weakgravity regime, several orders of magnitudes below the typical gravitational field exerted by the sun on, e.g., the inner planets. But in the case of modified inertia (Section 6.1.1), the anomalous acceleration at any location depends on properties of the whole orbit (nonlocality), so that anomalies may appear in the motion of Solar system bodies that are on highlyeccentric trajectories taking them to large distances (e.g., long period comets or the Pioneer spacecraft), where accelerations are low [314]. Such MOND effects have been proposed as a possible mechanism for generating the Pioneer anomaly [314, 469], without affecting the motions of planets, whose orbits are fully in the high acceleration regime. On the other hand, in classical, nonrelativistic modified gravity theories (Sections 6.1.2 and 6.1.3), small effects could still be observable and would primarily probe two aspects of the theory: (i) the shape of the interpolating function (Section 6.2) in the regime x ≫ 1, and (ii) the external Galactic gravitational field (Section 6.3) acting on the solar system, testing the interpolating function in the regime x ≪ 1.
If, as a first approximation, one considers the solar system as isolated, and the Sun as a point mass, the MOND effect in the inner solar system appears as an anomalous acceleration field in addition to the Newtonian one. In units of 0, the amplitude of the anomalous acceleration is given by x[1 − μ(x)], which can be constrained from the motion of the inner planets, typically their perihelion precession and the (non)variation of Kepler’s constant [293, 391, 417]. These constraints typically exclude the wholefamily of interpolating functions (Eq. 46) that are natural for multifield theories such as TeVeS (see Section 6.2 and Section 7) because they yield x[1 − μ(x)] > 1 for x ≫ 1 while it must be smaller than 0.04 at the orbit of Mars [391]^{Footnote 35}. Of course, this does not mean that the μfunction cannot be represented by the αfamily in the intermediate gravity regime characterizing galaxies, but it must be modified in the strong gravity regime^{Footnote 36}. Another potential effect of MOND is anomalously strong tidal stresses in the vicinity of saddle points of the Newtonian potential, which might be tested with the LISA pathfinder [37, 49, 255, 464]. The MOND bubble can be quite big and clearly detectable, or the effect could be small and undetectable, depending on the interpolating function [255, 161].
The approximation of an isolated Solar system being incorrect, it is also important to add the effect of the external field from the galaxy. Its amplitude is typically on the order of ∼ 1.5 × a_{0}. From there, Milgrom [314] has predicted (both analytically and numerically) a subtle anomaly in the form of a quadrupole field that may be detected in planetary and spacecraft motions (as subsequently confirmed by [62, 185]). This has been used to constrain the form of the interpolating function in the weak acceleration regime characteristic of the external field itself. Constraints have essentially been set on the nfamily of μfunctions from the perihelion precession of Saturn [63, 154], namely that one must have n > 8 in order to fit these data^{Footnote 37}.
However, it should be noted that it is slightly inconsistant to compare the classical predictions of MOND with observational constraints obtained by a global fit of solar system orbits using a fullyrelativistic firstpostNewtonian model. Although the above constraints on classical MOND models are useful guides, proper constraints can only truly be set on the various relativistic theories presented in Section 7, the firstorder constraints on these theories coming from their own postNewtonian parameters [65, 99, 173, 372, 391, 450]. What is more, and makes all these tests perhaps unnecessary, it has recently been shown that it was possible to cancel any deviation from general relativity at small distances in most of these relativistic theories, independently of the form of the μfunction [22].
MOND in rotationallysupported stellar systems
Rotation curves of disk galaxies
The root and heart of MOND, as modified inertia or modified gravity, is Milgrom’s formula (Eq. 7). Up to some small corrections outside of symmetrical situations, this formula yields (once a_{0} and the form of the transition function μ are chosen) a unique prediction for the total effective gravity as a function of the gravity produced by the visible baryons. It is absolutely remarkable that this formula, devised 30 years ago, has been able to successfully predict an impressive number of galactic scaling relations (the “Keplerlike” laws of Section 5.2, backed by the modern data of Section 4.3) that were very unprecise and/or unobserved at the time, and which still are a puzzle to understand in the ΛCDM framework. What is more, this formula not only predicts global scaling relations successfully, we show in this section that it also predicts the shape and amplitude of galactic rotation curves at all radii with uncanny precision, and this for all disk galaxy Hubble types [168, 402]. Of course, the absolute exact prediction of MOND depends on the exact formulation of MOND (as modified inertia or some form or other of modified gravity), but the differences are small compared to observational error bars, and even compared with the differences between various μfunctions.
In order to illustrate this, we plot in Figure 20 the theoretical rotation curve of an HSB exponential disk (see [145] for exact parameters) computed with three different formulations of MOND^{Footnote 38}: Milgrom’s formula (Eq. 7), representative of circular orbits in modified inertia, AQUAL (Eq. 17), and a multifield theory (Eq. 40) representative of a whole class of relativistic theories (see Sections 7.1 to 7.4), all with the α= n = 1 “simple” μfunction of Eq. 46 and Eq. 49. One can see velocity differences of only a few percents in this case, while, in general, it has been shown that the maximum difference between formulations is on the order of 10% for any type of disk [76]. This justifies using Milgrom’s formula as a proxy for MOND predictions on rotation curves, keeping in mind that, in order to constrain MOND within the modified gravity framework, one should actually calculate predictions of the various modified Poisson formulations of Section 6.1 for each galaxy model, and for each choice of galaxy parameters [18].
The procedure is then the following (see Section 4.3.4 for more detail). One usually assumes that light traces stellar mass (constant masstolight ratio, but see the counterexample M33), and one adds to this baryonic density the contribution of observed neutral hydrogen, scaled up to account for the contribution of primordial helium. The Newtonian gravitational force of baryons is then calculated via the Newtonian Poisson equation, and the MOND force is simply obtained via Eq. 7 or Eq. 10. First of all, an interpolating function must be chosen, then one can determine the value of a_{0} by fitting, all at once, a sample of highquality rotation curves with small distance uncertainties and no obvious noncircular motions. Then, all individual rotation curve fits can be performed with the masstolight ratio of the disk as the single free parameter of the fit^{Footnote 39}. It turns out that using the simple interpolating function (α= n = 1, see Eqs. 46 and 49) yields a value of a_{0} = 1.2 × 10^{−10} m s^{−2}, and excellent fits to galaxy rotation curves [166]. However, as already pointed out in Sections 6.3 and 6.4, this interpolating function yields too strong a modification in the solar system, so hereafter we use the γ = δ = 1 interpolating function of Eqs. 52 and 53 (solid blue line on Figure 19), very similar to the simple interpolating function in the intermediate to weak gravity regime.
Figure 21 shows two examples of detailed MOND fits to rotation curves of Figure 13. The black line represents the Newtonian contribution of stars and gas and the blue line is the MOND fit, the only free parameter being the stellar masstolight ratio^{Footnote 40}. Not only does MOND predict the general trend for LSB and HSB galaxies, it also predicts the observed rotation curves in great detail. This procedure has been carried out for 78 nearby galaxies (all galaxy rotation curves to which the authors have access), and the residuals between the observed and predicted velocities, at every point in all these galaxies (thus about two thousand individual measurements), are plotted in Figure 23. As an illustration of the variety and richness of rotation curves fitted by MOND, as well as of the range of magnitude of the discrepancies covered, we display in Figure 24 fits to rotation curves of extremely massive HSB earlytype disk galaxies [402] with V_{ f } up to 400 km/s, and in Figure 25 fits to very low mass LSB galaxies [324] with V_{ f } down to 15 km/s. In the latter, gasrich, small galaxies, the detailed fits are insensitive to the exact form of the interpolating function (Section 6.2) and to the stellar masstolight ratio [168, 324]. We then display in Figure 26 eight fits for representative galaxies from the latest highresolution THINGS survey [166, 481], and in Figure 27 six fits of yet other LSB galaxies (as these provide strong tests of MOND and depend less on the exact form of the interpolating function than HSB ones) from [120], updated with high resolution Hα data [242, 241]. The overall results for the whole 78 nearby galaxies (Figure 23) are globally very impressive, although there are a few outliers among the 2000 measurements. These are but a few trees outlying from a very clear forest. It is actually only as the quality of the data decline [384] that one begins to notice small disparities. These are sometimes attributable to external disturbances that invalidate the assumption of equilibrium [403], noncircular motions or bad observational resolution. For targets that are intrinsically difficult to observe, minor problems become more common [120, 448]. These typically have to do with the challenges inherent in combining disparate astronomical data sets (e.g., rotation curves measured independently at optical and radio wavelengths) and constraining the inclinations. A single individual galaxy that can be considered as a bit problematic is NGC 3198 [68, 166], but this could simply be due to a problem with the potentially too high Cepheidsbased distance (reddening problem mentioned in [254]). Indeed, the adopted distance plays an important role in the MOND fitting procedure, as the value of the centripetal acceleration \(V_c^2/R\) depends on the distance through the conversion of the observed angular radius in arcsec into the physical radius R in kpc. Note that other galaxies such as NGC 2841 had historicallyposed problems to MOND but that these have largely gone away with modern data (see [166] and Figure 26).
We finally note that what makes all these rotation curve fits really impressive is that either (i) stellar masstolight ratios are unimportant (in the case of gasrich galaxies) yielding excellent fits with essentially zero free parameters (apart from some wiggle room on the distance), or (ii) stellar masstolight ratios are important, and their bestfit value, obtained on purely dynamical grounds assuming MOND, vary with galaxy color as one would expect on purely astrophysical grounds from stellar population synthesis models [42]. There is absolutely nothing built into MOND that would require that redder galaxies should have higher stellar masstolight ratios in the Bband, but this is what the rotation curve fits require. This is shown on Figure 28, where the bestfit masstolight ratio in the Bband is plotted against B — V color index (left panel), and the same for the Kband (right panel).
The Milky Way
Our own Milky Way galaxy (an HSB galaxy) is a unique laboratory within which present and future surveys will allow us to perform many precision tests of MOND (at a level of precision that might even discriminate between the various versions of MOND described in Section 6.1) that are not feasible with external galaxies. However, concerning the rotation curve, the test is, at present, not the most conclusive, as the outer rotation curve of the Milky Way is paradoxically much less precisely known than that of external galaxies (the forthcoming Gaia mission should allow improvement to this situation, although the rotation curve will not be measured directly). Nevertheless, past studies of the inner rotation curve of the Milky Way [141, 142, 274], measured with the tangent point method, compared to the baryonic content of the inner Galaxy [53, 155], have shown full agreement between the rotation curve and MOND, assuming, as usual, the simple interpolating function (α = n = 1 in Eqs. 46 and 49) or the γ = δ =1 interpolating function (Eqs. 52 and 53). The inverse problem was also tackled, i.e., deriving the surface density of the inner Milky Way disk from its rotation curve (see Figure 29): this exercise [274] led to a derived surface density fully consistent with star count data, and also even reproducing the details of bumps and wiggles in the surface brightness (Renzo’s rule, Section 4.3.4), while being fully consistent with the (somewhat imprecise) constraints on the outer rotation curve of the galaxy [494].
However, especially with the advent of present and future astrometric and spectroscopic surveys, the Milky Way offers a unique opportunity to test many other predictions of MOND. These include the effect of the “phantom dark disk” (see Figure 18) on vertical velocity dispersions and on the tilt of the stellar velocity ellipsoid, the precise shape of tidal streams around the galaxy, or the effects of the external gravitational field in which the Milky Way is embedded on fundamental parameters such as the local escape speed. However, all these predictions can vary slightly depending on the exact formulation of MOND (mainly BekensteinMilgrom MOND, QUMOND, or multifield theories, the predictions being anyway difficult to make in modified inertia versions of MOND when noncircular orbits are considered). Most of the predictions made until today and reviewed hereafter have been using the BekensteinMilgrom version of MOND (Eq. 17).
Based on the baryonic distribution from, e.g., the Besançon model of the Milky Way [366], one can compute the MOND gravitational field of the Galaxy by solving the BMequation (Eq. 17). This has been done in [490]. Then one can apply the Newtonian Poisson equation to it, in order to find back the density distribution that would have yielded this potential within Newtonian dynamics [50, 140]. In this context, as already shown (Figure 18), MOND predicts a disk of “phantom dark matter” allowing one to clearly differentiate it from a Newtonian model with a dark halo:

(i)
By measuring the force perpendicular to the galactic plane: at the solar radius, MOND predicts a 60 percent enhancement of the dynamic surface density at 1.1 kpc above the plane compared to the baryonic surface density, a value in agreement with current data (Table 1, see also [339]). The enhancement would become more apparent at large galactocentric radii where the stellar disk mass density becomes negligible.

(ii)
By determining dynamically the scale length of the disk mass density distribution. This scale length is a factor ∼ 1.25 larger than the scale length of the visible stellar disk if BekensteinMilgrom MOND applies. Such a test could be applied with existing RAVE data [423], but the accuracy of available proper motions still limits the possibility to explore the gravitational forces too far from the solar neighborhood.

(iii)
By measuring the velocity ellipsoid tilt angle within the meridional galactic plane. This tilt is different within the MOND and Newton+dark halo cases in the inner part of the Galactic disk. The tilt of about 6 degrees at z =1 kpc at the solar radius is in agreement with the recent determination of 7.3 ± 1.8 degrees obtained by [422]. The difference between MOND and a Newtonian model with a spherical halo becomes significant at z =2 kpc. Interestingly, recent data [328] on the tilt of the velocity ellipsoid at these heights clearly favor the MOND prediction [50].
Such tests of MOND could be applied with the first release of future Gaia data. To fix the ideas on the current local constraints, the predictions of the Besancon MOND model are compared with the relevant observations in Table 1. However, let us note that these predictions are extremely dependent on the baryonic content of the model [53, 155, 366], so that testing MOND at the precision available in the Milky Way heavily relies on star counts, stellar population synthesis, census of the gaseous content (including molecular gas), and inhomogeneities in the baryonic distribution (clusters, gas clouds).
Another test of the predictions of MOND for the gravitational potential of the Milky Way is the thickness of the HI layer as a function of position in the disk (see Section 6.5.3): it has been found [378] that BekensteinMilgrom MOND and it phantom disk successfully accounts for the most recent and accurate flaring of the HI layer beyond 17 kpc from the center, but that it slightly underpredicts the scaleheight in the region between 10 and 15 kpc. This could indicate that the local stellar surface density in this region should be slightly smaller than usually assumed, in order for MOND to predict a less massive phantom disk and hence a thicker HI layer. Another explanation for this discrepancy would rely on nongravitational phenomena, namely ordered and smallscale magnetic fields and cosmic rays contributing to support the disk.
Yet another test would be the comparison of the observed Sagittarius stream [198, 248] with the predictions made for a disrupting galaxy satellite in the MOND potential of the Milky Way. Basic comparisons of the stream with the orbit of a point mass have shown accordance at the zeroth order [358]. In reality, such an analysis is not straightforward because streams do not delineate orbits, and because of the nonlinearity of MOND. However, combining a MOND Nbody code with a Bayesian technique [474] in order to efficiently explore the parameter space, it should be possible to rigorously test MOND with such data in the near future, including for external galaxies, which will lead to an exciting battery of new observational tests of MOND.
Finally, a last test of MOND in the Milky Way involves the external field effect of Section 6.3. As explained there, the return to a Newtonian (Eq. 61 or Eq. 63) instead of a logarithmic (Eq. 20) potential at large radii is defining the escape speed in MOND. By observationally estimating the escape speed from a system (e.g., the Milky Way escape speed from our local neighborhood), one can estimate the amplitude of the external field in which the system is embedded. With simple analytical arguments, it was found [144] that with an external field of 0.01a_{0}, the local escape speed at the Sun’s radius was about 550 km/s, exactly as observed (within the observational error range [433]). This was later confirmed by rigorous modeling in the context of BekensteinMilgrom MOND and with the Besancon baryonic model of the Milky Way [492]. This value of the external field, 10^{−2} × a_{0}, corresponds to the order of magnitude of the gravitational field exerted by Large Scale Structure, estimated from the acceleration endured by the Local Group during a Hubble time in order to attain a peculiar velocity of 600 km/s.
Disk stability and interacting galaxies
A lot of questions in galaxy dynamics require using Nbody codes. This is notably necessary for studying stability of galaxy disks, the formation of bars and spirals, or highly timevarying configurations such as galaxy mergers. As we have seen in Section 6.1.2, the BM modified Poisson equation (Eq. 17) can be solved numerically using various methods [50, 77, 96, 147, 250, 457]. Such a Poisson solver can then be used in particlemesh Nbody codes. More general codes based on QUMOND (Section 6.1.3) are currently under development.
The main results obtained via these simulations are the following (the comparison with observations will be discussed below):

(i)
LSB disks are more unstable regarding bar and spiral instabilities in MOND than in the Newton+spherical halo equivalent case,

(ii)
Bars always tend to appear more quickly in MOND than in the Newton+spherical halo equivalent, and are not slowed down by dynamical friction, leading to fast bars,

(iii)
LSB disks can be both very thin and extended in MOND thanks to the effect of the “phantom disk”, and vertical velocity dispersions level off at 8 km/s, instead of 2 km/s for Newtonian disks,

(iv)
Warps can be created in apparently isolated galaxies from the external field effect of large scale structure in MOND,

(v)
Merging timescales are longer in MOND for interacting galaxies,

(vi)
Reproducing interacting systems such as the Antennae require relatively finetuned initial conditions in MOND, but the resulting galaxy is more extended and thus closer to observations, thanks to the absence of angular momentum transfer to the dark halo.
Concerning the first point (i), Brada & Milgrom [77] investigated the important problem of stability of disk galaxies. They demonstrated that MOND, as anticipated [299], has an effect similar to a dark halo in stabilizing a rotationallysupported disk, thereby explaining the upper limit in surface density seen in the data (Section 4.3.2), and also showing how it damps the growthrate of barforming modes in the weak gravitational field regime. In a comparison of MOND disks with the equivalent Newtonian+halo counterpart (with identical rotation curves), they found that, as the surface density of the disk decreases, the growthrate of the barforming mode decreases similarly in both cases. However, in the limit of very low surface densities, typical of LSB galaxies, the MOND growth rate stops decreasing, contrary to the Newton+dark halo case (Figure 30). This could provide a solution to the stability challenge of Section 4.2, as observed LSBs do exhibit bars and spirals, which would require an ad hoc dark component within the selfgravitating disk of the Newtonian system. One can also see on this figure that if the surface density is typical of intermediate HSB galaxies, the bar systematically forms quicker in MOND.
This was confirmed in recent simulations [104, 457], where it was additionally found that (ii) the bar is sustained longer, and is not slowed down by dynamical friction against the dark halo, which leads to fast bars, consistent with the observed fast bars in disk galaxies (measured through the position of resonances). However, when gas inflow and external gas accretion are included, a larger range of situations are met regarding pattern speeds in MOND, all compatible with observations [458]. Since the bar pattern speed has a tendency to stay constant, the resonances remain at the same positions, and particles are trapped on these orbits more easily than in the Newtonian case, which leads to the formation of rings and pseudorings as observed (see Figure 31 and Figure 32). All these results have been shown to be independent of the exact choice of interpolating μfunction [458].
What is more, (iii) LSB disks can be both very thin and extended in MOND thanks to the stabilizing effect of the “phantom disk”, and vertical velocity dispersions level off at 8 km/s, as typically observed [25, 241], instead of 2 km/s for Newtonian disks with Σ = 1 M_{⊙} pc^{−2} (depending on the thickness of the disk). However, the observed value is usually attributed to nongravitational phenomena. Note that [279] utilized this fact to predict that conventional analyses of LSB disks would infer abnormally high masstolight ratios for their stellar populations — a prediction that was subsequently confirmed [159, 371]. But let us also note that this stabilizing effect of the phantom disk, leading to very thin stellar and gaseous layers, could even be too strong in the region between 10 and 15 kpc from the galactic center in the Milky Way (see Section 6.5.2), and in external galaxies [497], even though, as said, nongravitational effects such as ordered and smallscale magnetic fields and cosmic rays could significantly contribute to the prediction in these regions.
Via these simulations, it has also been shown (iv) that the external field effect of MOND (Section 6.3) offers a mechanism other than the relatively weak effect of tides in inducing and maintaining warps [79]. It was demonstrated that a satellite at the position and with the mass of the Magellanic clouds can produce a warp in the plane of the galaxy with the right amplitude and form [79], and even more importantly, that isolated galaxies could be affected by the external field of large scale structure, inducing a differential precession over the disk, in turn causing a warp [104]. This could provide a new explanation for the puzzle of isolated warped galaxies.
Interactions and mergers of galaxies are (v) very important in the cosmological context of galaxy formation (see also Section 9.2). It has been found [95] from analytical arguments that dynamical friction should be much more efficient in MOND, for instance for bar slowing down or mergers occurring more quickly. But simulations display exactly the opposite effect, in the sense of bars not slowing down and merger timescales being much larger in MOND [338, 459]. Concerning bars, Nipoti [335] found that they were indeed slowed down more in MOND, as predicted analytically [95], but this is because their bars were unrealistically small compared to observed ones. In reality, the bar takes up a significant fraction of the baryonic mass, and the reservoir of particles to interact with, assumed infinite in the case of the analytic treatment [95], is in reality insufficient to affect the bar pattern speed in MOND. Concerning long merging timescales, an important constraint from this would be that, in a MONDian cosmology, there should perhaps be fewer mergers, but longer ones than in ΛCDM, in order to keep the total observed amount of interacting galaxies unchanged. This is indeed what is expected (see Section 9.2). What is more, the long merging timescales would imply that compact galaxy groups do not evolve statistically over more than a crossing time. In contrast, in the Newtonian+dark halo case, the merging time scale would be about one crossing time because of dynamical friction, such that compact galaxy groups ought to undergo significant merging over a crossing time, contrary to what is observed [239]. Let us also note that, in MOND, many passages in binary galaxies will happen before the final merging, with a starburst triggered at each passage, meaning that the number of observed starbursts as a function of redshift cannot be used as an estimate of the number of mergers [104].
Finally, (vi) at a more detailed level, the Antennae system, the prototype of a major merger, has been shown to be nicely reproducible in MOND [459]. This is illustrated in Figure 33. On the contrary, while it is well established that CDM models can result in nice tidal tails, it turns out to be difficult to simultaneously match the narrow morphology of many observed tidal tails with rotation curves of the systems from which they come [130]. In MOND, reproducing the Antennae requires relatively finetuned initial conditions, but the resulting tidal tails are narrow and the galaxy is more extended and thus closer to observations than with CDM, thanks to the absence of angular momentum transfer to the dark halo (solution to the angular momentum challenge of Section 4.2).
Tidal dwarf galaxies
As seen in, e.g., Figure 33, left panel, major mergers between spiral galaxies are frequently observed with dwarf galaxies at the extremity of their tidal tails, called Tidal Dwarf Galaxies (TDG). These young objects are formed through gravitational instabilities within the tidal tails, leading to local collapse of gas and star formation. These objects are very common in interacting systems: in some cases dozens of such condensations are seen in the tidal tails, with a few ones having a mass typical of other dwarf galaxies in the Universe. However, in the ΛCDM model, these objects are difficult to form, and require very extended dark matter distribution [71]. In MOND simulations [459, 104], the exchange of angular momentum occurs within the disks, whose sizes are inflated. For this reason, it is much easier with MOND to form TDGs in extended tidal tails.
What is more, in the ΛCDM context, these objects are not expected to drag CDM around them, the reason being that these objects are formed out of the material in the tidal tails, itself made of the dynamically cold, rotating, material in the progenitor disk galaxies. In these disks, the local ratio of dark matter to baryons is close to zero. For this reason, the ΛCDM prediction is that these objects should not exhibit a mass discrepancy problem. However, the first ever measurement of the rotation curve of three TDGs in the NGC 5291 ring system (Figure 34) has revealed the presence of dark matter in these three objects [72]. A solution to explain this in the standard picture could then be to resort to dark baryons in the form of cold molecular gas in the disks of the progenitor galaxies. However, it is very surprising that a very different kind of dark matter, in this case baryonic dark matter, would conspire to assemble itself precisely in the right way such as to put the three TDGs (see Section 4.3.1) on the baryonic TullyFisher relation (when this baryonic dark matter is not taken into account in the baryonic budget of the BTF). Another possibility, not resorting to baryonic dark matter, would be that, by chance, the three TDGs have been observed precisely edgeon. However, if we simply consider the most natural inclination coming from the geometry of the ring (i = 45ΰ, see [72]), and apply Milgrom’s formula to the visible matter distribution with zero free parameters [165, 309], one gets very reasonable curves (Figure 35). Playing around a little bit with the inclinations allows perfect fits to these rotation curves [165], while the influence of the external field effect has been shown not to significantly change the result. Therefore, we can conclude that ΛCDM has severe problems with these objects, while MOND does exceedingly well in explaining their observed rotation curves.
However, the observations of only three TDGs are, of course, not enough, from a statistical point of view, in order for this result to be as robust as needed. Many other TDGs should be observed to randomize the uncertainties, and consolidate (or invalidate) this potentially extremely important result, that could allow one to really discriminate between Milgrom’s law being either a consequence of some fundamental aspect of gravity (or of the nature of dark matter), or simply a mere recipe for how CDM organizes itself inside spiral galaxies. As a summary, since the internal dynamics of tidal dwarfs should not be affected by CDM, they cannot obey Milgrom’s law for a statisticallysignificant sample of TDGs if Milgrom’s law is only linked to the way CDM assembles itself in galaxies. Thus, observations of the internal dynamics of TDGs should be one of the observational priorities of the coming years in order to settle this debate.
Finally, let us note that it has been suggested [239], as a possible solution to the satellites phasespace correlation problem of Section 4.2, that most dwarf satellites of the Milky Way could have been formed tidally, thereby being old tidal dwarf galaxies. They would then naturally appear in closely related planes, explaining the observed diskofsatellites. While this scenario would lead to a missing satellites catastrophe in ΛCDM (see Section 4.2), it could actually make sense in a MONDian Universe (see Section 9.2).
MOND in pressuresupported stellar systems
We have already outlined (Section 5.2) how Milgrom’s formula accounts for general scaling relations of pressuresupported systems such as the FaberJackson relation (Figure 7 and see [395]), and that isothermal systems have a finite mass in MOND with the density at large radii falling approximately as r^{−4} [296]. Note also that, in order to match the observed fundamental plane, MOND models must actually deviate somewhat from being strictly isothermal and isotropic: a radial orbit anisotropy in the outer regions is needed [388, 86]. Here we concentrate on slightly more detailed predictions and scaling relations. In general, these detailed predictions are less obvious to make than in rotationallysupported systems, precisely because of the new degree of freedom introduced by the anisotropy of the velocity distribution, very difficult to constrain observationally (as higherorder moments than the velocity dispersions would be needed to constrain it). As we shall see, the successes of MOND are in general a bit less impressive in pressuresupported systems than in rotationallysupported ones, and even in some cases really problematic (e.g., in the case of galaxy clusters, see Section 6.6.4). Whether this is due to the fact that predictions are less obvious to make, or whether this truly reflects a breakdown of Milgrom’s formula for these objects (or the fact that certain theoretical versions of MOND would explicitly deviate from Milgrom’s formula in pressuresupported systems, see Section 6.1.1) remains unclear.
Elliptical galaxies
Luminous elliptical galaxies are dense bodies of old stars with very little gas and typically large internal accelerations. The age of the stellar populations suggest they formed early and all the gas has been used to form stars. To form early, one might expect the presence of a massive darkmatter halo, but the study of, e.g., [367] showed that actually, there is very little evidence for dark matter within the effective radius, and even several effective radii, in ellipticals. On the other hand, these are veryHSB objects and would thus not be expected to show a large mass discrepancy within the bright optical object in MOND. And indeed, the results of [367] were shown to be in perfect agreement with MOND predictions, assuming very reasonable anisotropy profiles [323]. On the theoretical side, it was also importantly shown that triaxial elliptical galaxies can be reproduced using the Schwarzschild orbit superposition technique [482], and that these models are stable [493]^{Footnote 41}.
Interestingly, some observational studies circumvented the massanisotropy degeneracy by constructing nonparametric models of observed elliptical galaxies, from which equivalent circular velocity curves, radial profiles of masstolight ratio, and anisotropy profiles, as well as highorder moments, could be computed [171]. Thanks to these studies, it was, e.g., shown [171] that, although not much dark matter is needed, the equivalent circular velocity curves (see also [484] where the rotation curve could be measured directly) tend to become flat at much larger accelerations than in thin exponential disk galaxies. This would seem to contradict the MOND prescription, for which flat circular velocities typically occur well below the acceleration threshold a_{0}, but not at accelerations on the order of a few times a_{0} as in ellipticals. However, as shown in [363], if one assumes the simple interpolating function (α = n = 1 in Eq. 46 and Eq. 49), known to yield excellent fits to spiral galaxy rotation curves (see Section 6.5.1), one finds that MONDian galaxies exhibit a flattening of their circular velocity curve at high accelerations if they can be described by a Jaffe profile [208] in the region where the circular velocity is constant. Since this flattening at high accelerations is not possible for exponential profiles, it is remarkable that such flattenings of circular velocity curves at high accelerations are only observed in elliptical galaxies. What is more, [171], as well as [454], derived from their models scaling relations for the configuration space and phasespace densities of dark matter in ellipticals, and these DM scaling relations have been shown [363] to be in very good agreement with the MOND predictions on “phantom DM” (Eq. 33) scaling relations. This is displayed on Figure 36. Of course, some of these galaxies are residing in clusters, and the external field effect (see Section 6.3) could modify the predictions, but this was shown to be negligible for most of the analyzed sample, because the galaxies are far away from the cluster center [363]. Note that when closer to the center of galaxy clusters, interesting behaviors such as lopsidedness caused by the external field effect could allow new tests of MOND in the near future [491]. However, this would require modelling both the orbit of the galaxy in the cluster to take into account timevariations of the external field, as well as a precise estimate of the external field from the cluster itself, which can be tricky as the whole cluster should be modelled at once due to the nonlinearity of MOND [113, 259].
At a more detailed level, precise full lineofsight velocity dispersion profiles of individual ellipticals, typically measured with tracers such as PNe or globularcluster populations, have been reproduced by solving Jeans equation in spherical symmetry:
where σ is the radial velocity dispersion, α = d ln ρ/d ln r is the slope of the tracer density ρ, and \(\beta = 1  (\sigma _\theta ^2 + \sigma _\phi ^2)/2{\sigma ^2}\) is the velocity anisotropy. Note that on the lefthand side, one uses the density and the velocity dispersion of the tracers only, which can be different from the density producing the gravity on the righthand side, if a specific population of tracers such as globular clusters is used. When the global kinematics of a galaxy is analyzed, we do expect in MOND that the gravity on the righthand side of Eq. 65 is generated by the observed mass distribution, so both should be fit simultaneously: Figure 37 (provided by [399]) shows an example. In general, it was found that field galaxies all fit very naturally with MOND [461, 410] (see also [484]). On the other hand, the MOND modification has been found to slightly underpredict the velocity dispersions in large elliptical galaxies at the very center of galaxy clusters [364], which is just the smallscale equivalent of the problem of MOND in clusters, pointing towards missing baryons (see Section 6.6.4).
On the other hand, [225] used satellite galaxies of ellipticals to test MOND at distances of several 100 kpcs. They used the stacked SDSS satellites to generate a pair of mock galaxy groups with reasonably precise lineofsight velocity dispersions as a function of radius across the group. When these systems were first analysed by [225] they claimed that MOND was excluded by 10σ, but this was only for models that had constant velocity anisotropy. It was then found [14] that with varying anisotropy profiles similar to those found in simulations of the formation of ellipticals by dissipationless collapse in MOND [337], excellent fits to the lineofsight velocity dispersions of both mock galaxies could be found. This can be taken as strong evidence that MOND describes the dynamics in the surroundings of relatively isolated ellipticals very well.
Finally, let us note an intriguing possibility in a MONDian universe (see also Section 9.2). While massive ellipticals would form at z ≈ 10 [393] from monolithic dissipationless collapse [337], dwarf ellipticals could be more difficult to form. A possibility to form those would then be that tidal dwarf galaxies would be formed and survive more easily (see Section 6.5.4) in major mergers, and could then evolve to lead to the population of dwarf ellipticals seen today, thereby providing a natural explanation for the observed densitymorphology relation [239] (more dwarf ellipticals in denser environments).
Dwarf spheroidal galaxies
Dwarf spheroidal (dSph) satellites of the Milky Way [427, 477] exhibit some of the largest mass discrepancies observed in the universe. In this sense, they are extremely interesting objects in which to test MOND. Observationally, let us note that there are essentially two classes of objects in the galactic stellar halo: globular clusters (see Section 6.6.3) and dSph galaxies. These overlap in baryonic mass, but not in surface brightness, nor in age or uniformity of the stellar populations. The globular clusters are generally composed of old stellar populations, they are HSB objects and mostly exhibit no mass discrepancy problem, as expected for HSB objects in MOND. The dSphs, on the contrary, generally contain slightly younger stellar populations covering a range of ages, they are extreme LSB objects and exhibit, as said before, an extreme mass discrepancy, as generically expected from MOND. So, contrary to the case of ΛCDM where different formation scenarios have to be invoked (see Section 6.6.3), the different mass discrepancies in these objects find a natural explanation in MOND.
At a more detailed level, MOND should also be able to fit the whole velocity dispersion profiles, and not only give the right ballpark prediction. This analysis has recently been possible for the eight “classical” dSph around the Milky Way [477]. Solving Jeans equation (Eq. 65), it was found [8] that the four most massive and distant dwarf galaxies (Fornax, Sculptor, Leo I and Leo II) have typical stellar masstolight ratios, exactly within the expected range. Assuming equilibrium, two of the other four (smallest and most nearby) dSphs have masstolight ratios that are a bit higher than expected (Carina and Ursa Minor), and two have very high ones (Sextans and Draco). For all these dSphs, there is a remarkable correlation between the stellar M/L inferred from MOND and the ages of their stellar populations [189]. Concerning the high inferred stellar M/L, note that it has been shown [78] that a dSph will begin to suffer tidal disruption at distances from the Milky Way that are 4–7 times larger in MOND than in CDM, Sextans and Draco could thus actually be partly tidally disrupted in MOND. And indeed, after subjecting the five dSphs with published data to an interloper removal algorithm [418], it was found that Sextans was probably littered with unbound stars, which inflated the computed M/L, while Draco’s projected distancel.o.s. velocity diagram actually looks as outofequilibrium as Sextans’ one. Ursa Minor, on the other hand, is the typical example of an outofequilibrium system, elongated and showing evidence of tidal tails. In the end, only Carina has a suspiciously high M/L (> 4; see [418]).
What is more, there is a possibility that, in a MONDian Universe, dSphs are not primordial objects but have been tidally formed in a major merger (see Section 9.2 as a solution to the phasespace correlation challenge of Section 4.2). In addition to the MOND effect, it would be possible that these objects never really reach a stable equilibrium [237], and exhibit an artificially high M/L ratio. This is even more true for the recently discovered “ultrafaint” dwarf spheroidals, that are also, due to to their extremely lowdensity, very much prone to tidal heating in MOND. Indeed, at face value, if these ultrafaints are equilibrium objects, their velocity dispersions are much too high compared to what MOND predicts, and rule out MOND straightforwardly. However, unless this is due to systematic errors linked with the smallness of the velocity dispersion to measure (one must distinguish between σ ≈ 2 km s^{−1} and σ ≈ 5 km s^{−1}), and/or to high intrinsic stellar M/L ratios related to stochastic effects linked with the small number of stars [186], it was also found [285] that these objects are all close to filling their MONDian tidal radii, and that their stars can complete only a few orbits for every orbit of the satellite itself around the Milky Way (see Figure 38). As Brada & Milgrom [78] have shown, it then comes as no surprise that they are displaying outofequilibrium dynamics in MOND (and even more so in the case of a tidal formation scenario [237]).
Star clusters
Star clusters come in two types: open clusters and globular clusters. Most observed open clusters are in the inner parts of the Milky Way disk, and for that reason, the prediction of MOND is that their internal dynamics is Newtonian [293] with, perhaps, a slightly renormalized gravitational constant and slightly squashed isopotentials, due to the external field effect (Section 6.3). Therefore, the possibility of distinguishing Newtonian dynamics from MOND in these objects would require extreme precision. On the other hand, globular clusters are mostly HSB halo objects (see Section 6.6.2), and are consequently predicted to be Newtonian, and most of those that are fluffy enough to display MONDian behavior are close enough to the Galactic disk to be affected by the external field effect (Section 6.3), and so are Newtonian, too. Interestingly, MOND thus provides a natural explanation for the dichotomy between dwarf spheroidals and globular clusters. In ΛCDM, this dichotomy is rather explained by the formation history [235, 397]: globular clusters are supposedly formed in primordial diskbound supermassive molecular clouds with high baryontodark matter ratio, and later become more spheroidal due to subsequent mergers. In MOND, it is, of course, not implied that the two classes of objects have necessarily the same formation history, but the different dynamics are qualitatively explained by MOND itself, not by the different formation scenarios.
However, there exist a few globular clusters (roughly, less than ∼ 10 compared to the total number of ∼ 150) both fluffy enough to display typical internal accelerations well below a_{0}, and far away enough from the galactic plane to be more or less immune from the external field effect [27, 182, 181, 436]. Thus, these should, in principle, display a MONDian mass discrepancy. They include, e.g., Pal 14 and Pal 3, or the large fluffy globular cluster NGC 2419. Pal 3 is interesting, because it indeed tends to display a largerthanNewtonian global velocity dispersion, broadly in agreement with the MOND prediction (Baumgardt & Kroupa, private communication). However, it is difficult to draw too strong a conclusion from this (e.g., on excluding Newtonian dynamics), since there are not many stars observed, and one or two outliers would be sufficient to make the dispersion grow artificially, while a slightlyhigherthanusual masstolight ratio could reconcile Newtonian dynamics with the data. Other clusters such as NGC 1851 and NGC 1904 apparently display the same MONDian behavior [408] (see also [187]). On the other hand, Pal 14 displays exactly the opposite behavior: the measured velocity dispersion is Newtonian [212], but again the number of observed stars is too small to draw a statistically significant conclusion [164], and it is still possible to reconcile the data with MOND assuming a slightly low stellar masstolight ratio [437]. Note that if the cluster is on a highly eccentric orbit, the external gravitational field could vary very rapidly both in amplitude and direction, and it is possible that the cluster could take some time to accomodate this by still displaying a Newtonian signature in its kinematics after a sudden decrease of the external field.
NGC 2419 is an interesting case, because it allows not only for a measure of the global velocity dispersion, but also of the detailed velocity dispersion profile [199]. And, again, like in the case of Pal 14 (but contrary to Pal 3), it displays Newtonian behavior. More precisely, it was found, solving Jeans equations (Eq. 65), that the best MOND fit, although not extremely bad in itself, was 350 times less likely than the best Newtonian fit without DM [199, 200]. However, the stability [336] of this best MOND fit has not been checked in detail. These results are heavily debated as they rely on the small quoted measurement errors on the surface density, and even a slight rotation of only the outer parts of this system near the plane of the sky (which would not show up in th velocity data) would make a considerable difference in the right direction for MOND [398]. However, these observations, together with the results on Pal 14, although not ruling out any theory, are not a resounding success for MOND. However, it could perhaps indicate that globular clusters are generically on highly eccentric orbits, and out of equilibrium due to this (however, the effect would have to be opposite to that prevailing in ultrafaint dwarfs, where the departure from equilibrium would boost the velocity dispersion instead of decreasing it). A stronger view on these results could indicate that MOND as formulated today is an incomplete paradigm (see, e.g., Eq. 27), or that MOND is an effect due to the fundamental nature of the DM fluid in galaxies (see Sections 7.6 and 7.9), which is absent from globular clusters. Concerning NGC 2419, it is perhaps useful to remind oneself that it is very plausibly not a globular cluster. It is part of the Virgo stream and is thus most probably the remaining nucleus of a disrupting satellite galaxy in the halo of the Milky Way, on a genericallyhighlyeccentric orbit. Detailed Nbody simulations of such an event, and of the internal dynamics of the remaining nucleus, would thus be the key to confront MOND with observations in this object. All in all, the situation regarding MOND and the internal dynamics of globular clusters remains unclear.
On the other hand, it has been noted that MOND seems to overpredict the Roche lobe volume of globular clusters [499, 500, 512]. Again, the fact that globular clusters could generically be on highly eccentric orbits could come to the rescue here. What is more, it was shown that, in MOND, globular clusters can have a cutoff radius, which is unrelated to the tidal radius when nonisothermal [397]. In general, the cutoff radii of dwarf spheroidals, which have comparable baryonic masses, are larger than those of the globular clusters, meaning that those may well extend to their tidal radii because of a possibly different formation history than globular clusters.
Finally, a last issue for MOND related to globular clusters [335, 377] is the existence of five such objects surrounding the Fornax dwarf spheroidal galaxy. Indeed, under similar environmental conditions, dynamical friction occurs on significantly shorter timescales in MOND than standard dynamics [95], which could cause the globular clusters to spiral in and merge within at most 2 Gyrs [377]. However, this strongly depends on the orbits of the globular clusters, and, in particular, on their initial radius [10], which can allow for a Hubble time survival of the orbits in MOND.
Galaxy groups and clusters
As pointed out earlier (3rd Keplerlike law of Section 5.2), it is a natural consequence of Milgrom’s law that, at the effective baryonic radius of the system, the typical acceleration σ^{2}/R is always observed to be on the order of a_{0}, thereby naturally explaining the linear relation between size and temperature for galaxy clusters [327, 392]. However, one of the main predictions of Milgrom’s formula is the baryonic TullyFisher relation (circular velocity vs. baryonic mass, Figure 3), and its equivalent for isotropic pressuresupported systems, the FaberJackson relation (stellar velocity dispersion vs. baryonic mass, Figure 7), both for their slope and normalization. For systems such as galaxy clusters, where the hot intracluster gas is the major baryonic component, this relation can also be translated into a “gas temperature vs. baryonic mass” relation, M_{ b }∝ T^{2}, plotted on Figure 39, as the line log(M_{ b }/M_{⊙}) = 2 log(T/keV) + 12.9 (note that this differs slightly from [389] where solar metallicity gas is assumed). Note on this figure that observations are closer to the MOND predicted slope than to the conventional prediction of M ∝ T^{3/2} in ΛCDM, without the need to invoke preheating (a need that may arise as an artifact of the mismatch in slopes).
So, interestingly, the data are still reasonably consistent with the slope predicted by MOND [383], but not with the normalization. There is roughly a factor of two of residual missing mass in these objects [170, 354, 387, 389, 392, 453]. This conclusion, reached from applying the hydrostatic equilibrium equation to the temperature profile of the Xray emitting gas of these objects, has also been reached for low mass Xray emitting groups [12]. This is essentially because, contrary to the case of galaxies, there is observationally a need for “Newtonian” missing mass in the central parts^{Footnote 42} of clusters, where the observed acceleration is usually slightly larger than a_{0}, meaning that the MOND prescription is not enough to explain the observed discrepancy between visible and dynamical mass there. For this reason, the residual missing mass in MOND is essentially concentrated in the central parts of clusters, where the ratio of MOND dynamical mass to observed baryonic mass reaches a value of 10, to then only decrease to a value of roughly ∼ 2 in the very outer parts, where almost no residual mass is present. Thus, the profile of this residual mass would thus consist of a large constant density core of about 100–200 kpc in size (depending on the size of the group/cluster in question), followed by a sharp cutoff.
The need for this residual missing mass in MOND might be taken in one of the five following ways:

(i)
Practical falsification of MOND,

(ii)
Evidence for missing baryons in the central parts of clusters,

(iii)
Evidence for nonbaryonic dark matter (existing or exotic),

(iv)
Evidence that MOND is an incomplete paradigm,

(v)
Evidence for the effect of additional fields in the parent relativistic theories of MOND, not included in Milgrom’s formula.
If (i) is correct, one still needs to explain the success of MOND on galaxy scales with ΛCDM. Such an explanation has yet to be offered. Thus, tempting as case (i) is, it is worth giving a closer inspection to the four other possibilities.
The second case (ii) would be most in line with the elegant absence of need for any nonbaryonic mass in MOND (however, see the “dark fields” invoked in Section 7). It has happened before that most of the baryonic mass was in an unobserved component. From the 1930s when Zwicky first discovered the missing mass problem in clusters till the 1980s, it was widely presumed that the stars in the observed galaxies represented the bulk of baryonic mass in clusters. Only after the introduction of MOND (in 1983) did it become widely appreciated that the diffuse Xray emitting intracluster gas (the ICM) greatly outweighed the stars. That is to say, some of the missing mass problem in clusters was due to optically dark baryons — instead of the enormous mass discrepancies implied by cluster dynamical mass to optical light ratios in excess of 100 [24], the ratio of dark to baryonic mass is only ∼ 8 conventionally [175, 278]. So we should not be too hasty in presuming we now have a complete census of baryons in clusters. Indeed, in the global baryon inventory of the universe, ∼ 30% of the baryons produced during BBN are missing (Figure 40), and presumably reside in some, as yet undetected, (dark) form. It is estimated [160, 421] that the observed baryons in clusters only account for about 4% of those produced during BBN (Figure 40). This is much less than the 30% of baryons that are still missing. Consequently, only a modest fraction of the dark baryons need to reside in clusters to solve the problem of missing mass in the central regions of clusters in MOND. It should be highlighted that this missing mass only appears in MOND for systems with a high abundance of ionised gas and Xray emission. Indeed, for even smaller galaxy groups, devoid of gas, the MOND predictions for the velocity dispersions of individual galaxies are again perfectly in line with the observations [303, 307]. It is then0 no stretch of the imagination to surmise that these gas rich systems, where the residualmissingbaryons problem have equal quantities of molecular hydrogen or other molecules. Milgrom [310] has, e.g., proposed that the missing mass in MOND could entirely be in the form of cold, dense gas clouds. There is an extensive literature discussing searches for cold gas in the cores of galaxy clusters, but what is usually meant there is quite different from what is meant here, since those searches consisted in trying to find the signature of diffuse cold molecular gas at a temperature of ∼ 30 K. The proposition of Milgrom [310] rather relies on the work of Pfenniger & Combes [352], where dense gas clouds with a temperature of only a few Kelvin (∼ 3 K), solarsystem size, and of a Jupiter mass, were considered to be possible candidates for both galactic and extragalactic dark matter. These clouds would behave in a collisionless way, just like stars. However, since the dark mass considered in the context of MOND cannot be present in galaxies, it is not subject to the galactic constraints on such gas clouds. Note that the total sky covering factor of such clouds in the core of the clusters would be on the order of only 10^{−4}, so that they would only occult a minor fraction of the Xrays emitted by the hot gas (and it would be a rather constant fraction). For the same reason, the chances of a given quasar having light absorbed by them is very small. Still, [310] notes that these clouds could be probed through Xray flashes coming out of individual collisions between them. Of course, this speculative idea also raises a number of questions, the most serious one being how these clumps form and stabilize, and why they form only in clusters, Xray emitting groups and some ellipticals at the center of these groups and clusters, but not in individual spiral galaxies. As noted above, the fact that missing mass in MOND is necessarily associated with an abundance of ionised gas could be a hint at a formation and stabilization process somehow linked with the presence of hot gas and Xray emission themselves. Then, there is the issue of knowing whether the cloud formation would be prior to or posterior to the cluster formation. We note that a rather late formation mechanism could help increase the metal abundance, solving the problem of smallscale variations of metallicity in clusters when the clouds are destroyed [330]. Milgrom [310] also noted that these clouds could alleviate the cooling flow conundrum, because whatever destroys them (e.g., cloudcloud collisions and dynamical friction between the clouds and the hot gas) is conducive to heating the core gas, and thus preventing it from cooling too quickly. Such a heating source would not be transient and would be quite isotropic, contrary to AGN heating.
Another possibility (iii) would be that this residual missing mass in clusters is in the form of nonbaryonic matter. There is one obviously existing form of such matter: neutrinos. If \({m_\nu} \approx \sqrt {\Delta {m^2}}\) [434], then the neutrino mass is too small to be of interest in this context. But there is nothing that prevents it from being larger (note that the “cosmological” constraints from structure formation in the ΛCDM context obviously do not apply in MOND). Actual modelindependent experimental limits on the electron neutrino mass from the Mainz/Troitsk experiments, counting the highest energy electrons in the βdecay of Tritium [234] are m_{ ν } < 2.2 eV. Interestingly, the KATRIN experiment (the KArlsruhe TRItium Neutrino experiment, under construction) will be able to falsify these 2 eV electron neutrinos at 95% confidence. If the neutrino mass is substantially larger than the mass differences, then all types have about the same mass, and the cosmological density of three lefthanded neutrinos and their antiparticles [392] would be
where m_{ ν } is the mass of a single neutrino type in eV. If one assumes that clusters of galaxies respect the baryonneutrino cosmological ratio, and that the MOND missing mass is mostly made of neutrinos as suggested by [389, 392], then the mass of neutrinos must indeed be around 2 eV. Combined with the effect of additional degrees of freedom in relativistic MOND theories (Section 7), it has been shown that the CMB anisotropies could also be reproduced (see Section 9.2 and [430]), while this hot dark matter would obviously freestream out of spiral galaxies and would thus not perturb the MOND fits of Section 6.5.1. The main limit on the neutrino ability to condense in clusters comes from the TremaineGunn limit [463], stating that the phase space density must be preserved during collapse. This is a density level half the quantum mechanical degeneracy level in phasespace:
Converting this into configuration space, the maximum density for a cluster of a given temperature, T, is defined for a given mass of one neutrino type as [463]:
Assuming the temperature of the neutrino fluid as being equal (due to violent relaxation) to the mean emission weighted temperature of the gas, Sanders [389] showed that such 2 eV neutrinos at the limit of experimental detection could indeed account for the bulk of the dynamical mass in his sample of galaxy clusters of T > 4 keV (see also Section 8.3 for gravitational lensing constraints). This has the great advantage of naturally reproducing the proportionality of the electron density in the cores of clusters to T^{3/2}, as observed in [392]. However, looking at the central region of lowtemperature Xray emitting galaxy groups, it was found [12] that the needed central density of missing mass far exceeded this limit by a factor of several hundred. One would need one neutrino species with m ∼ 10 eV to reach the required densities. One exotic possibility is then the idea of righthanded eVscale sterile neutrinos [13]: as strange as this sounds, this mass for sterile neutrinos could also provide a good fit to the CMB acoustic peaks (see Section 9.2). This could indeed sound like the strangest and most complicated universe possible, combining true nonbaryonic (hot) dark matter with a modification of gravity, but if this is what it takes to simultaneously explain the Keplerlike laws of galactic dynamics and the extragalactic evidence for dark matter, it is useful to remember that there are both good reasons for there being more particles than those of the standard model of particle physics and that there is no reason that general relativity should be valid over a wide range of scales where it has never been tested. In any case, experiments that can address the existence of such a ∼ 10 eVscale sterile neutrino would thus be very interesting, as this kind of particle could provide the dark matter candidate only in a modified gravity framework, since such a hot dark matter particle would be unable to form small structures and to provide the dark matter that would be needed in galaxies.
Yet another possibility (iv) would be that MOND is incomplete, and that a new scale should be introduced, in order to effectively enhance the value of a_{0} in galaxy clusters, while lowering it to its preferred value in galaxies. There are several ways to implement such an idea. For instance, Bekenstein [36] proposed adding a second scale in order to allow for effective variations of the acceleration constant as a function of the deepness of the potential (Eq. 27). This idea should be investigated more in the future, but it is not clear that such a simple rescaling of a_{0} would account for the exact spatial distribution of the residual missing mass in MOND clusters, especially in cases where it is displaced from the baryonic distribution (see Section 8.3). However, as even Gauss’ theorem would not be valid anymore in spherical symmetry, the high nonlinearity might provide nonintuitive results, and it would thus clearly be worth investigating this suggestion in more detail, as well as developing similar ideas with other additional scales in the future (such as, for instance, the baryonic matter density; see [82, 143] and Section 7.6).
Finally, as we shall see in Section 7, parent relativistic theories of MOND often require additional degrees of freedom in the form of “dark fields”, which can nevertheless be globally subdominant to the baryon density, and thus do not necessarily act precisely as true “dark matter”. Thus, the last possibility (v) is that these fields, which are obviously not included in Milgrom’s formula, are responsible for the cluster missing mass in MOND. An example of such fields are the vector fields of TeVeS (Section 7.4) and Generalized EinsteinAether theories (Section 7.7). It has been shown (see Section 9.2) that the growth of the spatial part of the vector perturbation in the course of cosmological evolution can successfully seed the growth of baryonic structures, just as dark matter does. If these seeds persist, it was shown [112] that they could behave in very much the same way as a dark matter halo in relatively unrelaxed galaxy clusters. However, it remains to be seen whether the spatiallyconcentrated distribution of missing mass in MOND would be naturally reproduced in all clusters. In other relativistic versions of MOND (see, e.g., Sections 7.6 and 7.9), the “dark fields” are truly massive and can be thought of as true dark matter (although more complex than simple collisionless dark matter), whose energy density outweighs the baryonic one, and could provide the missing mass in clusters. However, again, it is not obvious that the centrallyconcentrated distribution of residual missing mass in clusters would be naturally reproduced. All in all, there is no obviously satisfactory explanation for the problem of residual missing mass in the center of galaxy clusters, which remains one of the most serious problems facing MOND.
Relativistic MOND Theories
In Section 6, we have considered the classical theories of MOND and their predictions in a vast number of astrophysical systems. However, as already stated at the beginning of Section 6, these classical theories are only toymodels until they become the weakfield limit of a relativistic theory (with invariant physical laws under differentiable coordinate transformations), i.e., an extension of general relativity (GR) rather than an extension of Newtonian dynamics. Here, we list the various existing relativistic theories boiling down to MOND in the quasistatic weakfield limit. It is useful to restate here that the motivation for developing such theories is not to get rid of dark matter but to explain the Keplerlike laws of galactic dynamics predicted by Milgrom’s law (see Section 5). As we shall see, many of these theories include new fields, so that dark matter is often effectively replaced by “dark fields” (although, contrary to dark matter, their energy density can be subdominant to the baryonic one; note that, even more importantly, in a static configuration these dark fields are fully determined by the baryons, contrary to the traditional dark matter particles, which may, in principle, be present independent of baryons).
These theories are great advances because they enable us to calculate the effects of gravitational lensing and the cosmological evolution of the universe in MOND, which are beyond the capabilities of classical theories. However, as we shall see, many of these relativistic theories still have their limitations, ranging from true theoretical or observational problems to more aesthetic problems, such as the arbitrary introduction of an interpolating function (Section 6.2) or the absence of an understanding of the \(\Lambda \sim a_0^2\) coincidence. What is more, the new fields introduced in these theories have no counterpart yet in microphysics, meaning that these theories are, at best, only effective. So, despite the existing effective relativistic theories presented here, the quest for a more profound relativistic formulation of MOND continues. Excellent reviews of existing theories can also be found in, e.g., [34, 35, 81, 100, 136, 183, 318, 429, 431].
The heart of GR is the equivalence principle(s), in its weak (WEP), Einstein (EEP) and strong (SEP) form. The WEP states the universality of free fall, while the EEP states that one recovers special relativity in the freely falling frame of the WEP. These equivalence principles are obtained by assuming that all known matter fields are universally and minimally coupled to one single metric tensor, the physical metric. It is perfectly fine to keep these principles in MOND, although certain versions can involve another type of (dark) matter not following the same geodesics as the known matter, and thus effectively violating the WEP. Additionally, note that the local Lorentz invariance of special relativity could be spontaneously violated in MOND theories. The SEP, on the other hand, states that all laws of physics, including gravitation itself, are fully independent of velocity and location in spacetime. This is obtained in GR by making the physical metric itself obey the EinsteinHilbert action. This principle has to be broken in MOND (see also Section 6.3). We now recall how GR connects with Newtonian dynamics in the weakfield limit, which is actually the regime in which the modification must be set in order to account for the MOND phenomenology of the ultraweakfield limit. The action of GR written as the sum of the matter action and the EinsteinHilbert (gravitational) action^{Footnote 43}:
where g denotes the determinant of the metric tensor g_{ μv } with (−, +, +, +) signature^{Footnote 44}, and R = R_{ μv }g^{μv} is its scalar curvature, R_{ μv } being the Ricci tensor (involving second derivatives of the metric). The matter action is a functional of the matter fields, depending on them and their first derivatives. For instance, the matter action of a free point particle S_{pp} writes:
depending on the positions x and on their timederivatives υ^{μ}. Varying the matter action with respect to (w.r.t.) matter fields degrees of freedom yields the equations of motion, i.e., the geodesic equation in the case of a point particle:
where the proper time τ = s is approximately equal to ct for slowly moving nonrelativistic particles, and is the Christoffel symbol involving first derivatives of the metric. On the other hand, varying the total action w.r.t. the metric yields Einstein’s field equations:
where T_{ μv } is the stressenergy tensor defined as the variation of the Lagrangian density of the matter fields over the metric.
In the static weakfield limit, the metric is written as (up to thirdorder corrections in 1/c^{3})^{Footnote 45}:
where, in GR,
and Φ_{ n } is the Newtonian gravitational potential. From the (0,0) components of the weakfield metric, one gets back Newton’s second law for massive particles \({d^2}{x^i}/d{t^2} =  \Gamma _{00}^i =  \partial {\Phi _N}/d{x^i}\) from the geodesic equation (Eq. 71). On the other hand, Einstein’s equations (Eq. 72) give back the Newtonian Poisson equation ∇^{2}Φ_{ N } = 4πGρ. Thus, the metric plays the role of the gravitational potential, and the Christoffel symbol plays the role of acceleration. Note, however, that if timelike geodesics are determined by the (0, 0) component of the metric, this is not the case for null geodesics. While the gravitational redshift for lightrays is solely governed by the g_{00} component of the metric too, the deflection of light is, on the other hand, also governed by the components (more specifically by Φ — Ψ in the weakfield limit). This means that, in order for the anomalous effects of any modified gravity theory on lensing and dynamics to correspond to a similar^{Footnote 46} amount of “missing mass” in GR, it is crucial that Ψ ≃ −Φ in Eq. 73.
Scalartensor kessence
MOND is an accelerationbased modification of gravity in the ultraweakfield limit, but since the Christoffel symbol, playing the role of acceleration in GR, is not a tensor, it is, in principle, not possible to make a general relativistic theory depend on it. Another natural way to account for the departure from Newtonian gravity in the weakfield limit and to account for the violation of the SEP inherent to the external field effect is to resort to a scalartensor theory, as first proposed by [38]. The added scalar field can play the role of an auxiliary potential, and its gradient then has the dimensions of acceleration and can be used to enforce the accelerationbased modification of MOND.
The relativistic theory of [38] depends on two fields, an “Einstein metric” \({{\tilde g}_{\mu \nu}}\) and a scalar field ϕ. The physical metric g_{ μv } entering the matter action is then given by a conformal transformation of the Einstein metric^{Footnote 47} through an exponential coupling function:
In order to recover the MOND dynamics, the EinsteinHilbert action (involving the Einstein metric) remains unchanged \(\left({\int {{d^4}x\sqrt { \tilde g} \tilde R}} \right)\), and the dimensionless scalar field is given a kessence action, with no potential and a nonlinear, aquadratic, kinetic term^{Footnote 48} inspired by the AQUAL action of Eq. 16:
where k is a dimensionless constant, l is a lengthscale, \(X = k{l^2}{{\tilde g}^{\mu \nu}}\phi {,_\mu}\phi {,_\nu}\), and f (X) is the “MOND function”. Since the action of the scalar field is similar to that of the potential in the BekensteinMilgrom version of classical MOND, this relativistic version is known as the Relativistic Aquadratic Lagrangian theory, RAQUAL.
Varying the action w.r.t., the scalar field yields, in a static configuration, the following modified Poisson’s equation for the scalar field:
and the (0, 0) component of the physical metric is given by \({g_{00}} =  {e^{2({\Phi _N} + {c^2}\phi)/{c^2}}}\), leading us precisely to the situation of Eq. 40 in the weakfield, with Φ = Φ_{ n } + c^{2}ϕ, with
and
whose finely tuned relation with the μfunction of Milgrom’s law is extensively described in Section 6.2. We note that the standard choice for X ≪ 1 is f′ (X) ∼ (X/3)^{1/2}, meaning that in order to recover \(\tilde \mu ({s}) = {s}/\xi\) for small s, where ξ = G_{ n }/G (see Section 6.2), one must define the lengthscale as
It was immediately realized [38] that a kessence theory such as RAQUAL can exhibit superluminal propagations whenever f″(X) > 0 [80]. Although it does not threaten causality [80], one has to check that the Cauchy problem is still wellposed for the field equations. It has been shown [80, 360] that it requires the otherwise free function f to satisfy the following properties, ∀X:
which is the equivalent of the constraints of Eq. 37 on Milgrom’s μfunction.
However, another problem was immediately realized at an observational level [38, 40]. Because of the conformal transformation of Eq. 75, one has that Ψ ≠ −Φ in the RAQUAL equivalent of Eq. 73. In other words, as it is wellknown that gravitational lensing is insensitive to conformal rescalings of the metric, apart from the contribution of the stressenergy of the scalar field to the source of the Einstein metric [40, 81], the “nonNewtonian” effects of the theory respectively on lensing and dynamics do not at all correspond to similar amounts of “missing mass”. This is also considered a generic problem with any local pure metric formulation of MOND [441].
Stratified theory
A solution to the above gravitational lensing problem due to the conformal rescaling of the metric in RAQUAL has been presented in [385]. Inspired by “stratified” theories of gravity [334], Sanders [385] suggested, in addition to the scalar field ϕ of RAQUAL, the use of a nondynamical timelike vector field U_{ μ } = (−1,0,0,0) with unitnorm U^{2} = −1 (in terms of the Einstein metric), in order to enforce a disformal relation between the Einstein and physical metrics:
The second term only affects the g_{00} component, and it then appears immediately that Ψ = −Φ in the weakfield limit (rhs terms of Eq. 73), and the problem of lensing is cured. However, the prescription that a 4vector points in the time direction is not a covariant one, and the theory should involve strong preferred frame effects, although these can now be fully suppressed, as well as any deviation from GR at small distances, with an appropriate additional “Galileon” term in addition to the asymptotic deepMOND kessence term in the action of the scalar field [22] (the other advantage being that the interpolating function then does not have to be inserted by hand). In any case, endowing the vector field with covariant dynamics of its own has been the next logical step in developing relativistic MOND theories.
Original TensorVectorScalar theory
The idea of the TensorVectorScalar theory of Bekenstein [33], dubbed TeVeS, is to keep the disformal relation of Eq. 83 between the Einstein metric \({{\tilde g}_{\mu \nu}}\) and the physical metric \({g_{\mu \nu}}\) to which matter fields couple, but to replace the above nondynamical vector field by a dynamical vector field U_{ μ } with an action (K being a dimensionless constant):
akin to that of the electromagnetic 4potential vector field (U_{[μv]} playing the role of the Faraday tensor), but without the coupling term to the 4current, and with a constraint term forcing the unit norm \({U^\mu}{U_\nu} = {{\tilde g}^{\mu \nu}}{U_\mu}{U_\nu} =  1\) (λ being a Lagrange multiplier function, to be determined as the equations are solved). The first term in the integrand takes care of approximately aligning U_{ μ } with the 4velocity of matter (when simultaneously solving for (i) the Einsteinlike equation of the Einstein metric \({{\tilde g}_{\mu \nu}}\), and for (ii) the vector equation obtained by varying the total action with respect to U_{ μ }).
Finally, the kessence action for the scalar field is kept as in RAQUAL (Eq. 76), but with
Contrary to RAQUAL, this scalar field exhibits no superluminal propagation modes. However, [81] noted that such superluminal propagation might have to be reintroduced in order to avoid excessive Cherenkov radiation and suppression of highenergy cosmic rays (see also [320]).
The static weakfield limit equation for the scalar field is precisely the same as Eq. 77, and the scalar field enters the static weak field metric Eq. 73 as Φ = −Ψ = ΞΦ_{ N } + c^{2}ϕ meaning that lensing and dynamics are compatible, with Ξ being a factor depending on K and on the cosmological value of the scalar field (see Eq. 58 of [33]). This can be normalized to yield Ξ = 1 at redshift zero. Again, all the relations between the free function and Milgrom’s μfunction can be found in Section 6.2 (see also [145, 431]).
This theory has played a true historical role as a proof of concept that it was possible to construct a fully relativistic theory both enhancing dynamics and lensing in a coherent way and reproducing the MOND phenomenology for static configurations with the dynamical 4vector pointing in the time direction. However, the question remained whether these static configurations would be stable. What is more, although a classical Hamiltonian^{Footnote 49} unbounded from below in flat spacetime would not necessarily be a concern at the classical level (and even less if the model is only “phenomenological”), it would inevitably become a worry for the existence of a stable quantum vacuum (see however [196]). And indeed, it was shown in [98] that models with such “Maxwellian” vector fields having a TeVeSlike Lagrange multiplier constraint in their action have a corresponding Hamiltonian density that can be made arbitrarily large and negative (see also Section IV.A of [81]). What is more, even at the classical level, it has been shown that sphericallysymmetric solutions of TeVeS are heavily unstable [412, 413], and that this type of vector field causes caustic singularities [105], in the sense that the integral curves of the vector are timelike geodesics meeting each other when falling into gravity potential wells. Thus, another form was needed for the action of the TeVeS vector field.
Generalized TensorVectorScalar theory
The generalization of TeVeS was proposed by Skordis [428]. Inspired by the fact that EinsteinAether theories [206, 207] also present instabilities when the unitnorm vector field is “Maxwellian” as above, it was simply proposed to use a more general Lagrangian density for the vector field, akin to that of EinsteinAether theories:
where
for a set of constants c_{1}, c_{2}, c_{3}, c_{4}. Interestingly, sphericallysymmetric solutions depend only on the combination c_{1} — c_{4}, not on c_{2} and c_{3} that can, in principle be chosen to avoid the instabilities of the original TeVeS theory. Of course, the original unstable theory is also included in this generalization through a specific combination of the four c_{ i } (see, e.g., [431]).
Thus, this generalized version is the current “working version” of what is now called TeVeS: a tensorvectorscalar theory with an Einsteinlike metric, an EinsteinAetherlike unitnorm vector field, and a kessencelike scalar field, all related to the physical metric through Eq. 83. It has been extensively studied, both in its original and generalized form. It has for instance been shown that, contrary to many gravity theories with a scalar sector, the theory evidences no cosmological evolution of the Newtonian gravitational constant and only minor evolution of Milgrom’s constant a_{0} [145, 39]. However, the fact that the latter is still put in by hand through the lengthscale of the theory l ∼ c^{2}/a_{0}, and has no dynamical connection with the Hubble or cosmological constant is perhaps a serious conceptual shortcoming, together with the free function put by hand in the action of the scalar field (but see [22] for a possible solution to the latter shortcoming). The relations between this free function and Milgrom’s μ can be found in [145, 431] (see also Section 6.2), the detailed structure of null and timelike geodesics of the theory in [431], the analysis of the parametrized postNewtonian coefficients (including the preferredframe parameters quantifying the local breaking of Lorentz invariance) in [173, 372, 391, 450], solutions for black holes and neutron stars in [244, 245, 247, 246, 374, 438, 439], and gravitational waves in [216, 214, 215, 373]. It is important to remember that TeVeS is not equivalent to GR in the strong regime, which is why it can be tested there, e.g., with binary pulsars or with the atomic spectral lines from the surface of stars [122], or other very strong field effects^{Footnote 50}. However, these effects can always generically be suppressed (at the price of introducing a Galileon type term in the action [22]), and such tests would never test MOND as a paradigm. It is by testing gravity in the weak field regime that MOND can really be put to the test.
Finally, let us note that TeVeS (and its generalization) has been shown to be expressible (in the “matter frame”) only in terms of the physical metric g_{ μv }, and the vector field U_{ μ } [513], the scalar field being eliminated from the equations through the “unitnorm” constraint in terms of the Einstein metric \({{\tilde g}^{\mu \nu}}{U_\mu}{U_\nu} =  1\), leading to g^{μv}U_{ μ }U_{ v } = −e^{−2ϕ}. In this form, TeVeS is sometimes thought of as GR with an additional “dark fluid” described by a vector field [503].
BiScalarTensorVector theory
In TeVeS [33], the “MOND function” f (X_{TeVeS}) of Eq. 76, where \({X_{{\rm{TeVeS}}}} \sim ({{\tilde g}^{\mu \nu}}  {U^\mu}{U^\nu})\phi {,_\mu}\phi {,_\nu}\), could also be expressed as a potential V of a nondynamical scalar field q, i.e., a scalar action for TeVeS of the form:
After variation of the action w.r.t. this nondynamical field, one gets qX = −V′(q), and variation w.r.t. ϕ yields the usual BM Poisson equation for ϕ (Eq. 17), with \({q^2} \propto \tilde \mu (\sqrt X)\). Inspired by an older theory (Phase Coupling Gravity [32, 381]) devised in a partially successful attempt to eliminate superluminal propagation from RAQUAL (but plagued with the same gravitational lensing problem as RAQUAL, and with additional instabilities), Sanders [390] proposed to make this field dynamical by adding a kinetic term \({{\tilde g}^{\mu \nu}}q{,_\mu}q{,_\nu}\) in the action, leading to the following very general action for the scalar fields ϕ and q:
In this theory (dubbed BSTV for biscalartensorvector theory), the physical metric has the same form as in TeVeS, meaning that ϕ is the mattercoupling scalar field, while q only influences the strength of that coupling. A remarkable achievement of the theory is that the quasistatic field equation for ϕ can be obtained only in a cosmological context, and thereby naturally explains the connection between a_{0} and H_{0} [390]. What is more, oscillations of the q field around its expectation value can be considered as massive dark matter, and is allowing an explanation of the peaks of the angular power spectrum of the Cosmic Microwave Background [390]. Unfortunately, various instabilities and a Hamiltonian unbounded by below have been evidenced in Section IV.A of [81], thus most likely ruling out this theory, at least in its present form.
Nonminimal scalartensor formalism
As a consequence of the inability of RAQUAL (the scalartensor kessence of Section 7.1) to enhance gravitational lensing, all other attempts reviewed so far (Sections 7.2 to 7.5) have been plagued with an aesthetically unpleasant growth of additional fields and free parameters. This has led Bruneton & EspositoFarèse [81] to consider models with fewer additional fields. They first considered pure metric theories in which matter is not only coupled to the metric but also nonminimally to its curvature (Eqs. 5.1 and 5.2 of [81]). While they showed that such models can indeed reproduce the MOND dynamics, they also concluded that they are generically unstable if locality is to be preserved (but see Section 7.10). They then considered models in which at most one scalar field is added, without any additional vector field, but where this field is coupled nonminimally to matter, in the sense that the mattercoupling depends on the scalar field itself but also on its first derivatives. In other words, the gradient of the scalar field is replacing the dynamical vector field of TeVeS. The simple scalar field action is just the normal action of a massive scalar field:
with \(X = {l^2}{{\tilde g}^{\mu \nu}}\phi {,_\mu}\phi {,_\nu}\) and V(ϕ) = l^{2}m^{2}ϕ^{2}/2. The physical metric g_{ μv } is then disformally related to the Einstein metric through (see Eq. 5.11 of [81]):
with the functionals
where Y = (ηa_{0})^{1/2}c^{−1}X^{−1/4}. The free function h(Y) is the “MOND function” playing the role of Milgrom’s μ. An alternative formulation of the model is obtained by separating the matter action into a normal matter action and an “interaction term” between the scalar field, the metric and the matter fields [82]. Considering the massive scalar field as a dark matter fluid, this model can thus be interpreted as a nonstandard baryondarkmatter interaction leading to the MOND behavior. If the scalar mass is small enough, it is a pure MOND theory, but if it is higher, it can lead to a “DM+ MOND” behavior, especially noteworthy in regions of high gravity such as the center of galaxy clusters (see Section 6.6.4 and discussions in [82]). Let us note that, while this theory exhibits superluminal propagations outside of matter, it is, in principle, not a problem for causality [80]. It has also been possible to study the behavior of the theory within matter, e.g., within the dilute HI gas inside galaxy disks (an analysis, which is mostly too difficult to perform in other models reviewed so far): this led to a deadly problem, i.e., that the Cauchy problem becomes illposed and the solutions to field equations illdefined. A possible solution was proposed in [82], namely to make the matter coupling (or, equivalently, the baryonscalar DM interaction) depend on the local density of matter^{Footnote 51}: this can also lead to an interesting phenomenology, where only gasrich systems behave according to Milgrom’s law, while others would behave in a CDM way [143]. A lot remains to be studied within this framework.
Generalized EinsteinAether theories
All theories reviewed so far are best expressed in the “Einstein frame”, and involve a form for the physical metric to which matter couples (an form expressed as a function of the Einstein metric and of the other additional fields). However, the work of [513] has shown that, for instance, TeVeS (Sections 7.3 and 7.4) is expressible as a pure TensorVector theory in the matter frame, and that the physical metric then both satisfies the EinsteinHilbert action and couples minimally to the matter fields, just like in GR. In fact, the modification of gravity in TeVeS thus only comes from the coupling of the physical metric to the vector field. The idea of Zlosnik et al. [514] was then that a similar, but simpler, modification of gravity could be obtained by devising a simple tensorvector theory in the matter frame, with no a priori on the geometry of the physical metric. Starting from the extensively studied EinsteinAether theories [206, 207], with a vector action of the type of Eq. 86, the idea is to make the kessence free function f (X) (the “MOND function” of Eq. 76) act directly on the vector field rather than on an additional scalar field. This leads to vector kessence, or Generalized EinsteinAether (GEA) theories (also called noncanonical EinsteinAether theories), in which the EinsteinHilbert and matter actions remain as in GR, but with an additional unitnorm vector field with the following action [431, 514]:
where (see Eq. 87 and replacing \({{\tilde g}^{\mu \nu}}\) by \({g^{\mu \nu}}\)
The unitnorm constraint fixes the vector field in terms of the metric, and from there we have that, in the weakfield limit, X_{gea} ∝ − ∇Φ^{2}, with Φ defined as in Eq. 73. The Einstein equation in the weakfield limit then yields a BM type of Poisson equation (Eq. 17) for the full gravitational potential Φ, with μ = f′ + (1 − f′)/(1 − C/2) and C = c_{1} − c_{4} [431]. In the deepMOND limit, the usual choice for f is of the type \(f({X_{{\rm{gea}}}}) \propto {( {X_{{\rm{gea}}}})^{3/2}} + 2{X_{{\rm{gea}}}}/C\), and the lengthscale must be fixed as:
Let us note that this weakfield limit of GEA theories is different from that of RAQUAL or TeVeS, where only the scalar field ϕ obeys a BMlike equation governed by an interpolating function \(\tilde \mu ({s})\), and where the total potential is given by Eq. 40.
The remarkable feature of GEA theories allowing for the desired enhancing of gravitational lensing without any on the form of the physical metric is that, writing the metric as in Eq. 73, it can be shown [431] that in the limit X_{gea} → 0 the action of Eq. 93 is only a function of ϒ = Φ + Ψ and is thus invariant under disformal transformations [Φ → Φ + β(r); Ψ → Ψ − β(r)], of the type of Eq. 83. These GEA theories are currently extensively studied, mostly in a cosmological context (see Section 9), but also for their parametrized postNewtonian coefficients in the solar system [65] or for black hole solutions [451].
Interestingly, it has been shown that all these vector field theories (TeVeS, BSTV, GEA) are all part of a broad class of theories studied in [183]. Yet other phenomenologicallyinteresting theories exist among this class, such as, for instance, the VΛ models considered by Zhao & Li [502, 506, 510] with a dynamical norm vector field, whose norm obeys a potential (giving it a mass) and has a nonquadratic kinetic term àlaRAQUAL, in order to try reproducing both the MOND phenomenology and the accelerated expansion of the universe, while interpreting the vector field as a fluid of neutrinos with varying mass [504, 505]. This has the advantage of giving a microphysics meaning to the vector field. Such vector fields have also been argued to arise naturally from dimensional reduction of higherdimensional gravity theories [34, 261], or, more generally, to be necessary from the fact that quantum gravity could need a preferred rest frame [206] in order to protect the theory against instabilities when allowing for higher derivatives to make the theory renormalizable (e.g., in Horava gravity [