Skip to main content

Varying Constants, Gravitation and Cosmology

Abstract

Fundamental constants are a cornerstone of our physical laws. Any constant varying in space and/or time would reflect the existence of an almost massless field that couples to matter. This will induce a violation of the universality of free fall. Thus, it is of utmost importance for our understanding of gravity and of the domain of validity of general relativity to test for their constancy. We detail the relations between the constants, the tests of the local position invariance and of the universality of free fall. We then review the main experimental and observational constraints that have been obtained from atomic clocks, the Oklo phenomenon, solar system observations, meteorite dating, quasar absorption spectra, stellar physics, pulsar timing, the cosmic microwave background and big bang nucleosynthesis. At each step we describe the basics of each system, its dependence with respect to the constants, the known systematic effects and the most recent constraints that have been obtained. We then describe the main theoretical frameworks in which the low-energy constants may actually be varying and we focus on the unification mechanisms and the relations between the variation of different constants. To finish, we discuss the more speculative possibility of understanding their numerical values and the apparent fine-tuning that they confront us with.

Introduction

Fundamental constants appear everywhere in the mathematical laws we use to describe the phenomena of Nature. They seem to contain some truth about the properties of the physical world while their real nature seem to evade us.

The question of the constancy of the constants of physics was probably first addressed by Dirac [155, 156] who expressed, in his “Large Numbers hypothesis”, the opinion that very large (or small) dimensionless universal constants cannot be pure mathematical numbers and must not occur in the basic laws of physics. He suggested, on the basis of this numerological principle, that these large numbers should rather be considered as variable parameters characterizing the state of the universe. Dirac formed five dimensionless ratios among whichFootnote 1 δH0ħ/mpc2 ∼ 2h × 10−42 and \(\epsilon \equiv G{\rho _0}/H_0^2 \sim 5{h^{- 2}} \times {10^{- 4}}\) and asked the question of which of these ratios is constant as the universe evolves. Usually, δ varies as the inverse of the cosmic time while ϵ varies also with time if the universe is not described by an Einstein-de Sitter solution (i.e., when a cosmological constant, curvature or radiation are included in the cosmological model). Dirac then noticed that αG/μαEM, representing the relative magnitude of electrostatic and gravitational forces between a proton and an electron, was of the same order as H0e2/mec2 = δαEMμ representing the age of the universe in atomic units so that his five numbers can be “harmonized” if one assumes that αG and δ vary with time and scale as the inverse of the cosmic time.

This argument by Dirac is indeed not a physical theory but it opened many doors in the investigation on physical constants, both on questioning whether they are actually constant and on trying to understand the numerical values we measure.

First, the implementation of Dirac’s phenomenological idea into a field-theory framework was proposed by Jordan [268], who realized that the constants have to become dynamical fields and proposed a theory where both the gravitational and fine-structure constants can vary ([497] provides a summary of some earlier attempts to quantify the cosmological implications of Dirac’s argument). Fierz [195] then realized that in such a case, atomic spectra will be spacetime-dependent so that these theories can be observationally tested. Restricting to the sub-case in which only G can vary led to definition of the class of scalar-tensor theories, which were further explored by Brans and Dicke [67]. This kind of theory was further generalized to obtain various functional dependencies for G in the formalization of scalar-tensor theories of gravitation (see, e.g., [124]).

Second, Dicke [151] pointed out that in fact the density of the universe is determined by its age, this age being related to the time needed to form galaxies, stars, heavy nuclei. … This led him to formulate that the presence of an observer in the universe places constraints on the physical laws that can be observed. In fact, what is meant by observer is the existence of (highly?) organized systems and this principle can be seen as a rephrasing of the question “why is the universe the way it is?” (see [252]). Carter [82, 83], who actually coined the term “anthropic principle” for it, showed that the numerological coincidences found by Dirac can be derived from physical models of stars and the competition between the weakness of gravity with respect to nuclear fusion. Carr and Rees [80] then showed how one can scale up from atomic to cosmological scales only by using combinations of αEM, α= and me/mp.

To summarize, Dirac’s insight was to question whether some numerical coincidences between very large numbers, that cannot be themselves explained by the theory in which they appear, was a mere coincidence or whether it can reveal the existence of some new physical laws. This gives three main roads of investigation

  • how do we construct theories in which what were thought to be constants are in fact dynamical fields,

  • how can we constrain, experimentally or observationally, the spacetime dependencies of the constants that appear in our physical laws

  • how can we explain the values of the fundamental constants and the fine-tuning that seems to exist between their numerical values.

While “varying constants” may seem, at first glance, to be an oxymoron, it has to be considered merely as jargon to be understood as “revealing new degrees of freedom, and their coupling to the known fields of our theory”. The tests on the constancy of the fundamental constants are indeed very important tests of fundamental physics and of the laws of Nature we are currently using. Detecting any such variation will indicate the need for new physical degrees of freedom in our theories, that is new physics.

The necessity of theoretical physics on deriving bounds on their variation is, at least, threefold:

  1. 1.

    it is necessary to understand and to model the physical systems used to set the constraints. In particular one needs to relate the effective parameters that can be observationally constrained to a set of fundamental constants;

  2. 2.

    it is necessary to relate and compare different constraints that are obtained at different spacetime positions. This often requires a spacetime dynamics and thus to specify a model as well as a cosmology;

  3. 3.

    it is necessary to relate the variations of different fundamental constants.

Therefore, we shall start in Section 2 by recalling the link between the constants of physics and the theories in which they appear, as well as with metrology. From a theoretical point of view, the constancy of the fundamental constants is deeply linked with the equivalence principle and general relativity. In Section 2 we will recall this relation and in particular the link with the universality of free fall. We will then summarize the various constraints that exist on such variations, mainly for the fine structure constant and for the gravitational constant in Sections 3 and 4, respectively. We will then turn to the theoretical implications in Section 5 in describing some of the arguments backing up the fact that constants are expected to vary, the main frameworks used in the literature and the various ways proposed to explain why they have the values we observe today. We shall finish by a discussion on their spatial variations in Section 6 and by discussing the possibility to understand their numerical values in Section 7.

Various reviews have been written on this topic. We will refer to the review [500] as FVC and we mention the following later reviews [31, 47, 72, 119, 226, 281, 278, 501, 395, 503, 505] and we refer to [356] for the numerical values of the constants adopted in this review.

Constants and Fundamental Physics

About constants

Our physical theories introduce various structures to describe the phenomena of Nature. They involve various fields, symmetries and constants. These structures are postulated in order to construct a mathematically-consistent description of the known physical phenomena in the most unified and simple way.

We define the fundamental constants of a physical theory as any parameter that cannot be explained by this theory. Indeed, we are often dealing with other constants that in principle can be expressed in terms of these fundamental constants. The existence of these two sets of constants is important and arises from two different considerations. From a theoretical point of view we would like to extract the minimal set of fundamental constants, but often these constants are not measurable. From a more practical point of view, we need to measure constants, or combinations of constants, which allow us to reach the highest accuracy.

Therefore, these fundamental constants are contingent quantities that can only be measured. Such parameters have to be assumed constant in this theoretical framework for two reasons:

  • from a theoretical point of view: the considered framework does not provide any way to compute these parameters, i.e., it does not have any equation of evolution for them since otherwise it would be considered as a dynamical field,

  • from an experimental point of view: these parameters can only be measured. If the theories in which they appear have been validated experimentally, it means that, at the precisions of these experiments, these parameters have indeed been checked to be constant, as required by the necessity of the reproducibility of experimental results.

This means that testing for the constancy of these parameters is a test of the theories in which they appear and allow to extend our knowledge of their domain of validity. This also explains the definition chosen by Weinberg [526] who stated that they cannot be calculated in terms of other constants “…not just because the calculation is too complicated (as for the viscosity of water) but because we do not know of anything more fundamental”.

This has a series of implications. First, the list of fundamental constants to consider depends on our theories of physics and, thus, on time. Indeed, when introducing new, more unified or more fundamental, theories the number of constants may change so that this list reflects both our knowledge of physics and, more important, our ignorance. Second, it also implies that some of these fundamental constants can become dynamical quantities in a more general theoretical framework so that the tests of the constancy of the fundamental constants are tests of fundamental physics, which can reveal that what was thought to be a fundamental constant is actually a field whose dynamics cannot be neglected. If such fundamental constants are actually dynamical fields it also means that the equations we are using are only approximations of other and more fundamental equations, in an adiabatic limit, and that an equation for the evolution of this new field has to be obtained.

The reflections on the nature of the constants and their role in physics are numerous. We refer to the books [29, 215, 510, 509] as well as [59, 165, 216, 393, 521, 526, 538] for various discussions of this issue that we cannot develop at length here. This paragraph summarizes some of the properties of the fundamental constants that have attracted some attention.

Characterizing the fundamental constants

Physical constants seem to play a central role in our physical theories since, in particular, they determined the magnitudes of the physical processes. Let us sketch briefly some of their properties. How many fundamental constants should be considered? The set of constants, which are conventionally considered as fundamental [213] consists of the electron charge e, the electron mass me, the proton mass mp, the reduced Planck constant ħ, the velocity of light in vacuum c, the Avogadro constant Na, the Boltzmann constant k B , the Newton constant G, the permeability and permittivity of space, ε0 and μ0. The latter has a fixed value in the SI system of unit 0 = 4π × 10−7 H m−1), which is implicit in the definition of the Ampere; ε0 is then fixed by the relation ε0μ0 = c−2.

However, it is clear that this cannot correspond to the list of the fundamental constants, as defined earlier as the free parameters of the theoretical framework at hand. To define such a list we must specify this framework. Today, gravitation is described by general relativity, and the three other interactions and the matter fields are described by the standard model of particle physics. It follows that one has to consider 22 unknown constants (i.e., 19 unknown dimensionless parameters): the Newton constant G, 6 Yukawa couplings for the quarks (h u , h d , h c , h s , h t , h b ) and 3 for the leptons (he, h μ , h τ ), 2 parameters of the Higgs field potential (\((\hat \mu, \lambda)\), λ), 4 parameters for the Cabibbo-Kobayashi-Maskawa matrix (3 angles θ ij and a phase δCKM), 3 coupling constants for the gauge groups SU(3) c × SU(2) L × U(1) Y (g1, g2, g3 or equivalently g2, g3 and the Weinberg angle θW), and a phase for the QCD vacuum (θQCD), to which one must add the speed of light c and the Planck constant h. See Table 1 for a summary and their numerical values.

Table 1 List of the fundamental constants of our standard model. See Ref. [379] for further details on the measurements.

Again, this list of fundamental constants relies on what we accept as a fundamental theory. Today we have many hints that the standard model of particle physics has to be extended, in particular to include the existence of massive neutrinos. Such an extension will introduce at least seven new constants (3 Yukawa couplings and 4 Maki-Nakagawa-Sakata (MNS) parameters, similar to the CKM parameters). On the other hand, the number of constants can decrease if some unifications between various interaction exist (see Section 5.3.1 for more details) since the various coupling constants may be related to a unique coupling constant α U and an energy scale of unification M U through

$$\alpha _i^{- 1}(E) = \alpha _U^{- 1} + {{{b_i}} \over {2\pi}}\ln {{{M_U}} \over E},$$

where the b i are numbers, which depend on the explicit model of unification. Note that this would also imply that the variations, if any, of various constants shall be correlated.

Relation to other usual constants. These parameters of the standard model are related to various constants that will appear in this review (see Table 2). First, the quartic and quadratic coefficients of the Higgs field potential are related to the Higgs mass and vev, \({m_H} = \sqrt {- {{\hat \mu}^2}/2}\) and \(\upsilon = \sqrt {- {{\hat \mu}^2}/\lambda}\). The latter is related to the Fermi constant \({G_{\rm{F}}} = {({\upsilon ^2}\sqrt 2)^{- 1}}\), which imposes that v = (246.7 ± 0.2) GeV while the Higgs mass is badly constrained. The masses of the quarks and leptons are related to their Yukawa coupling and the Higgs vev by \(m = h\upsilon/\sqrt 2\). The values of the gauge couplings depend on energy via the renormalization group so that they are given at a chosen energy scale, here the mass of the Z-boson, m z . g1 and g2 are related by the Weinberg angle as g1 = g2 tan θW. The electromagnetic coupling constant is not g1 since SU(2) L × U(1) Y is broken to U(1)elec so that it is given by

$${g_{{\rm{EM}}}}({m_Z}) = e = {g_2}({m_Z})\sin {\theta _{\rm{W}}}.$$
(1)

Defining the fine-structure constant as \({\alpha _{{\rm{EM}}}} = g_{{\rm{EM}}}^2/\hbar c\), the (usual) zero energy electromagnetic fine structure constant is αEM = 1/137.03599911(46) is related to αEM(m Z ) = 1/(127.918 ± 0.018) by the renormalization group equations. In particular, it implies that \({\alpha _{{\rm{EM}}}} \sim \alpha ({m_z}) + {2 \over {9\pi}}\ln \left({{{m_z^{20}} \over {m_{\rm{u}}^4m_{\rm{c}}^4{m_{\rm{d}}}{m_{\rm{s}}}{m_{\rm{b}}}m_{\rm{e}}^3m_\mu ^3m_\tau ^3}}} \right)\). We define the QCD energy scale, ΛQCD, as the energy at which the strong coupling constant diverges. Note that it implies that ΛQCD also depends on the Higgs and fermion masses through threshold effects.

Table 2 List of some related constants that appear in our discussions. See Ref. [379].

More familiar constants, such as the masses of the proton and the neutron are, as we shall discuss in more detail below (see Section 5.3.2), more difficult to relate to the fundamental parameters because they depend not only on the masses of the quarks but also on the electromagnetic and strong binding energies.

Are some constants more fundamental? As pointed-out by Lévy-Leblond [328], all constants of physics do not play the same role, and some have a much deeper role than others. Following [328], we can define three classes of fundamental constants, class A being the class of the constants characteristic of a particular system, class B being the class of constants characteristic of a class of physical phenomena, and class C being the class of universal constants. Indeed, the status of a constant can change with time. For instance, the velocity of light was initially a class A constant (describing a property of light), which then became a class B constant when it was realized that it was related to electromagnetic phenomena and, to finish, it ended as a type C constant (it enters special relativity and is related to the notion of causality, whatever the physical phenomena). It has even become a much more fundamental constant since it now enters in the definition of the meter [413] (see Ref. [510] for a more detailed discussion). This has to be contrasted with the proposition of Ref. [538] to distinguish the standard model free parameters as the gauge and gravitational couplings (which are associated to internal and spacetime curvatures) and the other parameters entering the accommodation of inertia in the Higgs sector.

Relation with physical laws. Lévy-Leblond [328] proposed to rank the constants in terms of their universality and he proposed that only three constants be considered to be of class C, namely G, ħ and c. He pointed out two important roles of these constants in the laws of physics. First, they act as concept synthesizer during the process of our understanding of the laws of nature: contradictions between existing theories have often been resolved by introducing new concepts that are more general or more synthetic than older ones. Constants build bridges between quantities that were thought to be incommensurable and thus allow new concepts to emerge. For example c underpins the synthesis of space and time while the Planck constant allowed to related the concept of energy and frequency and the gravitational constant creates a link between matter and spacetime. Second, it follows that these constants are related to the domains of validity of these theories. For instance, as soon as velocity approaches c, relativistic effects become important, relativistic effects cannot be negligible. On the other hand, for speed much below c, Galilean kinematics is sufficient. Planck constant also acts as a referent, since if the action of a system greatly exceeds the value of that constant, classical mechanics will be appropriate to describe this system. While the place of c (related to the notion of causality) and ħ (related to the quantum) in this list are well argued, the place of G remains debated since it is thought that it will have to be replaced by some mass scale.

Evolution. There are many ways the list of constants can change with our understanding of physics. First, new constants may appear when new systems or new physical laws are discovered; this is, for instance, the case of the charge of the electron or more recently the gauge couplings of the nuclear interactions. A constant can also move from one class to a more universal class. An example is that of the electric charge, initially of class A (characteristic of the electron), which then became class B when it was understood that it characterizes the strength of the electromagnetic interaction. A constant can also disappear from the list, because it is either replaced by more fundamental constants (e.g., the Earth acceleration due to gravity and the proportionality constant entering Kepler law both disappeared because they were “explained” in terms of the Newton constant and the mass of the Earth or the Sun) or because it can happen that a better understanding of physics teaches us that two hitherto distinct quantities have to be considered as a single phenomenon (e.g., the understanding by Joule that heat and work were two forms of energy led to the fact that the Joule constant, expressing the proportionality between work and heat, lost any physical meaning and became a simple conversion factor between units used in the measurement of heat (calories) and work (Joule)). Nowadays the calorie has fallen in disuse. Indeed demonstrating that a constant is varying will have direct implications on our list of constants.

In conclusion, the evolution of the number, status of the constants can teach us a lot about the evolution of the ideas and theories in physics since it reflects the birth of new concepts, their evolution and unification with other ones.

Constants and metrology

Since we cannot compute them in the theoretical framework in which they appear, it is a crucial property of the fundamental constants (but in fact of all the constants) that their value can be measured. The relation between constants and metrology is a huge subject to which we just draw the attention on some selected aspects. For more discussions, see [56, 280, 278].

The introduction of constants in physical laws is also closely related to the existence of systems of units. For instance, Newton’s law states that the gravitational force between two masses is proportional to each mass and inversely proportional to the square of their separation. To transform the proportionality to an equality one requires the use of a quantity with dimension of m3 kg−1 s−2 independent of the separation between the two bodies, of their mass, of their composition (equivalence principle) and on the position (local position invariance). With an other system of units the numerical value of this constant could have simply been anything. Indeed, the numerical value of any constant crucially depends on the definition of the system of units.

Measuring constants. The determination of the laboratory value of constants relies mainly on the measurements of lengths, frequencies, times, … (see [414] for a treatise on the measurement of constants and [213] for a recent review). Hence, any question on the variation of constants is linked to the definition of the system of units and to the theory of measurement. The behavior of atomic matter is determined by the value of many constants. As a consequence, if, e.g., the fine-structure constant is spacetime dependent, the comparison between several devices such as clocks and rulers will also be spacetime dependent. This dependence will also differ from one clock to another so that metrology becomes both device and spacetime dependent, a property that will actually be used to construct tests of the constancy of the constants.

Indeed a measurement is always a comparison between two physical systems of the same dimensions. This is thus a relative measurement, which will give as result a pure number. This trivial statement is oversimplifying since in order to compare two similar quantities measured separately, one needs to perform a number of comparisons. In order to reduce the number of comparisons (and in particular to avoid creating every time a chain of comparisons), a certain set of them has been included in the definitions of units. Each units can then be seen as an abstract physical system, which has to be realized effectively in the laboratory, and to which another physical system is compared. A measurement in terms of these units is usually called an absolute measurement. Most fundamental constants are related to microscopic physics and their numerical values can be obtained either from a pure microscopic comparison (as is, e.g., the case for me/mp) or from a comparison between microscopic and macroscopic values (for instance to deduce the value of the mass of the electron in kilogram). This shows that the choice of the units has an impact on the accuracy of the measurement since the pure microscopic comparisons are in general more accurate than those involving macroscopic physics. This implies that only the variation of dimensionless constants can be measured and in case such a variation is detected, it is impossible to determine, which dimensional constant is varying [183].

It is also important to stress that in order to deduce the value of constants from an experiment, one usually needs to use theories and models. An example [278] is provided by the Rydberg constant. It can easily be expressed in terms of some fundamental constants as \({R_\infty} = \alpha _{{\rm{EM}}}^2{m_{\rm{e}}}c/2h\). It can be measured from, e.g., the triplet 1s − 2s transition in hydrogen, the frequency of which is related to the Rydberg constant and other constants by assuming QED so that the accuracy of R is much lower than that of the measurement of the transition. This could be solved by defining R 4νH(1s − 2s)/3c but then the relation with more fundamental constants would be more complicated and actually not exactly known. This illustrates the relation between a practical and a fundamental approach and the limitation arising from the fact that we often cannot both exactly calculate and directly measure some quantity. Note also that some theoretical properties are plugged in the determination of the constants.

As a conclusion, let us recall that (i) in general, the values of the constants are not determined by a direct measurement but by a chain involving both theoretical and experimental steps, (ii) they depend on our theoretical understanding, (iii) the determination of a self-consistent set of values of the fundamental constants results from an adjustment to achieve the best match between theory and a defined set of experiments (which is important because we actually know that the theories are only good approximation and have a domain of validity) (iv) that the system of units plays a crucial role in the measurement chain, since for instance in atomic units, the mass of the electron could have been obtained directly from a mass ratio measurement (even more precise!) and (v) fortunately the test of the variability of the constants does not require a priori to have a high-precision value of the considered constants.

System of units. Thus, one needs to define a coherent system of units. This has a long, complex and interesting history that was driven by simplicity and universality but also by increasing stability and accuracy [29, 509].

Originally, the sizes of the human body were mostly used to measure the length of objects (e.g., the foot and the thumb gave feet and inches) and some of these units can seem surprising to us nowadays (e.g., the span was the measure of hand with fingers fully splayed, from the tip of the thumb to the tip of the little finger!). Similarly weights were related to what could be carried in the hand: the pound, the ounce, the dram.… Needless to say, this system had a few disadvantages since each country, region has its own system (for instance in France there was more than 800 different units in use in 1789). The need to define a system of units based on natural standard led to several propositions to define a standard of length (e.g., the mille by Gabriel Mouton in 1670 defined as the length of one angular minute of a great circle on the Earth or the length of the pendulum that oscillates once a second by Jean Picard and Christiaan Huygens). The real change happened during the French Revolution during which the idea of a universal and non anthropocentric system of units arose. In particular, the Assemblée adopted the principle of a uniform system of weights and measures on 8 May 1790 and, in March 1791 a decree (these texts are reprinted in [510]) was voted, stating that a quarter of the terrestrial meridian would be the basis of the definition of the meter (from the Greek metron, as proposed by Borda): a meter would henceforth be one ten millionth part of a quarter of the terrestrial meridian. Similarly the gram was defined as the mass of one cubic centimeter of distilled water (at a precise temperature and pressure) and the second was defined from the property that a mean solar day must last 24 hours.

To make a long story short, this led to the creation of the metric system and then of the signature of La convention du mètre in 1875. Since then, the definition of the units have evolved significantly. First, the definition of the meter was related to more immutable systems than our planet, which, as pointed out by Maxwell in 1870, was an arbitrary and inconstant reference. He then suggested that atoms may be such a universal reference. In 1960, the International Bureau of Weights and Measures (BIPM) established a new definition of the meter as the length equal to 1650763 wavelengths, in a vacuum, of the transition line between the levels 2p10 and 5d5 of krypton-86. Similarly the rotation of the Earth was not so stable and it was proposed in 1927 by André Danjon to use the tropical year as a reference, as adopted in 1952. In 1967, the second was also related to an atomic transition, defined as the duration of 9162631770 periods of the transition between the two hyperfine levels of the ground state of caesium-133. To finish, it was decided in 1983, that the meter shall be defined by fixing the value of the speed of light to c = 299792458 m s−1 and we refer to [55] for an up to date description of the SI system. Today, the possibility to redefine the kilogram in terms of a fixed value of the Planck constant is under investigation [279].

This summary illustrates that the system of units is a human product and all SI definitions are historically based on non-relativistic classical physics. The changes in the definition were driven by the will to use more stable and more fundamental quantities so that they closely follow the progress of physics. This system has been created for legal use and indeed the choice of units is not restricted to SI. SI systems and the number of basic units. The International System of Units defines seven basic units: the meter (m), second (s) and kilogram (kg), the Ampere (A), Kelvin (K), mole (mol) and candela (cd), from which one defines secondary units. While needed for pragmatic reasons, this system of units is unnecessarily complicated from the point of view of theoretical physics. In particular, the Kelvin, mole and candela are derived from the four other units since temperature is actually a measure of energy, the candela is expressed in terms of energy flux so that both can be expressed in mechanical units of length [L], mass [M] and time [T]. The mole is merely a unit denoting numbers of particles and has no dimension.

The status of the Ampere is interesting in itself. The discovery of the electric charge [Q] led to the introduction of a new units, the Coulomb. The Coulomb law describes the force between two charges as being proportional to the product of the two charges and to the inverse of the distance squared. The dimension of the force being known as [MLT−2], this requires the introduction of a new constant ε0 (which is only a conversion factor), with dimensions [Q2M−1L−3T2] in the Coulomb law, and that needs to be measured. Another route could have been followed since the Coulomb law tells us that no new constant is actually needed if one uses [M1/2L3/2T−1] as the dimension of the charge. In this system of units, known as Gaussian units, the numerical value of ε0 is 1. Hence the Coulomb can be expressed in terms of the mechanical units [L], [M] and [T], and so will the Ampere. This reduces the number of conversion factors, that need to be experimentally determined, but note that both choices of units assume the validity of the Coulomb law. Natural units. The previous discussion tends to show that all units can be expressed in terms of the three mechanical units. It follows, as realized by Johnstone Stoney in 1874Footnote 2, that these three basic units can be defined in terms of 3 independent constants. He proposed [27, 267] to use three constants: the Newton constant, the velocity of light and the basic units of electricity, i.e., the electron charge, in order to define, from dimensional analysis a “natural series of physical units” defined as

$$\begin{array}{*{20}{c}} {{t_{\text{S}}} = \sqrt {\frac{{G{e^2}}}{{4\pi {\varepsilon _0}{c^6}}}} \sim 4.59 \times {{10}^{ - 45}}{\text{s}},} \\ {{\ell _{\text{S}}} = \sqrt {\frac{{G{e^2}}}{{4\pi {\varepsilon _0}{c^4}}}} \sim 1.37 \times {{10}^{ - 36}}{\text{m}},} \\ {{m_{\text{S}}} = \sqrt {\frac{{{e^2}}}{{4\pi {\varepsilon _0}G}}} \sim 1.85 \times {{10}^{ - 9}}{\text{kg}},} \end{array}$$

where the ε0 factor has been included because we are using the SI definition of the electric charge. In such a system of units, by construction, the numerical value of G, e and c is 1, i.e., \(c = 1 \times {\ell _{\rm{S}}}\cdot t_{\rm{S}}^{- 1}\) etc.

A similar approach to the definition of the units was independently proposed by Planck [418] on the basis of the two constants a and b entering the Wien law and G, which he reformulated later [419] in terms of c, G and ħ as

$$\begin{array}{*{20}{c}} {{t_{\text{P}}} = \sqrt {\frac{{G\hbar }}{{{c^5}}}} \sim 5.39056 \times {{10}^{ - 44}}{\text{s}},} \\ {{\ell _{\text{P}}} = \sqrt {\frac{{G\hbar }}{{{c^3}}}} \sim 1.61605 \times {{10}^{ - 35}}{\text{m}},} \\ {{m_{\text{P}}} = \sqrt {\frac{{\hbar c}}{G}} \sim 2.17671 \times {{10}^{ - 8}}{\text{kg}}.} \end{array}$$

The two systems are clearly related by the fine-structure constant since e2/4πε0 = αEMhc.

Indeed, we can construct many such systems since the choice of the 3 constants is arbitrary. For instance, we can construct a system based on (e, me, h), that we can call the Bohr units, which will be suited to the study of the atom. The choice may be dictated by the system, which is studied (it is indeed far fetched to introduce G in the construction of the units when studying atomic physics) so that the system is well adjusted in the sense that the numerical values of the computations are expected to be of order unity in these units.

Such constructions are very useful for theoretical computations but not adapted to measurement so that one needs to switch back to SI units. More important, this shows that, from a theoretical point of view, one can define the system of units from the laws of nature, which are supposed to be universal and immutable. Do we actually need 3 natural units? is an issue debated at length. For instance, Duff, Okun and Veneziano [165] respectively argue for none, three and two (see also [535]). Arguing for no fundamental constant leads to consider them simply as conversion parameters. Some of them are, like the Boltzmann constant, but some others play a deeper role in the sense that when a physical quantity becomes of the same order as this constant, new phenomena appear; this is the case, e.g., of ħ and c, which are associated respectively to quantum and relativistic effects. Okun [392] considered that only three fundamental constants are necessary, as indicated by the International System of Units. In the framework of quantum field theory + general relativity, it seems that this set of three constants has to be considered and it allows to classify the physical theories (with the famous cube of physical theories). However, Veneziano [514] argued that in the framework of string theory one requires only two dimensionful fundamental constants, c and the string length λ s . The use of ħ seems unnecessary since it combines with the string tension to give λ s . In the case of the Nambu-Goto action \(S/\hbar = (T/\hbar)\int {{\rm{d}(Area)}} \equiv \lambda _{\mathcal S}^{- 2}\int {\rm{d}(Area)}\) and the Planck constant is just given by \(\lambda _{\mathcal S}^{- 2}\). In this view, ħ has not disappeared but has been promoted to the role of a UV cut-off that removes both the infinities of quantum field theory and singularities of general relativity. This situation is analogous to pure quantum gravity [388] where ħ and G never appear separately but only in the combination \({\ell _{{\rm{P1}}}} = \sqrt {G\hbar/{c^3}}\) so that only c and Pl are needed. Volovik [520] made an analogy with quantum liquids to clarify this. There an observer knows both the effective and microscopic physics so that he can judge whether the fundamental constants of the effective theory remain fundamental constants of the microscopic theory. The status of a constant depends on the considered theory (effective or microscopic) and, more interestingly, on the observer measuring them, i.e., on whether this observer belongs to the world of low-energy quasi-particles or to the microscopic world. Fundamental parameters. Once a set of three independent constants has been chosen as natural units, then all other constants are dimensionless quantities. The values of these combinations of constants does not depend on the way they are measured, [110, 164, 437], on the definition of the units etc.… It follows that any variation of constants that will leave these numbers unaffected is actually just a redefinition of units.

These dimensionless numbers represent, e.g., the mass ratio, relative magnitude of strength etc.… Changing their values will indeed have an impact on the intensity of various physical phenomena, so that they encode some properties of our world. They have specific values (e.g., αEM ∼ 1/137, mp/me ∼ 1836, etc.) that we may hope to understand. Are all these numbers completely contingent, or are some (why not all?) of them related by relations arising from some yet unknown and more fundamental theories. In such theories, some of these parameters may actually be dynamical quantities and, thus, vary in space and time. These are our potential varying constants.

The constancy of constants as a test of general relativity

The previous paragraphs have yet emphasize why testing for the consistency of the constants is a test of fundamental physics since it can reveal the need for new physical degrees of freedom in our theory. We now want to stress the relation of this test with other tests of general relativity and with cosmology.

General relativity

The tests of the constancy of fundamental constants take all their importance in the realm of the tests of the equivalence principle [540]. Einstein general relativity is based on two independent hypotheses, which can conveniently be described by decomposing the action of the theory as S = Sgrav + Smatter.

The equivalence principle has strong implication for the functional form of Sgrav. This principle includes three hypotheses:

  • the universality of free fall,

  • the local position invariance,

  • the local Lorentz invariance.

In its weak form (that is for all interactions but gravity), it is satisfied by any metric theory of gravity and general relativity is conjectured to satisfy it in its strong form (that is for all interactions including gravity). We refer to [540] for a detailed description of these principles. The weak equivalence principle can be mathematically implemented by assuming that all matter fields are minimally coupled to a single metric tensor g μν . This metric defines the length and times measured by laboratory clocks and rods so that it can be called the physical metric. This implies that the action for any matter field, ψ say, can be written as

$${S_{{\rm{matter}}}}(\psi, {g_{\mu \nu}}).$$

This metric coupling ensures in particular the validity of the universality of free-fall. Since locally, in the neighborhood of the worldline, there always exists a change of coordinates so that the metric takes a Minkowskian form at lowest order, the gravitational field can be locally “effaced” (up to tidal effects). If we identify this neighborhood to a small lab, this means that any physical properties that can be measured in this lab must be independent of where and when the experiments are carried out. This is indeed the assumption of local position invariance, which implies that the constants must take the same value independent of the spacetime point where they are measured. Thus, testing the constancy of fundamental constants is a direct test of this principle and therefore of the metric coupling. Interestingly, the tests we are discussing in this review allow one to extend them much further than the solar scales and even in the early universe, an important information to check the validity of relativity in cosmology.

As an example, the action of a point-particle reads

$${S_{{\rm{matter}}}} = - \int {mc} \sqrt {- {g_{\mu \nu}}({\bf{x}}){v^\mu}{v^\nu}{\rm{d}}t},$$
(2)

with vμ ≡ dxμ/dt. The equation of motion that one derives from this action is the usual geodesic equation

$${a^\mu} \equiv {u^\nu}{\nabla _\nu}{u^\mu} = 0,$$
(3)

where uμ = dxμ/cdτ, τ being the proper time; ∇ μ is the covariant derivative associated with the metric g μν and aν is the 4-acceleration. Any metric theory of gravity will enjoy such a matter Lagrangian and the worldline of any test particle shall be a geodesic of the spacetime with metric g μν , as long as there is no other long range force acting on it (see [190] for a detailed review of motion in alternative theories of gravity).

Note that in the Newtonian limit g00 = −1 − 2Φ N /c2 where Φ N is the Newtonian potential. It follows that, in the slow velocity limit, the geodesic equation reduces to

$${\bf{\dot v}} = {\bf{a}} = - \nabla {\Phi _N} \equiv {{\bf{g}}_N},$$
(4)

hence defining the Newtonian acceleration g N . Recall that the proper time of a clock is related to the coordinate time by \({\rm{d}}\tau = \sqrt {- {g_{00}}} {\rm{d}}t\). Thus, if one exchanges electromagnetic signals between two identical clocks in a stationary situation, the apparent difference between the two clocks rates will be

$${{{\nu _1}} \over {{\nu _2}}} = 1 + {{{\Phi _N}(2) - {\Phi _N}(1)} \over {{c^2}}},$$

at lowest order. This is the universality of gravitational redshift.

The assumption of a metric coupling is actually well tested in the solar system:

  • First, it implies that all non-gravitational constants are spacetime independent, which have been tested to a very high accuracy in many physical systems and for various fundamental constants; this is the subject of this review.

  • Second, the isotropy has been tested from the constraint on the possible quadrupolar shift of nuclear energy levels [99, 304, 422] proving that different matter fields couple to a unique metric tensor at the 10−27 level.

  • Third, the universality of free fall can be tested by comparing the accelerations of two test bodies in an external gravitational field. The parameter η12 defined as

    $${\eta _{12}} \equiv 2{{\vert {{\bf{a}}_1} - {{\bf{a}}_2}\vert} \over {\vert {{\bf{a}}_1} + {{\bf{a}}_2}\vert}},$$
    (5)

    can be constrained experimentally, e.g., in the laboratory by comparing the acceleration of a beryllium and a copper mass in the Earth gravitational field [4] to get

    $${\eta _{{\rm{Be,Cu}}}} = {(} - 1.9 \pm 2.5{)} \times {10^{- 12}}.$$
    (6)

    Similarly the comparison of Earth-core-like and Moon-mantle-like bodies gave [23]

    $${\eta _{{\rm{Earth,Moon}}}} = {(}0.1 \pm 2.7 \pm 1.7{)} \times {10^{- 13}},$$
    (7)

    and experiments with torsion balance using test bodies composed of tellurium an bismuth allowed to set the constraint [450]

    $${\eta _{{\rm{Te,Bi}}}} = {(}0.3 \pm 1.8{)} \times {10^{- 13}}.$$
    (8)

    The Lunar Laser ranging experiment [543], which compares the relative acceleration of the Earth and Moon in the gravitational field of the Sun, also set the constraints

    $${\eta _{{\rm{Earth,Moon}}}} = {(} - 1.0 \pm 1.4{)} \times {10^{- 13}}.$$
    (9)

    Note that since the core represents only 1/3 of the mass of the Earth, and since the Earth’s mantle has the same composition as that of the Moon (and thus shall fall in the same way), one loses a factor of three, so that this constraint is actually similar to the one obtained in the lab. Further constraints are summarized in Table 3. The latter constraint also contains some contribution from the gravitational binding energy and thus includes the strong equivalence principle. When the laboratory result of [23] is combined with the LLR results of [542] and [365], one gets a constraints on the strong equivalence principle parameter, respectively

    $${\eta _{{\rm{SEP}}}} = (3 \pm 6) \times {10^{- 13}}\;\;{\rm{and}}\;\;{\eta _{{\rm{SEP}}}} = (- 4 \pm 5) \times {10^{- 13}}.$$

    Large improvements are expected thanks to existence of two dedicated space mission projects: Microscope [493] and STEP [355].

  • Fourth, the Einstein effect (or gravitational redshift) has been measured at the 2 × 10−4 level [517]. We can conclude that the hypothesis of metric coupling is extremely well-tested in the solar system.

Table 3 Summary of the constraints on the violation of the universality of free fall.

The second building block of general relativity is related to the dynamics of the gravitational sector, assumed to be dictated by the Einstein-Hilbert action

$${S_{{\rm{grav}}}} = {{{c^3}} \over {16\pi G}}\int {\sqrt {- {g_\ast}} {R_\ast}{{\rm{d}}^4}x}.$$
(10)

This defines the dynamics of a massless spin-2 field \(g_{\mu \nu}^{\ast}\), called the Einstein metric. General relativity then assumes that both metrics coincide, \({g_{\mu \nu}} = g_{\mu \nu}^{\ast}\) (which is related to the strong equivalence principle), but it is possible to design theories in which this is indeed not the case (see the example of scalar-tensor theories below; Section 5.1.1) so that general relativity is one out of a large family of metric theories.

The variation of the total action with respect to the metric yields the Einstein equations

$${R_{\mu \nu}} - {1 \over 2}R{g_{\mu \nu}} = {{8\pi G} \over {{c^4}}}{T_{\mu \nu}},$$
(11)

where \({T^{\mu \nu}} \equiv \left({2/\sqrt {- g}} \right)\delta {S_{{\rm{matter}}}}/\delta {g_{\mu \nu}}\) is the matter stress-energy tensor. The coefficient 8πG/c4 is determined by the weak-field limit of the theory that should reproduce the Newtonian predictions.

The dynamics of general relativity can be tested in the solar system by using the parameterized post-Newtonian formalism (PPN). Its is a general formalism that introduces 10 phenomenological parameters to describe any possible deviation from general relativity at the first post-Newtonian order [540, 541] (see also [60] for a review on higher orders). The formalism assumes that gravity is described by a metric and that it does not involve any characteristic scale. In its simplest form, it reduces to the two Eddington parameters entering the metric of the Schwartzschild metric in isotropic coordinates

$${g_{00}} = - 1 + {{2Gm} \over {r{c^2}}} - 2{\beta ^{{\rm{PPN}}}}{\left({{{2Gm} \over {r{c^2}}}} \right)^2},\quad {g_{ij}} = \left({1 + 2{\gamma ^{{\rm{PPN}}}}{{2Gm} \over {r{c^2}}}} \right){\delta _{ij}}.$$

Indeed, general relativity predicts βPPN = γPPN = 1.

These two phenomenological parameters are constrained (1) by the shift of the Mercury perihelion [457], which implies that ∣2γPPN − βPPN − 1∣ < 3 × 10−3, (2) the Lunar laser ranging experiments [543], which implies that ∣4βPPNγPPN − 3∣ = (4.4 ± 4.5) × 10−4 and (3) by the deflection of electromagnetic signals, which are all controlled by γPPN. For instance the very long baseline interferometry [459] implies that ∣γPPN − 1∣ = 4 × 10−4, while the measurement of the time delay variation to the Cassini spacecraft [53] sets γPPN − 1 = (2.1 ± 2.3) × 10−5.

The PPN formalism does not allow to test finite range effects that could be caused, e.g., by a massive degree of freedom. In that case one expects a Yukawa-type deviation from the Newton potential,

$$V = {{Gm} \over r}\left({1 + \alpha {{\rm{e}}^{- r/\lambda}}} \right),$$

that can be probed by “fifth force” experimental searches. λ characterizes the range of the Yukawa deviation of strength α. The constraints on (λ, α) are summarized in [256], which typically shows that α < 10−2 on scales ranging from the millimeter to the solar system size.

General relativity is also tested with pulsars [125, 189] and in the strong field regime [425]. For more details we refer to [129, 495, 540, 541]. Needless to say that any extension of general relativity has to pass these constraints. However, deviations from general relativity can be larger in the past, as we shall see, which makes cosmology an interesting physical system to extend these constraints.

Varying constants and the universality of free fall

As the previous description shows, the constancy of the fundamental constants and the universality are two pillars of the equivalence principle. Dicke [152] realized that they are actually not independent and that if the coupling constants are spatially dependent then this will induce a violation of the universality of free fall.

The connection lies in the fact that the mass of any composite body, starting, e.g., from nuclei, includes the mass of the elementary particles that constitute it (this means that it will depend on the Yukawa couplings and on the Higgs sector parameters) but also a contribution, Ebinding/c2, arising from the binding energies of the different interactions (i.e., strong, weak and electromagnetic) but also gravitational for massive bodies. Thus, the mass of any body is a complicated function of all the constants, m[α i ].

It follows that the action for a point particle is no more given by Equation (2) but by

$${S_{{\rm{matter}}}} = - \int {{m_A}[{\alpha _j}]c\sqrt {- {g_{\mu \nu}}({\bf{x}}){v^\mu}{v^\nu}} {\rm{d}}t,}$$
(12)

where α j , is a list of constants including αEM but also many others and where the index A in m A recalls that the dependency in these constants is a priori different for bodies of different chemical composition. The variation of this action gives the equation of motion

$${u^\nu}{\nabla _\nu}{u^\mu} = - \left({\sum\limits_i {{{\partial \ln {m_A}} \over {\partial {\alpha _i}}}} {{\partial {\alpha _i}} \over {\partial {x^\beta}}}} \right)({g^{\beta \mu}} + {u^\beta}{u^\mu}).$$
(13)

It follows that a test body will not enjoy a geodesic motion. In the Newtonian limit g00 = − 1 + 2Φ N /c2, and at first order in v/c, the equation of motion of a test particle reduces to

$${\bf{a}} = {{\bf{g}}_N} + \delta {{\bf{a}}_A},\quad \delta {{\bf{a}}_A} = - {c^2}\sum\limits_i {{f_{A,i}}\left({\nabla {\alpha _i} + {{\dot \alpha}_i}{{{{\bf{v}}_A}} \over {{c^2}}}} \right)}$$
(14)

so that in the slow velocity (and slow variation) limit it reduces to

$$\delta {{\bf{a}}_A} = - {c^2}\sum\limits_i {{f_{A,i}}\nabla {\alpha _i}.}$$

where we have introduce the sensitivity of the mass A with respect to the variation of the constant α i by

$${f_{A,i}} \equiv {{\partial \ln {m_A}} \over {\partial {\alpha _i}}}.$$
(15)

This simple argument shows that if the constants depend on time then there must exist an anomalous acceleration that will depend on the chemical composition of the body A.

This anomalous acceleration is generated by the change in the (electromagnetic, gravitational, …) binding energies [152, 246, 386] but also in the Yukawa couplings and in the Higgs sector parameters so that the α i -dependencies are a priori composition-dependent. As a consequence, any variation of the fundamental constants will entail a violation of the universality of free fall: the total mass of the body being space dependent, an anomalous force appears if energy is to be conserved. The variation of the constants, deviation from general relativity and violation of the weak equivalence principle are in general expected together.

On the other hand, the composition dependence of δa A and thus of η AB can be used to optimize the choice of materials for the experiments testing the equivalence principle [118, 120, 122] but also to distinguish between several models if data from the universality of free fall and atomic clocks are combined [143].

From a theoretical point of view, the computation of η AB will requires the determination of the coefficients f Ai . This can be achieved in two steps by first relating the new degrees of freedom of the theory to the variation of the fundamental constants and then relating them to the variation of the masses. As we shall see in Section 5, the first issue is very model dependent while the second is especially difficult, particularly when one wants to understand the effect of the quark mass, since it is related to the intricate structure of QCD and its role in low energy nuclear reactions.

As an example, the mass of a nuclei of charge Z and atomic number A can be expressed as

$$m(A,Z) = Z{m_{\rm{P}}} + (A - Z){m_{\rm{n}}} + Z{m_{\rm{e}}} + {E_{\rm{S}}} + {E_{{\rm{EM}}}},$$

where ES and EEM are respectively the strong and electromagnetic contributions to the binding energy. The Bethe-Weizäcker formula allows to estimate the latter as

$${E_{{\rm{EM}}}} = 98.25{{Z(Z - 1)} \over {{A^{1/3}}}}{\alpha _{{\rm{EM}}}}\;{\rm{MeV}}.$$
(16)

if we decompose the proton and neutron masses as [230] m(p,n) = u3 + b(u,d)mu + b(d,u)md + B(p,n)αEM where u3 is the pure QCD approximation of the nucleon mass (bu, bd and B(n,p)/u3 being pure numbers), it reduces to

$$\begin{array}{*{20}{c}} {m(A,z) = (A{u_3} + {E_\text{S}}) + (Z{b_\text{u}} + N{b_\text{d}}){m_\text{u}} + (Z{b_\text{d}} + N{b_\text{u}}){m_{\text{d}\;\;\;\;\;\;\;\;\;\;}}} \\ { + \left( {Z{B_\text{p}} + N{B_\text{n}} + 98.25\frac{{Z(Z - 1)}}{{{A^{1/3}}}}\text{MeV}} \right){\alpha _{\text{EM}}},} \end{array}$$
(17)

with N = AZ, the neutron number. For an atom, one would have to add the contribution of the electrons, Zme. This form depends on strong, weak and electromagnetic quantities. The numerical coefficients B(n,p) are given explicitly by [230]

$${B_{\rm{P}}}{\alpha _{{\rm{EM}}}} = 0.63\;{\rm{MeV}}\quad {B_{\rm{n}}}{\alpha _{{\rm{EM}}}} = - 0.13\;{\rm{MeV}}.$$
(18)

Such estimations were used in the first analysis of the relation between variation of the constant and the universality of free fall [135, 166] but the dependency on the quark mass is still not well understood and we refer to [120, 122, 157, 159, 208] for some attempts to refine this description.

For macroscopic bodies, the mass has also a negative contribution

$$\Delta m(G) = - {G \over {2{c^2}}}\int {{{\rho (\vec r)\rho (\vec r{\prime})} \over {\vert \vec r - \vec r{\prime}\vert}}} {{\rm{d}}^3}\vec r{{\rm{d}}^3}\vec r{\prime}$$
(19)

from the gravitational binding energy. As a conclusion, from (17) and (19), we expect the mass to depend on all the coupling constant, m(αEM, αW, αS, αG, …). We shall discuss this issue in more detail in Section 5.3.2.

Note that varying coupling constants can also be associated with violations of local Lorentz invariance and CPT symmetry [298, 52, 242].

Relations with cosmology

Most constraints on the time variation of the fundamental constants will not be local and related to physical systems at various epochs of the evolution of the universe. It follows that the comparison of different constraints requires a full cosmological model.

Our current cosmological model is known as the ΛCDM (see [409] for a detailed description, and Table 4 for the typical value of the cosmological parameters). It is important to recall that its construction relies on 4 main hypotheses: (H1) a theory of gravity; (H2) a description of the matter components contained in the Universe and their non-gravitational interactions; (H3) symmetry hypothesis; and (H4) a hypothesis on the global structure, i.e., the topology, of the Universe. These hypotheses are not on the same footing since H1 and H2 refer to the physical theories. However, these hypotheses are not sufficient to solve the field equations and we must make an assumption on the symmetries (H3) of the solutions describing our Universe on large scales while H4 is an assumption on some global properties of these cosmological solutions, with same local geometry. But the last two hypothesis are unavoidable because the knowledge of the fundamental theories is not sufficient to construct a cosmological model [504].

Table 4 Main cosmological parameters in the standard Λ-CDM model. There are 7 main parameters (because ∑Ω i = 0) to which one can add 6 more to include dark energy, neutrinos and gravity waves. Note that often the spatial curvature is set to Ω K = 0. (See, e.g. Refs. [296, 409]).

The ΛCDM model assumes that gravity is described by general relativity (H1), that the Universe contains the fields of the standard model of particle physics plus some dark matter and a cosmological constant, the latter two having no physical explanation at the moment. It also deeply involves the Copernican principle as a symmetry hypothesis (H3), without which the Einstein equations usually cannot been solved, and assumes most often that the spatial sections are simply connected (H4). H2 and H3 imply that the description of the standard matter reduces to a mixture of a pressureless and a radiation perfect fluids. This model is compatible with all astronomical data, which roughly indicates that ΩΛ0 ≃ 0.73, Ωmat0 ≃ 0.27, and ΩK0 ≃ 0. Thus, cosmology roughly imposes that \(\vert {\Lambda _0}\vert \leq H_0^2\), that is \({\ell _\Lambda} \leq H_0^{- 1} \sim {10^{26}}{\rm{m}} \sim {\rm{1}}{{\rm{0}}^{41}}{\rm{Ge}}{{\rm{V}}^{{- 1}}}\).

Hence, the analysis of the cosmological dynamics of the universe and of its large scale structures requires the introduction of a new constant, the cosmological constant, associated with a recent acceleration of the cosmic expansion, that can be introduced by modifying the Einstein-Hilbert action to

$${S_{{\rm{grav}}}} = {{{c^3}} \over {16\pi G}}\int {\sqrt {- g}} (R - 2\Lambda){{\rm{d}}^{\rm{4}}}x.$$

This constant can equivalently be introduced in the matter action. Note, however, that it is disproportionately small compared to the natural scale fixed by the Planck length

$${\rho _{{\Lambda _0}}} \sim {10^{- 120}}M_{{\rm{Pl}}}^4 \sim {10^{- 47}}\;{\rm{Ge}}{{\rm{V}}^{\rm{4}}}.$$
(20)

Classically, this value is no problem but it was pointed out that at the quantum level, the vacuum energy should scale as M4, where M is some energy scale of high-energy physics. In such a case, there is a discrepancy of 60–120 order of magnitude between the cosmological conclusions and the theoretical expectation. This is the cosmological constant problem [528].

Two approaches to solve this problem have been considered. Either one accepts such a constant and such a fine-tuning and tries to explain it on anthropic ground. Or, in the same spirit as Dirac, one interprets it as an indication that our set of cosmological hypotheses have to be extended, by either abandoning the Copernican principle [508] or by modifying the local physical laws (either gravity or the matter sector). The way to introduce such new physical degrees of freedom were classified in [502]. In that latter approach, the tests of the constancy of the fundamental constants are central, since they can reveal the coupling of this new degree of freedom to the standard matter fields. Note, however, that the cosmological data still favor a pure cosmological constant.

Among all the proposals quintessence involves a scalar field rolling down a runaway potential hence acting as a fluid with an effective equation of state in the range −1 ≤ w ≤ 1 if the field is minimally coupled. It was proposed that the quintessence field is also the dilaton [229, 434, 499]. The same scalar field then drives the time variation of the cosmological constant and of the gravitational constant and it has the property to also have tracking solutions [499]. Such models do not solve the cosmological constant problem but only relieve the coincidence problem. One of the underlying motivation to replace the cosmological constant by a scalar field comes from superstring models in which any dimensionful parameter is expressed in terms of the string mass scale and the vacuum expectation value of a scalar field. However, the requirement of slow roll (mandatory to have a negative pressure) and the fact that the quintessence field dominates today imply, if the minimum of the potential is zero, that it is very light, roughly of order m ∼ 10−33 eV [81].

Such a light field can lead to observable violations of the universality of free fall if it is non-universally coupled to the matter fields. Carroll [81] considered the effect of the coupling of this very light quintessence field to ordinary matter via a coupling to the electromagnetic field as \(\phi {F^{\mu \nu}}{{\tilde F}_{\mu \nu}}\). Chiba and Kohri [96] also argued that an ultra-light quintessence field induces a time variation of the coupling constant if it is coupled to ordinary matter and studied a coupling of the form ΦFμνF μν , as, e.g., expected from Kaluza-Klein theories (see below). This was generalized to quintessence models with a couplings of the form Z(ϕ)FμνF μν [11, 112, 162, 315, 314, 347, 404, 531] and then to models of runaway dilaton [133, 132] inspired by string theory (see Section 5.4.1). The evolution of the scalar field drives both the acceleration of the universe at late time and the variation of the constants. As pointed in [96, 166, 532] such models are extremely constrained from the bound on the universality of free-fall (see Section 6.3).

We have two means of investigation:

  • The field driving the time variation of the fundamental constants does not explain the acceleration of the universe (either it does not dominate the matter content today or its equation of state is not negative enough). In such a case, the variation of the constants is disconnected from the dark energy problem. Cosmology allows to determine the dynamics of this field during the whole history of the universe and thus to compare local constraints and cosmological constraints. An example is given by scalar-tensor theories (see Section 5.1.1) for which one can compare, e.g., primordial nucleosynthesis to local constraints [134]. However, in such a situation one should take into account the effect of the variation of the constants on the astrophysical observations since it can affect local physical processes and bias, e.g., the luminosity of supernovae and indirectly modify the distance luminosity-redshift relation derived from these observations [33, 435].

  • The field driving the time variation of the fundamental constants is also responsible for the acceleration of the universe. It follows that the dynamics of the universe, the level of variation of the constants and the other deviations from general relativity are connected [348] so that the study of the variation of the constants can improve the reconstruction of the equation state of the dark energy [20, 162, 389, 404].

In conclusion, cosmology seems to require a new constant. It also provides a link between the microphysics and cosmology, as foreseen by Dirac. The tests of fundamental constants can discriminate between various explanations of the acceleration of the universe. When a model is specified, cosmology also allows to set stringer constraints since it relates observables that cannot be compared otherwise.

Experimental and Observational Constraints

This section focuses on the experimental and observational constraints on the non-gravitational constants, that is assuming αG remains constant. We use the convention that Δα = αα0 for any constant α, so that Δα < 0 refers to a value smaller than today.

The various physical systems that have been considered can be classified in many ways. We can classify them according to their look-back time and more precisely their space-time position relative to our actual position. This is summarized in Figure 1. Indeed higher redshift systems offer the possibility to set constraints on a larger time scale, but this is at the expense of usually involving other parameters such as the cosmological parameters. This is, in particular, the case of the cosmic microwave background or of primordial nucleosynthesis. The systems can also be classified in terms of the physics they involve. For instance, atomics clocks, quasar absorption spectra and the cosmic microwave background require only to use quantum electrodynamics to draw the primary constraints while the Oklo phenomenon, meteorites dating and nucleosynthesis require nuclear physics.

Figure 1
figure1

Top: Summary of the systems that have been used to probe the constancy of the fundamental constants and their position in a space-time diagram in which the cone represents our past light cone. The shaded areas represents the comoving space probed by different tests. Bottom: The look-back time-redshift relation for the standard ΛCDM model.

For any system, setting constraints goes through several steps. First we have some observable quantities from which we can draw constraints on primary constants, which may not be fundamental constants (e.g., the BBN parameters, the lifetime of β-decayers, …). These primary parameters must then be related to some fundamental constants such as masses and couplings. In a last step, the number of constants can be reduced by relating them in some unification schemes. Indeed each step requires a specific modelization and hypothesis and has its own limitations. This is summarized on Table 5.

Table 5 Summary of the systems considered to set constraints on the variation of the fundamental constants. We summarize the observable quantities, the primary constants used to interpret the data and the other hypothesis required for this interpretation. All the quantities appearing in this table are defined in the text.

Atomic clocks

Atomic spectra and constants

The laboratory constraints on the time variation of fundamental constants are obtained by comparing the long-term behavior of several oscillators and rely on frequency measurements. The atomic transitions have various dependencies in the fundamental constants. For instance, for the hydrogen atom, the gross, fine and hyperfine-structures are roughly given by

$$2p - 1s:\nu \propto c{R_\infty},\quad 2{p_{3/2}} - 2{p_{1/2}}:\nu \propto c{R_\infty}\alpha _{{\rm{EM}}}^2,\quad 1s: \propto c{R_\infty}\alpha _{{\rm{EM}}}^2{g_{\rm{P}}}\bar \mu,$$

respectively, where the Rydberg constant set the dimension. gp is the proton gyromagnetic factor and \(\bar \mu = {m_{\rm{e}}}/{m_{\rm{P}}}\). In the non-relativistic approximation, the transitions of all atoms have similar dependencies but two effects have to be taken into account. First, the hyperfine-structures involve a gyromagnetic factor g i (related to the nuclear magnetic moment by μi = g i μ N , with μ N = /2m p c), which are different for each nuclei. Second, relativistic corrections (including the Casimir contribution), which also depend on each atom (but also on the type of the transition) can be included through a multiplicative function Frel(αEM). It has a strong dependence on the atomic number Z, which can be illustrated on the case of alkali atoms, for which

$${F_{{\rm{rel}}}}({\alpha _{{\rm{EM}}}}) = {[1 - {(Z{\alpha _{{\rm{EM}}}})^2}]^{- 1/2}}{\left[ {1 - {4 \over 3}{{(Z{\alpha _{{\rm{EM}}}})}^2}} \right]^{- 1}} \simeq 1 + {{11} \over 6}{(Z{\alpha _{{\rm{EM}}}})^2}.$$

The developments of highly accurate atomic clocks using different transitions in different atoms offer the possibility to test a variation of various combinations of the fundamental constants.

It follows that at the lowest level of description, we can interpret all atomic clocks results in terms of the g-factors of each atoms, g i , the electron to proton mass ration μ and the fine-structure constant αEM. We shall parameterize the hyperfine and fine-structures frequencies as follows.

The hyperfine frequency in a given electronic state of an alkali-like atom, such as 133Cs, 87Rb, 199Hg+, is

$${\nu _{{\rm{hfs}}}} \simeq {R_\infty}c \times {A_{{\rm{hfs}}}} \times {g_i} \times \alpha _{{\rm{EM}}}^2 \times \bar \mu \times {F_{{\rm{hfs}}}}(\alpha)$$
(21)

where g i = μ i /μN is the nuclear g factor. Ahfs is a numerical factor depending on each particular atom and we have set Frel = Fhfs(α). Similarly, the frequency of an electronic transition is well-approximated by

$${\nu _{{\rm{eles}}}} \simeq {R_\infty}c \times {A_{{\rm{elec}}}} \times {F_{{\rm{elec}}}}(Z,\alpha),$$
(22)

where, as above, Aelec is a numerical factor depending on each particular atom and Felec is the function accounting for relativistic effects, spin-orbit couplings and many-body effects. Even though an electronic transition should also include a contribution from the hyperfine interaction, it is generally only a small fraction of the transition energy and thus should not carry any significant sensitivity to a variation of the fundamental constants.

The importance of the relativistic corrections was probably first emphasized in [423] and their computation through relativistic N-body calculations was carried out for many transitions in [170, 174, 175, 198]. They can be characterized by introducing the sensitivity of the relativistic factors to a variation of αEM,

$${\kappa _\alpha} \equiv {{\partial \ln F} \over {\partial \ln {\alpha _{{\rm{EM}}}}}}.$$
(23)

Table 6 summarizes the values of some of them, as computed in [175, 210]. Indeed a reliable knowledge of these coefficients at the 1% to 10% level is required to deduce limits to a possible variation of the constants. The interpretation of the spectra in this context relies, from a theoretical point of view, only on quantum electrodynamics (QED), a theory, which is well tested experimentally [280] so that we can safely obtain constraints on (αEM, μ,g i ), still keeping in mind that the computation of the sensitivity factors required numerical N-body simulations.

Table 6 Sensitivity of various transitions on a variation of the fine-structure constant.

From an experimental point of view, various combinations of clocks have been performed. It is important to analyze as many species as possible in order to rule-out species-dependent systematic effects. Most experiments are based on a frequency comparison to caesium clocks. The hyperfine splitting frequency between the F = 3 and F = 4 levels of its 2S1/2 ground state at 9.192 GHz has been used for the definition of the second since 1967. One limiting effect, that contributes mostly to the systematic uncertainty, is the frequency shift due to cold collisions between the atoms. On this particular point, clocks based on the hyperfine frequency of the ground state of the rubidium at 6.835 GHz, are more favorable.

Experimental constraints

We present the latest results that have been obtained and refer to Section III.B.2 of FCV [500] for earlier studies. They all rely on the developments of new atomic clocks, with the primarily goal to define better frequency standards.

  • Rubidium: The comparison of the hyperfine frequencies of the rubidium and caesium in their electronic ground state between 1998 and 2003, with an accuracy of order 10−15, leads to the constraint [346]

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Rb}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(}0.2 \pm 7.0{)} \times {10^{- 16}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (24)

    With one more year of experiment, the constraint dropped to [58]

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Rb}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(} - 0.5 \pm 5.3{)} \times {10^{- 16}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (25)

    From Equation (21), and using the values of the sensitivities κ α , we deduce that comparison constrains

    $${{{\nu _{{\rm{Cs}}}}} \over {{\nu _{{\rm{Rb}}}}}} \propto {{{g_{{\rm{Cs}}}}} \over {{g_{{\rm{Rb}}}}}}\alpha _{{\rm{EM}}}^{0.49}.$$
  • Atomic hydrogen: The 1s–2s transition in atomic hydrogen was compared tp the ground state hyperfine splitting of caesium [196] in 1999 and 2003, setting an upper limit on the variation of νH of (−29±57) Hz within 44 months. This can be translated in a relative drift

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{\rm{H}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = (- 32 \pm 63) \times {10^{- 16}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (26)

    Since the relativistic correction for the atomic hydrogen transition nearly vanishes, we have νHR so that

    $${{{\nu _{{\rm{Cs}}}}} \over {{\nu _{\rm{H}}}}} \propto {g_{{\rm{Cs}}}}\bar \mu \alpha _{{\rm{EM}}}^{2.83}.$$
  • Mercury: The 199Hg+ 2S1/22D5/2 optical transition has a high sensitivity to αEM (see Table 6) so that it is well suited to test its variation. The frequency of the 199Hg+ electric quadrupole transition at 282 nm was compared to the ground state hyperfine transition of caesium during a two year period, which lead to [57]

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Hg}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(}0.2 \pm 7{)} \times {10^{- 15}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (27)

    This was improved by a comparison over a 6 year period [214] to get

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Hg}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(}3.7 \pm 3.9{)} \times {10^{- 16}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (28)

    While νCs is still given by Equation (21), νHg is given by Equation (22). Using the sensitivities of Table 6, we conclude that this comparison test the stability of

    $${{{\nu _{{\rm{Cs}}}}} \over {{\nu _{{\rm{Hg}}}}}} \propto {g_{{\rm{Cs}}}}\bar \mu \alpha _{{\rm{EM}}}^{6.05}.$$
  • Ytterbium: The 2S1/22D3/2 electric quadrupole transition at 688 THz of 171Yb+ was compared to the ground state hyperfine transition of cesium. The constraint of [408] was updated, after comparison over a six year period, which lead to [407]

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Yb}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(} - 0.78 \pm 1.40{)} \times {10^{- 15}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (29)

    Proceeding as previously, this tests the stability of

    $${{{\nu _{{\rm{Cs}}}}} \over {{\nu _{{\rm{Yb}}}}}} \propto {g_{{\rm{Cs}}}}\bar \mu \alpha _{{\rm{EM}}}^{1.93}.$$
  • Strontium: The comparison of the 1S03P0 transition in neutral 87Sr with a cesium clock was performed in three independent laboratories. The combination of these three experiments [61] leads to the constraint

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Sr}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(} - 1.0 \pm 1.8{)} \times {10^{- 15}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (30)

    Proceeding as previously, this tests the stability of

    $${{{\nu _{{\rm{Cs}}}}} \over {{\nu _{{\rm{Sr}}}}}} \propto {g_{{\rm{Cs}}}}\bar \mu \alpha _{{\rm{EM}}}^{2.77}.$$
  • Atomic dyprosium: It was suggested in [175, 174] (see also [173] for a computation of the transition amplitudes of the low states of dyprosium) that the electric dipole (E1) transition between two nearly degenerate opposite-parity states in atomic dyprosium should be highly sensitive to the variation of αEM. It was then demonstrated [384] that a constraint of the order of 10−18/yr can be reached. The frequencies of nearly of two isotopes of dyprosium were monitored over a 8 months period [100] showing that the frequency variation of the 3.1-MHz transition in 163Dy and the 235-MHz transition in 162Dy are 9.0 ± 6.7 Hz/yr and −0.6 ± 6.5 Hz/yr, respectively. These provide the constraint

    $${{{\alpha _{{\rm{\dot EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} = {(} - 2.7 \pm 2.6{)} \times {10^{- 15}}{\rm{y}}{{\rm{r}}^{-1}},$$
    (31)

    at 1σ level, without any assumptions on the constancy of other fundamental constants.

  • Aluminium and mercury single-ion optical clocks: The comparison of the 1S03P0 transition in 27Al+ and 2S1/22D5/2 in 199Hg+ over a year allowed to set the constraint [440]

    $${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{Al}}}}} \over {{\nu _{{\rm{Hg}}}}}}} \right) = {(} - 5.3 \pm 7.9{)} \times {10^{- 17}}{\rm{y}}{{\rm{r}}^{-1}}.$$
    (32)

    Proceeding as previously, this tests the stability of

    $${{{\nu _{{\rm{Hg}}}}} \over {{\nu _{{\rm{Al}}}}}} \propto \alpha _{{\rm{EM}}}^{- 3.208},$$

    which directly set the constraint

    $${{{\alpha _{{\rm{\dot EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} = {(} - 1.6 \pm 2.3{)} \times {10^{- 17}}{\rm{y}}{{\rm{r}}^{-1}},$$
    (33)

    since it depends only on αEM.

While the constraint (33) was obtained directly from the clock comparison, the other studies need to be combined to disentangle the contributions of the various constants. As an example, we first use the bound (33) on αEM, we can then extract the two following bounds

$${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{g_{{\rm{Cs}}}}} \over {{g_{{\rm{Rb}}}}}}} \right) = {(}0.48 \pm 6.68{)} \times {10^{- 16}}{\rm{y}}{{\rm{r}}^{-1}},\quad {{\rm{d}} \over {{\rm{d}}t}}\ln ({g_{{\rm{Cs}}}}\bar \mu) = {(}4.67 \pm 5.29{)} \times {10^{- 16}}{\rm{y}}{{\rm{r}}^{-1}},$$
(34)

on a time scale of a year. We cannot lift the degeneracies further with this clock comparison, since that would require a constraint on the time variation of μ. All these constraints are summarized in Table 7 and Figure 2.

Figure 2
figure2

Evolution of the comparison of different atomic clocks summarized in Table 7.

Table 7 Summary of the constraints obtained from the comparisons of atomic clocks. For each constraint on the relative drift of the frequency of the two clocks, we provide the dependence in the various constants, using the numbers of Table 6. From Ref. [379], which can be consulted for other constants.

A solution is to consider diatomic molecules since, as first pointed out by Thomson [488], molecular lines can provide a test of the variation of μ. The energy difference between two adjacent rotational levels in a diatomic molecule is inversely proportional to Mr2, r being the bond length and M the reduced mass, and the vibrational transition of the same molecule has, in first approximation, a \(\sqrt M\) dependence. For molecular hydrogen M = mp/2 so that the comparison of an observed vibro-rotational spectrum with a laboratory spectrum gives an information on the variation of mp and mn. Comparing pure rotational transitions with electronic transitions gives a measurement of μ. It follows that the frequency of vibro-rotation transitions is, in the Born-Oppenheimer approximation, of the form

$$\nu \simeq {E_I}({c_{{\rm{elec}}}} + {c_{{\rm{vib}}}}\sqrt {\bar \mu} + {c_{{\rm{rot}}}}\bar \mu)$$
(35)

where celec, cvib and crot are some numerical coefficients.

The comparison of the vibro-rotational transition in the molecule SF6 was compared to a caesium clock over a two-year period, leading to the constraint [464]

$${{\rm{d}} \over {{\rm{d}}t}}\ln \left({{{{\nu _{{\rm{SF6}}}}} \over {{\nu _{{\rm{Cs}}}}}}} \right) = {(}1.9 \pm 0.12 \pm 2.7{)} \times {10^{- 14}}{\rm{y}}{{\rm{r}}^{-1}},$$
(36)

where the second error takes into account uncontrolled systematics. Now, using again Table 6, we deduce that

$${{{\nu _{{\rm{SF6}}}}} \over {{\nu _{{\rm{Cs}}}}}} \propto {\mu ^{- 1/2}}\alpha _{{\rm{EM}}}^{- 2.83}{({g_{{\rm{Cs}}}}\bar \mu)^{- 1}}.$$

It can be combined with the constraint (26), which enjoys the same dependence to cesium to infer that

$${{\dot \mu} \over \mu} = {(} - 3.8 \pm 5.6{)} \times {10^{- 14}}{\rm{y}}{{\rm{r}}^{-1}}.$$
(37)

Combined with Equation (34), we can obtain independent constraints on the time variation of gCs, gRb and μ.

Physical interpretation

The theoretical description must be pushed further if ones wants to extract constraints on constant more fundamental than the nuclear magnetic moments. This requires one to use quantum chromodynamics. In particular, it was argued than within this theoretical framework, one can relate the nucleon g-factors in terms of the quark mass and the QCD scale [198]. Under the assumption of a unification of the three non-gravitational interaction (see Section 6.3), the dependence of the magnetic moments on the quark masses was investigated in [210]. The magnetic moments, or equivalently the g-factors, are first related to the ones of the proton and a neutron to derive a relation of the form

$$g \propto g_{\rm{p}}^{{a_{\rm{p}}}}g_{\rm{n}}^{{a_{\rm{n}}}}.$$

[198, 210] argued that these g-factors mainly depend on the light quark mass mq = ½(mu + md) and ms, respectively for the up, down and strange quarks, that is in terms of Xq = mqQCD and Xs = msQCD. Using a chiral perturbation theory, it was deduced, assuming ΛQCD constant, that

$${g_{\rm{p}}} \propto X_{\rm{q}}^{- 0.087}X_{\rm{s}}^{- 0.013},\quad {g_{\rm{n}}} \propto X_{\rm{q}}^{- 0.118}X_{\rm{s}}^{0.0013},$$

so that for a hyperfine transition

$${\nu _{{\rm{hfs}}}} \propto \alpha _{{\rm{EM}}}^{2 + {\kappa _\alpha}}X_{\rm{q}}^{{\kappa _{\rm{q}}}}X_{\rm{s}}^{{\kappa _{\rm{s}}}}\bar \mu.$$

Both coefficients can be computed, leading to the possibility to draw constraints on the independent time variation of Xq, Xs and Xe.

To simplify, we may assume that XqXs, which is motivated by the Higgs mechanism of mass generation, so that the dependence in the quark masses reduces to κ = ½ (κq + κs). For instance, we have

$${\kappa _{{\rm{Cs}}}} = 0.009,\quad {\kappa _{{\rm{Rb}}}} = - 0.016,\quad {\kappa _{\rm{H}}} = - 0.10.$$

for hyperfine transition, one further needs to take into account the dependence in μ that can be described [204] by

$${m_{\rm{p}}} \sim 3{\Lambda _{{\rm{QCD}}}}X_{\rm{q}}^{0.037}X_{\rm{s}}^{0.011},$$

so that the hyperfine frequencies behaves as

$${\nu _{{\rm{hfs}}}} \propto \alpha _{{\rm{EM}}}^{2 + {\kappa _\alpha}}X_{\rm{q}}^{\kappa {\rm{- 0}}{\rm{.048}}}{X_{\rm{e}}},$$

in the approximation XqXs and where XemeQCD. This allows one to get independent constraints on the independent time variation of Xe, Xq and αEM. Indeed, these constraints are model-dependent and, as an example, Table III of [210] compares the values of the sensitivity κ when different nuclear effects are considered. For instance, it can vary from 0.127, 0.044 to 0.009 for the cesium according to whether one includes only valence nucleon, non-valence non-nucleon or effect of the quark mass on the spin-spin interaction. Thus, it is a very promising framework, which still needs to be developed and the accuracy of which must be quantified in detail.

Future evolutions

Further progresses in a near future are expected mainly through three types of developments:

  • New systems: Many new systems with enhanced sensitivity [171, 200, 202, 205, 421] to some fundamental constants have recently been proposed. Other atomic systems are considered, such as, e.g., the hyperfine transitions in the electronic ground state of cold, trapped, hydrogen-like highly charged ions [44, 199, 448], or ultra-cold atom and molecule systems near the Feshbach resonances [98], where the scattering length is extremely sensitive to μ.

    Concerning diatomic molecules, it was shown that this sensitivity can be enhanced in transitions between narrow close levels of different nature [13, 15]. In such transitions, the fine structure mainly depends on the fine-structure constant, νfs ∼ (EM)2Rc, while the vibrational levels depend mainly on the electron-to-proton mass ratio and the reduced mass of the molecule, \({\nu _{\rm{V}}} \sim M_r^{- 1/2}{{\bar \mu}^{1/2}}{R_\infty}c\). There could be a cancellation between the two frequencies when ν = νhfv ∼ 0 with n a positive integer. It follows that δν/ν will be proportional to K = νhf/ν so that the sensitivity to αEM and μ can be enhanced for these particular transitions. A similar effect between transistions with hyperfine-structures, for which the sensitivity to αEM can reach 600 for instance for 139La32S or silicon monobrid [42] that allows one to constrain \({\alpha _{{\rm{EM}}}}{{\bar \mu}^{- 1/4}}\).

    Nuclear transitions, such as an optical clock based on a very narrow ultraviolet nuclear transition between the ground and first excited states in the 229Th, are also under consideration. Using a Walecka model for the nuclear potential, it was concluded [199] that the sensitivity of the transition to the fine-structure constant and quark mass was typically

    $${{\delta \omega} \over \omega} \sim {10^5}\left({4{{\delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} + {{\delta {X_{\rm{q}}}} \over {{X_{\rm{q}}}}} - 10{{\delta {X_{\rm{s}}}} \over {{X_{\rm{s}}}}}} \right),$$

    which roughly provides a five order of magnitude amplification, which can lead to a constraint at the level of 10−24/yr on the time variation of Xq. Such a method is promising and would offer different sensitivities to systematic effects compared to atomic clocks. However, this sensitivity is not clearly established since different nuclear calculations do not agree [46, 247].

  • Atomic clocks in space (ACES): An improvement of at least an order of magnitude on current constraints can be achieved in space with the PHARAO/ACES project [433, 444] of the European Spatial Agency. PHARAO (Projet d’Horloge Atomique par Refroidissement d’Atomes en Orbite) combines laser cooling techniques and a microgravity environment in a satellite orbit. It aims at achieving time and frequency transfer with stability better than 10−16.

    The SAGAS (Search for anomalous gravitation using atomic sensor) project aims at flying highly sensitive optical atomic clocks and cold atom accelerometers on a solar system trajectory on a time scale of 10 years. It could test the constancy of the fine-structure constant along the satellite worldline, which, in particular, can set a constraint on its spatial variation of the order of 10−9 [433, 547].

  • Theoretical developments: We remind one more time that the interpretation of the experiments requires a good theoretical understanding of the systems but also that the constraints we draw on the fundamental constants such as the quark masses are conditional to our theoretical modeling, hence on hypothesis on a unification scheme as well as nuclear physics. The accuracy and the robustness of these steps need to be determined, e.g., by taking the dependence in the nuclear radius [154].

The Oklo phenomenon

A natural nuclear reactor

Oklo is the name of a town in the Gabon republic (West Africa) where an open-pit uranium mine is situated. About 1.8 × 109 yr ago (corresponding to a redshift of ∼ 0.14 with the cosmological concordance model), in one of the rich vein of uranium ore, a natural nuclear reactor went critical, consumed a portion of its fuel and then shut a few million years later (see, e.g., [509] for more details). This phenomenon was discovered by the French Commissariat à l’Énergie Atomique in 1972 while monitoring for uranium ores [382]. Sixteen natural uranium reactors have been identified. Well studied reactors include the zone RZ2 (about 60 bore-holes, 1800 kg of 235U fissioned during 8.5 × 105 yr) and zone RZ10 (about 13 bore-holes, 650 kg of 235U fissioned during 1.6 × 105 yr).

The existence of such a natural reactor was predicted by P. Kuroda [303] who showed that under favorable conditions, a spontaneous chain reaction could take place in rich uranium deposits. Indeed, two billion years ago, uranium was naturally enriched (due to the difference of decay rate between 235U and 238U) and 235U represented about 3.68% of the total uranium (compared with 0.72% today and to the 3–5% enrichment used in most commercial reactors). Besides, in Oklo the conditions were favorable: (1) the concentration of neutron absorbers, which prevent the neutrons from being available for the chain fission, was low; (2) water played the role of moderator (the zones RZ2 and RZ10 operated at a depth of several thousand meters, so that the water pressure and temperature was close to the pressurized water reactors of 20 Mpa and 300°C) and slowed down fast neutrons so that they can interact with other 235U and (3) the reactor was large enough so that the neutrons did not escape faster than they were produced. It is estimated that the Oklo reactor powered 10 to 50 kW. This explanation is backed up by the substantial depletion of 235U as well as a correlated peculiar distribution of some rare-earth isotopes. These rare-earth isotopes are abundantly produced during the fission of uranium and, in particular, the strong neutron absorbers like \(_{62}^{149}{\rm{Sm,}}\,_{63}^{151}{\rm{Eu,}}\,_{64}^{155}{\rm{Gd}}\) and \(_{64}^{155}{\rm{Gd}}\) are found in very small quantities in the reactor.

From the isotopic abundances of the yields, one can extract information about the nuclear reactions at the time the reactor was operational and reconstruct the reaction rates at that time. One of the key quantity measured is the ratio \(_{62}^{149}{\rm{Sm/}}_{62}^{147}{\rm{Sm}}\) of two light isotopes of samarium, which are not fission products. This ratio of order of 0.9 in normal samarium, is about 0.02 in Oklo ores. This low value is interpreted [465] by the depletion of \(_{62}^{149}{\rm{Sm}}\) by thermal neutrons produced by the fission process and to which it was exposed while the reactor was active. The capture cross section of thermal neutron by \(_{62}^{149}{\rm{Sm}}\)

$${n+_{\ 62}^{\ 149}{\rm Sm} \rightarrow _{\ 62}^{\ 150}{\rm Sm}+\gamma}$$
(38)

is dominated by a capture resonance of a neutron of energy of about 0.1 eV (E r = 97.3 meV today). The existence of this resonance is a consequence of an almost cancellation between the electromagnetic repulsive force and the strong interaction.

Shlyakhter [465] pointed out that this phenomenon can be used to set a constraint on the time variation of fundamental constants. His argument can be summarized as follows.

  • First, the cross section σ(n, γ) strongly depends on the energy of a resonance at E r = 97.3 meV.

  • Geochemical data allow to determine the isotopic composition of various element, such as uranium, neodynium, gadolinium and samarium. Gadolinium and neodium allow to determine the fluence (integrated flux over time) of the neutron while both gadolinium and samarium are strong neutron absorbers.

  • From these data, one deduces the value of the averaged value of the cross section on the neutron flux, \({{\hat \sigma}_{149}}\). This value depends on hypothesis on the geometry of the reactor zone.

  • The range of allowed value of \({{\hat \sigma}_{149}}\) was translated into a constraint on E r . This step involves an assumption on the form and temperature of the neutron spectrum.

  • E r was related to some fundamental constant, which involve a model of the nucleus.

In conclusion, we have different steps, which all involve assumptions:

  • Isotopic compositions and geophysical parameters are measured in a given set of bore-hold in each zone. A choice has to be made on the sample to use, in order, e.g., to ensure that they are not contaminated.

  • With hypothesis on the geometry of the reactor, on the spectrum and temperature of the neutron flux, one can deduce the effective value of the cross sections of neutron absorbers (such as samarium and gadolinium). This requires one to solve a network of nuclear reactions describing the fission.

  • One can then infer the value of the resonance energy E r , which again depends on the assumptions on the neutron spectrum.

  • E r needs to be related to fundamental constant, which involves a model of the nucleus and high energy physics hypothesis.

We shall now detail the assumptions used in the various analyses that have been performed since the pioneering work of [465].

Constraining the shift of the resonance energy

Cross sections. The cross section of the neutron capture (38) strongly depends on the energy of a resonance at E r = 97.3 meV and is well described by the Breit-Wigner formula

$${\alpha _{(n,\gamma)}}(E) = {{{g_0}\pi} \over 2}{{{\hbar ^2}} \over {{m_{\rm{n}}}E}}{{{\Gamma _{\rm{n}}}{\Gamma _\gamma}} \over {{{(E - {E_r})}^2} + {\Gamma ^2}/4}}$$
(39)

where g0 = (2J + 1)(2s + 1)−1(2I + 1)−1 is a statistical factor, which depends on the spin of the incident neutron s = 1/2, of the target nucleus I, and of the compound nucleus J. For the reaction (38), we have g0 = 9/16. The total width Γ ≡ Γn + Γ γ is the sum of the neutron partial width Γn = 0.533 meV (at E r = 97.3 meV and it scales as \(\sqrt E\) in the center of mass) and of the radiative partial width Γ γ = 60.5 meV. \(_{64}^{155}{\rm{Gd}}\) has a resonance at E r = 26.8 meV with Γn = 0.104 meV, Γ γ = 108 meV and g = 5/8 while \(_{64}^{157}{\rm{Gd}}\) has a resonance at E r = 31.4 meV with Γn = 0.470 meV, Γ γ = 106 meV and g = 5/8.

As explained in the previous Section 3.2.1, this cross section cannot be measured from the Oklo data, which allow only to measure its value averaged on the neutron flux n(v, T), T being the temperature of the moderator. It is conventionally defined as

$$\hat \sigma = {1 \over {n{v_0}}}\int {{\sigma _{(n,\gamma)}}n(v,T)v{\rm{d}}v,}$$
(40)

where the velocity v0 = 2200 m · s−1 corresponds to an energy E0 = 25.3 meV and \(\upsilon = \sqrt {2E/{m_n}}\), instead of

$$\bar \sigma = {{\int {{\sigma _{(n,\gamma)}}n(v,T)v{\rm{d}}v}} \over {\int {n(v,T)v{\rm{d}}v}}}.$$

When the cross section behaves as σ = σ0v0/v, which is the case for nuclei known as “1/v-absorbers”, \(\hat \sigma = {\sigma _0}\) and does not depend on the temperature, whatever the distribution n(v). In a similar way, the effective neutron flux defined

$$\hat \phi = {v_0}\int {n(v,T){\rm{d}}v},$$
(41)

which differs from the true flux

$$\phi = \int {n(v,T)v{\rm{d}}v}.$$

However, since \(\bar \sigma \phi = \hat \sigma \hat \phi\), the reaction rates are not affected by these definitions.

Extracting the effective cross section from the data. To “measure” the value of \({\hat \sigma}\) from the Oklo data, we need to solve the nuclear reaction network that controls the isotopic composition during the fission.

The samples of the Oklo reactors were exposed [382] to an integrated effective fluence \(\int {\hat \phi {\rm{d}}t}\) of about 1021 neutron · cm−2 = 1 kb−1. It implies that any process with a cross section smaller than 1 kb can safely be neglected in the computation of the abundances. This includes neutron capture by \(_{62}^{144}{\rm{Sm}}\) and \(_{62}^{148}{\rm{Sm}}\), as well as by \(_{64}^{155}{\rm{Gd}}\) and \(_{64}^{157}{\rm{Gd}}\). On the other hand, the fission of \(_{92}^{235}{\rm{U}}\), the capture of neutron by \(_{60}^{143}{\rm{Nd}}\) and by \(_{62}^{149}{\rm{Sm}}\) with respective cross sections σ5 ≃ 0.6 kb, σ143 ∼ 0.3 kb and σ149 ≥ 70 kb are the dominant processes. It follows that the equations of evolution for the number densities N147, N148, N149 and N235 of \(_{62}^{147}{\rm{Sm,}}\,_{62}^{148}{\rm{Sm,}}\,_{62}^{149}{\rm{Sm}}\) and \(_{92}^{235}{\rm{U}}\) takes the form

$${{{\rm{d}}{N_{147}}} \over {\hat \phi {\rm{d}}t}} = - {\hat \sigma _{147}}{N_{147}} + {\hat \sigma _{f235y147}}{N_{235}}$$
(42)
$${{{\rm{d}}{N_{148}}} \over {\hat \phi {\rm{d}}t}} = {\hat \sigma _{147}}{N_{147}}$$
(43)
$${{{\rm{d}}{N_{149}}} \over {\hat \phi {\rm{d}}t}} = - {\hat \sigma _{149}}{N_{149}} + {\hat \sigma _{f235y149}}{N_{235}}$$
(44)
$${{{\rm{d}}{N_{235}}} \over {\hat \phi {\rm{d}}t}} = - {\sigma _5}{N_{235}},$$
(45)

where y i denotes the yield of the corresponding element in the fission of \(_{92}^{235}{\rm{U}}\) and \({{\hat \sigma}_5}\) is the fission cross section. This system can be integrated under the assumption that the cross sections and the neutron flux are constant and the result compared with the natural abundances of the samarium to extract the value of \({{\hat \sigma}_{149}}\) at the time of the reaction. Here, the system has been closed by introducing a modified absorption cross section [123] \(\sigma _5^{\ast}\) to take into account both the fission, capture but also the formation from the α-decay of \(_{94}^{239}{\rm{Pu}}\). One can instead extend the system by considering \(_{94}^{239}{\rm{Pu}}\), and \(_{92}^{235}{\rm{U}}\) (see [234]). While most studies focus on the samarium, [220] also includes the gadolinium even though it is not clear whether it can reliably be measured [123]. They give similar results.

By comparing the solution of this system with the measured isotopic composition, one can deduce the effective cross section. At this step, the different analyses [465, 415, 123, 220, 305, 416, 234] differ from the choice of the data. The measured values of \({{\hat \sigma}_{149}}\) can be found in these articles. They are given for a given zone (RZ2, RZ10 mainly) with a number that correspond to the number of the bore-hole and the depth (e.g., in Table 2 of [123], SC39-1383 means that we are dealing with the bore-hole number 39 at a depth of 13.83 m). Recently, another approach [416, 234] was proposed in order to take into account of the geometry and details of the reactor. It relies on a full-scale Monte-Carlo simulation and a computer model of the reactor zone RZ2 [416] and both RZ2 and RZ10 [234] and allows to take into account the spatial distribution of the neutron flux.

Determination of E r . To convert the constraint on the effective cross section, one needs to specify the neutron spectrum. In the earlier studies [465, 415], a Maxwell distribution,

$${n_{{\rm{th}}}}(v,T) = {\left({{{{m_{\rm{n}}}} \over {2\pi T}}} \right)^{3/2}}{{\rm{e}}^{- {{m{\nu ^2}} \over {2{k_{\rm{B}}}T}},}}$$

was assumed for the neutron with a temperature of 20°C, which is probably too small. Then v0 is the mean velocity at a temperature \({T_0} = {m_{\rm{n}}}\upsilon _0^2/2{k_{\rm{B}}} = 20.4^\circ {\rm{C}}\). [123, 220] also assume a Maxwell distribution but let the moderator temperature vary so that they deduce an effective cross section \(\hat \sigma ({R_r},T)\). They respectively restricted the temperature range to 180°C < T < 700°C and 200°C < T < 400°C, based on geochemical analysis. The advantage of the Maxwell distribution assumption is that it avoids to rely on a particular model of the Oklo reactor since the spectrum is determined solely by the temperature.

It was then noted [305, 416] that above an energy of several eV, the neutron spectrum shifted to a 1/E tail because of the absorption of neutrons in uranium resonances. Thus, the distribution was adjusted to include an epithermal distribution

$$n(v) = (1 - f){n_{{\rm{th}}}}(v,T) + f{n_{{\rm{epi}}}}(v),$$

with \({n_{{\rm{epi}}}} = \upsilon _c^2/{\upsilon ^2}\) for v > v c and vanishing otherwise. v c is a cut-off velocity that also needs to be specified. The effective cross section can then be parameterized [234] as

$$\hat \sigma = g(T){\sigma _0} + {r_0}I,$$
(46)

where g(T) is a measure of the departure of σ from the 1/v behavior, I is related to the resonance integral of the cross section and r0 is the Oklo reactor spectral index. It characterizes the contribution of the epithermal neutrons to the cross section. Among the unknown parameters, the most uncertain is probably the amount of water present at the time of the reaction. [234] chooses to adjust it so that r0 matches the experimental values.

These hypothesis on the neutron spectrum and on the temperature, as well as the constraint on the shift of the resonance energy, are summarized in Table 8. Many analyses [220, 416, 234] find two branches for ΔE r = E r Er0, with one (the left branch) indicating a variation of E r . Note that these two branches disappear when the temperature is higher since \(\hat \sigma ({E_r},T)\) is more peaked when T decreases but remain in any analysis at low temperature. This shows the importance of a good determination of the temperature. Note that the analysis of [416] indicates that the curves \(\hat \sigma (T,{E_r})\) lie appreciably lower than for a Maxwell distribution and that [220] argues that the left branch is hardly compatible with the gadolinium data.

Table 8 Summary of the analysis of the Oklo data. The principal assumptions to infer the value of the resonance energy E r are the form of the neutron spectrum and its temperature.

From the resonance energy to fundamental constants

The energy of the resonance depends a priori on many constants since the existence of such resonance is mainly the consequence of an almost cancellation between the electromagnetic repulsive force and the strong interaction. But, since no full analytical understanding of the energy levels of heavy nuclei is available, the role of each constant is difficult to disentangle.

In his first analysis, Shlyakhter [465] stated that for the neutron, the nucleus appears as a potential well with a depth V0 ≃ 50 MeV. He attributed the change of the resonance energy to a modification of the strong interaction coupling constant and concluded that ΔgS/gS ∼ ΔE r /V0. Then, arguing that the Coulomb force increases the average inter-nuclear distance by about 2.5% for A ∼ 150, he concluded that ΔαEM/αEM ∼ 20ΔgS/gS, leading to ∣αĖM/αEM∣ < 10−17 yr−1, which can be translated to

$$\vert \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}}\vert < 1.8 \times {10^{- 8}}.$$
(47)

The following analysis focused on the fine-structure constant and ignored the strong interaction. Damour and Dyson [123] related the variation of E r to the fine-structure constant by taking into account that the radiative capture of the neutron by \(_{62}^{149}{\rm{Sm}}\) corresponds to the existence of an excited quantum state of \(_{62}^{150}{\rm{Sm}}\) (so that \({E_r} = E_{150}^{\ast} - {E_{149}} - {m_n}\)) and by assuming that the nuclear energy is independent of αEM. It follows that the variation of αEM can be related to the difference of the Coulomb binding energy of these two states. The computation of this latter quantity is difficult and must be related to the mean-square radii of the protons in the isotopes of samarium. In particular this analysis [123] showed that the Bethe-Weizäcker formula overestimates by about a factor the 2 the αEM-sensitivity to the resonance energy. It follows from this analysis that

$${\alpha _{{\rm{EM}}}}{{\Delta {E_r}} \over {\Delta {\alpha _{{\rm{EM}}}}}} \simeq - 1.1\;{\rm{MeV,}}$$
(48)

which, once combined with the constraint on ΔE r , implies

$$- 0.9 \times {10^{- 7}} < \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} < 1.2 \times {10^{- 7}}$$
(49)

at 2σ level, corresponding to the range −6.7 × 10−17 yr−1 < αĖM/αEM < 5.0 × 10−17 yr−1 if αĖM is assumed constant. This tight constraint arises from the large amplification between the resonance energy (∼ 0.1 eV) and the sensitivity (∼ 1 MeV). The re-analysis of these data and also including the data of [220] with gadolinium, found the favored result αĖM/αEM = (−0.2 ± 0.8) × 10−17 yr−1, which corresponds to

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.36 \pm 1.44{)} \times {10^{- 8}}$$
(50)

and the other branch (indicating a variation; see Table 8) leads to αĖM/αEM = (4.9 ± 0.4) × 10−17 yr−1. This non-zero result cannot be eliminated.

The more recent analysis, based on a modification of the neutron spectrum lead respectively to [416]

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(}3.85 \pm 5.65{)} \times {10^{- 8}}$$
(51)

and [234]

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.65 \pm 1.75{)} \times {10^{- 8}},$$
(52)

at a 95% confidence level, both using the formalism of [123].

Olive et al. [399], inspired by grand unification model, reconsider the analysis of [123] by letting all gauge and Yukawa couplings vary. Working within the Fermi gas model, the over-riding scale dependence of the terms, which determine the binding energy of the heavy nuclei was derived. Parameterizing the mass of the hadrons as m i ∝ ΛQCD(1 + κ i m q QCD + …), they estimate that the nuclear Hamiltonian was proportional to mqQCD at lowest order, which allows to estimate that the energy of the resonance is related to the quark mass by

$${{\Delta {E_r}} \over {{E_r}}} \sim {(}2.5 - 10{)} \times {10^{17}}\Delta \ln \left({{{{m_{\rm{q}}}} \over {{\Lambda _{{\rm{QCD}}}}}}} \right).$$
(53)

Using the constraint (48), they first deduced that

$$\left\vert {\Delta \ln \left({{{{m_{\rm{q}}}} \over {{\Lambda _{{\rm{QCD}}}}}}} \right)} \right\vert < (1 - 4) \times {10^{- 8}}.$$

then, assuming that \({\alpha _{{\rm{EM}}}} \propto m_{\rm{q}}^{50}\) on the basis of grand unification (see Section 6.3 for details), they concluded that

$$\vert \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}}\vert < (2 - 8) \times {10^{- 10}}.$$
(54)

Similarly, [207, 467, 212] related the variation of the resonance energy to the quark mass. Their first estimate [207] assumes that it is related to the pion mass, m π , and that the main variation arises from the variation of the radius R ∼ 5fm + 1/m π of the nuclear potential well of depth V0, so that

$$\delta {E_r} \sim - 2{V_0}{{\delta R} \over R} \sim 3 \times {10^8}{{\delta {m_\pi}} \over {{m_\pi}}},$$

assuming that R ≃ 1.2A1/3r0, r0 being the inter-nucleon distance.

Then, in [467], the nuclear potential was described by a Walecka model, which keeps only the σ (scalar) and ω (vector) exchanges in the effective nuclear force. Their masses was related to the mass ms of the strange quark to get \({m_\sigma} \propto m_{\rm{s}}^{0.54}\) and \({m_\omega} \propto m_{\rm{s}}^{0.15}\). It follows that the variation of the potential well can be related to the variation of m σ and m ω and thus on mq by \(V \propto m_{\rm{q}}^{- 3.5}\). The constraint (48) then implies that

$$\left\vert {\Delta \ln \left({{{{m_{\rm{s}}}} \over {{\Lambda _{{\rm{QCD}}}}}}} \right)} \right\vert < 1.2 \times {10^{- 10}}.$$

By extrapolating from light nuclei where the N-body calculations can be performed more accurately, it was concluded [208] that the resonance energy scales as ΔE r ∼ 10(ΔlnXq − 0.1Δln αEM), so that the the constraints from [416] would imply that \(\Delta \ln ({X_{\rm{q}}}/\alpha _{{\rm{EM}}}^{0.1}) < 7 \times {10^{- 9}}\).

In conclusion, these last results illustrate that a detailed theoretical analysis and quantitative estimates of the nuclear physics (and QCD) aspects of the resonance shift still remain to be carried out. In particular, the interface between the perturbative QCD description and the description in term of hadron is not fully understand: we do not know the exact dependence of hadronic masses and coupling constant on ΛQCD and quark masses. The second problem concerns modeling nuclear forces in terms of the hadronic parameters.

At present, the Oklo data, while being stringent and consistent with no variation, have to be considered carefully. While a better understanding of nuclear physics is necessary to understand the full constant-dependence, the data themselves require more insight, particularly to understand the existence of the left-branch.

Meteorite dating

Long-lived α- or β-decay isotopes may be sensitive probes of the variation of fundamental constants on geological times ranging typically to the age of the solar system, t ∼ (4–5) Gyr, corresponding to a mean redshift of z ∼ 0.43. Interestingly, it can be compared with the shallow universe quasar constraints. This method was initially pointed out by Wilkinson [539] and then revived by Dyson [168]. The main idea is to extract the αEM-dependence of the decay rate and to use geological samples to bound its time variation.

The sensitivity of the decay rate of a nucleus to a change of the fine-structure constant is defined, in a similar way as for atomic clocks [Equation (23)], as

$${s_\alpha} \equiv {{\partial \ln \lambda} \over {\partial \ln {\alpha _{{\rm{EM}}}}}}.$$
(55)

λ is a function of the decay energy Q. When Q is small, mainly due to an accidental cancellation between different contributions to the nuclear binding energy, the sensitivity s a maybe strongly enhanced. A small variation of the fundamental constants can either stabilize or destabilize certain isotopes so that one can extract bounds on the time variation of their lifetime by comparing laboratory data to geophysical and solar system probes.

Assume some meteorites containing an isotope X that decays into Y are formed at a time t*. It follows that

$${N_X}(t) = {N_{X\ast}}{{\rm{e}}^{- \lambda (t - {t_\ast})}},\quad {N_Y}(t) = {N_{X\ast}}\left[ {1 - {{\rm{e}}^{- \lambda (t - {t_\ast})}}} \right] + {N_{Y\ast}}$$
(56)

if one assumes the decay rate constant. If it is varying then these relations have to be replaced by

$${N_X}(t) = {N_{X\ast}}{{\rm{e}}^{\int\nolimits_{t\ast}^t {\lambda (t{\prime}){\rm{d}}t{\prime}}}}$$

so that the value of N X today can be interpreted with Equation (56) but with an effective decay rate of

$$\bar \lambda = {1 \over {{t_0} - {t_\ast}}}\int\nolimits_{{t_\ast}}^{{t_0}} {\lambda (t{\prime}){\rm{d}}t{\prime}.}$$
(57)

From a sample of meteorites, we can measure {N X (t0), N Y (t0)} for each meteorite. These two quantities are related by

$${N_Y}({t_0}) = \left[ {{{\rm{e}}^{\bar \lambda ({t_0} - {t_\ast})}} - 1} \right]{N_X}({t_0}) + {N_{Y\ast}},$$

so that the data should lie on a line (since NX* is a priori different for each meteorite), called an “isochron”, the slope of which determines \(\bar \lambda ({t_0} - {t_\ast})\). It follows that meteorites data only provides an average measure of the decay rate, which complicates the interpretation of the constraints (see [219, 218] for explicit examples). To derive a bound on the variation of the constant we also need a good estimation of t0t*, which can be obtained from the same analysis for an isotope with a small sensitivity sα, as well as an accurate laboratory measurement of the decay rate.

Long lived α-decays

The α-decay rate, λ, of a nucleus \(_Z^A{\rm{X}}\) of charge Z and atomic number A,

$$_{Z + 2}^{A + 4}X \rightarrow_Z^AX + _2^4{\rm{He,}}$$
(58)

is governed by the penetration of the Coulomb barrier that can be described by the Gamow theory. It is well approximated by

$$\lambda \simeq \Lambda ({\alpha _{{\rm{EM}}}},v)\exp \left({- 4\pi Z{\alpha _{{\rm{EM}}}}{c \over v}} \right),$$
(59)

where \(\upsilon/c = \sqrt {Q/2{m_{\rm{P}}}{c^2}}\) is the escape velocity of the α particle. Λ is a function that depends slowly on αEM and Q. It follows that the sensitivity to the fine-structure constant is

$${s_\alpha} \simeq - 4\pi Z{{{\alpha _{{\rm{EM}}}}} \over {\sqrt {Q/2{m_{\rm{p}}}}}}\left({1 - {1 \over 2}{{{\rm{d}}\ln Q} \over {{\rm{d}}\ln {\alpha _{{\rm{EM}}}}}}} \right).$$
(60)

The decay energy is related to the nuclear binding energies B(A, Z) of the different nuclei by

$$Q = B(A,Z) + {B_\alpha} - B(A + 4,Z + 2)$$

with B α = B(4, 2). Physically, an increase of αEM induces an increase in the height of the Coulomb barrier at the nuclear surface while the depth of the nuclear potential well below the top remains the same. It follows that α-particle escapes with a greater energy but at the same energy below the top of the barrier. Since the barrier becomes thiner at a given energy below its top, the penetrability increases. This computation indeed neglects the effect of a variation of αEM on the nucleus that can be estimated to be dilated by about 1% if αEM increases by 1%.

As a first insight, when focusing on the fine-structure constant, one can estimate s α by varying only the Coulomb term of the binding energy. Its order of magnitude can be estimated from the Bethe-Weizäcker formula

$${E_{{\rm{EM}}}} = 98.25{{Z(Z - 1)} \over {{A^{1/3}}}}{\alpha _{{\rm{EM}}}}\;{\rm{MeV}}.$$
(61)

Table 9 summarizes the most sensitive isotopes, with the sensitivities derived from a semi-empirical analysis for a spherical nucleus [399]. They are in good agreement with the ones derived from Equation (61) (e.g., for 238U, one would obtain s α = 540 instead of s α = 548).

Table 9 Summary of the main nuclei and their physical properties that have been used in α-decay studies.

The sensitivities of all the nuclei of Table 9 are similar, so that the best constraint on the time variation of the fine-structure constant will be given by the nuclei with the smaller Δλ/λ.

Wilkinson [539] considered the most favorable case, that is the decay of \(_{92}^{238}{\rm{U}}\) for which s α = 548 (see Table 9). By comparing the geological dating of the Earth by different methods, he concluded that the decay constant λ of 238U, 235U and 232Th have not changed by more than a factor 3 or 4 during the last 3–4 × 109 years from which it follows

$$\vert \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}}\vert < 8 \times {10^{- 3}}.$$
(62)

This constraint was revised by Dyson [168] who claimed that the decay rate has not changed by more than 20%, during the past 2 × 109 years, which implies

$$\vert \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}}\vert < 4 \times {10^{- 4}}.$$
(63)

Uranium has a short lifetime so that it cannot be used to set constraints on longer time scales. It is also used to calibrate the age of the meteorites. Therefore, it was suggested [399] to consider 147Sm. Assuming that Δλ147147 is smaller than the fractional uncertainty of 7.5 × 10−3 of its half-life

$$\vert \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}}\vert \underset{\sim}{<} \times {10^{- 5}}.$$
(64)

As for the Oklo phenomena, the effect of other constants has not been investigated in depth. It is clear that at lowest order both Q and mp scales as ΛQCD so that one needs to go beyond such a simple description to determine the dependence in the quark masses. Taking into account the contribution of the quark masses, in the same way as for Equation (53), it was argued that \(\lambda \propto X_{\rm{q}}^{300 - 2000}\), which leads to ∣ΔlnXq∣ ≲ 10−5. In a grand unify framework, that could lead to a constraint of the order of ∣Δln αEM∣ ≲ 2 × 10−7.

Long lived β-decays

Dicke [150] stressed that the comparison of the rubidium-strontium and potassium-argon dating methods to uranium and thorium rates constrains the variation of αEM.

As long as long-lived β-decay isotopes are concerned for which the decay energy Q is small, we can use a non-relativistic approximation for the decay rate

$$\lambda = {\Lambda _ \pm}{Q^{p \pm}}$$
(65)

respectively for β-decay and electron capture. Λ± are functions that depend smoothly on αEM and which can thus be considered constant, p+ = + 3 and p = 2 + 2 are the degrees of forbiddenness of the transition. For high-Z nuclei with small decay energy Q, the exponent p becomes \(p = 2 + \sqrt {1 - \alpha _{{\rm{EM}}}^2{Z^2}}\) and is independent of . It follows that the sensitivity to a variation of the fine-structure constant is

$${s_\alpha} = p{{{\rm{d}}\ln Q} \over {{\rm{d}}\ln {\alpha _{{\rm{EM}}}}}}.$$
(66)

The second factor can be estimated exactly as for α-decay. We note that Λ± depends on the Fermi constant and on the mass of the electron as \({\Lambda _ \pm} \propto G_F^2m_{\rm{e}}^5{Q^p}\). This dependence is the same for any β-decay so that it will disappear in the comparison of two dating methods relying on two different β-decay isotopes, in which case only the dependence on the other constants appear again through the nuclear binding energy. Note, however, that comparing a α- to a β-decay may lead to interesting constraints.

We refer to Section III.A.4 of FVC [500] for earlier constraints derived from rubidium-strontium, potassium-argon and we focus on the rhenium-osmium case,

$$_{75}^{187}{\rm Re} \rightarrow_{76}^{187}{\rm{Os +}}{\bar \nu _e} + {e^ -}$$
(67)

first considered by Peebles and Dicke [406]. They noted that the very small value of its decay energy Q = 2.6 keV makes it a very sensitive probe of the variation of αEM. In that case p ≃ 2.8 so that s α ≃ −18000; a change of 10−2% of αEM will induce a change in the decay energy of order of the keV, that is of the order of the decay energy itself. Peebles and Dicke [406] did not have reliable laboratory determination of the decay rate to put any constraint. Dyson [167] compared the isotopic analysis of molybdenite ores (λ187 = (1.6 ± 0.2) × 10−11 yr−1), the isotopic analysis of 14 iron meteorites (λ187 = (1.4 ± 0.3) × 10−11 yr−1) and laboratory measurements of the decay rate (λ187 = (1.1 ± 0.1) × 10−11 yr−1). Assuming that the variation of the decay energy comes entirely from the variation of αEM, he concluded that ∣ΔαEM/αEM∣ < 9 × 10−4 during the past 3 × 109 years. Note that the discrepancy between meteorite and lab data could have been interpreted as a time-variation of αEM, but the laboratory measurement were complicated by many technical issues so that Dyson only considered a conservative upper limit.

The modelization and the computation of s α were improved in [399], following the same lines as for α-decay.

$${{\Delta {\lambda _{187}}} \over {{\lambda _{187}}}} = p{{\Delta Q} \over Q} \simeq p\left({{{20\;{\rm{MeV}}} \over Q}} \right){{\Delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} \sim - 2.2 \times {10^4}{{\Delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}}$$

if one considers only the variation of the Coulomb energy in Q. A similar analysis [147] leads to \(\Delta \ln {\lambda _{187}} \simeq {10^4}\Delta \ln \left[ {\alpha _{{\rm{EM}}}^{- 2.2}X_{\rm{q}}^{- 1.9}{{({X_{\rm{d}}} - {X_{\rm{u}}})}^{0.23}}X_{\rm{e}}^{- 0.058}} \right]\).

The dramatic improvement in the meteoric analysis of the Re/Os ratio [468] led to a recent re-analysis of the constraints on the fundamental constants. The slope of the isochron was determined with a precision of 0.5%. However, the Re/Os ratio is inferred from iron meteorites the age of which is not determined directly. Models of formation of the solar system tend to show that iron meteorites and angrite meteorites form within the same 5 million years. The age of the latter can be estimated from the 207Pb-208Pb method, which gives 4.558 Gyr [337] so that λ187 = (1.666 ± 0.009) × 10−11 yr−1. Thus, we could adopt [399]

$$\left\vert {{{\Delta {\lambda _{187}}} \over {{\lambda _{187}}}}} \right\vert < 5 \times {10^{- 3}}.$$

However, the meteoritic ages are determined mainly by 238U dating so that effectively we have a constraint on the variation of λ187238. Fortunately, since the sensitivity of 238U is much smaller than the one of the rhenium, it is safe to neglect its effect. Using the recent laboratory measurement [333] (λ187 = (−1.639 ± 0.025) × 10−11 yr−1), the variation of the decay rate is not given by the dispersion of the meteoritic measurement, but by comparing to its value today, so that

$$\left\vert {{{\Delta {\lambda _{187}}} \over {{\lambda _{187}}}}} \right\vert = - 0.016 \pm 0.016.$$
(68)

The analysis of Ref. [400], following the assumption of [399], deduced that

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = (- 8 \pm 16) \times {10^{- 7}},$$
(69)

at a 95% confidence level, on a typical time scale of 5 Gyr (or equivalently a redshift of order z ∼ 0.2).

As pointed out in [219, 218], these constraints really represents a bound on the average decay rate \({\bar \lambda}\) since the formation of the meteorites. This implies in particular that the redshift at which one should consider this constraint depends on the specific functional dependence λ(t). It was shown that well-designed time dependence for λ can obviate this limit, due to the time average.

Conclusions

Meteorites data allow to set constraints on the variation of the fundamental constants, which are comparable to the ones set by the Oklo phenomenon. Similar constraints can also bet set from spontaneous fission (see Section III.A.3 of FVC [500]) but this process is less well understood and less sensitive than the α- and β-decay processes and.

From an experimental point of view, the main difficulty concerns the dating of the meteorites and the interpretation of the effective decay rate.

As long as we only consider αEM, the sensitivities can be computed mainly by considering the contribution of the Coulomb energy to the decay energy, that reduces to its contribution to the nuclear energy. However, as for the Oklo phenomenon, the dependencies in the other constants, Xq, GF, μ…, require a nuclear model and remain very model-dependent.

Quasar absorption spectra

Generalities

Quasar (QSO) absorption lines provide a powerful probe of the variation of fundamental constants. Absorption lines in intervening clouds along the line of sight of the QSO give access to the spectra of the atoms present in the cloud, that it is to paleo-spectra. The method was first used by Savedoff [447] who constrained the time variation of the fine-structure constraint from the doublet separations seen in galaxy emission spectra. For general introduction to these observations, we refer to [412, 474, 271].

Indeed, one cannot use a single transition compared to its laboratory value since the expansion of the universe induces a global redshifting of all spectra. In order to tackle down a variation of the fundamental constants, one should resort on various transitions and look for chromatic effects that can indeed not be reproduce by the expansion of the universe, which acts chromatically on all wavelengths.

To achieve such a test, one needs to understand the dependencies of different types of transitions, in a similar way as for atomic clock experiments. [175, 169] suggested to use the convenient formulation

$$\omega = {\omega _0} + q\left[ {{{\left({{{{\alpha _{{\rm{EM}}}}} \over {\alpha _{{\rm{EM}}}^{(0)}}}} \right)}^2} - 1} \right] + {q^2}\left[ {{{\left({{{{\alpha _{{\rm{EM}}}}} \over {\alpha _{{\rm{EM}}}^{(0)}}}} \right)}^4} - 1} \right],$$
(70)

in order to take into account the dependence of the spectra on the fine-structure constant. ω is the energy in the rest-frame of the cloud, that is at a redshift z, ω0 is the energy measured today in the laboratory. q and q2 are two coefficients that determine the frequency dependence on a variation of αEM and that arise from the relativistic corrections for the transition under consideration. The coefficient q is typically an order of magnitude larger than q2 so that the possibility to constrain a variation of the fine-structure constant is mainly determined by q. These coefficients were computed for a large set of transitions, first using a relativistic Hartree-Fock method and then using many-body perturbation theory. We refer to [175, 45, 14] for an extensive discussion of the computational methods and a list of the q-coefficients for various transitions relevant for both quasar spectra and atomic clock experiments. Figure 3 summarizes some of these results. The uncertainty in q are typically smaller than 30 cm−1 for Mg, Si, Al and Zn, but much larger for Cr, Fe and Ni due to their more complicated electronic configurations. The accuracy for ω0 from dedicated laboratory measurements now reach 0.004 cm−1. It is important to stress that the form (70) ensures that errors in the q-coefficients cannot lead to a non zero detection of ΔαEM.

Figure 3
figure3

Summary of the values of some coefficients entering the parameterization (70) and necessary to interpret the QSO absorption spectra data. From [367]

The shift between two lines is easier to measure when the difference between the q-coefficients of the two lines is large, which occurs, e.g., for two levels with large q of opposite sign. Many methods were developed to take this into account. The alkali doublet method (AD) focuses on the fine-structure doublet of alkali atoms. It was then generalized to the many-multiplet method (MM), which uses correlations between various transitions in different atoms. As can be seen on Figure 3, some transitions are almost insensitive to a variation of αEM. This is the case of Mg II, which can be used as an anchor, i.e., a reference point. To obtain strong constraints one can either compare transitions of light atoms with those of heavy atoms (because the αEM dependence of the ground state scales as Z2) or compare sp and dp transitions in heavy elements (in that case, the relativistic correction will be of opposite signs). This latter effect increases the sensitivity and strengthens the method against systematic errors. However, the results of this method rely on two assumptions: (i) ionization and chemical homogeneity and (ii) isotopic abundance of Mg II close to the terrestrial value. Even though these are reasonable assumptions, one cannot completely rule out systematic biases that they could induce. The AD method completely avoids the assumption of homogeneity because, by construction, the two lines of the doublet must have the same profile. Indeed the AD method avoids the implicit assumption of the MM method that chemical and ionization inhomogeneities are negligible. Another way to avoid the influence of small spectral shift due to ionization inhomogeneities within the absorber and due to possible non-zero offset between different exposures was to rely on different transitions of a single ion in individual exposure. This method has been called the Single ion differential alpha measurement method (SIDAM).

Most studies are based on optical techniques due to the profusion of strong UV transitions that are redshifted into the optical band (this includes AD, MM, SIDAM and it implies that they can be applied only above a given redshift, e.g., Si IV at z > 1.3, Fe IIλ1608 at z > 1) or on radio techniques since radio transitions arise from many different physical effects (hyperfine splitting and in particular HI 21 cm hyperfine transition, molecular rotation, Lambda-doubling, etc). In the latter case, the line frequencies and their comparisons yield constraints on different sets of fundamental constants including αEM, gp and μ. Thus, these techniques are complementary since systematic effects are different in optical and radio regimes. Also the radio techniques offer some advantages: (1) to reach high spectral resolution (< 1 km/s), alleviating in particular problems with line blending and the use of, e.g., masers allow to reach a frequency calibration better than roughly 10 m/s; (2) in general, the sensitivity of the line position to a variation of a constant is higher; (3) the isotopic lines are observed separately, while in optical there is a blend with possible differential saturations (see, e.g., [109] for a discussion).

Let us first emphasize that the shifts in the absorption lines to be detected are extremely small. For instance a change of αEM of order 10−5 corresponds a shift of at most 20 mÅ for a redshift of z ∼ 2, which would corresponds to a shift of order ∼ 0.5 km/s, or to about a third of a pixel at a spectral resolution of \(R \sim 40000\), as achieved with Keck/HIRES or VLT/UVES. As we shall discuss later, there are several sources of uncertainty that hamper the measurement. In particular, the absorption lines have complex profiles (because they result from the propagation of photons through a highly inhomogeneous medium) that are fitted using a combination of Voigt profiles. Each of these components depends on several parameters including the redshift, the column density and the width of the line (Doppler parameter) to which one now needs to add the constants that are assumed to be varying. These parameters are constrained assuming that the profiles are the same for all transitions, which is indeed a non-trivial assumption for transitions from different species (this was one of the driving motivations to use the transition from a single species and of the SIDAM method). More important, the fit is usually not unique. This is not a problem when the lines are not saturated but it can increase the error on αEM by a factor 2 in the case of strongly saturated lines [91].

Alkali doublet method (AD)

The first method used to set constraint on the time variation of the fine-structure constant relies on fine-structure doublets splitting for which

$$\Delta \nu \propto {{\alpha _{{\rm{EM}}}^2{Z^4}{R_\infty}} \over {2{n^3}}}.$$

It follows that the relative separation is proportional \({\alpha _{{\rm{EM}}}},\Delta \nu/\bar \nu \propto \alpha _{{\rm{EM}}}^2\) so that the variation of the fine structure constant at a redshift z can be obtained as

$$\left({{{\Delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}}} \right)(z) = {{{c_r}} \over 2}\left[ {{{\left({{{\Delta \lambda} \over {\bar \lambda}}} \right)}_z}/{{\left({{{\Delta \lambda} \over {\bar \lambda}}} \right)}_0} - 1} \right],$$

where c r ∼ 1 is a number taking into account the relativistic corrections. This expression is indeed a simple approach of the alkali doublet since one should, as for atomic clocks, take into account the relativistic corrections more precisely. Using the formulation (70), one can deduce that

$${c_r} = {{\delta q + \delta {q_2}} \over {\delta q + 2\delta {q_2}}},$$

where the δq are the differences between the q-coefficients for the doublet transitions.

Several authors have applied the AD method to doublets of several species such as, e.g., C IV, N V, O VI, Mg II, Al III, Si II, Si IV. We refer to Section III.3 of FVC [500] for a summary of their results (see also [318]) and focus on the three most recent analysis, based on the SiIV doublet. In this particular case, q = 766 (resp. 362) cm−1 and q2 = 48 (resp. −8) cm−1 for Si IV λ1393 (resp. λ1402) so that c r = 0.8914. The method is based on a χ2 minimization of multiple component Voigt profile fits to the absorption features in the QSO spectra. In general such a profile depends on three parameters, the column density N, the Doppler width (b) and the redshift. It is now extended to include ΔαEMEM. The fit is carried out by simultaneously varying these parameters for each component.

  • Murphy et al. [377] analyzed 21 Keck/HIRES Si IV absorption systems toward 8 quasars to obtain the weighted mean of the sample,

    $$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.5 \pm 1.3{)} \times {10^{- 5}},\quad 2.33 < z < 3.08,$$
    (71)

    with a mean redshift of z = 2.6. The S/N ratio of these data is in the range 15–40 per pixel and the spectral resolution is R ∼ 34000.

  • Chand et al. [91] analyzed 15 Si IV absorption systems selected from a ESO-UVES sample containing 31 systems (eliminating contaminated, saturated or very broad systems; in particular a lower limit on the column density was fixed so that both lines of the doublets are detected at more than 5σ) to get the weighted mean,

    $$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.15 \pm 0.43{)} \times {10^{- 5}},\quad 1.59 < z < 2.92.$$
    (72)

    The improvement of the constraint arises mainly from a better S/N ratio, of order 60–80 per pixel, and resolution R ∼ 45000. Note that combining this result with the previous one (71 in a weighted mean would lead to ΔαEM/αEM = (−0.04 ± 0.56) × 10−5 in the range 1.59 < z < 3.02

  • The analysis [349] of seven CIV systems and two Si IV systems in the direction of a single quasar, obtained by the VLT-VES (during the science verification) has led to

    $$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 3.09 \pm 8.46{)} \times {10^{- 5}},\quad 1.19 < z < 1.84.$$
    (73)

    This is less constraining than the two previous analyses, mainly because the q-coefficients are smaller for CIV (see [410] for the calibration of the laboratory spectra)

One limitation may arise from the isotopic composition of silicium. Silicium has three naturally occurring isotopes with terrestrial abundances 28Si:29Si:30Si = 92.23:4.68:3.09 so that each absorption line is a composite of absorption lines from the three isotopes. However, it was shown that this effect of isotopic shifts [377] is negligible in the case of Si IV.

Many multiplet method (MM)

A generalization of the AD method, known as the many-mulptiplet was proposed in [176]. It relies on the combination of transitions from different species. In particular, as can be seen on Figure 3, some transitions are fairly unsensitive to a change of the fine-structure constant (e.g., Mg II or MgI, hence providing good anchors) while others such as Fe II are more sensitive. The first implementation [522] of the method was based on a measurement of the shift of the Fe II (the rest wavelengths of which are very sensitive to αEM) spectrum with respect to the one of MgII. This comparison increases the sensitivity compared with methods using only alkali doublets. Two series of analyses were performed during the past ten years and lead to contradictory conclusions. The accuracy of the measurements depends on how well the absorption line profiles are modeled.

Keck/HIRES data. The MM-method was first applied in [522] who analyzed one transition of the Mg II doublet and five Fe II transitions from three multiplets. Using 30 absorption systems toward 17 quasars, they obtained

$$\begin{array}{*{20}{c}} {\Delta {\alpha _{\text{EM}}}/{\alpha _{\text{EM}}} = ( - 0.17 \pm 0.39) \times {{10}^{ - 5}},\;\;\;\;\;\;\;\;0.6 < z < 1} \\ {\Delta {\alpha _{\text{EM}}}/{\alpha _{\text{EM}}} = ( - 1.88 \pm 0.53) \times {{10}^{ - 5}},\;\;\;\;\;\;\;\;1 < z < 1.6.} \end{array}$$

This was the first claim that a constant may have varied during the evolution of the universe. It was later confirmed in a re-analysis [376, 524] of the initial sample and by including new optical QSO data to reach 28 absorption systems with redshift z = 0.5–1.8 plus 18 damped Lyman-α absorption systems towards 13 QSO plus 21 Si IV absorption systems toward 13 QSO. The analysis used mainly the multiplets of Ni II, Cr II and Zn II and Mg I, Mg I, Al II, Al III and Fe II was also included. The most recent analysis [369] relies on 128 absorption spectra, later updated [367] to include 143 absorption systems. The more robust estimates is the weighted mean

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.57 \pm 0.11{)} \times {10^{- 5}},\quad 0.2 < z < 4.2.$$
(74)

The resolution for most spectra was R ∼ 45000 and the S/N per pixel ranges from 4 to 240, with most spectral regions with S/N∼ 30. The wavelength scale was calibrated by mean of a thorium-argon emission lamp. This calibration is crucial and its quality is discussed in [368, 374] for the Keck/HIRES (see also [236]) as well as [534] for the VLT/UVES measurements.

The low-z (z < 1.8) and high-z rely on different ions and transitions with very different αEM-dependencies. At low-z, the Mg transitions are used as anchors against which the large positive shifts in the Fe II can be measured. At high-z, different transitions are fitted (Fe II, S II, Cr II, Ni II, Zn II, Al II, Al III). The two sub-samples respond differently to simple systematic errors due to their different arrangement of q-coefficients in wavelength space. The analysis for each sample give the weighted mean

$$\begin{array}{*{20}{c}} {\Delta {\alpha _{{\text{EM}}}}/{\alpha _{{\text{EM}}}} = ( - 0.54 \pm 0.12) \times {{10}^{ - 5}},\;\;\;\;\;\;\;\;0.2 < z < 1.8} \\ {\Delta {\alpha _{{\text{EM}}}}/{\alpha _{{\text{EM}}}} = ( - 0.74 \pm 0.17) \times {{10}^{ - 5}},\;\;\;\;\;\;\;\;1.8 < z < 4.2,} \end{array}$$
(75)

with respectively 77 and 66 systems.

Hunting systematics. While performing this kind of observations a number of problems and systematic effects have to be taken into account and controlled. (1) Errors in the determination of laboratory wavelengths to which the observations are compared. (2) While comparing wavelengths from different atoms one has to take into account that they may be located in different regions of the cloud with different velocities and hence with different Doppler shifts. (3) One has to ensure that there is no transition not blended by transitions of another system. (4) The differential isotopic saturation has to be controlled. Usually quasar absorption systems are expected to have lower heavy element abundances. The spatial inhomogeneity of these abundances may also play a role. (5) Hyperfine splitting can induce a saturation similar to isotopic abundances. (6) The variation of the velocity of the Earth during the integration of a quasar spectrum can also induce differential Doppler shift. (7) Atmospheric dispersion across the spectral direction of the spectrograph slit can stretch the spectrum. It was shown that, on average, this can, for low redshift observations, mimic a negative ΔαEM/αEM, while this is no more the case for high redshift observations (hence emphasizing the complementarity of these observations). (8) The presence of a magnetic field will shift the energy levels by Zeeman effect. (9) Temperature variations during the observation will change the air refractive index in the spectrograph. In particular, flexures in the instrument are dealt with by recording a calibration lamp spectrum before and after the science exposure and the signal-to-noise and stability of the lamp is crucial (10) Instrumental effects such as variations of the intrinsic instrument profile have to be controlled.

All these effects have been discussed in detail in [374, 376] to argue that none of them can explain the current detection. This was recently complemented by a study on the calibration since adistortion of the wavelength scale could lead to a non-zero value of ΔαEM. The quality of the calibration is discussed in [368] and shown to have a negligible effect on the measurements (a similar result has been obtained for the VLT/UVES data [534]).

As we pointed out earlier, one assumption of the method concerns the isotopic abundances of Mg II that can affect the low-z sample since any changes in the isotopic composition will alter the value of effective rest-wavelengths. This isotopic composition is assumed to be close to terrestrial 24Mg:25Mg:26Mg = 79:10:11. No direct measurement of rMg = (26Mg + 25Mg)/24Mg in QSO absorber is currently feasible due to the small separation of the isotopic absorption lines. However, it was shown [231], on the basis of molecular absorption lines of MgH that rMg generally decreases with decreasing metallicity. In standard models it should be near 0 at zero metallicity since type II supernovae are primarily producers of 24Mg. It was also argued that 13C is a tracer of 25Mg and was shown to be low in the case of HE 0515-4414 [321]. However, contrary to this trend, it was found [552] that rMg can reach high values for some giant stars in the globular cluster NGC 6752 with metallicity [Fe/H] ∼ −1.6. This led Ashenfelter et al. [18] to propose a chemical evolution model with strongly enhanced population of intermediate (2–8 M) stars, which in their asymptotic giant branch phase are the dominant factories for heavy Mg at low metallicities typical of QSO absorption systems, as a possible explanation of the low-z Keck/HIRES observations without any variation of αEM. It would require that rMg reaches 0.62, compared to 0.27 (but then the UVES/VLT constraints would be converted to a detection). Care needs to be taken since the star formation history can be different ine each region, even in each absorber, so that one cannot a priori use the best-fit obtained from the Keck data to the UVES/VLT data. However, such modified nucleosynthetic history will lead to an overproduction of elements such as P, Si, Al, P above current constraints [192], but this later model is not the same as the one of Ref. [18] that was tuned to avoid these problems.

In conclusion, no compelling evidence for a systematic effect has been raised at the moment.

VLT/UVES data. The previous results, and their importance for fundamental physics, led another team to check this detection using observations from UVES spectrograph operating on the VLT. In order to avoid as much systematics as possible, and based on numerical simulations, they apply a series of selection criteria [90] on the systems used to constrain the time variation of the fine-structure constant: (1) consider only lines with similar ionization potentials (Mg II, Fe II, Si II and Al II) as they are most likely to originate from similar regions in the cloud; (2) avoid absorption lines contaminated by atmospheric lines; (3) consider only systems with hight enough column density to ensure that all the mutiplets are detected at more than 5σ; (4) demand than at least one of the anchor lines is not saturated to have a robust measurement of the redshift; (5) reject strongly saturated systems with large velocity spread; (6) keep only systems for which the majority of the components are separated from the neighboring by more than the Doppler shift parameter.

The advantage of this choice is to reject most complex or degenerate systems, which could result in uncontrolled systematics effects. The drawback is indeed that the analysis will be based on less systems.

Refs. [90, 470] analyzed the observations of 23 absorption systems, fulfilling the above criteria, in direction of 18 QSO with a S/N ranging between 50 and 80 per pixel and a resolution R > 44000. They concluded that

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.06 \pm 0.06{)} \times {10^{- 5}},\quad 0.4 < z < 2.3,$$

hence giving a 3σ constraint on a variation of αEM.

This analysis was challenged by Murphy, Webb and Flambaum [372, 371, 370]. Using (quoting them) the same reduced data, using the same fits to the absorption profiles, they claim to find different individual measurements of ΔαEMEM and a weighted mean,

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.44 \pm 0.16{)} \times {10^{- 5}},\quad 0.4 < z < 2.3,$$

which differs from the above cited value. The main points that were raised are (1) the fact that some of the uncertainties on ΔαEM/αEM are smaller than a minimum uncertainty that they estimated and (2) the quality of the statistical analysis (in particular on the basis of the χ2 curves). These arguments were responded in [471] The revision [471] of the VLT/UVES constraint rejects two more than 4σ deviant systems that were claimed to dominate the re-analysis [371, 370] and concludes that

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(}0.01 \pm 0.15{)} \times {10^{- 5}},\quad 0.4 < z < 2.3,$$
(76)

emphasizing that the errors are probably larger.

On the basis of the articles [372, 371, 370] and the answer [471], it is indeed difficult (without having played with the data) to engage one of the parties. This exchange has enlightened some differences in the statistical analysis.

To finish, let us mention that [361] reanalyzed some systems of [90, 470] by means of the SIDAM method (see below) and disagree with some of them, claiming for a problem of calibration. They also claim that the errors quoted in [367] are underestimated by a factor 1.5.

Regressional MM (RMM). The MM method was adapted to use a linear regression method [427]. The idea is to measure the redshift z i deduced from the transition i and plot z i as a function of the sensitivity coefficient. If ΔαEM ≠ 0 then there should exist a linear relation with a slope proportional to ΔαEM/αEM. On a single absorption system (VLT/UVES), on the basis of FeII transition, they concluded that

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.4 \pm 1.9 \pm {2.7_{{\rm{syst}}}}{)} \times {10^{- 6}},\quad z = 1.15,$$
(77)

compared to ΔαEM/αEM = (0.1 ± 1.7) × 10−6 that is obtained with the standard MM technique on the same data. This is also consistent with the constraint (79) obtained on the same system with the HARPS spectrograph.

Open controversy. At the moment, we have to face a situation in which two teams have performed two independent analyses based on data sets obtained by two instruments on two telescopes. Their conclusions do not agree, since only one of them is claiming for a detection of a variation of the fine-structure constant. This discrepancy between VLT/UVES and Keck/Hires results is yet to be resolved. In particular, they use data from a different telescopes observing a different (Southern/Northern) hemisphere.

Ref. [236] provides an analysis of the wavelength accuracy of the Keck/HIRES spectrograph. An absolute uncertainty of Δz ∼ 10−5, corresponding to Δλ ∼ 0.02 Å with daily drift of Δz ∼ 5 × 10−6 and multiday drift of Δz ∼ 2 × 10−5. While the cause of this drift remains unknown, it is argued [236] that this level of systematic uncertainty makes it difficult to use the Keck/HIRES to constrain the time variation of αEM (at least for a single system or a small sample since the distortion pattern pertains to the echelle orders as they are recorded on the CDD, that is it is similar from exposure to exposure, the effect on ΔαEM/αEM for an ensemble of absorbers at different redshifts would be random since the transitions fall in different places with respect to the pattern of the disortion). This needs to be confirmed and investigated in more detail. We refer to [373] for a discussion on the Keck wavelength calibration error and [534] for the VLT/UVES as well as [86] for a discussion on the ThAr calibration.

On the one hand, it is appropriate that one team has reanalyzed the data of the other and challenged its analysis. This would indeed lead to an improvement in the robustness of these results. Indeed a similar reverse analysis would also be appropriate. On the other hand both teams have achieved an amazing work in order to understand and quantify all sources of systematics. Both developments, as well as the new techniques, which are appearing, should hopefully set this observational issue. Today, it is unfortunately premature to choose one data set compared to the other.

A recent data [523] set of 60 quasar spectra (yielding 153 absorption systems) for the VLT was used and split at z = 1.8 to get

$${(\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}})_{{\rm{VLT;}}z{\rm{< 1}}{\rm{.8}}}} = {(} - 0.06 \pm 0.16{)} \times {10^{- 5}},$$

in agreement with the former study [471], while at higher redshift

$${(\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}})_{{\rm{VLT}}\;z{\rm{> 1}}{\rm{.8}}}} = {(} + 0.61 \pm 0.20{)} \times {10^{- 5}}.$$

This higher component exhibits a positive variation of αEM, that is of opposite sign with respect to the previous Keck/HIRES detection [367]

$${(\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}})_{{\rm{Keck;}}\;z{\rm{< 1}}{\rm{.8}}}} = {(} - 0.54 \pm 0.12{)} \times {10^{- 5}},\quad {(\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}})_{{\rm{Keck;}}\;z{\rm{> 1}}{\rm{.8}}}} = {(} - 0.74 \pm 0.17{)} \times {10^{- 5}}.$$

It was pointed out that the Keck/HIRES and VLT/UVES observations can be made consistent in the case the fine structure constant is spatially varying [523]. Indeed, one can note that they do not correspond to the same hemisphere and invoke a spatial variation. [523] concludes that the distribution of αEM is well represented by a spatial dipole, significant at 4.1σ, in the direction right ascension 17.3 ± 0.6 hours and declination −61 ± 9 deg (see also [50, 48]). This emphasizes the difficulty in comparing different data sets and shows that the constraints can easily be combined as long as they are compatible with no variation but one must care about a possible spatial variation otherwise.

Single ion differential measurement (SIDAM)

This method [320] is an adaptation of the MM method in order to avoid the influence of small spectral shifts due to ionization inhomogeneities within the absorbers as well as to non-zero offsets between different exposures. It was mainly used with Fe II, which provides transitions with positive and negative q-coefficients (see Figure 3). Since it relies on a single ion, it is less sensitive to isotopic abundances, and in particular not sensitive to the one of Mg.

The first analysis relies on the QSO HE 0515-4414 that was used in [427] to get the constraint (77). An independent analysis [361] of the same system gave a weighted mean

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.12 \pm 1.79{)} \times {10^{- 6}},\quad z = 1.15,$$
(78)

at 1σ. The same system was studied independently, using the HARPS spectrograph mounted on the 3.6 m telescope at La Silla observatory [92]. The HARPS spectrograph has a higher resolution that UVES; R ∼ 112000. Observations based on Fe II with a S/N of about 30–40 per pixel set the constraint

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(}0.5 \pm 2.4{)} \times {10^{- 6}},\quad z = 1.15.$$
(79)

The second constraint [325, 361] is obtained from an absorption system toward Q 1101-264,

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(}5.66 \pm 2.67{)} \times {10^{- 6}},\quad z = 1.84,$$
(80)

These constraints do not seem to be compatible with the results of the Keck/HIRES based on the MM method. A potential systematic uncertainty, which can affect these constraints is the relative shift of the wavelength calibration in the blue and the red arms of UVES where the distant Fe lines are recorded simultaneously (see, e.g., [359] for a discussion of the systematics of this analysis).

HI-21 cm vs. UV: \(x = \alpha _{{\rm{EM}}}^2{g_p}/\mu\)

The comparison of UV heavy element transitions with the hyperfine HI transition allows to extract [496]

$$x \equiv \alpha _{{\rm{EM}}}^2{g_{\rm{p}}}/\mu,$$

since the hyperfine transition is proportional to \(\alpha _{{\rm{EM}}}^2{g_p}{\mu ^{- 1}}{R_\infty}\) while optical transitions are simply proportional to R. It follows that constraints on the time variation of x can be obtained from high resolution 21 cm spectra compared to UV lines, e.g., of Si II, Fe II and/or Mg II, as first performed in [548] in z ∼ 0.524 absorber.

Using 9 absorption systems, there was no evidence for any variation of x [494],

$$\Delta x/x = {(} - 0.63 \pm 0.99{)} \times {10^{- 5}},\quad 0.23 < z < 2.35,$$
(81)

This constraint was criticised in [275] on the basis that the systems have multiple components and that it is not necessary that the strongest absorption arises in the same component in both type of lines. However, the error analysis of [494] tries to estimate the effect of the assumption that the strongest absorption arises in the same component.

Following [147], we note that the systems lie in two widely-separated ranges and that the two samples have completely different scatter. Therefore it can be split into two samples of respectively 5 and 4 systems to get

$$\Delta x/x = {(}1.02 \pm 1.68{)} \times {10^{- 5}},\quad 0.23 < z < 0.53,$$
(82)
$$\Delta x/x = {(}0.58 \pm 1.94{)} \times {10^{- 5}},\quad 1.7 < z < 2.35.$$
(83)

In such an approach two main difficulties arise: (1) the radio and optical source must coincide (in the optical QSO can be considered pointlike and it must be checked that this is also the case for the radio source), (2) the clouds responsible for the 21 cm and UV absorptions must be localized in the same place. Therefore, the systems must be selected with care and today the number of such systems is small and are actively looked for [411].

The recent detection of 21 cm and molecular hydrogen absorption lines in the same damped Lyman-α system at zabs = 3.174 towards SDSS J1337+3152 constrains [472] the variation x to

$$\Delta x/x = - {(}1.7 \pm 1.7{)} \times {10^{- 6}},\quad z = 3.174.$$
(84)

This system is unique since it allows for 21 cm, H2 and UV observation so that in principle one can measure αEM, x and μ independently. However, as the H2 column density was low, only Werner band absorption lines are seen so that the range of sensitivity coefficients is too narrow to provide a stringent constraint, Δμ/μ < 4 × 10−4. It was also shown that the H2 and 21 cm are shifted because of the inhomogeneity of the gas, hence emphasizing this limitation. [411] also mentioned that 4 systems at z = 1.3 sets Δx/x = (0.0 ± 1.5) × 10−6 and that another system at z = 3.1 gives δx/x = (0.2 ± 0.5) × 10−6. Note also that the comparison [274] with CI at z ∼ 1.4–1.6 towards Q0458-020 and Q2337-011, yields Δx/x = (6.8 ± 1.0) × 10−6 over the band o redshift 0 < 〈z〉 ≤ 1.46, but this analysis ignores an important wavelength calibration estimated to be of the order of 6.7 × 10−6. It was argued that, using the existing constraints on Δμ/μ, this measurement is inconsistent with claims of a smaller value of αEM from the many-multiplet method, unless fractional changes in g p are larger than those in αEM and μ.

HI vs. molecular transitions: \(y \equiv {g_{\rm{P}}}\alpha _{{\rm{EM}}}^2\)

The HI 21 cm hyperfine transition frequency is proportional to \({g_{\rm{P}}}{\mu ^{- 1}}\alpha _{{\rm{EM}}}^2{R_\infty}\) (see Section 3.1.1). On the other hand, the rotational transition frequencies of diatomic are inversely proportional to their reduced mass M. As on the example of Equation (35) where we compared an electronic transition to a vibro-rotational transition, the comparison of the hyperfine and rotational frequencies is proportional to

$${{{\nu _{{\rm{hf}}}}} \over {{\nu _{{\rm{rot}}}}}} \propto {g_{\rm{p}}}\alpha _{{\rm{EM}}}^2{M \over {{m_{\rm{p}}}}} \simeq {g_{\rm{p}}}\alpha _{{\rm{EM}}}^2 \equiv y,$$

where the variation of M/mp is usually suppressed by a large factor of the order of the ratio between the proton mass and nucleon binding energy in nuclei, so that we can safely neglect it.

The constraint on the variation of y is directly determined by comparing the redshift as determined from HI and molecular absorption lines,

$${{\Delta y} \over y} = {{{z_{{\rm{mol}}}} - {z_{\rm{H}}}} \over {1 + {z_{{\rm{mol}}}}}}.$$

This method was first applied [513] to the CO molecular absorption lines [536] towards PKS 1413+135 to get

$$\Delta y/y = (- 4 \pm 6) \times {10^{- 5}}\quad z = 0.247.$$

The most recent constraint [375] relies on the comparison of the published redshifts of two absorption systems determined both from HI and molecular absorption. The first is a system at z = 0.6847 in the direction of TXS 0218+357 for which the spectra of CO(1-2), 13CO(1-2), C18O(1-2), CO(2-3), HCO+(1-2) and HCN(1-2) are available. They concluded that

$$\Delta y/y = {(} - 0.16 \pm 0.54{)} \times {10^{- 5}}\quad z = 0.6847.$$
(85)

The second system is an absorption system in direction of PKS 1413+135 for which the molecular lines of CO(1-2), HCO+(1-2) and HCO+(2-3) have been detected. The analysis led to

$$\Delta y/y = {(} - 0.2 \pm 0.44{)} \times {10^{- 5}},\quad z = 0.247.$$
(86)

[78] obtains the constraints ∣Δy/y∣ < 3.4 × 10−5 at z ∼ 0.25 and z ∼ 0.685.

The radio domain has the advantage of heterodyne techniques, with a spectral resolution of 106 or more, and dealing with cold gas and narrow lines. The main systematics is the kinematical bias, i.e., that the different lines do not come exactly from the same material along the line of sight, with the same velocity. To improve this method one needs to find more sources, which may be possible with the radio telescope ALMAFootnote 3.

OH — 18 cm: \(F = {g_{\rm{P}}}{(\alpha _{{\rm{EM}}}^2\mu)^{1.57}}\)

Using transitions originating from a single species, like with SIDAM, allows to reduce the systematic effects. The 18 cm lines of the OH radical offers such a possibility [95, 272].

The ground state, 2Π3/2J = 3/2, of OH is split into two levels by Λ-doubling and each of these doubled level is further split into two hyperfine-structure states. Thus, it has two “main” lines (ΔF = 0) and two “satellite” lines (ΔF =1). Since these four lines arise from two different physical processes (Λ-doubling and hyperfine splitting), they enjoy the same Rydberg dependence but different gp and αEM dependencies. By comparing the four transitions to the HI hyperfine line, one can have access to

$$F \equiv {g_{\rm{p}}}{(\alpha _{{\rm{EM}}}^2\mu)^{1.57}}$$
(87)

and it was also proposed to combine them with HCO+ transitions to lift the degeneracy.

Using the four 18 cm OH lines from the gravitational lens at z ∼ 0.765 toward PMN J0134-0931 and comparing the HI 21 cm and OH absorption redshifts of the different components allowed to set the constraint [276]

$$\Delta F/F = {(} - 0.44 \pm 0.36 \pm {1.0_{{\rm{syst}}}}{)} \times {10^{- 5}},\quad z = 0.765,$$
(88)

where the second error is due to velocity offsets between OH and HI assuming a velocity dispersion of 3 km/s. A similar analysis [138] in a system in the direction of PKS 1413+135 gave

$$\Delta F/F = {(}0.51 \pm 1.26{)} \times {10^{- 5}},\quad z = 0.2467.$$
(89)

Far infrared fine-structure lines: \({F\prime} = \alpha _{{\rm{EM}}}^2\mu\)

Another combination [300] of constants can be obtained from the comparison of far infrared fine-structure spectra with rotational transitions, which respectively behaves as \({R_\infty}\alpha _{{\rm{EM}}}^2\) and \({R_\infty}\bar \mu = {R_\infty}/\mu\), so that they give access to

$$F{\prime} = \alpha _{{\rm{EM}}}^2\mu.$$

A good candidate for the rotational lines is CO since it is the second most abundant molecule in the Universe after H2.

Using the CII fine-structure and CO rotational emission lines from the quasars J1148+5251 and BR 1202-0725, it was concluded that

$$\Delta F{\prime}/F{\prime} = {(}0.1 \pm 1.0{)} \times {10^{- 5}},\quad z = 6.42,$$
(90)
$$\Delta F{\prime}/F{\prime} = {(}1.4 \pm 1.5{)} \times {10^{- 5}},\quad z = 4.69,$$
(91)

which represents the best constraints at high redshift. As usual, when comparing the frequencies of two different species, one must account for random Doppler shifts caused by non-identical spatial distributions of the two species. Several other candidates for microwave and FIR lines with good sensitivities are discussed in [299].

“Conjugate” satellite OH lines: G = gp(αEMμ)1.85

The satellite OH 18 cm lines are conjugate so that the two lines have the same shape, but with one line in emission and the other in absorption. This arises due to an inversion of the level of populations within the ground state of the OH molecule. This behavior has recently been discovered at cosmological distances and it was shown [95] that a comparison between the sum and difference of satellite line redshifts probes G = gp(αEMμ)1.85.

From the analysis of the two conjugate satellite OH systems at z ∼ 0.247 towards PKS 1413+135 and at z ∼ 0.765 towards PMN J0134-0931, it was concluded [95] that

$$\vert \Delta G/G\vert < 7.6 \times {10^{- 5}}.$$
(92)

It was also applied to a nearby system, Centaurus A, to give ∣ΔG/G∣ < 1.6 × 10−5 at z ∼ 0.0018. A more recent analysis [273] claims for a tentative evidence (with 2.6σ significance, or at 99.1% confidence) for a smaller value of G

$$\Delta G/G = {(} - 1.18 \pm 0.46{)} \times {10^{- 5}}$$
(93)

for the system at z ∼ 0.247 towards PKS 1413+135.

One strength of this method is that it guarantees that the satellite lines arise from the same gas, preventing from velocity offset between the lines. Also, the shape of the two lines must agree if they arise from the same gas.

Molecular spectra and the electron-to-proton mass ratio

As was pointed out in Section 3.1, molecular lines can provide a test of the variationFootnote 4 [488] of μ since rotational and vibrational transitions are respectively inversely proportional to their reduce mass and its square-root [see Equation (35)].

Constraints with H2

H2 is the most abundant molecule in the universe and there were many attempts to use its absorption spectra to put constraints on the time variation of μ despite the fact that H2 is very difficult to detect [387].

As proposed in [512], the sensitivity of a vibro-rotational wavelength to a variation of μ can be parameterized as

$${\lambda _i} = \lambda _i^0(1 + {z_{{\rm{abs}}}})\left({1 + {K_i}{{\Delta \mu} \over \mu}} \right),$$

where \(\lambda _i^0\) is the laboratory wavelength (in the vacuum) and λ i is the wavelength of the transition i in the rest-frame of the cloud, that is at a redshift zabs so that the observed wavelength is λ i /(1+zabs). K i is a sensitivity coefficient analogous to the q-coefficient introduced in Equation (70), but with different normalization since in the parameterization we would have \({q_i} = \omega _i^0{K_i}/2\),

$${K_i} \equiv {{{\rm{d}}\ln {\lambda _i}} \over {{\rm{d}}\ln \mu}}$$

corresponding to the Lyman and Werner bands of molecular hydrogen. From this expression, one can deduce that the observed redshift measured from the transition i is simply

$${z_i} = {z_{{\rm{abs}}}} + b{K_i},\quad b \equiv - (1 + {z_{{\rm{abs}}}}){{\Delta \mu} \over \mu},$$

which implies in particular that zabs is not the mean of the z i if Δμ ≠ 0. Indeed z i is measured with some uncertainty of the astronomical measurements λ i and by errors of the laboratory measurements \(\lambda _i^0\). But if Δμ ≠ 0 there must exist a correlation between z i and K i so that a linear regression of z i (measurement) as a function of K i (computed) allows to extract (zabs, b) and their statistical significance.

We refer to Section V.C of FVC [500] for earlier studies and we focus on the latest results. The recent constraints are mainly based on the molecular hydrogen of two damped Lyman-α absorption systems at z = 2.3377 and 3.0249 in the direction of two quasars (Q 1232+082 and Q 0347-382) for which a first analysis of VLT/UVES data showed [262] a slight indication of a variation,

$$\Delta \mu/\mu = {(}5.7 \pm 3.8{)} \times {10^{- 5}}$$

at 1.5σ for the combined analysis. The lines were selected so that they are isolated, unsaturated and unblended. It follows that the analysis relies on 12 lines (over 50 detected) for the first quasar and 18 (over 80) for second but the two selected spectra had no transition in common. The authors performed their analysis with two laboratory catalogs and got different results. They point out that the errors on the laboratory wavelengths are comparable to those of the astronomical measurements.

It was further improved with an analysis of two absorption systems at z = 2.5947 and z = 3.0249 in the directions of Q 0405-443 and Q 0347-383 observed with the VLT/UVES spectrograph. The data have a resolution R = 53000 and a S/N ratio ranging between 30 and 70. The same selection criteria where applied, letting respectively 39 (out of 40) and 37 (out of 42) lines for each spectrum and only 7 transitions in common. The combined analysis of the two systems led [261]

$$\Delta \mu/\mu = {(}1.65 \pm 0.74{)} \times {10^{- 5}}\quad {\rm{or}}\quad \Delta \mu/\mu = {(}3.05 \pm 0.75{)} \times {10^{- 5}},$$

according to the laboratory measurements that were used. The same data were reanalyzed with new and highly accurate measurements of the Lyman bands of H2, which implied a reevaluation of the sensitivity coefficient K i . It leads to the two constraints [431]

$$\Delta \mu/\mu = {(}2.59 \pm 0.88{)} \times {10^{- 5}},\quad z = 2.59,$$
(94)
$$\Delta \mu/\mu = {(}2.06 \pm 0.79{)} \times {10^{- 5}},\quad z= 3.02,$$
(95)

leading to a 3.5σ detection for the weighted mean Δμ/μ = (2.4 ± 0.66) × 10−5. The authors of [431] do not claim for a detection and are cautious enough to state that systematics dominate the measurements. The data of the z = 3.02 absorption system were re-analyzed in [529], which claim that they lead to the bound ∣Δμ/μ∣ < 4.9 × 10−5 at a 2σ level, instead of Equation (95). Adding a new set of 6 spectra, it was concluded that Δμ/μ = (15 ± 14) × 10−6 for the weighted fit [530].

These two systems were reanalyzed [289], adding a new system in direction of Q 0528-250,

$$\Delta \mu/\mu = {(}1.01 \pm 0.62{)} \times {10^{- 5}},\quad z = 2.59,$$
(96)
$$\Delta \mu/\mu = {(}0.82 \pm 0.74{)} \times {10^{- 5}},\quad z = 2.8,$$
(97)
$$\Delta \mu/\mu = {(}0.26 \pm 0.30{)} \times {10^{- 5}},\quad z = 3.02,$$
(98)

respectively with 52, 68 and 64 lines. This gives a weighted mean of (2.6±3.0) × 10−6 at z ∼ 2.81. To compare with the previous data, the analysis of the two quasars in common was performed by using the same lines (this implies adding 3 and removing 16 for Q 0405-443 and adding 4 and removing 35 for Q 0347-383) to get respectively (−1.02 ± 0.89) × 10−5 (z = 2.59) and (−1.2 ± 1.4) × 10−5 (z = 3.02). Both analyses disagree and this latter analysis indicates a systematic shift of Δμ/μ toward 0. A second re-analysis of the same data was performed in [490, 489] using a different analysis method to get

$$\Delta \mu/\mu = (- 7 \pm 8) \times {10^{- 6}}.$$
(99)

Recently discovered molecular transitions at z = 2.059 toward the quasar J2123-0050 observed by the Keck telescope allow to obtain 86 H2 transitions and 7 HD transitions to conclude [342]

$$\Delta \mu /\mu = (5{.}6 \pm {5.5_{{\rm{stat}}}} \pm {2.7_{{\rm{syst}}}}) \times {10^{- 6}},\quad z = 2{.}059{.}$$
(100)

This method is subject to important systematic errors among which (1) the sensitivity to the laboratory wavelengths (since the use of two different catalogs yield different results [431]), (2) the molecular lines are located in the Lyman-α forest where they can be strongly blended with intervening HI Lyman-α absorption lines, which requires a careful fitting of the lines [289] since it is hard to find lines that are not contaminated. From an observational point of view, very few damped Lyman-α systems have a measurable amount of H2 so that only a dozen systems is actually known even though more systems will be obtained soon [411]. To finish, the sensitivity coefficients are usually low, typically of the order of 10−2. Some advantages of using H2 arise from the fact there are several hundred available H2 lines so that many lines from the same ground state can be used to eliminate different kinematics between regions of different excitation temperatures. The overlap between Lyman and Werner bands also allow to reduce the errors of calibration.

To conclude, the combination of all the existing observations indicate that μ is constant at the 10−5 level during the past 11 Gigayrs while an improvement of a factor 10 can be expected in the five coming years.

Other constraints

It was recently proposed [201, 202] that the inversion spectrum of ammonia allows for a better sensitivity to μ. The inversion vibro-rotational mode is described by a double well with the first two levels below the barrier. The tunneling implies that these two levels are split in inversion doublets. It was concluded that the inversion transitions scale as \({\nu _{{\rm{inv}}}} \sim {{\bar \mu}^{4.46}}\), compared with a rotational transition, which scales as \({\nu _{{\rm{rot}}}} \sim \bar \mu\). This implies that the redshifts determined by the two types of transitions are modified according to δzinv = 4.46(1 + zabsμ/μ and δzrot ∼ (1 + zabsμ/μ so that

$$\Delta \mu/\mu = 0.289{{{z_{{\rm{inv}}}} - {z_{{\rm{rot}}}}} \over {1 + {z_{{\rm{abs}}}}}}.$$

Only one quasar absorption system, at z = 0.68466 in the direction of B 0218+357, displaying NH3 is currently known and allows for this test. A first analysis [201] estimated from the published redshift uncertainties that a precision of ∼ 2 × 10−6 on Δμ/μ can be achieved. A detailed measurement [366] of the ammonia inversion transitions by comparison to HCN and HCO+ rotational transitions concluded that

$$\vert \Delta \mu/\mu \vert < 1.8 \times {10^{- 6}},\quad z = 0.685,$$
(101)

at a 2σ level. Recently the analysis of the comparison of NH3 to HC3N spectra was performed toward the gravitational lens system PKS 1830-211 (z ≃ 0.89), which is a much more suitable system, with 10 detected NH3 inversion lines and a forest of rotational transitions. It reached the conclusion that

$$\vert \Delta \mu/\mu \vert < 1.4 \times {10^{- 6}},\quad z = 0.89,$$
(102)

at a 3σ level [250]. From a comparison of the ammonia inversion lines with the NH3 rotational transitions, it was concluded [353]

$$\vert \Delta \mu/\mu \vert < 3.8 \times {10^{- 6}},\quad z = 0.89,$$
(103)

at 95% C.L. One strength of this analysis is to focus on lines arising from only one molecular species but it was mentioned that the frequencies of the inversion lines are about 25 times lower than the rotational ones, which might cause differences in the absorbed background radio continuum.

This method was also applied [323] in the Milky Way, in order to constrain the spatial variation of μ in the galaxy (see Section 6.1.3). Using ammonia emission lines from interstellar molecular clouds (Perseus molecular core, the Pipe nebula and the infrared dark clouds) it was concluded that Δμ = (4–14) × 10−8. This indicates a positive velocity offset between the ammonia inversion transition and rotational transitions of other molecules. Two systems being located toward the galactic center while one is in the direction of the anti-center, this may indicate a spatial variation of μ on galactic scales.

New possibilities

The detection of several deuterated molecular hydrogen HD transitions makes it possible to test the variation of μ in the same way as with H2 but in a completely independent way, even though today it has been detected only in 2 places in the universe. The sensitivity coefficients have been published in [263] and HD was first detected by [387].

HD was recently detected [473] together with CO and H2 in a DLA cloud at a redshift of 2.418 toward SDSS1439+11 with 5 lines of HD in 3 components together with several H2 lines in 7 components. It allowed to set the 3σ limit of ∣Δμ/μ∣ < 9 × 10−5 [412].

Even though the small number of lines does not allow to reach the level of accuracy of H2 it is a very promising system in particular to obtain independent measurements.

Emission spectra

Similar analysis to constrain the time variation of the fundamental constants were also performed with emission spectra. Very few such estimates have been performed, since it is less sensitive and harder to extend to sources with high redshift. In particular, emission lines are usually broad as compared to absorption lines and the larger individual errors need to be beaten by large statistics.

The O III doublet analysis [24] from a sample of 165 quasars from SDSS gave the constraint

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = (12 \pm 7) \times {10^{- 5}},\quad 0.16 < z < 0.8.$$
(104)

The method was then extended straightforwardly along the lines of the MM method and applied [238] to the fine-structure transitions in Ne III, Ne V, O III, O I and S II multiplets from a sample of 14 Seyfert 1.5 galaxies to derive the constraint

$$\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = (150 \pm 70) \times {10^{- 5}},\quad 0.035 < z < 0.281.$$
(105)

Conclusion and prospects

This subsection illustrates the diversity of methods and the progresses that have been achieved to set robust constraints on the variation of fundamental constants. Many systems are now used, giving access to different combinations of the constants. It exploits a large part of the electromagnetic spectrum from far infrared to ultra violet and radio bands and optical and radio techniques have played complementary roles. The most recent and accurate constraints are summarized in Table 10 and Figure 4.

Figure 4
figure4

Summary of the direct constraints on αEM obtained from the AD (blue), MM (red) and AD (green) methods (left) and on μ (right) that are summarized in Table 10.

Table 10 Summary of the latest constraints on the variation of fundamental constants obtained from the analysis of quasar absorption spectra. We recall that \(y \equiv {g_{\rm{p}}}\alpha _{{\rm{EM}}}^2,\,F \equiv {g_{\rm{p}}}{(\alpha _{{\rm{EM}}}^2\mu)^{1.57}},\,x \equiv \alpha _{{\rm{EM}}}^2{g_{\rm{p}}}/\mu ,\,F' \equiv \alpha _{{\rm{EM}}}^2\mu\) and μmp/me, G = gp(αμ)1.85.

At the moment, only one analysis claims to have detected a variation of the fine structure constant (Keck/HIRES) while the VLT/UVES points toward no variation of the fine structure constant. It has led to the proposition that αEM may be space dependent and exhibit a dipole, the origin of which is not explained. Needless to say that such a controversy and hypotheses are sane since it will help improve the analysis of this data, but it is premature to conclude on the issue of this debate and the jury is still out. Most of the systematics have been investigated in detail and now seem under control.

Let us what we can learn on the physics from these measurement. As an example, consider the constraints obtained on μ, y and F in the redshift band 0.6–0.8 (see Table 10). They can be used to extract independent constraints on gp, αEM and μ

$$\Delta \mu/\mu = {(}0 \pm 0.18{)} \times {10^{- 5}},\quad \Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = {(} - 0.27 \pm 2.09{)} \times {10^{- 5}},\quad \Delta {g_{\rm{p}}}/{g_{\rm{p}}} = {(}0.38 \pm 4.73{)} \times {10^{- 5}}.$$

This shows that one can test the compatibility of the constraints obtained from different kind of systems. Independently of these constraints, we have seen in Section 6.3 that in grand unification theory the variation of the constants are correlated. The former constraints show that if Δln μ = RΔlnαEM then the constraint (101) imposes that ∣RΔlnαEM∣ < 1.8 × 10−6. In general R is expected to be of the order of 30–50. Even if its value its time-dependent, that would mean that ΔlnαEM ∼ (1–5) × 10−7, which is highly incompatible with the constraint (74) obtained by the same team on αEM, but also on the constraints (71) and (72) obtained from the AD method and on which both teams agree. This illustrates how important the whole set of data is since one will probably be able to constrain the order of magnitude of R in a near future, which would be a very important piece of information for the theoretical investigations.

We mention in the course of this paragraph many possibilities to improve these constraints.

Since the AD method is free of the two main assumptions of the MM method, it seems important to increase the precision of this method as well as any method relying only on one species. This can be achieved by increasing the S/N ratio and spectral resolution of the data used or by increasing the sample size and including new transitions (e.g., cobalt [172, 187]).

The search for a better resolution is being investigated in many direction. With the current resolution of R ∼ 40000, the observed line positions can be determined with an accuracy of σ λ ∼ 1 mÅ. This implies that the accuracy on ΔαEM/αEM is of the order of 10−5 for lines with typical q-coefficients. As we have seen this limit can be improved to 10−6 when more transitions or systems are used together. Any improvement is related to the possibility to measure line positions more accurately. This can be done by increasing R up to the point at which the narrowest lines in the absorption systems are resolved. The Bohlin formula [62] gives the estimates

$${\sigma _\lambda} \sim \Delta {\lambda _{{\rm{pix}}}}\left({{{\Delta {\lambda _{{\rm{pix}}}}} \over {{W_{{\rm{obs}}}}}}} \right){1 \over {\sqrt {{N_e}}}}\left({{{{M^{3/2}}} \over {\sqrt {12}}}} \right),$$

where Δλpix is the pixel size, Wobs is the observed equivalent width, N e is the mean number of photoelectron at the continuum level and M is the number of pixel covering the line profile. The metal lines have intrinsic width of a few km/s. Thus, one can expect improvements from higher spectral resolution. Progresses concerning the calibration are also expected, using, e.g., laser comb [478]. Let us just mention, the EXPRESSO (Echelle Spectrograph for PREcision Super Stable Observation) project [115] on 4 VLT units or the CODEX (COsmic Dynamics EXplorer) on E-ELT projects [360, 357, 507]. They shall provide a resolving power of R = 150000 to be compared to the HARPSFootnote 5 (High Accuracy Radial velocity planet Searcher) spectrograph (R ∼ 112000) has been used but it is operating on a 3.6 m telescope.

The limitation may then lie in the statistics and the calibration and it would be useful to use more than two QSO with overlapping spectra to cross-calibrate the line positions. This means that one needs to discover more absorption systems suited for these analyses. Much progress is expected. For instance, the FIR lines are expected to be observed by a new generation of telescopes such as HERSCHELFootnote 6. While the size of the radio sample is still small, surveys are being carried out so that the number of known redshift OH, HI and HCO+ absorption systems will increase. For instance the future Square Kilometer Array (SKA) will be able to detect relative changes of the order of 10−7 in αEM.

In conclusion, it is clear that these constraints and the understanding of the absorption systems will increase in the coming years.

Stellar constraints

Stars start to accumulate helium produced by the pp-reaction and the CNO cycle in their core. Furthermore, the products of further nuclear reactions of helium with either helium or hydrogen lead to isotopes with A = 5 or A = 8, which are highly unstable. In order to produce elements heavier than A > 7 by fusion of lighter isotopes, the stars need to reach high temperatures and densities. In these conditions, newly produced 12C would almost immediately be fused further to form heavier elements so that one expects only a tiny amount of 12C to be produced, in contradiction with the observed abundances. This led Hoyle [257] to conclude that a then unknown excited state of the 12C with an energy close to the 3α-threshold should exist since such a resonance would increase the probability that 8Be captures an α-particle. It follows that the production of 12C in stars relies on the three conditions:

  • the decay lifetime of 8Be, of order 10−16 s, is four orders of magnitude longer than the time for two α particles to scatter, so that a macroscopic amount of beryllium can be produced, which is sufficient to lead to considerable production of carbon,

  • an excited state of 12C lies just above the energy of 8Be+α, which allows for

    $$^4{\rm{He}}{{\rm{+}}^4}{\rm{He}}{\leftrightarrow ^8}{\rm{Be,}}{\quad ^8}{\rm{Be}}{{\rm{+}}^4}{\rm{He}}{\leftrightarrow ^{12}}{\rm{C\ast}}\;{\rightarrow^{12}}{\rm{C + 7}}{\rm{.367}}\;{\rm{MeV,}}$$
  • the energy level of 16O at 7.1197 MeV is non resonant and below the energy of 12C + α, of order 7.1616 MeV, which ensures that most of the carbon synthesized is not destroyed by the capture of an α-particle. The existence of this resonance, the \(J_l^\pi = 0_2^ +\)-state of 12C was actually discovered [111] experimentally later, with an energy of 372 ± 4 keV [today, \({E_{0_2^ +}} = 379.47 \pm 0.15{\rm{keV}}\)], above the ground state of three α-particles (see Figure 5).

Figure 5
figure5

Left: Level scheme of nuclei participating to the 4He(αα, γ)12C reaction. Right: Central abundances at the end of the CHe burning as a function of δ NN for a 60 M star with Z = 0. From [103].

The variation of any constant that would modify the energy of this resonance would also endanger the stellar nucleosynthesis of carbon, so that the possibility for carbon production has often been used in anthropic arguments. Qualitatively, if \({E_{0_2^ +}}\) is increased then the carbon would be rapidly processed to oxygen since the star would need to be hotter for the triple-α process to start. On the other hand, if \({E_{0_2^ +}}\) is decreased, then all α-particles would produce carbon so that no oxygen would be synthesized. It was estimated [334] that the carbon production in intermediate and massive stars is suppressed if the various of the energy of the resonance is outside the range \(- 250\,{\rm{keV \lesssim}}\Delta {{\rm{E}}_{0_2^ +}} \lesssim 60{\rm{keV}}\), which was further improved [451] to, \(- 5\,{\rm{keV \lesssim}}\Delta {{\rm{E}}_{0_2^ +}} \lesssim 50\,{\rm{keV}}\) in order for the C/O ratio to be larger than the error in the standard yields by more than 50%. Indeed, in such an analysis, the energy of the resonance was changed by hand. However, we expect that if \({E_{0_2^ +}}\) is modified due to the variation of a constant other quantities, such as the resonance of the oxygen, the binding energies and the cross sections will also be modified in a complex way.

In practice, to draw a constraint on the variation of the fundamental constants from the stellar production of carbon, one needs to go through different steps, any of them involving assumptions,

  1. 1.

    to determine the effective parameters, e.g., cross sections, which affects the stellar evolution. The simplest choice is to modify only the energy of the resonance but it may not be realistic since all cross sections and binding energies should also be affected. This requires one to use a stellar evolutionary model;

  2. 2.

    relate these parameters to nuclear parameters. This involves the whole nuclear physics machinery;

  3. 3.

    to relate the nuclear parameters to fundamental constants. As for the Oklo phenomenon, it requires to link QCD to nuclear physics.

A first analysis [390, 391, 451] used a model that treats the carbon nucleus by solving the 12-nucleon Schrödinger equation using a three-cluster wavefunction representing the three-body dynamics of the 12C state. The NN interaction was described by the Minnesota model [297, 491] and its strength was modified by multiplying the effective NN-potential by an arbitrary number p. This allows to relate the energy of the Hoyle level relative to the triple alpha threshold, εQ ααα , and the gamma width, Γ γ , as a function of the parameter p, the latter being almost not affected. The modified 3α-reaction rate was then given by

$${r_\alpha} = {3^{3/2}}N_\alpha ^3{\left({{{2\pi {\hbar ^2}} \over {{M_\alpha}{k_{\rm{B}}}T}}} \right)^3}{\Gamma \over \hbar}\exp \left[ {- {{\varepsilon (p)} \over {{k_{\rm{B}}}T}}} \right],$$
(106)

where M α and N α are the mass and number density of the α-particle, The resonance width Γ = Γ α Γ γ /(Γ α + Γ γ ) ∼ Γ γ . This was included in a stellar code and ran for red giant stars with 1.3, 5 and 20 M with solar metallicity up to thermally pulsating asymptotic giant branch [390] and in low, intermediate and high mass (1.3, 5, 15, 25 M) with solar metallicity also up to TP-AGB [451] to conclude that outside a window of respectively 0.5% and 4% of the values of the strong and electromagnetic forces, the stellar production of carbon or oxygen will be reduced by a factor 30 to 1000.

In order to compute the resonance energy of the 8 Be and 12C a microscopic cluster model was developed [297]. The Hamiltonian of the system is then of the form \(H = \sum \nolimits_i^AT({{\bf{r}}_i} + \sum \nolimits_{j < i}^AV({{\bf{r}}_{ij}})\), where A is the nucleon number, T the kinetic energy and V the NN interaction potential. In order to implement the variation of the strength of the nuclear interaction with respect to the electromagnetic interaction, it was taken as

$$V({{\bf{r}}_{ij}}) = {V_C}({{\bf{r}}_{ij}}) + (1 + {\delta _{NN}}){V_N}({{\bf{r}}_{ij}}),$$

where δ NN is a dimensionless parameter that describes the change of the nuclear interaction, V N being described in [491]. When A > 4 no exact solution can be found and approximate solutions in which the wave function of the 8Be and 12C are described by clusters of respectively 2 and 3 α-particle is well adapted.

First, δ NN can be related to the deuterium binding energy as

$$\Delta {B_D}/{B_D} = 5.7701 \times {\delta _{NN}},$$
(107)

which, given the discussion in Section 3.8.3, allows to relate δ NN to fundamental constants, as, e.g., in [104]. Then, the resonance energy of the 8Be and 12C scale as

$${E_R}{(^8}{\rm{Be}}) = {(}0.09208 - 12.208 \times {\delta _{NN}}{)}\;{\rm{Mev,}}\quad {E_R}{(^{12}}{\rm{C}}) = {(}0.2877 - 20.412 \times {\delta _{NN}}{)}\;{\rm{Mev}},$$
(108)

so that the energy of the Hoyle level relative to the triple alpha threshold is Q ααα = E R (8Be) + E R (12C).

This was implemented in [103, 180] to population III stars with typical masses, 15 and 60 M with zero metallicity, in order to compute the central abundances at the end of the core He burning. From Figure 5, one can distinguish 4 regimes (I) the star ends the CHe burning phase with a core composed of a mixture of 12C and 16O, as in the standard case; (II) if the 3α rate is weaker, 12C is produced slower, the reaction 12C(α, γ)16O becomes efficient earlier so that the star ends the CHe burning phase with a core composed mostly of 16O; (III) for weaker rates, the 16O is further processed to 20Ne and then 24Mg so that the star ends the CHe burning phase with a core composed of 24Mg and (IV) if the 3α rate is stronger, the 12C is produced more rapidly and the star ends the CHe burning phase with a core composed mostly of 12C. Typically this imposes that

$$- 5 \times {10^{- 4}} < {\delta _{NN}} < 1.5 \times {10^{- 3}},\quad - 3 \times {10^{- 4}} < \Delta {B_D}/{B_D} < 9 \times {10^{- 3}},$$
(109)

at a redshift of order z ∼ 15, to ensure the ratio C/O to be of order unity.

To finish, a recent study [3] focus on the existence of stars themselves, by revisiting the stellar equilibrium when the values of some constants are modified. In some sense, it can be seen as a generalization of the work by Gamow [224] to constrain the Dirac model of a varying gravitational constant by estimating its effect on the lifetime of the Sun. In this semi-analytical stellar structure model, the effect of the fundamental constants was reduced phenomenologically to 3 parameters, G, which enters mainly on the hydrostatic equilibrium, αEM, which enters in the Coulomb barrier penetration through the Gamow energy, and a composite parameter \({\mathcal C}\), which describes globally the modification of the nuclear reaction rates. The underlying idea is to assume that the power generated per unit volume, ε(r), and which determines the luminosity of the star, is proportional to the fudge factor \({\mathcal C}\), which would arise from a modification of the nuclear fusion factor, or equivalently of the cross section. Thus, it assumes that all cross sections are affected is a similar way. The parameter space for which stars can form and for which stable nuclear configurations exist was determined, showing that no fine-tuning seems to be required.

This new system is very promising and will provide new information on the fundamental constants at redshifts smaller than z ∼ 15 where no constraints exist at the moment, even though drawing a robust constraint seems to be difficult at the moment. In particular, an underlying limitation arises from the fact that the composition of the interstellar media is a mixture of ejecta from stars with different masses and it is not clear which type of stars contribute the most the carbon and oxygen production. Besides, one would need to include rotation and mass loss [181]. As for the Oklo phenomenon, another limitation arises from the complexity of nuclear physics.

Cosmic Microwave Background

The CMB radiation is composed of photons emitted at the time of the recombination of hydrogen and helium when the universe was about 300,000 years old [see, e.g., [409] for details on the physics of the CMB]. This radiation is observed to be a black-body with a temperature T0 = 2.725 K with small anisotropies of order of the μK. The temperature fluctuation in a direction (ϑ, φ) is usually decomposed on a basis of spherical harmonics as

$${{\delta T} \over T}(\vartheta, \varphi) = \sum\limits_\ell {\sum\limits_{m = - \ell}^{m = + \ell} {{a_{\ell m}}{Y_{\ell m}}}} (\vartheta, \varphi).$$
(110)

The angular power spectrum multipole C = 〈∣a lm 2〉 is the coefficient of the decomposition of the angular correlation function on Legendre polynomials. Given a model of structure formation and a set of cosmological parameters, this angular power spectrum can be computed and compared to observational data in order to constrain this set of parameters.

The CMB temperature anisotropies mainly depend on three constants: G, αEM and me.

The gravitational constant enters in the Friedmann equation and in the evolution of the cosmological perturbations. It has mainly three effects [435] that are detailed in Section 4.4.1. αEM, me affect the dynamics of the recombination. Their influence is complex and must be computed numerically. However, we can trace their main effects since they mainly modify the CMB spectrum through the change in the differential optical depth of photons due to the Thomson scattering

$$\dot \tau = {x_{\rm{e}}}{n_{\rm{e}}}c{\sigma _{\rm{T}}},$$
(111)

which enters in the collision term of the Boltzmann equation describing the evolution of the photon distribution function and where xe is the ionization fraction (i.e., the number density of free electrons with respect to their total number density ne).

The first dependence arises from the Thomson scattering cross section given by

$${\sigma _{\rm{T}}} = {{8\pi} \over 3}{{{\hbar ^2}} \over {m_{\rm{e}}^2{c^2}}}\alpha _{{\rm{EM}}}^2$$
(112)

and the scattering by free protons can be neglected since me/mp ∼ 5 × 10−4.

The second, and more subtle dependence, comes from the ionization fraction. Recombination proceeds via 2-photon emission from the 2 level or via the Ly-α photons, which are redshifted out of the resonance line [405] because recombination to the ground state can be neglected since it leads to immediate re-ionization of another hydrogen atom by the emission of a Ly-α photons. Following [405, 338] and taking into account, for the sake of simplicity, only the recombination of hydrogen, the equation of evolution of the ionization fraction takes the form

$${{{\rm{d}}{x_{\rm{e}}}} \over {{\rm{d}}t}} = {\mathcal C}\left[ {\beta (1 - {x_{\rm{e}}})\exp \left({- {{{B_1} - {B_2}} \over {{k_{\rm{B}}}{T_M}}}} \right) - {\mathcal R}{n_{\rm{p}}}x_{\rm{e}}^2} \right],$$
(113)

where T M is the temperature. At high redshift, T M is identical to the one of the photons T γ = T0(1 + z) but evolves according to

$${{{\rm{d}}{T_M}} \over {{\rm{d}}t}} = - {{8{\sigma _{\rm{T}}}{a_R}} \over {3{m_{\rm{e}}}}}T_R^4{{{x_{\rm{e}}}} \over {1 + {x_{\rm{e}}}}}({T_M} - {T_\gamma}) - 2H{T_M}$$
(114)

where the radiation constant a R = 4σSB/c with \({\sigma _{{\rm{SB}}}} = k_{\rm{B}}^4{\pi ^2}/(60\pi {c^2}{\hbar ^3})\) the Stefan-Boltzmann constant. In Equation (113), B n = −E I /n2 is the energy of the nth hydrogen atomic level, β is the ionization coefficient, \({\mathcal R}\) the recombination coefficient, \({\mathcal C}\) the correction constant due to the redshift of Ly-α photons and to 2-photon decay and n p = n e is the number density of protons. β is related to \({\mathcal R}\) by the principle of detailed balance so that

$$\beta = {\mathcal R}{\left({{{2\pi {m_{\rm{e}}}{k_{\rm{B}}}{T_M}} \over {{h^2}}}} \right)^{3/2}}\exp \left({- {{{B_2}} \over {{k_{\rm{B}}}{T_M}}}} \right).$$
(115)

The recombination rate to all other excited levels is

$${\mathcal R} = {{8\pi} \over {{c^2}}}{\left({{{{k_{\rm{B}}}T} \over {2\pi {m_{\rm{e}}}}}} \right)^{3/2}}\sum\limits_{n,l}^{\ast} {(2l + 1){{\rm{e}}^{{B_n}/{k_{\rm{B}}}T}}\int\nolimits_{{B_n}/{k_{\rm{B}}}T}^\infty {{\sigma _{nl}}{{{y^2}{\rm{d}}y} \over {{{\rm{e}}^y} - 1}}}}$$

where σ nl is the ionization cross section for the (n, l) excited level of hydrogen. The star indicates that the sum needs to be regularized and the αEM-, me-dependence of the ionization cross section is complicated to extract. However, it can be shown to behave as \({\sigma _{nl}} \propto \alpha _{{\rm{EM}}}^{- 1}m_{\rm{e}}^{- 2}f(h\nu/{B_1})\). Finally, the factor \({\mathcal C}\) is given by

$${\mathcal C} = {{1 + K{\Lambda _{2s}}(1 - {x_e})} \over {1 + K(\beta + {\Lambda _{2s}})(1 - {x_e})}}$$
(116)

where Λ2s is the rate of decay of the 2s excited level to the ground state via 2 photons; it scales as \({m_{\rm{e}}}\alpha _{{\rm{EM}}}^8\). The constant K is given in terms of the Ly-α photon \({\lambda _\alpha} = 16\pi \hbar/(3{m_{\rm{e}}}\alpha _{{\rm{EM}}}^2c)\) by \(K = {n_p}\lambda _\alpha ^3/(8\pi H)\) and scales as \(m_{\rm{e}}^{- 3}\alpha _{{\rm{EM}}}^{- 6}\).

In summary, both the temperature of the decoupling and the residual ionization after recombination are modified by a variation of αEM or me. This was first discussed in [36, 277]. The last scattering surface can roughly be determined by the maximum of the visibility function \(g = \dot \tau \exp (- \tau)\), which measures the differential probability for a photon to be scattered at a given redshift. Increasing αEM shifts g to a higher redshift at which the expansion rate is faster so that the temperature and x e decrease more rapidly, resulting in a narrower g. This induces a shift of the C spectrum to higher multipoles and an increase of the values of the C . The first effect can be understood by the fact that pushing the last scattering surface to a higher redshift leads to a smaller sound horizon at decoupling. The second effect results from a smaller Silk damping.

Most studies have introduced those modifications in the RECFAST code [454] including similar equations for the recombination of helium. Our previous analysis shows that the dependences in the fundamental constants have various origins, since the binding energies B i scale has \({m_{\rm{e}}}\alpha _{{\rm{EM}}}^2\), σ T as \(\alpha _{{\rm{EM}}}^2m_{\rm{e}}^{- 2}\), K as \(m_{\rm{e}}^{- 3}\alpha _{{\rm{EM}}}^{- 6}\), the ionisation coefficients β as \(\alpha _{{\rm{EM}}}^3\), the transition frequencies as \({m_{\rm{e}}}\alpha _{{\rm{EM}}}^2\), the Einstein’s coefficients as \({m_{\rm{e}}}\alpha _{{\rm{EM}}}^5\), the decay rates Λ as \({m_{\rm{e}}}\alpha _{{\rm{EM}}}^8\) and \({\mathcal R}\) has complicated dependence, which roughly reduces to \(\alpha _{{\rm{EM}}}^{- 1}m_{\rm{e}}^{- 2}\). Note that a change in the fine-structure constant and in the mass of the electron are degenerate according to ΔαEM ≈ 0.39Δme but this degeneracy is broken for multipoles higher than 1500 [36]. In earlier works [244, 277] it was approximated by the scaling \({\mathcal R} \propto \alpha _{{\rm{EM}}}^{2(1 + \xi)}\) with ξ ∼ 0.7.

The first studies [244, 277] focused on the sensitivity that can be reached by WMAPFootnote 7 and PlanckFootnote 8. They concluded that they should provide a constraint on αEM at recombination, i.e., at a redshift of about z ∼ 1,000, with a typical precision ∣ΔαEM/αEM∣ ∼ 10−2–10−3.

The first attempt [21] to actually set a constraint was performed on the first release of the data by BOOMERanG and MAXIMA. It concluded that a value of αEM smaller by a few percents in the past was favored but no definite bound was obtained, mainly due to the degeneracies with other cosmological parameters. It was later improved [22] by a joint analysis of BBN and CMB data that assumes that only αEM varies and that included 4 cosmological parameters (Ωmat, Ωb, h,n s ) assuming a universe with Euclidean spatial section, leading to −0.09 < ΔαEM < 0.02 at 68% confidence level. A similar analysis [307], describing the dependence of a variation of the fine-structure constant as an effect on recombination the redshift of which was modeled to scale as z* = 1080[1 + 2ΔαEM/αEM], set the constraint −0.14 < ΔαEM < 0.02, at a 2σ level, assuming a spatially flat cosmological models with adiabatic primordial fluctuations that. The effect of re-ionisation was discussed in [350]. These works assume that only αEM is varying but, as can been seen from Eqs. (110116), assuming the electron mass constant.

With the WMAP first year data, the bound on the variation of αEM was sharpened [438] to −0.05 < ΔαEM/αEM < 0.02, after marginalizing over the remaining cosmological parameters (Ωmath2, Ωbh2, Ωh2, n s , α s , τ) assuming a universe with Euclidean spatial sections. Restricting to a model with a vanishing running of the spectral index (α s ≡ dn s /dlnk = 0), it gives −0.06 < ΔαEM/αEM < 0.01, at a 95% confidence level. In particular it shows that a lower value of αEM makes α s = 0 more compatible with the data. These bounds were obtained without using other cosmological data sets. This constraint was confirmed by the analysis of [259], which got −0.097 < ΔαEMαEM < 0.034, with the WMAP-1yr data alone and −0.042 < ΔαEM/αEM < 0.026, at a 95% confidence level, when combined with constraints on the Hubble parameter from the HST Hubble Key project.

The analysis of the WMAP-3yr data allows to improve [476] this bound to −0.039 < ΔαEM/αEM < 0.010, at a 95% confidence level, assuming (Ωmat, Ωb, h, n s , zre, A s ) for the cosmological parameters (ΩΛ being derived from the assumption Ω K = 0, as well as τ from the re-ionisation redshift, zre) and using both temperature and polarization data (TT, TE, EE).

The WMAP 5-year data were analyzed, in combination with the 2dF galaxy redshift survey, assuming that both αEM and me can vary and that the universe was spatially Euclidean. Letting 6 cosmological parameters [(Ωmath2, Ωbh2, Θ, τ, n s , A s ), Θ being the ratio between the sound horizon and the angular distance at decoupling] and 2 constants vary they, it was concluded [452, 453] −0.012 < ΔαEM/αEM < 0.018 and −0.068 < Δme/me < 0.044, the bounds fluctuating slightly depending on the choice of the recombination scenario. A similar analyis [381] not including me gave −0.050 < ΔαEM/αEM < 0.042, which can be reduced by taking into account some further prior from the HST data. Including polarisation data data from ACBAR, QUAD and BICEP, it was also obtained [352] −0.043 < ΔαEM/αEM < 0.038 at 95% C.L. and −0.013 < ΔαEM/αEM < 0.015 including HST data, also at 95% C.L. Let us also emphasize the work by [351] trying to include the variation of the Newton constant by assuming that ΔαEM/αEM = QΔG/G, Q being a constant and the investigation of [380] taking into account αEM, me and μ, G being kept fixed. Considering (Ωmat, Ωb, h, n s , τ) for the cosmological parameters they concluded from WMAP-5 data (TT, TE, EE) that −8.28 × 10−3 < ΔαEM/αEM < 1.81 × 10−3 and −0.52 < Δμ/μ < 0.17

The analysis of [452, 453] was updated [310] to the WMAP-7yr data, including polarisation and SDSS data. It leads to −0.025 < ΔαEM/αEM < −0.003 and 0.009 < Δme/me < 0.079 at a 1σ level.

The main limitation of these analyses lies in the fact that the CMB angular power spectrum depends on the evolution of both the background spacetime and the cosmological perturbations. It follows that it depends on the whole set of cosmological parameters as well as on initial conditions, that is on the shape of the initial power spectrum, so that the results will always be conditional to the model of structure formation. The constraints on αEM or me can then be seen mostly as constraints on a delayed recombination. A strong constraint on the variation of αEM can be obtained from the CMB only if the cosmological parameters are independently known. [438] forecasts that CMB alone can determine αEM to a maximum accuracy of 0.1%.

21 cm

After recombination, the CMB photons are redshifted and their temperature drops as (1 + z). However, the baryons are prevented from cooling adiabatically since the residual amount of free electrons, that can couple the gas to the radiation through Compton scattering, is too small. It follows that the matter decouples thermally from the radiation at a redshift of order z ∼ 200.

The intergalactic hydrogen atoms after recombination are in their ground state, which hyperfine-structure splits into a singlet and a triple states (1s1/2 with F = 0 and F = 1 respectively, see Section III.B.1 of FCV [500]). It was recently proposed [284] that the observation of the 21 cm emission can provide a test on the fundamental constants. We refer to [221] for a detailed review on 21 cm.

Table 11 Summary of the latest constraints on the variation of fundamental constants obtained from the analysis of cosmological data and more particularly of CMB data. All assume Ω K = 0.

The fraction of atoms in the excited (triplet) state versus the ground (singlet) state is conventionally related by the spin temperature Ts defined by the relation

$${{{n_t}} \over {{n_s}}} = 3\exp \left({- {{{T_\ast}} \over {{T_{\rm{s}}}}}} \right)$$
(117)

where T*hc/(λ21kB) = 68.2 mK is the temperature corresponding to the 21 cm transition and the factor 3 accounts for the degeneracy of the triplet state (note that this is a very simplified description since the assumption of a unique spin temperature is probably not correct [221]. The population of the two states is determined by two processes, the radiative interaction with CMB photons with a wavelength of λ21 = 21.1 cm (i.e., ν21 = 1420 MHz) and spin-changing atomic collision. Thus, the evolution of the spin temperature is dictated by [221].

$${{{\rm{d}}{T_{\rm{s}}}} \over {{\rm{d}}t}} = 4{C_{10}}\left({{1 \over {{T_{\rm{s}}}}} - {1 \over {{T_{\rm{g}}}}}} \right)T_{\rm{s}}^2 + (1 + z)H{A_{10}}\left({{1 \over {{T_{\rm{s}}}}} - {1 \over {{T_\gamma}}}} \right){{{T_\gamma}} \over {{T_\ast}}}$$
(118)

The first term corresponds to the collision de-excitation rate from triplet to singlet and the coefficient C10 is decomposed as

$${C_{10}} = \kappa_{10}^{HH}{n_p} + \kappa_{10}^{eH}{x_{\rm{e}}}{n_p}$$

with the respective contribution of H-H and e-H collisions. The second term corresponds to spontaneous transition and A10 is the Einstein coefficient. The equation of evolution for the gas temperature Tg is given by Equation (114) with T M = Tg (we recall that we have neglected the contribution of helium) and the electronic density satisfies Equation (113).

It follows [284, 285] that the change in the brightness temperature of the CMB at the corresponding wavelength scales as \({T_{\rm{b}}} \propto {A_{12}}/\nu _{21}^2\), where the Einstein coefficient A12 is defined below. Observationally, we can deduce the brightness temperature from the brightness I ν , that is the energy received in a given direction per unit area, solid angle and time, defined as the temperature of the black-body radiation with spectrum I ν . Thus, kBTbI ν c2/2ν2. It has a mean value, \({{\bar T}_{\rm{b}}}({z_{{\rm{obs}}}})\) at various redshift where \(1 + {z_{{\rm{obs}}}} = \nu _{21}^{{\rm{today}}}/{\nu _{{\rm{obs}}}}\). Besides, as for the CMB, there will also be fluctuation in Tb due to imprints of the cosmological perturbations on n p and Tg. It follows that we also have access to an angular power spectrum C (zobs) at various redshift (see [329] for details on this computation).

Both quantities depend on the value of the fundamental constants. Beside the same dependencies of the CMB that arise from the Thomson scattering cross section, we have to consider those arising from the collision terms. In natural units, the Einstein coefficient scaling is given by \({A_{12}} = {2 \over 3}\pi {\alpha _{{\rm{EM}}}}\nu _{21}^3m_{\rm{e}}^{- 2} \sim 2.869 \times {10^{- 15}}{{\rm{s}}^{- 1}}\). It follows that it scales as \({A_{10}} \propto g_{\rm{P}}^3{\mu ^3}\alpha _{{\rm{EM}}}^{13}{m_{\rm{e}}}\). The brightness temperature depends on the fundamental constant as \({T_{\rm{b}}} \propto {g_{\rm{P}}}\mu \alpha _{{\rm{EM}}}^5/{m_{\rm{e}}}\). Note that the signal can also be affected by a time variation of the gravitational constant through the expansion history of the universe. [284] (see also [221] for further discussions), focusing only on αEM, showed that this was the dominant effect on a variation of the fundamental constant (the effect on C10 is much complicated to determine but was argued to be much smaller). It was estimated that a single station telescope like LWAFootnote 9 or LOFARFootnote 10 can lead to a constraint of the order of ΔαEM/αEM ∼ 0.85%, improving to 0.3% for the full LWA. The fundamental challenge for such a measurement is the subtraction of the foreground.

The 21 cm absorption signal in a available on a band of redshift typically ranging from z ≲ 1000 to z ∼ 20, which is between the CMB observation and the formation of the first stars, that is during the “dark age”. Thus, it offers an interesting possibility to trace the constraints on the evolution of the fundamental constants between the CMB epoch and the quasar absorption spectra.

As for CMB, the knowledge of the cosmological parameters is a limitation since a change of 1% in the baryon density or the Hubble parameter implies a 2% (3% respectively) on the mean bolometric temperature. The effect on the angular power spectrum have been estimated but still require an in depth analysis along the lines of, e.g., [329]. It is motivating since C (zobs) is expected to depend on the correlators of the fundamental constants, e.g., 〈α EM (x, zobs)αEM(x′, zobs)〉 and thus in principle allows to study their fluctuation, even though it will also depend on the initial condition, e.g., power spectrum, of the cosmological perturbations.

In conclusion, the 21 cm observation opens a observational window on the fundamental at redshifts ranging typically from 30 to 100, but full in-depth analysis is still required (see [206, 286] for a critical discussion of this probe).

Big bang nucleosynthesis

Overview

The amount of 4He produced during the big bang nucleosynthesis is mainly determined by the neutron to proton ratio at the freeze-out of the weak interactions that interconvert neutrons and protons. The result of Big Bang nucleosynthesis (BBN) thus depends on G, αW, αEM and αS respectively through the expansion rate, the neutron to proton ratio, the neutron-proton mass difference and the nuclear reaction rates, besides the standard parameters such as, e.g., the number of neutrino families.

The standard BBN scenario [117, 409] proceeds in three main steps:

  1. 1.

    for T > 1 MeV, (t < 1 s) a first stage during which the neutrons, protons, electrons, positrons an neutrinos are kept in statistical equilibrium by the (rapid) weak interaction

    $$n \leftrightarrow p + {e^ -} + {\bar \nu _e},\quad n + {\nu _e} \leftrightarrow p + {e^ -},\quad n + {e^ +} \leftrightarrow p + {\bar \nu _e}.$$
    (119)

    As long as statistical equilibrium holds, the neutron to proton ratio is

    $$(n/p) = {{\rm{e}}^{- {Q_{{\rm{np/}}{{\rm{k}}_{\rm{B}}}T}}}}$$
    (120)

    where Qnp ≡ (mnmp)c2 = 1.29 MeV. The abundance of the other light elements is given by [409]

    $${Y_A} = {g_A}{\left({{{\zeta (3)} \over {\sqrt \pi}}} \right)^{A - 1}}{2^{(3A - 5)/2}}{A^{5/2}}{\left[ {{{{k_{\rm{B}}}T} \over {{m_{\rm{N}}}{c^2}}}} \right]^{3(A - 1)/2}}{\eta ^{A - 1}}Y_{\rm{p}}^ZY_{\rm{n}}^{A - Z}{{\rm{e}}^{{B_A}/{k_{\rm{B}}}T}},$$
    (121)

    where g A is the number of degrees of freedom of the nucleus \(_Z^A{\rm{X}}\), mN is the nucleon mass, η the baryon-photon ratio and B A ≡ (Zmp + (AZ)mnm A )c2 the binding energy.

  2. 2.

    Around T ∼ 0.8 MeV (t ∼ 2 s), the weak interactions freeze out at a temperature Tf determined by the competition between the weak interaction rates and the expansion rate of the universe and thus roughly determined by Γw(Tf) ∼ H(Tf) that is

    $$G_{\rm{F}}^2{({k_{\rm{B}}}{T_{\rm{f}}})^5} \sim \sqrt {G{N_\ast}} {({k_{\rm{B}}}{T_{\rm{f}}})^2}$$
    (122)

    where GF is the Fermi constant and N* the number of relativistic degrees of freedom at Tf. Below Tf, the number of neutrons and protons change only from the neutron β-decay between Tf to TN ∼ 0.1 MeV when p + n reactions proceed faster than their inverse dissociation.

  3. 3.

    For 0.05 MeV < T < 0.6 MeV (3 s < t < 6 min), the synthesis of light elements occurs only by two-body reactions. This requires the deuteron to be synthesized (p + nD) and the photon density must be low enough for the photo-dissociation to be negligible. This happens roughly when

    $${{{n_{\rm{d}}}} \over {{n_\gamma}}} \sim {\eta ^2}\exp (- {B_D}/{T_{\rm{N}}}) \sim 1$$
    (123)

    with η ∼ 3 × 10−10. The abundance of 4He by mass, Yp, is then well estimated by

    $${Y_{\rm{p}}} \simeq 2{{{{(n/p)}_{\rm{N}}}} \over {1 + {{(n/p)}_{\rm{N}}}}}$$
    (124)

    with

    $${(n/p)_{\rm{N}}} = {(n/p)_{\rm{f}}}\exp (- {t_{\rm{N}}}/{\tau _{\rm{n}}})$$
    (125)

    with \({t_{\rm{N}}} \propto {G^{- 1/2}}T_{\rm{N}}^{- 2}\) and \(\tau _{\rm{n}}^{- 1} = 1.636\,G_{\rm{F}}^2(1 + 3g_A^2)m_{\rm{e}}^5/(2{\pi ^3})\), with g A ≃ 1.26 being the axial/vector coupling of the nucleon. Assuming that \({B_D} \propto \alpha _{\rm{S}}^2\), this gives a dependence \({t_{\rm{N}}}/{\tau _{\rm{P}}} \propto {G^{- 1/2}}\alpha _{\rm{S}}^2G_{\rm{F}}^2\).

  4. 4.

    The abundances of the light element abundances, Y i , are then obtained by solving a series of nuclear reactions

    $${\dot Y_i} = J - \Gamma {Y_i},$$

    where J and Γ are time-dependent source and sink terms.

From an observational point of view, the light elements abundances can be computed as a function of η and compared to their observed abundances. Figure 6 summarizes the observational constraints obtained on helium-4, helium-3, deuterium and lithium-7. On the other hand, η can be determined independently from the analysis of the cosmic microwave background anisotropies and the WMAP data [296] have led to to the conclusion that

$$\eta = {\eta _{{\rm{WMAP}}}} = {(}6.19 \pm 0.15{)} \times {10^{- 10}}.$$

This number being fixed, all abundances can be computed. At present, there exists a discrepancy between the predicted abundance of lithium-7 based on the WMAP results [108, 107] for η, 7Li/H = (5.14 ± 0.50) × 10−10 and its values measured in metal-poor halo stars in our galaxy [63], 7Li/H = (1.26±0.26) × 10−10, which is a factor of three lower, at least [116] (see also [469]), than the predicted value. No solution to this Lithium-7 problem is known. A back of the envelope estimates shows that we can mimic a lower η parameter, just by modifying the deuterium binding energy, letting T N unchanged, since from Equation (123), one just need ΔB D /TN ∼ − ln 9 so that the effective η parameter, assuming no variation of constant, is three times smaller than ηWMAP. This rough rule of thumb explains that the solution of the lithium-7 problem may lie in a possible variation of the fundamental constants (see below for details).

Figure 6
figure6

(Left): variation of the light element abundances in function of η compared to the spectroscopic abundances. The vertical line depicts the constraint obtained on η from the study of the cosmic microwave background data. The lithium-7 problem lies in the fact that ηspectro < ηwmap. From [107]. (right): Dependence of the light element abundance on the independent variation of the BBN parameters, assuming η = ηWMAP. From [105]

Constants everywhere…

In complete generality, the effect of varying constants on the BBN predictions is difficult to model because of the intricate structure of QCD and its role in low energy nuclear reactions. Thus, a solution is to proceed in two steps, first by determining the dependencies of the light element abundances on the BBN parameters and then by relating those parameters to the fundamental constants.

The analysis of the previous Section 3.8.1, that was restricted to the helium-4 case, clearly shows that the abundances will depend on: (1) αG, which will affect the Hubble expansion rate at the time of nucleosynthesis in the same way as extra-relativistic degrees of freedom do, so that it modifies the freeze-out time Tf. This is the only gravitational sector parameter. (2) τn, the neutron lifetime dictates the free neutron decay and appears in the normalization of the proton-neutron reaction rates. It is the only weak interaction parameter and it is related to the Fermi constant GF, or equivalently the Higgs vev. (3) αEM, the fine-structure constant. It enters in the Coulomb barriers of the reaction rates through the Gamow factor, in all the binding energies. (4) Qnp, the neutron-proton mass difference enters in the neutron-proton ratio and we also have a dependence in (5) mN and me and (6) the binding energies.

Clearly all these parameters are not independent but their relation is often model-dependent. If we focus on helium-4, its abundance mainly depends on Qnp, Tf and TN (and hence mainly on the neutron lifetime, τn). Early studies (see Section III.C.2 of FVC [500]) generally focused on one of these parameters. For instance, Kolb et al. [295] calculated the dependence of primordial 4He on G, GF and Qnp to deduce that the helium-4 abundance was mostly sensitive in the change in Qnp and that other abundances were less sensitive to the value of Qnp, mainly because 4He has a larger binding energy; its abundances is less sensitive to the weak reaction rate and more to the parameters fixing the value of (n/p). To extract the constraint on the fine-structure constant, they decomposed Qnp as Qnp = αEMQ α + βQ β where the first term represents the electromagnetic contribution and the second part corresponds to all non-electromagnetic contributions. Assuming that Q α and Q β are constant and that the electromagnetic contribution is the dominant part of Q, they deduced that ∣ΔαEM/αEM∣ < 10−2. Campbell and Olive [77] kept track of the changes in Tf and Qnp separately and deduced that \({{\Delta {Y_{\rm{P}}}} \over {{Y_{\rm{P}}}}} \simeq {{\Delta {T_{\rm{f}}}} \over {{T_{\rm{f}}}}} - {{\Delta {Q_{{\rm{np}}}}} \over {{Q_{{\rm{np}}}}}}\) while more recently the analysis [308] focused on αEM and v.

Let us now see how the effect of all these parameters are now accounted for in BBN codes.

Bergström et al. [51] started to focus on the αEM-dependence of the thermonuclear rates (see also Ref. [260]). In the non-relativistic limit, it is obtained as the thermal average of the product of the cross, the relative velocity and the the number densities. Charged particles must tunnel through a Coulomb barrier to react. Changing αEM modifies these barriers and thus the reaction rates. Separating the Coulomb part, the low-energy cross section can be written as

$$\sigma (E) = {{S(E)} \over E}{{\rm{e}}^{- 2\pi \eta (E)}}$$
(126)

where η(E) arises from the Coulomb barrier and is given in terms of the charges and the reduced mass M r of the two interacting particles as

$$\eta (E) = {\alpha _{{\rm{EM}}}}{Z_1}{Z_2}\sqrt {{{{M_r}{c^2}} \over {2E}}.}$$
(127)

The form factor S(E) has to be extrapolated from experimental nuclear data but its αEM-dependence as well as the one of the reduced mass were neglected. Keeping all other constants fixed, assuming no exotic effects and taking a lifetime of 886.7 s for the neutron, it was deduced that ∣ΔαEM/αEM∣ < 2 × 10−2. This analysis was then extended [385] to take into account the αEM-dependence of the form factor to conclude that

$$\sigma (E) = {{2\pi \eta (E)} \over {{{\exp}^{2\pi \eta (E)}} - 1}} \simeq 2\pi {\alpha _{{\rm{EM}}}}{Z_1}{Z_2}\sqrt {{{{M_r}{c^2}} \over {{c^2}}}} {\exp ^{- 2\pi \eta (E)}}.$$

Ref. [385] also took into account (1) the effect that when two charged particles are produced they must escape the Coulomb barrier. This effect is generally weak because the Q i -values (energy release) of the different reactions are generally larger than the Coulomb barrier at the exception of two cases, 3He(n, p)3H and 7Be(n, p)7Li. The rate of these reactions must be multiplied by a factor (1 + α i ΔαEM/αEM). (2) The radiative capture (photon emitting processes) are proportional to αEM since it is the strength of the coupling of the photon and nuclear currents. All these rates need to be multiplied by (1 + ΔαEM/αEM). (3) The electromagnetic contribution to all masses was taken into account, which modify the Q i -values as Q i Q i + q i ΔαEM/αEM). For helium-4 abundance these effects are negligible since the main αEM-dependence arises from Qnp. Equipped with these modifications, it was concluded that \(\Delta {\alpha _{{\rm{EM}}}}/{\alpha _{{\rm{EM}}}} = - 0.007_{- 0.017}^{+ 0.010}\) using only deuterium and helium-4 since the lithium-7 problem was still present.

Then the focus fell on the deuterium binding energy, B D . Flambaum and Shuryak [207, 208, 158, 157] illustrated the sensitivity of the light element abundances on B D . Its value mainly sets the beginning of the nucleosynthesis, that is of TN since the temperature must low-enough in order for the photo-dissociation of the deuterium to be negligible (this is at the origin of the deuterium bottleneck). The importance of B D is easily understood by the fact that the equilibrium abundance of deuterium and the reaction rate p(n, γ)D depends exponentially on B D and on the fact that the deuterium is in a shallow bound state. Focusing on the TN-dependence, it was concluded [207] that ΔB D /B D < 0.075.

This shows that the situation is more complex and that one cannot reduce the analysis to a single varying parameter. Many studies then tried to determinate the sensitivity to the variation of many independent parameters.

The sensitivity of the helium-4 abundance to the variation of 7 parameters was first investigated by Müller et al. [364] considering the dependence on the parameters {X i } ≡ {G, αEM, v, me, τn, Qnp, B D } independently,

$$\Delta \ln {Y_{{\rm{He}}}} = \sum\limits_i {c_i^{(X)}} \Delta \ln {X_i}$$

and assuming ΛQCD fixed (so that the seven parameters are in fact dimensionless quantities). The \(c_i^{(X)}\) are the sensitivities to the BBN parameters, assuming the six others are fixed. It was concluded that \({Y_{{\rm{He}}}} \propto \alpha _{{\rm{EM}}}^{- 0.043}{\upsilon ^{2.4}}m_{\rm{e}}^{0.024}\tau _{\rm{n}}^{0.24}Q_{{\rm{np}}}^{- 1.8}B_D^{0.53}{G^{0.405}}\) for independent variations. They further related (τn, Qnp, B D ) to (αEM, v, me, mN, mdmu), as we shall discuss in the next Section 3.8.3.

This was generalized by Landau et al. [309] up to lithium-7 considering the parameters {αEM, GF, ΛQCD, Ω b h2}, assuming G constant where the variation of τn and the variation of the masses where tied to these parameters but the effect on the binding energies were not considered.

Coc et al. [104] considered the effect of a variation of (Qnp, B D , τn, me) on the abundances of the light elements up to lithium-7, neglecting the effect of αEM on the. Their dependence on the independent variation of each of these parameters is depicted on Figure 6. It confirmed the result of [207, 394] that the deuterium binding energy is the most sensitive parameter. From the helium-4 data alone, the bounds

$$- 8.2 \times {10^{- 2}} \underset{\sim}{<} {{\Delta {\tau _{\rm{n}}}} \over \tau} \underset{\sim}{<} 6 \times {10^{- 2}},\quad - 4 \times {10^{- 2}} \underset{\sim}{<} {{\Delta {Q_{{\rm{np}}}}} \over {{Q_{{\rm{np}}}}}} \underset{\sim}{<} 2.7 \times {10^{- 2}},$$
(128)

and

$$- 7.5 \times {10^{- 2}} \lesssim {{\Delta {B_D}} \over {{B_D}}} \lesssim 6.5 \times {10^{- 2}},$$
(129)

at a 2σ level, were set (assuming ηwmap). The deuterium data set the tighter constraint −4 × 10−2 ≲ Δln B D ≲ 3 × 10−2. Note also on Figure 6 that the lithium-7 abundance can be brought in concordance with the spectroscopic observations provided that B D was smaller during BBN

$$- 7.5 \times {10^{- 2}}\underset{\sim}{<} {{\Delta {B_D}} \over {{B_D}}} \underset{\sim}{<} - 4 \times {10^{- 2}},$$

so that B D may be the most important parameter to resolve the lithium-7 problem. The effect of the quark mass on the binding energies was described in [49]. They then concluded that a variation of Δmq/mq = 0.013 ± 0.002 allows to reconcile the abundance of lithium-7 and the value of η deduced from WMAP.

This analysis was extended [146] to incorporate the effect of 13 independent BBN parameters including the parameters considered before plus the binding energies of deuterium, tritium, helium-3, helium-4, lithium-6, lithium-7 and beryllium-7. The sensitivity of the light element abundances to the independent variation of these parameters is summarized in Table I of [146]. These BBN parameters were then related to the same 6 “fundamental” parameters used in [364].

All these analyses demonstrate that the effects of the BBN parameters on the light element abundances are now under control. They have been implemented in BBN codes and most results agree, as well as with semi-analytical estimates. As long as these parameters are assume to vary independently, no constraints sharper than 10−2 can be set. One should also not forget to take into account standard parameters of the BBN computation such as η and the effective number of relativistic particle.

From BBN parameters to fundamental constants

To reduce the number parameters, we need to relate the BBN parameters to more fundamental ones, keeping in mind that this can usually be done only in a model-dependent way. We shall describe some of the relations that have been used in many studies. They mainly concern Qnp, τn and B D .

At lowest order, all dimensional parameters of QCD, e.g., masses, nuclear energies etc., are to a good approximation simply proportional to some powers of ΛQCD. One needs to go beyond such a description and takes the effects of the masses of the quarks into account.

Qnp can be expressed in terms of the mass on the quarks u and d and the fine-structure constant as

$${Q_{{\rm{np}}}} = a{\alpha _{{\rm{EM}}}}{\Lambda _{{\rm{QCD}}}} + ({m_{\rm{d}}} - {m_{\rm{u}}}),$$

where the electromagnetic contribution today is (EMΛQCD)0 = −0.76 MeV and therefore the quark mass contribution today is (mdmu) = 2.05 [230] so that

$${{\Delta {Q_{{\rm{np}}}}} \over {{Q_{{\rm{np}}}}}} = - 0.59{{\Delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} + 1.59{{\Delta ({m_{\rm{d}}} - {m_{\rm{u}}})} \over {({m_{\rm{d}}} - {m_{\rm{u}}})}}.$$
(130)

All the analyses cited above agree on this dependence.

The neutron lifetime can be well approximated by

$$\tau _{\rm{n}}^{- 1} = {{1 + 3g_A^2} \over {120\pi^{3}}}G_{\rm{F}}^2m_{\rm{e}}^5\left[ {\sqrt {{q^2} - 1} (2{q^4} - 9{q^2} - 8) + 15\ln \left({q + \sqrt {{q^2} - 1}} \right)} \right],$$

with qQnp/me and \({G_F} = 1/\sqrt 2 {\upsilon ^2}\). Using the former expression for Qnp we can express τn in terms of αEM, v and the u, d and electron masses. It follows

$${{\Delta {\tau _{\rm{n}}}} \over {{\tau _{\rm{n}}}}} = 3.86{{\Delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} + 4{{\Delta v} \over v} + 1.52{{\Delta {m_{\rm{e}}}} \over {{m_{\rm{e}}}}} - 10.4{{\Delta ({m_{\rm{d}}} - {m_{\rm{u}}})} \over {({m_{\rm{d}}} - {m_{\rm{u}}})}}.$$
(131)

Again, all the analyses cited above agree on this dependence.

Let us now turn to the binding energies, and more particularly to B D that, as we have seen, is a crucial parameter. This is one the better known quantities in the nuclear domain and it is experimentally measured to a precision better than 10−6 [19]. Two approaches have been followed.

  • Pion mass. A first route is to use the dependence of the binding energy on the pion mass [188, 38], which is related to the u and d quark masses by

    $$m_\pi ^2 = {m_{\rm{q}}}\langle \bar uu + \bar dd\rangle f_\pi ^{- 2} \simeq \hat m{\Lambda _{{\rm{QCD}}}},$$

    where mq ≡ ½(mu + md) and assuming that the leading order of \(\left\langle {\bar uu + \bar dd} \right\rangle f_\pi ^{- 2}\) depends only on ΛQCD, f π being the pion decay constant. This dependence was parameterized [553] as

    $${{\Delta {B_D}} \over {{B_D}}} = - r{{\Delta {m_\pi}} \over {{m_\pi}}},$$

    where r is a fitting parameter found to be between 6 [188] and 10 [38]. Prior to this result, the analysis of [207] provides two computations of this dependence, which respectively lead to r = −3 and r = 18 while, following the same lines, [88] got r = 0.082.

    [364], following the computations of [426], adds an electromagnetic contribution −0.0081ΔαEM/αEM so that

    $${{\Delta {B_D}} \over {{B_D}}} = - {r \over 2}{{\Delta {m_{\rm{q}}}} \over {{m_{\rm{q}}}}} - 0.0081{{\Delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}},$$
    (132)

    but this latter contribution has not been included in other work.

  • Sigma model. In the framework of the Walecka model, where the potential for the nuclear forces keeps only the σ and ω meson exchanges,

    $$V = - {{g_s^2} \over {4\pi r}}\exp (- {m_\sigma}r) + {{g_v^2} \over {4\pi r}}\exp (- {m_\omega}r),$$

    where g s and g v are two coupling constants. Describing σ as a SU(3) singlet state, its mass was related to the mass of the strange quark. In this way one can hope to take into account the effect of the strange quark, both on the nucleon mass and the binding energy. In a second step B D is related to the meson and nucleon mass by

    $${{\Delta {B_D}} \over {{B_D}}} = - 48{{\Delta {m_\sigma}} \over {{m_\sigma}}} + 50{{\Delta {m_\omega}} \over {{m_\omega}}} + 6{{\Delta {m_{\rm{N}}}} \over {{m_{\rm{N}}}}}$$

    so that ΔB D /B D ≃ −17Δmsms [208]. Unfortunately, a complete treatment of all the nuclear quantities on ms has not been performed yet.

The case of the binding energies of the other elements has been less studied. [146] follows a route similar than for B D and relates them to pion mass and assumes that

$${{\partial {B_i}} \over {\partial {m_\pi}}} = {f_i}({A_i} - 1){{{B_D}} \over {{m_\pi}}}r \simeq - 0.13{f_i}({A_i} - 1),$$

where f i are unknown coefficients assumed to be of order unity and A i is the number of nucleons. No other estimates has been performed. Other nuclear potentials (such as Reid 93 potential, Nijmegen potential, Argonne v18 potential and Bonn potential) have been used in [101] to determine the dependence of B D on v and agree with previous studies.

These analyses allow one to reduce all the BBN parameter to the physical constants (αEM, v, me, mdmu, mq) and G that is not affected by this discussion. This set can be further reduced, since all the masses can be expressed in terms of v as m i = h i v, where h i are Yukawa couplings.

To go further, one needs to make more assumption, such as grand unification, or by relating the Yukawa coupling of the top to v by assuming that weak scale is determined by dimensional transmutation [104], or that the variation of the constant is induced by a string dilaton [77]. At each step, one gets more stringent constraints, which can reach the 10−4 [146] to 10−5 [104] level but indeed more model-dependent!

Conclusion

Primordial nucleosynthesis offers a possibility to test almost all fundamental constants of physics at a redshift of z ∼ 108. This makes it very rich but indeed the effect of each constant is more difficult to disentangle. The effect of the BBN parameters has been quantified with precision and they can be constrained typically at a 10−2 level, and in particular it seems that the most sensitive parameter is the deuterium binding energy.

The link with more fundamental parameters is better understood but the dependence of the deuterium binding energy still left some uncertainties and a good description of the effect of the strange quark mass is missing.

We have not considered the variation of G in this section. Its effect is disconnected from the other parameters. Let us just stress that assuming the BBN sensitivity on G by just modifying its value may be misleading. In particular G can vary a lot during the electron-positron annihilation so that the BBN constraints can in general not be described by an effective speed-up factor [105, 134].

The Gravitational Constant

The gravitational constant was the first constant whose constancy was questioned [155]. From a theoretical point of view, theories with a varying gravitational constant can be designed to satisfy the equivalence principle in its weak form but not in its strong form [540] (see also Section 5). Most theories of gravity that violate the strong equivalence principle predict that the locally measured gravitational constant may vary with time.

The value of the gravitational constant is G = 6.674 28(67) × 10−11 m3 kg−1 s−2 so that its relative standard uncertainty fixed by the CODATAFootnote 11 in 2006 is 0.01%. Interestingly, the disparity between different experiments led, in 1998, to a temporary increase of this uncertainty to 0.15% [241], which demonstrates the difficulty in measuring the value of this constant. This explains partly why the constraints on the time variation are less stringent than for the other constants.

A variation of the gravitational constant, being a pure gravitational phenomenon, does not affect the local physics, such as, e.g., the atomic transitions or the nuclear physics. In particular, it is equivalent at stating that the masses of all particles are varying in the same way to that their ratios remain constant. Similarly all absorption lines will be shifted in the same way. It follows that most constraints are obtained from systems in which gravity is non-negligible, such as the motion of the bodies of the Solar system, astrophysical and cosmological systems. They are mostly related in the comparison of a gravitational time scale, e.g., period of orbits, to a non-gravitational time scale. It follows that in general the constraints assume that the values of the other constants are fixed. Taking their variation into account would add degeneracies and make the constraints cited below less stringent.

We refer to Section IV of FVC [500] for earlier constraints based, e.g., on the determination of the Earth surface temperature, which roughly scales as \({G^{2.25}}M_ \odot ^{1.75}\) and gives a constraint of the order of ∣ΔG/G∣ < 0.1 [224], or on the estimation of the Earth radius at different geological epochs. We also emphasize that constraints on the variation of G are meant to be constraints on the dimensionless parameter αG.

Solar systems constraints

Monitoring the orbits of the various bodies of the solar system offers a possibility to constrain deviations from general relativity, and in particular the time variation of G. This accounts for comparing a gravitational time scale (related to the orbital motion) and an atomic time scale and it is thus assumed that the variation of atomic constants is negligible over the time of the experiment.

The first constraint arises from the Earth-Moon system. A time variation of G is then related to a variation of the mean motion (n = 2π/P) of the orbit of the Moon around the Earth. A decrease in G would induce both the Lunar mean distance and period to increase. As long as the gravitational binding energy is negligible, one has

$${{\dot P} \over P} = - 2{{\dot G} \over G}.$$
(133)

Earlier constraints rely on paleontological data and ancient eclipses observations (see Section IV.B.1 of FVC [500]) and none of them are very reliable. A main difficulty arises from tidal dissipation that also causes the mean distance and orbital period to increase (for tidal changes 2/n + 3ȧ/a = 0), but not as in the same ratio as for Ġ.

The Lunar Laser Ranging (LLR) experiment has measured the relative position of the Moon with respect to the Earth with an accuracy of the order of 1 cm over 3 decades. An early analysis of this data [544] assuming a Brans-Dicke theory of gravitation gave that ∣Ġ/G∣ ≤ 3 × 10−11 yr−1. It was improved [365] by using 20 years of observation to get ∣Ġ/G∣ ≤ 1.04 × 10−11 yr−1, the main uncertainty arising from Lunar tidal acceleration. With, 24 years of data, one reached [542] ∣Ġ/G∣ ≤ 6 × 10−12 yr−1 and finally, the latest analysis of the Lunar laser ranging experiment [543] increased the constraint to

$${\left. {{{\dot G} \over G}} \right\vert _0} = (4 \pm 9) \times {10^{- 13}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(134)

Similarly, Shapiro et al. [458] compared radar-echo time delays between Earth, Venus and Mercury with a caesium atomic clock between 1964 and 1969. The data were fitted to the theoretical equation of motion for the bodies in a Schwarzschild spacetime, taking into account the perturbations from the Moon and other planets. They concluded that ∣Ġ/G∣ < 4 × 10−10 yr−1. The data concerning Venus cannot be used due to imprecision in the determination of the portion of the planet reflecting the radar. This was improved to ∣Ġ/G∣ < 1.5 × 10−10 yr−1 by including Mariner 9 and Mars orbiter data [429]. The analysis was further extended [457] to give Ġ/G = (−2 ± 10) × 10−12 yr−1. The combination of Mariner 10 an Mercury and Venus ranging data gives [12]

$${\left. {{{\dot G} \over G}} \right\vert _0} = {(}0.0 \pm 2.0{)} \times {10^{- 12}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(135)

Reasenberg et al. [430] considered the 14 months data obtained from the ranging of the Viking spacecraft and deduced, assuming a Brans-Dicke theory, ∣Ġ/G∣ < 10−12 yr−1. Hellings et al. [249] using all available astrometric data and in particular the ranging data from Viking landers on Mars deduced that

$${\left. {{{\dot G} \over G}} \right\vert _0} = (2 \pm 4) \times {10^{- 12}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(136)

The major contribution to the uncertainty is due to the modeling of the dynamics of the asteroids on the Earth-Mars range. Hellings et al. [249] also tried to attribute their result to a time variation of the atomic constants. Using the same data but a different modeling of the asteroids, Reasenberg [428] got ∣Ġ/G∣ < 3 × 10−11 yr−1, which was then improved by Chandler et al. [93] to ∣Ġ/G| < 10−11 yr−1.

Pulsar timing

Contrary to the solar system case, the dependence of the gravitational binding energy cannot be neglected while computing the time variation of the period. Here two approaches can be followed; either one sticks to a model (e.g., scalar-tensor gravity) and compute all the effects in this model or one has a more phenomenological approach and tries to put some model-independent bounds.

Eardley [177] followed the first route and discussed the effects of a time variation of the gravitational constant on binary pulsar in the framework of the Brans-Dicke theory. In that case, both a dipole gravitational radiation and the variation of induce a periodic variation in the pulse period. Nordtvedt [386] showed that the orbital period changes as

$${{\dot P} \over P} = - \left[ {2 + {{2({m_1}{c_1} + {m_2}{c_2}) + 3({m_1}{c_2} + {m_2}{c_1})} \over {{m_1} + {m_2}}}} \right]{{\dot G} \over G}$$
(137)

where c i δln m i /δln G. He concluded that for the pulsar PSR 1913+16 (m1m2 and c1c2) one gets

$${{\dot P} \over P} = - [2 + 5c]{{\dot G} \over G},$$
(138)

the coefficient c being model dependent. As another application, he estimated that cEarth ∼ −5 × 10−10, cMoon ∼ −10−8 and cSun ∼ −4 × 10−6 justifying the formula used in the solar system.

Damour et al. [127] used the timing data of the binary pulsar PSR 1913+16. They implemented the effect of the time variation of G by considering the effect on /P. They defined, in a phenomenological way, that Ġ/G = −0.5δṖ/P, where δṖ is the part of the orbital period derivative that is not explained otherwise (by gravitational waves radiation damping). This theory-independent definition has to be contrasted with the theory-dependent result (138) by Nordtvedt [386]. They got

$$\dot G/G = {(}1.0 \pm 2.3{)} \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(139)

Damour and Taylor [137] then reexamined the data of PSR 1913+16 and established the upper bound

$$\dot G/G < {(}1.10 \pm 1.07{)} \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(140)

Kaspi et al. [282] used data from PSR B1913+16 and PSR B1855+09 respectively to get

$$\dot G/G = (4 \pm 5) \times {10^{- 12}}{\rm{y}}{{\rm{r}}^{- 1}}$$
(141)

and

$$\dot G/G = (- 9 \pm 18) \times {10^{- 12}}{\rm{y}}{{\rm{r}}^{- 1}},$$
(142)

the latter case being more “secure” since the orbiting companion is not a neutron star.

All the previous results concern binary pulsars but isolated ones can also be used. Heintzmann and Hillebrandt [248] related the spin-down of the pulsar JP1953 to a time variation of G. The spin-down is a combined effect of electromagnetic losses, emission of gravitational waves, possible spin-up due to matter accretion. Assuming that the angular momentum is conserved so that I/P = constant, one deduces that

$${\left. {{{\dot P} \over P}} \right\vert _G} = \left({{{{\rm{d}}\ln I} \over {{\rm{d}}\ln G}}} \right){{\dot G} \over G}.$$
(143)

The observational spin-down can be decomposed as

$${\left. {{{\dot P} \over P}} \right\vert _{{\rm{obs}}}} = {\left. {{{\dot P} \over P}} \right\vert _{{\rm{mag}}}} + {\left. {{{\dot P} \over P}} \right\vert _{{\rm{GW}}}} + {\left. {{{\dot P} \over P}} \right\vert _G}.$$
(144)

Since /Pmag and /PGW are positive definite, it follows that /PobsṖ/P G so that a bound on Ġ can be inferred if the main pulse period is the period of rotation. Heintzmann and Hillebrandt [248] then modelled the pulsar by a polytropic (Pρn) white dwarf and deduced that dln I/dln G = 2 − 3n/2 so that ∣Ġ/G∣ < 10−10 yr−1. Mansfield [344] assumed a relativistic degenerate, zero temperature polytropic star and got that, when Ġ < 0, 0 ≤ −Ġ/G < 6.8 × 10−11 yr−1 at a 2σ level. He also noted that a positive Ġ induces a spin-up counteracting the electromagnetic spin-down, which can provide another bound if an independent estimate of the pulsar magnetic field can be obtained. Goldman [233], following Eardley [177], used the scaling relations NG−3/2 and MG−5/2 to deduce that 2dln I/dlnG = −5 + 3dln I/dln N. He used the data from the pulsar PSR 0655+64 to deduce that the rate of decrease of G was smaller than

$$0 \leq - \dot G/G < 5.5 \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(145)

The analysis [516] of 10 years high precision timing data on the millisecond pulsar PSR J0437-4715 has allowed to improve the constraint to

$$\vert \dot G/G\vert < 2.3 \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(146)

Recently, it was argued [266, 432] that a variation of G would induce a departure of the neutron star matter from β-equilibrium, due to the changing hydrostatic equilibrium. This would force non-equilibrium β-processes to occur, which release energy that is invested partly in neutrino emission and partly in heating the stellar interior. Eventually, the star arrives at a stationary state in which the temperature remains nearly constant, as the forcing through the change of G is balanced by the ongoing reactions. Comparing the surface temperature of the nearest millisecond pulsar, PSR J0437-4715, inferred from ultraviolet observations, two upper limits for variation were obtained, ∣Ġ/G∣ < 2 × 10−10 yr−1, direct Urca reactions operating in the neutron star core are allowed, and ∣Ġ/G∣ < 4 × 10−12 yr−1, considering only modified Urca reactions. This was extended in [302] in order to take into account the correlation between the surface temperatures and the radii of some old neutron stars to get ∣Ġ/G∣ < 2.1 × 10−11 yr−1.

Stellar constraints

Early works, see Section IV.C of FVC [500], studied the Solar evolution in presence of a time varying gravitational constant, concluding that under the Dirac hypothesis, the original nuclear resources of the Sun would have been burned by now. This results from the fact that an increase of the gravitational constant is equivalent to an increase of the star density (because of the Poisson equation).

The idea of using stellar evolution to constrain the possible value of G was originally proposed by Teller [487], who stressed that the evolution of a star was strongly dependent on G. The luminosity of a main sequence star can be expressed as a function of Newton’s gravitational constant and its mass by using homology relations [224, 487]. In the particular case that the opacity is dominated by free-free transitions, Gamow [224] found that the luminosity of the star is given approximately by LG7.8M5.5. In the case of the Sun, this would mean that for higher values of G, the burning of hydrogen will be more efficient and the star evolves more rapidly, therefore we need to increase the initial content of hydrogen to obtain the present observed Sun. In a numerical test of the previous expression, Delg’Innocenti et al. [140] found that low-mass stars evolving from the Zero Age Main Sequence to the red giant branch satisfy LG5.6M4.7, which agrees to within 10% of the numerical results, following the idea that Thomson scattering contributes significantly to the opacity inside such stars. Indeed, in the case of the opacity being dominated by pure Thomson scattering, the luminosity of the star is given by LG4M3. It follows from the previous analysis that the evolution of the star on the main sequence is highly sensitive to the value of G.

The driving idea behind the stellar constraints is that a secular variation of leads to a variation of the gravitational interaction. This would affect the hydrostatic equilibrium of the star and in particular its pressure profile. In the case of non-degenerate stars, the temperature, being the only control parameter, will adjust to compensate the modification of the intensity of the gravity. It will then affect the nuclear reaction rates, which are very sensitive to the temperature, and thus the nuclear time scales associated to the various processes. It follows that the main stage of the stellar evolution, and in particular the lifetimes of the various stars, will be modified. As we shall see, basically two types of methods have been used, the first in which on relate the variation of G to some physical characteristic of a star (luminosity, effective temperature, radius), and a second in which only a statistical measurement of the change of G can be inferred. Indeed, the first class of methods are more reliable and robust but is usually restricted to nearby stars. Note also that they usually require to have a precise distance determination of the star, which may depend on G.

Ages of globular clusters

The first application of these idea has been performed with globular clusters. Their ages, determined for instance from the luminosity of the main-sequence turn-off, have to be compatible with the estimation of the age of the galaxy. This gives the constraint [140]

$$\dot G/G = {(} - 1.4 \pm 2.1{)} \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(147)

The effect of a possible time dependence of G on luminosity has been studied in the case of globular cluster H-R diagrams but has not yielded any stronger constraints than those relying on celestial mechanics

Solar and stellar seismology

A side effect of the change of luminosity is a change in the depth of the convection zone so that the inner edge of the convecting zone changes its location. This induces a modification of the vibration modes of the star and particularly to the acoustic waves, i.e., p-modes [141].

Helioseismology. These waves are observed for our star, the Sun, and helioseismology allows one to determine the sound speed in the core of the Sun and, together with an equation of state, the central densities and abundances of helium and hydrogen. Demarque et al. [141] considered an ansatz in which Gtγ and showed that ∣γ∣ < 0.1 over the last 4.5 × 109 years, which corresponds to ∣Ġ/G∣ < 2 × 10−11 yr−1. Guenther et al. [240] also showed that g-modes could provide even much tighter constraints but these modes are up to now very difficult to observe. Nevertheless, they concluded, using the claim of detection by Hill and Gu [251], that ∣Ġ/G∣ < 4.5 × 10−12 yr−1. Guenther et al. [239] then compared the p-mode spectra predicted by different theories with varying gravitational constant to the observed spectrum obtained by a network of six telescopes and deduced that

$$\vert \dot G/G\vert < 1.6 \times {10^{- 12}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(148)

The standard solar model depends on few parameters and G plays a important role since stellar evolution is dictated by the balance between gravitation and other interactions. Astronomical observations determines GM with an accuracy better than 10−7 and a variation of G with GM fixed induces a change of the pressure \((P = GM_ \odot ^2/R_ \odot ^2)\) and density \((\rho = {M_ \odot}/R_ \odot ^3)\). The experimental uncertainties in G between different experiments have important implications for helioseismology. In particular the uncertainties for the standard solar model lead to a range in the value of the sound speed in the nuclear region that is as much as 0.15% higher than the inverted helioseismic sound speed [335]. While a lower value of G is preferred for the standard model, any definite prediction is masked by the uncertainties in the solar models available in the literature. Ricci and Villante [436] studied the effect of a variation of G on the density and pressure profile of the Sun and concluded that present data cannot constrain G better than 10−2%. It was also shown [335] that the information provided by the neutrino experiments is quite significant because it constitutes an independent test of G complementary to the one provided by helioseismology.

White dwarfs. The observation of the period of non-radial pulsations of white dwarf allows to set similar constraints. White dwarfs represent the final stage of the stellar evolution for stars with a mass smaller to about 10 M. Their structure is supported against gravitational collapse by the pressure of degenerate electrons. It was discovered that some white dwarfs are variable stars and in fact non-radial pulsator. This opens the way to use seismological techniques to investigate their internal properties. In particular, their non-radial oscillations is mostly determined by the Brunt-Väisälä frequency

$${N^2} = g{{{\rm{d}}\ln {P^{1/{\gamma _1}}}/\rho} \over {{\rm{d}}r}}$$

where g is the gravitational acceleration, Γ1 the first adiabatic exponent and P and ρ the pressure and density (see, e.g., [283] for a white dwarf model taking into account a varying G). A variation of G induces a modification of the degree of degeneracy of the white dwarf, hence on the frequency N as well as the cooling rate of the star, even though this is thought to be negligible at the luminosities where white dwarfs are pulsationally unstable[54]. Using the observation of G117-B15A that has been monitored during 20 years, it was concluded [43] that

$$- 2.5 \times {10^{- 10}}{\rm{y}}{{\rm{r}}^{- 1}} < \dot G/G < 4.0 \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}},$$
(149)

at a 2σ-level. The same observations were reanalyzed in [54] to obtain

$$\vert \dot G/G\vert < 4.1 \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(150)

Late stages of stellar evolution and supernovae

A variation of G can influence the white dwarf cooling and the light curves ot Type Ia supernovae.

García-Berro et al. [225] considered the effect of a variation of the gravitational constant on the cooling of white dwarfs and on their luminosity function. As first pointed out by Vila [518], the energy of white dwarfs, when they are cool enough, is entirely of gravitational and thermal origin so that a variation of G will induce a modification of their energy balance and thus of their luminosity. Restricting to cold white dwarfs with luminosity smaller than ten solar luminosity, the luminosity can be related to the star binding energy B and gravitational energy, Egrav, as

$$L = - {{{\rm{d}}B} \over {{\rm{d}}t}} + {{\dot G} \over G}{E_{{\rm{grav}}}},$$
(151)

which simply results from the hydrostatic equilibrium. Again, the variation of the gravitational constant intervenes via the Poisson equation and the gravitational potential. The cooling process is accelerated if Ġ/G < 0, which then induces a shift in the position of the cut-off in the luminosity function. García-Berro et al. [225] concluded that

$$0 \leq - \dot G/G < (1 \pm 1) \times {10^{- 11}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(152)

The result depends on the details of the cooling theory, on whether the C/O white dwarf is stratified or not and on hypothesis on the age of the galactic disk. For instance, with no stratification of the C/O binary mixture, one would require Ġ/G = −(2.5±0.5) × 10−11 yr−1 if the solar neighborhood has a value of 8 Gyr (i.e., one would require a variation of G to explain the data). In the case of the standard hypothesis of an age of 11 Gyr, one obtains that 0 ≤ −Ġ/G < 3 × 10−11 yr−1.

The late stages of stellar evolution are governed by the Chandrasekhar mass \({(\hbar c/G)^{3/2}}m_{\rm{n}}^{- 2}\) mainly determined by the balance between the Fermi pressure of a degenerate electron gas and gravity.

Simple analytical models of the light curves of Type Ia supernovae predict that the peak of luminosity is proportional to the mass of nickel synthesized. In a good approximation, it is a fixed fraction of the Chandrasekhar mass. In models allowing for a varying G, this would induce a modification of the luminosity distance-redshift relation [227, 232, 435]. However, it was shown that this effect is small. Note that it will be degenerate with the cosmological parameters. In particular, the Hubble diagram is sensitive to the whole history of G(t) between the highest redshift observed and today so that one needs to rely on a better defined model, such as, e.g., scalar-tensor theory [435] (the effect of the Fermi constant was also considered in [194]).

In the case of Type II supernovae, the Chandrasekhar mass also governs the late evolutionary stages of massive stars, including the formation of neutron stars. Assuming that the mean neutron star mass is given by the Chandrasekhar mass, one expects that Ġ/G = −2NS/3 MNS. Thorsett [492] used the observations of five neutron star binaries for which five Keplerian parameters can be determined (the binary period P b , the projection of the orbital semi-major axis α1 sin i, the eccentricity e, the time and longitude of the periastron T0 and ω) as well as the relativistic advance of the angle of the periastron \({\dot \omega}\). Assuming that the neutron star masses vary slowly as \({M_{{\rm{NS}}}} = M_{{\rm{NS}}}^{(0)} - {{\dot M}_{{\rm{NS}}}}{t_{{\rm{NS}}}}\), that their age was determined by the rate at which P b is increasing (so that t NS ≃ 2P b / b ) and that the mass follows a normal distribution, Thorsett [492] deduced that, at 2σ,

$$\dot G/G = {(} - 0.6 \pm 4.2{)} \times {10^{- 12}}{\rm{y}}{{\rm{r}}^{- 1}}.$$
(153)

New developments

It has recently been proposed that the variation of G inducing a modification of the binary’s binding energy, it should affect the gravitational wave luminosity, hence leading to corrections in the chirping frequency [554]. For instance, it was estimated that a LISA observation of an equal-mass inspiral event with total redshifted mass of 105 M for three years should be able to measure Ġ/G at the time of merger to better than 10−11/yr. This method paves the way to constructing constraints in a large band of redshifts as well as in different directions in the sky, which would be an invaluable constraint for many models.

More speculative is the idea [25] that a variation of G can lead a neutron to enter into the region where strange or hybrid stars are the true ground state. This would be associated with gamma-ray bursts that are claimed to be able to reach the level of 10−17/yr on the time variation of G.

Cosmological constraints

Cosmological observations are more difficult to use in order to set constraints on the time variation of G. In particular, they require to have some ideas about the whole history of G as a function of time but also, as the variation of G reflects an extension of General relativity, it requires to modify all equations describing the evolution (of the universe and of the large scale structure) in a consistent way. We refer to [504, 502, 506] for a discussion of the use of cosmological data to constrain deviations from general relativity.

Cosmic microwave background

A time-dependent gravitational constant will have mainly three effects on the CMB angular power spectrum (see [435] for discussions in the framework of scalar-tensor gravity in which G is considered as a field):

  1. 1.

    The variation of G modifies the Friedmann equation and therefore the age of the Universe (and, hence, the sound horizon). For instance, if G is larger at earlier time, the age of the Universe is smaller at recombination, so that the peak structure is shifted towards higher angular scales.

  2. 2.

    The amplitude of the Silk damping is modified. At small scales, viscosity and heat conduction in the photon-baryon fluid produce a damping of the photon perturbations. The damping scale is determined by the photon diffusion length at recombination, and therefore depends on the size of the horizon at this epoch, and hence, depends on any variation of the Newton constant throughout the history of the Universe.

  3. 3.

    The thickness of the last scattering surface is modified. In the same vein, the duration of recombination is modified by a variation of the Newton constant as the expansion rate is different. It is well known that CMB anisotropies are affected on small scales because the last scattering “surface” has a finite thickness. The net effect is to introduce an extra, roughly exponential, damping term, with the cutoff length being determined by the thickness of the last scattering surface. When translating redshift into time (or length), one has to use the Friedmann equations, which are affected by a variation of the Newton constant. The relevant quantity to consider is the visibility function g. In the limit of an infinitely thin last scattering surface, τ goes from ∞ to 0 at recombination epoch. For standard cosmology, it drops from a large value to a much smaller one, and hence, the visibility function still exhibits a peak, but it is much broader.

In full generality, the variation of G on the CMB temperature anisotropies depends on many factors: (1) modification of the background equations and the evolution of the universe, (2) modification of the perturbation equations, (3) whether the scalar field inducing the time variation of G is negligible or not compared to the other matter components, (4) on the time profile of G that has to be determine to be consistent with the other equations of evolution. This explains why it is very difficult to state a definitive constraint. For instance, in the case of scalar-tensor theories (see below), one has two arbitrary functions that dictate the variation of G. As can be seen, e.g., from [435, 378], the profiles and effects on the CMB can be very different and difficult to compare. Indeed, the effects described above are also degenerate with a variation of the cosmological parameters.

In the case of Brans-Dicke theory, one just has a single constant parameter ω BD characterizing the deviation from general relativity and the time variation of G. Thus, it is easier to compare the different constraints. Chen and Kamionkowski [94] showed that CMB experiments such as WMAP will be able to constrain these theories for ω BD < 100 if all parameters are to be determined by the same CMB experiment, ωBD < 500 if all parameters are fixed but the CMB normalization and ωBD < 800 if one uses the polarization. For the Planck mission these numbers are respectively, 800, 2500 and 3200. [2] concluded from the analysis of WMAP, ACBAR, VSA and CBI, and galaxy power spectrum data from 2dF, that ω BD > 120, in agreement with the former analysis of [378]. An analysis [549] indictates that The ‘WMAP-5yr data’ and the ‘all CMB data’ both favor a slightly non-zero (positive) Ġ/G but with the addition of the SDSS poser spectrum data, the best-fit value is back to zero, concluding that −0.083 < ΔG/G < 0.095 between recombination and today, which corresponds to −1.75 × 10−12 yr−1 < Ġ/G < 1.05 × 10−12 yr−1.

From a more phenomenological prospect, some works modeled the variation of G with time in a purely ad-hoc way, for instance [89] by assuming a linear evolution with time or a step function.

BBN

As explained in detail in Section 3.8.1, changing the value of the gravitational constant affects the freeze-out temperature Tf. A larger value of G corresponds to a higher expansion rate. This rate is determined by the combination and in the standard case the Friedmann equations imply that Gρt2 is constant. The density ρ is determined by the number N* of relativistic particles at the time of nucleosynthesis so that nucleosynthesis allows to put a bound on the number of neutrinos N ν . Equivalently, assuming the number of neutrinos to be three, leads to the conclusion that G has not varied from more than 20% since nucleosynthesis. But, allowing for a change both in G and N ν allows for a wider range of variation. Contrary to the fine structure constant the role of G is less involved.

The effect of a varying G can be described, in its most simple but still useful form, by introducing a speed-up factor, ξ = H/H GR , that arises from the modification of the value of the gravitational constant during BBN. Other approaches considered the full dynamics of the problem but restricted themselves to the particular class of Jordan-Fierz-Brans-Dicke theory [1, 16, 26, 84, 102, 128, 441, 551] (Casas et al. [84] concluded from the study of helium and deuterium that ωBD > 380 when N ν = 3 and ωBD > 50 when N ν = 2.), of a massless dilaton with a quadratic coupling [105, 106, 134, 446] or to a general massless dilaton [455]. It should be noted that a combined analysis of BBN and CMB data was investigated in [113, 292]. The former considered G constant during BBN while the latter focused on a nonminimally quadratic coupling and a runaway potential. It was concluded that from the BBN in conjunction with WMAP determination of η set that ΔG/G has to be smaller than 20%. However, we stress that the dynamics of the field can modify CMB results (see previous Section 4.4.1) so that one needs to be careful while inferring Ωb from WMAP unless the scalar-tensor theory has converged close to general relativity at the time of decoupling.

In early studies, Barrow [26] assumed that Gtn and obtained from the helium abundances that −5.9 × 10−3 < n < 7 × 10−3, which implies that ∣Ġ/G∣ < (2 ± 9.3) h × 10−12 yr−1, assuming a flat universe. This corresponds in terms of the Brans-Dicke parameter to ωBD > 25. Yang et al. [551] included the deuterium and lithium to improve the constraint to n < 5 × 10−3, which corresponds to ωBD > 50. It was further improved by Rothman and Matzner [441] to ∣n∣ < 3 × 10−3 implying ∣Ġ/G∣ < 1.7 × 10−13 yr−1. Accetta et al. [1] studied the dependence of the abundances of D, 3He, 4He and 7Li upon the variation of G and concluded that −0.3 < ΔG/G < 0.4, which roughly corresponds to ∣Ġ/G∣ < 9 × 10−13 yr−1. All these investigations assumed that the other constants are kept fixed and that physics is unchanged. Kolb et al. [295] assumed a correlated variation of G, αEM and GF and got a bound on the variation of the radius of the extra dimensions.

Although the uncertainty in the helium-4 abundance has been argued to be significantly larger that what was assumed in the past [401], interesting bounds can still be derived [117]. In particular translating the bound on extra relativistic degress of freedom (−0.6 < δN ν < 0.82) to a constraint on the speed-up factor (0.949 < ξ < 1.062), it was concluded [117], since ΔG/G = ξ2 − 1 = 7δN ν /43, that

$$- 0.10 < {{\Delta G} \over G} < 0.13.$$
(154)

The relation between the speed-up factor, or an extra number of relativistic degrees of freedom, with a variation of G is only approximate since it assumes that the variation of G affects only the Friedmann equation by a renormalization of G. This is indeed accurate only when the scalar field is slow-rolling. For instance [105], the speed-up factor is given (with the notations of Section 5.1.1) by

$$\xi = {{A({\varphi _\ast})} \over {{A_0}}}{{1 + \alpha ({\varphi _\ast})\varphi _\ast {\prime}} \over {\sqrt {1 - \varphi _\ast ^{2{\prime}}/3}}}{1 \over {\sqrt {1 + \alpha _0^2}}}$$

so that

$${\xi ^2} = {G \over {{G_0}}}{{{{(1 + \alpha ({\varphi _\ast})\varphi _\ast {\prime})}^2}} \over {(1 + {\alpha ^2})(1 - \varphi _\ast ^{2{\prime}}/3)}},$$
(155)

so that ΔG/G0 = ξ2 − 1 only if α ≪ 1 (small deviation from general relativity) and \(\varphi _\ast\prime \ll 1\) (slow rolling dilaton). The BBN in scalar-tensor theories was investigated [105, 134] in the case of a two-parameter family involving a non-linear scalar field-matter coupling function. They concluded that even in the cases where before BBN the scalar-tensor theory was far from general relativity, BBN enables to set quite tight constraints on the observable deviations from general relativity today. In particular, neglecting the cosmological constant, BBN imposes \(\alpha _0^2 < {10^{- 6.5}}{\beta ^{- 1}}{({\Omega _{{\rm{mat}}}}{h^2}/0.15)^{- 3/2}}\) when β > 0.5 (with the definitions introduced below Equation (164)).

Theories With Varying Constants

As explained in the introduction, Dirac postulated that G varies as the inverse of the cosmic time. Such an hypothesis is indeed not a theory since the evolution of G with time is postulated and not derived from an equation of evolutionFootnote 12 consistent with the other field equations, that have to take into account that G is no more a constant (in particular in a Lagrangian formulation one needs to take into account that G is no more constant when varying.

The first implementation of Dirac’s phenomenological idea into a field-theory framework (i.e., modifying Einstein’s gravity and incorporating non-gravitational forces and matter) was proposed by Jordan [268]. He realized that the constants have to become dynamical fields and proposed the action

$$S = \int {\sqrt {- g} {{\rm{d}}^{\rm{4}}}{\bf{x}}{\phi ^\eta}\left[ {R - \xi {{\left({{{\nabla \phi} \over \phi}} \right)}^2} - {\phi \over 2}{F^2}} \right],}$$
(156)

η and ξ being two parameters. It follows that both G and the fine-structure constant have been promoted to the status of a dynamical field.

Fierz [195] realized that with such a Lagrangian, atomic spectra will be space-time-dependent, and he proposed to fix η to the value −1 to prevent such a space-time dependence. This led to the definition of a one-parameter (ξ) class of scalar-tensor theories in which only G is assumed to be a dynamical field. This was then further explored by Brans and Dicke [67] (with the change of notation ξω). In this Jordan-Fierz-Brans-Dicke theory the gravitational constant is replaced by a scalar field, which can vary both in space and time. It follows that, for cosmological solutions, Gtn with n−1 = 2 + 3ωBD/2. Thus, Einstein’s gravity is recovered when ωBD → ∞. This kind of theory was further generalized to obtain various functional dependencies for in the formalisation of scalar-tensor theories of gravitation (see, e.g., Damour and Esposito-Farèse [124] or Will [540]).

Introducing new fields: generalities

The example of scalar-tensor theories

Let us start to remind how the standard general relativistic framework can be extended to make G dynamical on the example of scalar-tensor theories, in which gravity is mediated not only by a massless spin-2 graviton but also by a spin-0 scalar field that couples universally to matter fields (this ensures the universality of free fall). In the Jordan frame, the action of the theory takes the form

$$S = \int {{{{{\rm{d}}^{\rm{4}}}x} \over {16\pi {G_\ast}}}\sqrt {- g} [F(\varphi)R - {g^{\mu \nu}}Z(\varphi){\varphi _{,\mu}}{\varphi _{,\nu}} - 2U(\varphi)] + {S_{{\rm{matter}}}}[\psi;{g_{\mu \nu}}]}$$
(157)

where G* is the bare gravitational constant. This action involves three arbitrary functions (F, Z and U) but only two are physical since there is still the possibility to redefine the scalar field. F needs to be positive to ensure that the graviton carries positive energy. Smatter is the action of the matter fields that are coupled minimally to the metric g μν . In the Jordan frame, the matter is universally coupled to the metric so that the length and time as measured by laboratory apparatus are defined in this frame.

The variation of this action gives the following field equations

$$\begin{array}{*{20}{c}} {F(\varphi )\left( {{R_{\mu \nu }} - \frac{1}{2}{g_{\mu \nu }}R} \right) = 8\pi {G_*}{T_{\mu \nu }}\left[ {{\partial _\mu }\varphi {\partial _\nu }\varphi - \frac{1}{2}{g_{\mu \nu }}{{({\partial _\alpha }\varphi )}^2}} \right]\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} \\ { + {\nabla _\mu }{\partial _\nu }F(\varphi ) - {g_{\mu \nu }}\square F(\varphi ) - {g_{\mu \nu }}U(\varphi )\;,} \end{array}$$
(158)
$$2Z(\varphi)\;\square\varphi = - {{dF} \over {d\varphi}}R - {{dZ} \over {d\varphi}}{({\partial _\alpha}\varphi)^2} + 2{{dU} \over {d\varphi}},$$
(159)
$${\nabla _\mu}T_\nu ^\mu = 0,$$
(160)

where \(T \equiv T_\mu ^\mu\) is the trace of the matter energy-momentum tensor \({T^{\mu \nu}} \equiv (2/\sqrt {- g}) \times \delta {S_m}/\delta {g_{\mu \nu}}\) As expected [183], we have an equation, which reduces to the standard Einstein equation when φ is constant and a new equation to describe the dynamics of the new degree of freedom while the conservation equation of the matter fields is unchanged, as expected from the weak equivalence principle.

It is useful to define an Einstein frame action through a conformal transformation of the metric \(g_{\mu \nu}^{\ast} = F(\varphi){g_{\mu \nu}}\). In the following all quantities labeled by a star (*) will refer to Einstein frame. Defining the field φ* and the two functions A(φ*) and V(φ*) (see, e.g., [191]) by

$${\left({{{{\rm{d}}{\varphi _\ast}} \over {{\rm{d}}\varphi}}} \right)^2} = {3 \over 4}{\left({{{{\rm{d}}\ln F(\varphi)} \over {{\rm{d}}\varphi}}} \right)^2} + {1 \over {2F(\varphi)}},\quad A({\varphi _\ast}) = {F^{- 1/2}}(\varphi),\quad 2V({\varphi _\ast}) = U(\varphi){F^{- 2}}(\varphi),$$

the action (157) reads as

$$S = {1 \over {16\pi {G_\ast}}}\int {{{\rm{d}}^4}x\sqrt {- {g_\ast}} [{R_\ast} - 2g_\ast ^{\mu \nu}{\partial _\mu}{\varphi _\ast}{\partial _\nu}{\varphi _\ast} - 4V] + {S_{{\rm{matter}}}}[{A^2}g_{\mu \nu}^{\ast};\psi ].}$$
(161)

The kinetic terms have been diagonalized so that the spin-2 and spin-0 degrees of freedom of the theory are perturbations of \(g_{\mu \nu}^{\ast}\) and φ* respectively. In this frame the field equations are given by

$$R_{\mu \nu}^{\ast} - {1 \over 2}{R^{\ast}}g_{\mu \nu}^{\ast} = 8\pi {G_\ast}T_{\mu \nu}^{\ast} + 2{\partial _\mu}{\varphi _\ast}{\partial _\nu}{\varphi _\ast} - g_{\mu \nu}^{\ast}(g_\ast ^{\alpha \beta}{\partial _\alpha}{\varphi _\ast}{\partial _\beta}{\varphi _\ast}) - 2V(\varphi)g_{\mu \nu}^{\ast},$$
(162)
$${\square_\ast}{\varphi _\ast} = - 4\pi {G_\ast}\alpha ({\varphi _\ast}){T_\ast} + dV(\varphi)/d{\varphi _\ast},$$
(163)
$$\nabla _\mu ^{\ast}T_{\ast \nu}^\mu = \alpha ({\varphi _\ast}){T_\ast}{\partial _\nu}{\varphi _\ast},$$
(164)

with α ≡ dlnA/dφ* and β ≡ dα/dφ*. In this version, the Einstein equations are not modified, but since the theory can now be seen as the theory in which all the mass are varying in the same way, there is a source term to the conservation equation. This shows that the same theory can be interpreted as a varying G theory or a universally varying mass theory, but remember that whatever its form the important parameter is the dimensionless quantity Gm2/ħc.

The action (157) defines an effective gravitational constant Geff = G*/F = G*A2. This constant does not correspond to the gravitational constant effectively measured in a Cavendish experiment. The Newton constant measured in this experiment is

$${G_{{\rm{cav}}}} = {G_\ast}A_0^2(1 + \alpha _0^2) = {{{G_\ast}} \over F}\left({1 + {{F_\phi ^2} \over {2F + 3F_\phi ^2}}} \right)$$
(165)

where the first term, \({G_ \ast}A_0^2\) corresponds to the exchange of a graviton while the second term \({G_ \ast}A_0^2\alpha _0^2\) is related to the long range scalar force, a subscript 0 referring to the quantity evaluated today. The gravitational constant depends on the scalar field and is thus dynamical.

This illustrates the main features that will appear in any such models: (i) new dynamical fields appear (here a scalar field), (ii) some constant will depend on the value of this scalar field (here G is a function of the scalar field). It follows that the Einstein equations will be modified and that there will exist a new equation dictating the propagation of the new degree of freedom.

In this particular example, the coupling of the scalar field is universal so that no violation of the universality of free fall is expected. The deviation from general relativity can be quantified in terms of the post-Newtonian parameters, which can be expressed in terms of the values of α and β today as

$${\gamma ^{{\rm{PPN}}}} - 1 = - {{2\alpha _0^2} \over {1 + \alpha _0^2}},\quad {\beta ^{{\rm{PPN}}}} - 1 = {1 \over 2}{{{\beta _0}\alpha _0^2} \over {{{(1 + \alpha _0^2)}^2}}}.$$
(166)

These expressions are valid only if the field is light on the solar system scales. If this is not the case then these conclusions may be changed [287]. The solar system constraints imply α0 to be very small, typically \(\alpha _0^2 < {10^{- 5}}\) while β0 can still be large. Binary pulsar observations [125, 189] impose that β0 > −4.5. The time variation of G is then related to α0, β0 and the time variation of the scalar field today

$${{{{\dot G}_{{\rm{cav}}}}} \over {{G_{{\rm{cav}}}}}} = 2{\alpha _0}\left({1 + {{{\beta _0}} \over {1 + \alpha _0^2}}} \right){\dot \varphi _{\ast 0}}.$$
(167)

This example shows that the variation of the constant and the deviation from general relativity quantified in terms of the PPN parameters are of the same magnitude, because they are all driven by the same new scalar field.

The example of scalar-tensor theories is also very illustrative to show how deviation from general relativity can be fairly large in the early universe while still being compatible with solar system constraints. It relies on the attraction mechanism toward general relativity [130, 131].

Consider the simplest model of a massless dilaton (V(φ*) = 0) with quadratic coupling \((\ln \,A = a = {1 \over 2}\beta \varphi _\ast^2)\). Note that the linear case correspond to a Brans-Dicke theory with a fixed deviation from general relativity. It follows that α0 = βφ0* and β0 = β. As long as V = 0, the Klein-Gordon equation can be rewritten in terms of the variable p = ln a as

$${2 \over 3 -{\varphi\prime_{\ast}^{2}}}\varphi _\ast{\prime\prime} + (1 - w)\varphi _{\ast}{\prime} = - \alpha (\varphi _{\ast})(1 - 3w).$$
(168)

As emphasized in [130], this is the equation of motion of a point particle with a velocity dependent inertial mass, \(m({\varphi _\ast}) = 2/(3 - \varphi _\ast{\prime 2})\) evolving in a potential α(φ*)(1 − 3ω) and subject to a damping force, \(- (1 - w)\varphi _\ast{\prime}\). During the cosmological evolution the field is driven toward the minimum of the coupling function. If β > 0, it drives φ* toward 0, that is α → 0, so that the scalar-tensor theory becomes closer and closer to general relativity. When β < 0, the theory is driven way from general relativity and is likely to be incompatible with local tests unless φ* was initially arbitrarily close from 0.

It follows that the deviation from general relativity remains constant during the radiation era (up to threshold effects in the early universe [108, 134] and quantum effects [85]) and the theory is then attracted toward general relativity during the matter era. Note that it implies that postulating a linear or inverse variation of G with cosmic time is actually not realistic in this class of models. Since the theory is fully defined, one can easily compute various cosmological observables (late time dynamics [348], CMB anisotropy [435], weak lensing [449], BBN [105, 106, 134]) in a consistent way and confront them with data.

Making other constants dynamical

Given this example, it seems a priori simple to cook up a theory that will describe a varying fine-structure constant by coupling a scalar field to the electromagnetic Faraday tensor as

$$S = \int {\left[ {{R \over {16\pi G}} - 2{{({\partial _\mu}\phi)}^2} - {1 \over 4}B(\phi)F_{\mu \nu}^2} \right]} \sqrt {- g} {{\rm{d}}^{\rm{4}}}x$$
(169)

so that the fine-structure will evolve according to α = B−1. However, such an simple implementation may have dramatic implications. In particular, the contribution of the electromagnetic binding energy to the mass of any nucleus can be estimated by the Bethe-Weizäcker formula so that

$${m_{(A,Z)}}(\phi) \supset 98.25\alpha (\phi){{Z(Z - 1)} \over {{A^{1/3}}}}\;{\rm{MeV}}.$$

. This implies that the sensitivity of the mass to a variation of the scalar field is expected to be of the order of

$${f_{(A,Z)}} = {\partial _\phi}{m_{(A,Z)}}(\phi) \sim {10^{- 2}}{{Z(Z - 1)} \over {{A^{4/3}}}}\alpha {\prime}(\phi).$$
(170)

It follows that the level of the violation of the universality of free fall is expected to be of the level of \({\eta _{12}} \sim {10^{- 9}}X({A_1},{Z_1};{A_2},{Z_2})({\partial _\phi}\ln B)_0^2\). Since the factor X(A1, Z1; A2, Z2) typically ranges as \({\mathcal O}(0.1 - 10)\), we deduce that ( ϕ ln B)0 has to be very small for the solar system constraints to be satisfied. It follows that today the scalar field has to be very close to the minimum of the coupling function ln B. This led to the idea of the least coupling mechanism [135, 136] discussed in Section 5.4.1. This is indeed very simplistic because this example only takes into account the effect of the electromagnetic binding energy (see Section 6.3).

Let us also note that such a simple coupling cannot be eliminated by a conformal rescaling \({g_{\mu \nu}} = {A^2}(\phi)g_{\mu \nu}^{\ast}\) since

$$\int {B(\phi){g^{\mu \rho}}{g^{\mu \nu}}{F_{\nu \sigma}}{F_{\rho \sigma}}\sqrt {- g} {{\rm{d}}^{\rm{4}}}x \rightarrow\int {B(\phi){A^{D - 4}}(\phi)g_\ast ^{\mu \rho}g_\ast ^{\mu \nu}{F_{\nu \sigma}}{F_{\rho \sigma}}\sqrt {- {g_\ast}} {{\rm{d}}^{\rm{4}}}x}}$$

so that the action is invariant in D = 4 dimensions.

This example shows that we cannot couple a field blindly to, e.g., the Faraday tensor to make the fine-structure constant dynamics and that some mechanism for reconciling this variation with local constraints, and in particular the university of free fall, will be needed.

High-energy theories and varying constants

Kaluza-Klein

Such coupling terms naturally appear when compactifying a higher-dimensional theory. As an example, let us recall the compactification of a 5-dimensional Einstein-Hilbert action ([409], chapter 13)

$$S = {1 \over {12{\pi ^2}{G_5}}}\int {\bar R\sqrt {- \bar g} {{\rm{d}}^{\rm{5}}}x.}$$

Decomposing the 5-dimensional metric \({{\bar g}_{AB}}\) as

$${\bar g_{AB}} = \left( {{g_{\mu \nu }}\begin{array}{*{20}{c}} { + \frac{{{A_\mu }{A_\nu }}}{{{M^2}}}{\phi ^2}\frac{{{A_\mu }}}{M}{\phi ^2}} \\ {\frac{{{A_\nu }}}{M}{\phi ^2}\;\;\;\;\;\;\;\;{\phi ^2}\;} \end{array}} \right),$$

where M is a mass scale, we obtain

$$S = {1 \over {16\pi {G_\ast}}}\int {\left({R - {{{\phi ^2}} \over {4{M^2}}}{F^2}} \right)\phi \sqrt {- g} {{\rm{d}}^{\rm{4}}}x,}$$
(171)

where the 4-dimensional gravitational constant is G* = 3πG5/4 ∫ dy. The scalar field couples explicitly to the kinetic term of the vector field and cannot be eliminated by a redefinition of the metric: again, this is the well-known conformal invariance of electromagnetism in four dimensions. Such a term induces a variation of the fine-structure constant as well as a violation of the universality of free-fall. Such dependencies of the masses and couplings are generic for higher-dimensional theories and in particular string theory. It is actually one of the definitive predictions for string theory that there exists a dilaton, that couples directly to matter [484] and whose vacuum expectation value determines the string coupling constants [546].

In the models by Kaluza [269] and Klein [291] the 5-dimensional spacetime was compactified assuming that one spatial extra-dimension S1, of radius RKK. It follows that any field χ(xμ, y) can be Fourier transformed along the compact dimension (with coordinate), so that, from a 4-dimensional point of view, it gives rise to a tower of of fields χ(n)(xμ) of mas mn = nR KK . At energies small compared to \(R_{KK}^{- 1}\) only the y-independent part of the field remains and the physics looks 4-dimensional.

Assuming that the action (171) corresponds to the Jordan frame action, as the coupling ϕR may suggest, it follows that the gravitational constant and the Yang-Mills coupling associated with the vector field Aμ must scale as

$$G \propto {\phi ^{- 1}},\quad g_{YM}^{- 2} \propto {\phi ^2}/G \propto {\phi ^3}.$$
(172)

Note that the scaling of G with ϕ (or time) is not the one of the gravitational constant that would be measured in a Cavendish experiment since Eq. (165) tells us that \({G_{{\rm{cav}}}} \propto {G_\ast}{\phi ^{- 1}}\left({1 + {1 \over {2\phi + 3}}} \right)\).

This can be generalized to the case of D extra-dimensions [114] to

$$G \propto {\phi ^{- D}},\quad {\alpha _i}({m_{{\rm{KK}}}}) = {K_i}(D)G{\phi ^{- 2}}$$
(173)

where the constants K i depends only on the dimension and topology of the compact space [525] so that the only fundamental constant of the theory is the mass scale M4+D entering the 4 + D-dimensional theory. A theory on \({{\mathcal M}_4} \times {{\mathcal M}_D}\) where \({{\mathcal M}_D}\) is a D-dimensional compact space generates a low-energy quantum field theory of the Yang-Mills type related to the isometries of \({{\mathcal M}_D}\) [for instance [545] showed that for D = 7, it can accommodate the Yang-Mills group SU(3) × SU(2) × U(1)]. The two main problems of these theories are that one cannot construct chiral fermions in four dimensions by compactification on a smooth manifold with such a procedure and that gauge theories in five dimensions or more are not renormalizable.

In such a framework the variation of the gauge couplings and of the gravitational constant arises from the variation of the size of the extra dimensions so that one can derives stronger constraints that by assuming independent variation, but at the expense of being more model-dependent. Let us mention the works by Marciano [345] and Wu and Wang [550] in which the structure constants at lower energy are obtained by the renormalization group, and the work by Veneziano [515] for a toy model in D ≥ 4 dimensions, endowed with an invariant UV cut-off Λ, and containing a large number of non-self-interacting matter species.

Ref. [295] used the variation (173) to constrain the time variation of the radius of the extra dimensions during primordial nucleosynthesis to conclude that ∣ Δ RKK/RKK ∣ < 1%. [28] took the effects of the variation of \({\alpha _{\rm{S}}} \propto R_{KK}^{- 2}\) and deduced from the helium-4 abundance that ∣ ΔRKK/RKK∣ < 0.7% and ∣ Δ RKK/RKK∣ < 1.1% respectively for D = 2 and D = 7 Kaluza-Klein theory and that ∣Δ RKK/RKK∣ < 3.4 × 10−10 from the Oklo data. An analysis of most cosmological data (BBN, CMB, quasar etc..) assuming that the extra dimension scales as R0(1 + Δt−3/4) and R0[1 + Δ](1 − cosω(tt0)) concluded that Δ has to be smaller than 10−16 and 10−8 respectively [311], while [330] assumes that gauge fields and matter fields can propagate in the bulk, that is in the extra dimensions. Ref. [336] evaluated the effect of such a couple variation of G and the structures constants on distant supernova data, concluding that a variation similar to the one reported in [524] would make the distant supernovae brighter.

String theory

There exist five anomaly-free, supersymmetric perturbative string theories respectively known as type I, type IIA, type IIB, SO(32) heterotic and E8 × E8 heterotic theories (see, e.g., [420]). One of the definitive predictions of these theories is the existence of a scalar field, the dilaton, that couples directly to matter [484] and whose vacuum expectation value determines the string coupling constant [546]. There are two other excitations that are common to all perturbative string theories, a rank two symmetric tensor (the graviton) g μν and a rank two antisymmetric tensor B μν . The field content then differs from one theory to another. It follows that the 4-dimensional couplings are determined in terms of a string scale and various dynamical fields (dilaton, volume of compact space, …). When the dilaton is massless, we expect three effects: (i) a scalar admixture of a scalar component inducing deviations from general relativity in gravitational effects, (ii) a variation of the couplings and (iii) a violation of the weak equivalence principle. Our purpose is to show how the 4-dimensional couplings are related to the string mass scale, to the dilaton and the structure of the extra dimensions mainly on the example of heterotic theories.

To be more specific, let us consider an example. The two heterotic theories originate from the fact that left- and right-moving modes of a closed string are independent. This reduces the number of supersymmetry to N = 1 and the quantization of the left-moving modes imposes that the gauge group is either SO(32) or E8 × E8 depending on the fermionic boundary conditions. The effective tree-level action is (see, e.g., Ref. [237])

$${S_H} = \int {{{\rm{d}}^{10}}{\bf{x}}\sqrt {- {g_{10}}} {{\rm{e}}^{- 2\Phi}}\left[ {M_H^8\{{R_{10}} + 4\square\Phi - 4{{(\nabla \Phi)}^2}\} - {{M_H^6} \over 4}{F_{AB}}{F^{AB}} + \ldots} \right].}$$
(174)

When compactified on a 6-dimensional Calabi-Yau space, the effective 4-dimensional action takes the form

$${S_H} = \int {{\rm{d}}^4}{\bf{x}}\sqrt {- {g_4}} \phi \left[ {M_H^8\left\{{{R_4} + {{\left({{{\nabla \phi} \over \phi}} \right)}^2} - {1 \over 6}{{\left({{{\nabla {V_6}} \over {{V_6}}}} \right)}^2}} \right\} - {{M_H^6} \over 4}{F^2}} \right] + \ldots$$
(175)

where ϕV6e−2Φ couples identically to the Einstein and Yang-Mills terms. It follows that

$$M_4^2 = M_H^8\phi, \quad g_{{\rm{YM}}}^{- 2} = M_H^6\phi$$
(176)

at tree-level. Note that to reach this conclusion, one has to assume that the matter fields (in the ‘dots’ of Equation (175) are minimally coupled to g4; see, e.g., [340]).

The strongly coupled SO(32) heterotic string theory is equivalent to the weakly coupled type I string theory. Type I superstring admits open strings, the boundary conditions of which divide the number of supersymmetries by two. It follows that the tree-level effective bosonic action is N = 1, D = 10 supergravity, which takes the form, in the string frame,

$${S_I} = \int {{{\rm{d}}^{10}}} {\bf{x}}\sqrt {- {g_{10}}} M_I^6{{\rm{e}}^{-\Phi}}\left[ {{{\rm{e}}^{- \Phi}}M_I^2{R_{10}} - {{{F^2}} \over 4} + \ldots} \right]$$
(177)

where the dots contains terms describing the dynamics of the dilaton, fermions and other form fields. At variance with (174), the field Φ couples differently to the gravitational and Yang-Mills terms because the graviton and Yang-Mills fields are respectively excitation of close and open strings. It follows that M I can be lowered even to the weak scale by simply having expΦ small enough. Type I theories require D9-branes for consistency. When V6 is small, one can use T-duality (to render V6 large, which allows to use a quantum field theory approach) and turn the D9-brane into a D3-brane so that

$${S_I} = \int {{{\rm{d}}^{10}}} {\bf{x}}\sqrt {- {g_{10}}} {{\rm{e}}^{- 2\Phi}}M_I^8{R_{10}} - \int {{{\rm{d}}^4}} {\bf{x}}\sqrt {- {g_4}} {{\rm{e}}^{- \Phi}}{1 \over 4}{F^2} + \ldots$$
(178)

where the second term describes the Yang-Mills fields localized on the D3-brane. It follows that

$$M_4^2 = {{\rm{e}}^{- 2\Phi}}{V_6}M_I^8,\quad g_{{\rm{YM}}}^{- 2} = {{\rm{e}}^{- \Phi}}$$
(179)

at tree-level. If one compactifies the D9-brane on a 6-dimensional orbifold instead of a 6-torus, and if the brane is localized at an orbifold fixed point, then gauge fields couple to fields M i living only at these orbifold fixed points with a (calculable) tree-level coupling c i so that

$$M_4^2 = {{\rm{e}}^{- 2\Phi}}{V_6}M_I^8,\quad g_{{\rm{YM}}}^{- 2} = {{\rm{e}}^{- \Phi}} + {c_i}{M_i}.$$
(180)

The coupling to the field c i is a priori non universal. At strong coupling, the 10-dimensional E8 × E8 heterotic theory becomes M-theory on R10 × S1/Z2 [255]. The gravitational field propagates in the 11-dimensional space while the gauge fields are localized on two 10-dimensional branes.

At one-loop, one can derive the couplings by including Kaluza-Klein excitations to get [163]

$$g_{{\rm{YM}}}^{- 2} = M_H^6\phi - {{{b_a}} \over 2}{(R{M_H})^2} + \ldots$$
(181)

when the volume is large compared to the mass scale and in that case the coupling is no more universal. Otherwise, one would get a more complicated function. Obviously, the 4-dimensional effective gravitational and Yang-Mills couplings depend on the considered superstring theory, on the compactification scheme but in any case they depend on the dilaton.

As an example, [340] considered the (N = 1, D = 10)-supergravity model derived from the heterotic superstring theory in the low energy limit and assumed that the 10-dimensional spacetime is compactified on a 6-torus of radius R(xμ) so that the effective 4-dimensional theory described by (175) is of the Brans-Dicke type with ω = −1. Assuming that ϕ has a mass μ, and couples to the matter fluid in the universe as \({S_{{\rm{matter}}}} = \int {{{\rm{d}}^{10}}{\rm{x}}\sqrt {- {g_{10}}} \exp (- 2\Phi){{\mathcal L}_{{\rm{matter}}}}({g_{10}})}\), the reduced 4-dimensional matter action is

$${S_{{\rm{matter}}}} = \int {{{\rm{d}}^4}{\bf{x}}\sqrt {- g} \phi {{\mathcal L}_{{\rm{matter}}}}(g).}$$
(182)

The cosmological evolution of ϕ and R can then be computed to deduce that αĖM/αEM ≃ 10−10 (μ/1 eV)−2 yr−1. considered the same model but assumed that supersymmetry is broken by non-perturbative effects such as gaugino condensation. In this model, and contrary to [340], ϕ is stabilized and the variation of the constants arises mainly from the variation of R in a runaway potential.

Ref. [290] considers a probe D3-brane probe in the context of AdS/CFT correspondence at finite temperature and provides the predictions for the running electric and magnetic effective couplings, beyond perturbation theory. It allows to construct a varying speed of light model.

To conclude, superstring theories offer a natural theoretical framework to discuss the value of the fundamental constants since they become expectation values of some fields. This is a first step towards their understanding but yet, no complete and satisfactory mechanism for the stabilization of the extra dimensions and dilaton is known.

It has paved the way for various models that we detail in Section 5.4.

Relations between constants

There are different possibilities to relate the variations of different constants. First, in quantum field theory, we have to take into account the running of coupling constants with energy and the possibilities of grand unification to relate them. It will also give a link between the QCD scale, the coupling constants and the mass of the fundamental particles (i.e., the Yukawa couplings and the Higgs vev). Second, one can compute the binding energies and the masses of the proton, neutron and different nuclei in terms of the gauge couplings and the quark masses. This step involves QCD and nuclear physics. Third, one can relate the gyromagnetic factor in terms of the quark masses. This is particularly important to interpret the constraints from the atomic clocks and the QSO spectra. This allows one to set stronger constraints on the varying parameters at the expense of a model-dependence.

Implication of gauge coupling unification

The first theoretical implication of high-energy physics arises from the unification of the non-gravitational interactions. In these unification schemes, the three standard model coupling constants derive from one unified coupling constant.

In quantum field, the calculation of scattering processes include higher order corrections of the coupling constants related to loop corrections that introduce some integrals over internal 4-momenta. Depending on the theory, these integrals may be either finite or diverging as the logarithm or power law of a UV cut-off. In a class of theories, called renormalizable, among which the standard model of particle physics, the physical quantities calculated at any order do not depend on the choice of the cut-off scale. But the result may depend on ln E/m where E is the typical energy scale of the process. It follows that the values of the coupling constants of the standard model depend on the energy at which they are measured (or of the process in which they are involved). This running arises from the screening due to the existence of virtual particles, which are polarized by the presence of a charge. The renormalization group allows to compute the dependence of a coupling constants as a function of the energy E as

$${{{\rm{d}}{g_i}(E)} \over {{\rm{d}}\ln E}} = {\beta _i}(E),$$

where the beta functions, β i , depend on the gauge group and on the matter content of the theory and may be expended in powers of g i . For the SU(2) and U(1) gauge couplings of the standard model, they are given by

$${\beta _2}({g_2}) = - {{g_2^3} \over {4{\pi ^2}}}\left({{{11} \over 6} - {{{n_g}} \over 3}} \right),\quad {\beta _1}({g_1}) = + {{g_1^3} \over {4{\pi ^2}}}{{5{n_g}} \over 9}$$

where n g is the number of generations for the fermions. We remind that the fine-structure constant is defined in the limit of zero momentum transfer so that cosmological variation of αEM are independent of the issue of the renormalization group dependence. For the SU(3) sector, with fundamental Dirac fermion representations,

$${\beta _3}({g_3}) = - {{g_3^3} \over {4{\pi ^2}}}\left({{{11} \over 4} - {{{n_f}} \over 6}} \right),$$

n f being the number of quark flavors with mass smaller than E. The negative sign implies that (1) at large momentum transfer the coupling decreases and loop corrections become less and less significant: QCD is said to be asymptotically free; (2) integrating the renormalization group equation for α3 gives

$${\alpha _3}(E) = {{6\pi} \over {(33 - {n_f})\ln (E/{\Lambda _c})}}$$

so that it diverges as the energy scale approaches Λ c from above, that we decided to call ΛQCD. This scale characterizes all QCD properties and in particular the masses of the hadrons are expected to be proportional to ΛQCD up to corrections of order mq/ΛQCD.

It was noticed quite early that these relations imply that the weaker gauge coupling becomes stronger at high energy, while the strong coupling becomes weaker so that one can thought the three non-gravitational interactions may have a single common coupling strength above a given energy. This is the driving idea of Grand Unified Theories (GUT) in which one introduces a mechanism of symmetry-breaking from a higher symmetry group, such, e.g., as SO(10) or SU(5), at high energies. It has two important consequences for our present considerations. First there may exist algebraic relations between the Yukawa couplings of the standard model. Second, the structure constants of the standard model unify at an energy scale M U

$${\alpha _1}({M_U}) = {\alpha _2}({M_U}) = {\alpha _3}({M_U}) \equiv {\alpha _U}({M_U}).$$
(183)

We note that the electroweak mixing angle, i.e., the can also be time dependent parameter, but only for EM U since at E = M U , it is fixed by the symmetry to have the value sin2 θ = 3/8, from which we deduce that

$$\alpha _{{\rm{EM}}}^{-1}({M_Z}) = {5 \over 3}\alpha _{\rm{1}}^{- 1}({M_Z}) + \alpha _{{2}}^{- 1}({M_Z}).$$

It follows from the renormalization group relations that

$$\alpha _i^{- 1}(E) = \alpha _i^{- 1}({M_U}) - {{{b_i}} \over {2\pi}}\ln {E \over {{M_U}}},$$
(184)

where the beta-function coefficients are given by b i = (41/10, −19/6, 7) for the standard model (or below the SUSY scale ΛSUSY) and by b i = (33/5, 1, −3) for N = 1 supersymmetric theory. Given a field decoupling at mth, one has

$$\alpha _i^{- 1}({E_ -}) = \alpha _i^{- 1}({E_ +}) - {{b_i^{(-)}} \over {2\pi}}\ln {{{E_ -}} \over {{E_ +}}} - {{b_i^{(th)}} \over {2\pi}}\ln {{{m_{{\rm{th}}}}} \over {{E_ +}}}$$

where \(b_i^{({\rm{th}})} = {b^{(+)}} - {b^{(-)}}\) with b(+/−) the beta-function coefficients respectively above and below the mass threshold, with tree-level matching at mth. In the case of multiple thresholds, one must sum the different contributions. The existence of these thresholds implies that the running of α3 is complicated since it depends on the masses of heavy quarks and colored superpartner in the case of supersymmetry. For non-supersymmetric theories, the low-energy expression of the QCD scale is

$${\Lambda _{{\rm{QCD}}}} = E{\left({{{{m_{\rm{c}}}{m_{\rm{b}}}{m_{\rm{t}}}} \over E}} \right)^{2/27}}\exp \left({- {{2\pi} \over {9{\alpha _3}(E)}}} \right)$$
(185)

for E > mt. This implies that the variation of Yukawa couplings, gauge couplings, Higgs vev and ΛQCD/MP are correlated. A second set of relations arises in models in which the weak scale is determined by dimensional transmutation [184, 185]. In such cases, the Higgs vev is related to the Yukawa constant of the top quark by [77]

$$v = {M_p}\exp \left({- {{8{\pi ^2}c} \over {h_{\rm{t}}^2}}} \right),$$
(186)

where c is a constant of order unity. This would imply that δlnv = ln h with S ∼ 160 [104].

The first consequences of this unification were investigated in Refs. [77, 74, 75, 135, 136, 185, 313] where the variation of the 3 coupling constants was reduced to the one of α U and M U /M P . It was concluded that, setting

$$R \equiv \delta \ln {\Lambda _{{\rm{QCD}}}}/\delta \ln {\alpha _{{\rm{EM}}}},$$
(187)

R ∼ 34 with a stated accuracy of about 20% [312, 313] (assuming only α U can vary), R ∼ 40.82 in the string dilaton model assuming Grand Unification [135, 136] (see Section 5.4.1), R = 38 ± 6 [74] and then R = 46 [75, 76], the difference arising from the quark masses and their associated thresholds. However, these results implicitly assume that the electroweak symmetry breaking and supersymmetry breaking mechanisms, as well as the fermion mass generation, are not affected by the variation of the unified coupling. It was also mentioned in [75] that R can reach −235 in unification based on SU(5) and SO(10). The large value of R arises from the exponential dependence of ΛQCD on α3. In the limit in which the quark masses are set to zero, the proton mass, as well as all other hadronic masses are proportional to ΛQCD, i.e., \({m_{\rm{p}}} \propto {\Lambda _{{\rm{QCD}}}}(1 + {\mathcal O}({m_{\rm{q}}}/{\Lambda _{{\rm{QCD}}}}))\). [313] further relates the Higgs vev to αEM by dlnv/dlnαEMκ and estimated that κ ∼ 70 so that, assuming that the variation of the Yukawa couplings is negligible, it could be concluded that

$$\delta \ln {m \over {{\Lambda _{{\rm{QCD}}}}}} \sim 35\delta \ln {\alpha _{{\rm{EM}}}},$$

for the quark and electron masses. This would also imply that the variation of μ, and αEM are correlated, still in a very model-dependent way, typically one can conclude [104] that

$${{\delta \mu} \over \mu} = - 0.8R{{\delta {\alpha _{{\rm{EM}}}}} \over {{\alpha _{{\rm{EM}}}}}} + 0.6(S + 1){{\delta h} \over h},$$

with S ∼ 160. The running of α U can be extrapolated to the Planck mass, MP. Assuming α U (MP) fixed and letting M U /MP vary, it was concluded [153] that R = 2π(b U + 3)/[9αEM(86 U /3 − 12)] where b U is the beta-function coefficient describing the running of α U . This shows that a variation of αEM and μ can open a windows on GUT theories. A similar analysis [142] assuming that electroweak symmetry breaking was triggered by nonperturbative effects in such a way that v and α U are related, concludes that δμ/μ = (13 ± 7)δαEM/αEM in a theory with soft SUSY breaking and δμ/μ = (−4 ± 5) δαEM/αEM otherwise.

From a phenomenological point of view, [147] making an assumption of proportionality with fixed “unification coefficients” assumes that the variations of the constants at a given redshift z depend on a unique evolution factor (z) and that the variation of all the constants can be derived from those of the unification mass scale (in Planck units), M U , the unified gauge coupling α U , the Higgs vev, v and in the case of supersymmetric theories the soft supersymmetry breaking mass, \({\tilde m}\). Introducing the coefficients d i by

$$\Delta \ln {{{M_U}} \over {{M_{\rm{P}}}}} = {d_M}\ell, \quad \Delta \ln {\alpha _U} = {d_U}\ell, \quad \Delta \ln {v \over {{M_U}}} = {d_H}\ell, \quad \Delta \ln {{\tilde m} \over {{M_{\rm{P}}}}} = {d_S}\ell,$$

(d S = 0 for non-supersymmetric theories) and assuming that the masses of the standard model fermions all vary with v so that the Yukawa couplings are assumed constant, it was shown that the variations of all constants can be related to (d M , d U , d H , d S ) and (z), using the renormalization group equations (neglecting the effects induced by the variation of α U on the RG running of fermion masses). This decomposition is a good approximation provided that the time variation is slow, which is actually backed up by the existing constraints, and homogeneous in space (so that it may not be applied as such in the case a chameleon mechanism is at work [69]).

This allowed to be defined six classes of scenarios: (1) varying gravitational constant (d H = d S = d X = 0) in which only M U /M P or equivalently \(G\Lambda _{{\rm{QCD}}}^2\) is varying; (2) varying unified coupling (d U = 1, d H = d S = d M = 0); (3) varying Fermi scale defined by (d H = 1, d U = d S = d M = 0) in which one has d ln μ/d lnαEM = −325; (4) varying Fermi scale and SUSY-breaking scale (d S = d H = 1, d U = d M = 0) and for which d ln μ/d ln αEM = −21.5; (5) varying unified coupling and Fermi scale \(({d_X} = 1,{d_H} = \tilde \gamma {d_X},{d_S} = {d_M} = 0)\) and for which \({\rm{d}}\ln \mu/{\rm{d}}\ln {\alpha _{{\rm{EM}}}} = (23.2 - 0.65\tilde \gamma)/(0.865 + 0.02\tilde \gamma)\) (6) varying unified coupling and Fermi scale with SUSY \(({d_X} = 1,{d_S} \simeq {d_H} = \tilde \gamma {d_X},{d_M} = 0)\) and for which \({\rm{d}}\ln \mu/{\rm{d}}\ln {\alpha _{{\rm{EM}}}} = (14 - 0.28\tilde \gamma)/(0.52 + 0.013\tilde \gamma)\).

Each scenario can be compared to the existing constraints to get sharper bounds on them [146, 147, 149, 364] and emphasize that the correlated variation between different constants (here μ and αEM) depends strongly on the theoretical hypothesis that are made.

Masses and binding energies

The previous Section 5.3.1 described the unification of the gauge couplings. When we consider “composite” systems such as proton, neutron, nuclei or even planets and stars, we need to compute their mass, which requires to determine their binding energy. As we have already seen, the electromagnetic binding energy induces a direct dependence on αEM and can be evaluated using, e.g., the Bethe-Weizäcker formula (61). The dependence of the masses on the quark masses, via nuclear interactions, and the determination of the nuclear binding energy are especially difficult to estimate.

In the chiral limit of QCD in which all quark masses are negligible compared to ΛQCD all dimensionful quantities scale as some power of ΛQCD. For instance, concerning the nucleon mass, mN = cΛQCD with c ∼ 3.9 being computed from lattice QCD. This predicts a mass of order 860 MeV, smaller than the observed value of 940 MeV. The nucleon mass can be computed in chiral perturbation theory and expressed in terms of the pion mass as [316] \({m_{\rm{N}}} = {a_0} + {a_2}m_\pi ^2 + {a_4}m_\pi ^4 + {a_6}m_\pi ^6 + {\sigma _{N\pi}} + {\sigma _{\Delta \pi}} + {\sigma _{{\rm{tad}}}}\) (where all coefficients of this expansion are defined in [316]), which can be used to show [204] that the nucleon mass is scaling as

$${m_{\rm{N}}} \propto {\Lambda _{{\rm{QCD}}}}X_{\rm{q}}^{0.037}X_{\rm{s}}^{0.011}.$$
(188)

(Note, however, that such a notation is dangerous since it would imply that mN vanishes in the chiral limit but it is a compact way to give δm n /δXq etc.). It was further extended [208] by using a sigma model to infer that \({m_{\rm{N}}} \propto {\Lambda _{{\rm{QCD}}}}X_{\rm{q}}^{0.045}X_{\rm{s}}^{0.19}\). These two examples explicitly show the strong dependence in nuclear modeling.

To go further and determine the sensitivity of the mass of a nucleus to the various constant,

$$m(A,Z) = Z{m_{\rm{p}}} + (A - Z){m_{\rm{n}}} + Z{m_{\rm{e}}} + {E_{\rm{S}}} + {E_{{\rm{EM}}}}$$

one should determine the strong binding energy [see related discussion below Eq. (17)] in function of the atomic number Z and the mass number A.

The case of the deuterium binding energy B D has been discussed in different ways (see Section 3.8.3). Many models have been created. A first route relies on the use of the dependence of B D on the pion mass [188, 38, 426, 553], which can then be related to mu, md and ΛQCD. A second avenue is to use a sigma model in the framework of the Walecka model [456] in which the potential for the nuclear forces keeps only the σ, ρ and ω meson exchanges [208]. We also emphasize that the deuterium is only produced during BBN, as it is too weakly bound to survive in the regions of stars where nuclear processes take place. The fact that we do observe deuterium today sets a non-trivial constraint on the constants by imposing that the deuterium remains stable from BBN time to today. Since it is weakly bound, it is also more sensitive to a variation of the nuclear force compared to the electromagnetic force. This was used in [145] to constrain the variation of the nuclear strength in a sigma-model.

For larger nuclei, the situation is more complicated since there is no simple modeling. For large mass number A, the strong binding energy can be approximated by the liquid drop model

$${{{E_{\rm{S}}}} \over A} = {a_V} - {{{a_S}} \over {{A^{1/3}}}} - {a_A}{{{{(A - 2Z)}^2}} \over {{A^2}}} + {a_P}{{{{(- 1)}^A} + {{(- 1)}^Z}} \over {{A^{3/2}}}}$$
(189)

with(a V , a S , a A , a P ) = (15.7, 17.8, 23.7, 11.2) MeV [439]. It has also been suggested [129] that the nuclear binding energy can be expressed as

$${E_{\rm{S}}} \simeq A{a_3} + {A^{2/3}}{b_3}\quad {\rm{with}}\quad {a_3} = a_3^{{\rm{chiral}}\;{\rm{limit}}} + m_\pi ^2{{\partial {a_3}} \over {\partial m_\pi ^2}}.$$
(190)

In the chiral limit, a3 has a non-vanishing limit to which we need to add a contribution scaling like \(m_\pi ^2 \propto {\Lambda _{{\rm{QCD}}}}{m_{\rm{q}}}\). [129] also pointed out that the delicate balance between attractive and repulsive nuclear interactions [456] implies that the binding energy of nuclei is expected to depend strongly on the quark masses [159]. Recently, a fitting formula derived from effective field theory and based of the semi-empirical formula derived in [222] was proposed [120] as

$${{{E_{\rm{S}}}} \over A} = - \left({120 - {{97} \over {{A^{1/3}}}}} \right){\eta _S} + \left({67 - {{57} \over {{A^{1/3}}}}} \right){\eta _V} + \ldots$$
(191)

where η S and η V are the strength of respectively the scalar (attractive) and vector (repulsive) nuclear contact interactions normalized to their actual value. These two parameters need to be related to the QCD parameters [159]. We also refer to [211] for the study of the dependence of the binding of light (A ≤ 8) nuclei on possible variations of hadronic masses, including meson, nucleon, and nucleon-resonance masses.

These expressions allow to compute the sensitivity coefficients that enter in the decomposition of the mass [see Equation (201)]. They also emphasize one of the most difficult issue concerning the investigation about constant related to the intricate structure of QCD and its role in low energy nuclear physics, which is central to determine the masses of nuclei and the binding energies, quantities that are particularly important for BBN, the universality of free fall and stellar physics.

Gyromagnetic factors

The constraints arising from the comparison of atomic clocks (see Section 3.1) involve the fine-structure constant αEM, the proton-to-electron mass ratio μ and various gyromagnetic factors. It is important to relate these factors to fundamental constants.

The proton and neutron gyromagnetic factors are respectively given by gp = 5.586 and gn = −3.826 and are expected to depend on Xq = mqQCD [197]. In the chiral limit in which mu = md = 0, the nucleon magnetic moments remain finite so that one could have thought that the finite quark mass effects should be small. However, it is enhanced by π-meson loop corrections, which are proportional to \({m_\pi} \propto \sqrt {{m_{\rm{q}}}{\Lambda _{{\rm{QCD}}}}}\). Following [316], this dependence can be described by the approximate formula

$$g({m_\pi}) = {{g(0)} \over {1 + a{m_\pi} + bm_\pi ^2}}.$$

The coefficients are given by a = (1.37, 1.85)/GeV and b = (0.452, 0.271)/GeV2 respectively for the proton an neutron. This lead [197] to \({g_{\rm{p}}} \propto m_\pi ^{- 0.174} \propto X_{\rm{q}}^{- 0.087}\) and \({g_{\rm{n}}} \propto m_\pi ^{- 0.213} \propto X_{\rm{q}}^{- 0.107}\). This was further extended in [204] to take into account the dependence with the strange quark mass ms to obtain

$${g_{\rm{p}}} \propto X_{\rm{q}}^{- 0.087}X_{\rm{s}}^{- 0.013},\quad {g_{\rm{n}}} \propto X_{\rm{q}}^{- 0.118}X_{\rm{s}}^{0.0013}.$$
(192)

All these expressions assumes ΛQCD constant in their derivations.

This allows one to express the results of atomic clocks (see Section 3.1.3) in terms of αEM, Xq, Xs and Xe. Similarly, for the constants constrained by QSO observation, we have (see Table 10)

$$\begin{array}{*{20}{c}} {x \propto \alpha _{EM}^2X_q^{ - 0.087}X_s^{ - 0.013},\;\;\;\;\;\;\;\;\;\;} \\ {y \propto \alpha _{EM}^2X_q^{ - 0.124}X_s^{ - 0.024}{X_e},\;\;\;\;\;\;} \\ {\bar \mu \propto X_q^{ - 0.037}X_s^{ - 0.011}{X_e},\;\;\;\;\;\;\;\;\;\;\;} \\ {F \propto \alpha _{EM}^{3.14}X_q^{ - 0.0289}X_s^{0.0043}X_e^{ - 1.57},} \\ {F' \propto \alpha _{EM}^2X_q^{ - 0.037}X_s^{0.011}X_e^{ - 1},\;\;\;\;} \\ {G \propto \alpha _{EM}^{1.85}X_q^{ - 0.0186}X_s^{0.0073}X_e^{ - 1.85},} \end{array}$$
(193)

once the scaling of the nucleon mass as \({m_{\rm{N}}} \propto {\Lambda _{{\rm{QCD}}}}X_{\rm{q}}^{0.037}X_{\rm{s}}^{0.011}\) (see Section 5.3.2). This shows that the seven observable quantities that are constrained by current QSO observations can be reduced to only 4 parameters.

Models with varying constants

The models that can be constructed are numerous and cannot all be reviewed here. Thus, we focus on the string dilaton model in Section 5.4.1 and then discuss the chameleon mechanism in Section 5.4.2 and the Bekenstein framework in Section 5.4.3.

String dilaton and Runaway dilaton models

Damour and Polyakov [135, 136] argued that the effective action for the massless modes taking into account the full string loop expansion should be of the form

$$\begin{array}{*{20}{c}} {S = \int {{{\text{d}}^{\text{4}}}{\mathbf{x}}\sqrt { - \hat g} \left[ {M_s^2\left\{ {{B_g}(\Phi )\hat R + 4{B_\Phi }(\Phi )\left[ {\hat \square \Phi - {{(\hat \nabla \Phi )}^2}} \right]} \right\} - {B_F}(\Phi )\frac{k}{4}{{\hat F}^2}} \right.} } \\ {\quad \quad \left. { - {B_\psi }(\Phi )\bar {\hat \psi} {\hat {\not D}}\hat \psi + \ldots } \right]\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} \end{array}$$
(194)

in the string frame, M s being the string mass scale. The functions B i are not known but can be expanded (from the genus expansion of string theory) in the limit Φ → −∞ as

$${B_i}(\Phi) = {{\rm{e}}^{- 2\Phi}} + c_0^{(i)} + c_1^{(i)}{{\rm{e}}^{2\Phi}} + c_2^{(i)}{{\rm{e}}^{4\Phi}} + \ldots$$
(195)

where the first term is the tree level term. It follows that these functions can exhibit a local maximum. After a conformal transformation \(({g_{\mu \nu}} = C{B_g}{{\tilde g}_{\mu \nu}},\psi = {(C{B_g})^{- 3/4}}B_\psi ^{1/2}\tilde \psi)\) the action in Einstein frame takes the form

$$S = \int {{{{{\rm{d}}^4}{\bf{x}}} \over {16\pi G}}\sqrt {- g} \left[ {R - 2{{(\nabla \phi)}^2} - {k \over 4}{B_F}(\phi){F^2} - \bar \psi {\not D}\psi + \ldots} \right]}$$
(196)

where the field ϕ is defined as

$$\phi \equiv \int {\left[ {{3 \over 4}{{\left({{{B_g{\prime}} \over {{B_g}}}} \right)}^2} + 2{{B_\Phi {\prime}} \over {{B_\Phi}}} + 2{{B_\Phi {\prime}} \over {{B_g}}}} \right]{\rm{d}}\Phi.}$$

It follows that the Yang-Mills coupling behaves as \(g_{{\rm{YM}}}^{- 2} = k{B_F}(\phi)\) This also implies that the QCD mass scale is given by

$${\Lambda _{{\rm{QCD}}}} \sim {M_s}{(C{B_g})^{- 1/2}}{{\rm{e}}^{- 8{\pi ^2}k{B_F}/b}}$$
(197)

where b depends on the matter content. It follows that the mass of any hadron, proportional to ΛQCD in first approximation, depends on the dilaton, m A (B g , B F ,…).

If, as allowed by the ansatz (195), m A (ϕ) has a minimum ϕ m then the scalar field will be driven toward this minimum during the cosmological evolution. However, if the various coupling functions have different minima then the minima of m A (ϕ) will depend on the particle A. To avoid violation of the equivalence principle at an unacceptable level, it is thus necessary to assume that all the minima coincide in ϕ = ϕ m , which can be implemented by setting B i = B. This can be realized by assuming that ϕ m is a special point in field space, for instance it could be associated to the fixed point of a Z2 symmetry of the T- or S-duality [129].

Expanding ln B around its maximum ϕ m as ln B ∝ − κ(ϕϕ m )2/2, Damour and Polyakov [135, 136] constrained the set of parameters (κ, ϕ0ϕ m ) using the different observational bounds. This toy model allows one to address the unsolved problem of the dilaton stabilization, to study all the experimental bounds together and to relate them in a quantitative manner (e.g., by deriving a link between equivalence-principle violations and time-variation of αEM). This model was compared to astrophysical data in [306] to conclude that ∣ Δϕ∣ < 3.4κ10−6.

An important feature of this model lies in the fact that at lowest order the masses of all nuclei are proportional to ΛQCD so that at this level of approximation, the coupling is universal and the theory reduces to a scalar-tensor theory and there will be no violation of the universality of free fall. It follows that the deviation from general relativity are characterized by the PPN parameters

$${\gamma ^{{\rm{PPN}}}} - 1 \simeq - 2f_A^2 = - 2\beta _s^2{\kappa ^2}\Delta \phi _0^2,\quad {\beta ^{{\rm{PPN}}}} - 1 \simeq {1 \over 2}f_A^2{{{\rm{d}}{f_A}} \over {{\rm{d}}\phi}} = {1 \over 2}\beta _s^3{\kappa ^3}\Delta \phi _0^2$$

with

$${f_A} = {{\partial \ln {\Lambda _{{\rm{QCD}}}}(\phi)} \over {\partial \phi}} = - \left[ {\ln {{{M_s}} \over {{m_A}}} + {1 \over 2}} \right]{{{\rm{d}}\ln B} \over {{\rm{d}}\phi}} \equiv - {\beta _s}{{{\rm{d}}\ln B} \over {{\rm{d}}\phi}} = {\beta _s}\kappa \Delta {\phi _0}$$
(198)

with Δϕ0 = ϕ0ϕ m and β s ∼ 40 [135]. The variation of the gravitational constant is, from Equation (167), simply

$${{\dot G} \over G} = 2{f_A}{\dot \phi _0} = - 2\left[ {\ln {{{M_s}} \over {{m_A}}} + {1 \over 2}} \right]{{{\rm{d}}\ln B} \over {{\rm{d}}\phi}}{\dot \phi _0}.$$

The value of \({{\dot \phi}_0} = {H_0}\phi _0\prime\) is obtained from the Klein-Gordon equation (168) and is typically given by \(\phi _0{\prime} = - Z{\beta _s}\kappa {H_0}\Delta {\phi _0}\) were Z is a number that depends on the equation of state of the fluid dominating the matter content of the universe in the last e-fold and the cosmological parameters so that

$${\left. {{{\dot G} \over G}} \right\vert _0} = 2{f_A}{\dot \phi _0} = - 2Z{H_0}\beta _s^2{\kappa ^2}\Delta \phi _0^2.$$
(199)

The factor Z is model-dependent and another way to estimate \(\phi _0{\prime}\) is to use the Friedmann equations, which imply that \({{\dot \phi}_0} = {H_0}\sqrt {1 + {q_0} - {3 \over 3}{\Omega _{{{\rm{m0}}}}}}\) where q is the deceleration parameter.

When one considers the effect of the quark masses and binding energies, various composition-dependent effects appear. First, the fine-structure constant scales as αEMB−1 so that

$${\left. {{{\dot \alpha} \over \alpha}} \right\vert _0} = \kappa \Delta {\phi _0}{\dot \phi _0} = - Z{H_0}{\beta _s}{\kappa ^2}\Delta \phi _0^2.$$
(200)

The second effect is, as pointed out earlier, a violation of the universality of free fall. In full generality, we expect that

$${m_A}(\phi) = N{\Lambda _{{\rm{QCD}}}}(\phi)\left[ {1 + \sum\limits_{\rm{q}} {\epsilon _A^q{{{m_{\rm{q}}}} \over {{\Lambda _{{\rm{QCD}}}}}}} + \epsilon _A^{{\rm{EM}}}{\alpha _{{\rm{EM}}}}} \right].$$
(201)

Using an expansion of the form (17), it was concluded that

$${\eta _{AB}} = {\kappa ^2}{({\phi _0} - {\phi _m})^2}\left[ {{C_B}\Delta \left({{B \over M}} \right) + {C_D}\Delta \left({{D \over M}} \right) + {C_E}\Delta \left({{E \over M}} \right)} \right]$$
(202)

with B = N + Z, D = NZ and E = Z(Z − 1)/(N + Z)1/3 and where the value of the parameters C i are model-dependent.

It follows from this model that:

  • The PPN parameters, the time variation of α and G today and the violation of the university of free-fall all scale as \(\Delta \phi _0^2\).

  • The field is driven toward ϕ m during the cosmological evolution, a point at which the scalar field decouples from the matter field. The mechanism is usually called the least coupling principle.

  • Once the dynamics for the scalar field is solved, Δϕ0 can be related to Δϕ i at the end of inflation. Interestingly, this quantity can be expressed in terms of amplitude of the density contrast at the end of inflation, that is to the energy scale of inflation.

  • The numerical estimations [135] indicate that η U,H ∼ −5.4 × 10−5(γPPN − 1) showing that in such a class of models, the constraint on η ∼ 10−13 implies 1 − γPPN ∼ 2 × 10−9, which is a better constraint that the one obtained directly.

This model was extended [