1 Introduction

For many years, Einstein’s general theory of relativity (GR), which was finalised in 1915 (Einstein 1915), could only be tested in the weak-field slow-motion regime of the Solar System,Footnote 1 and testing GR and its alternatives beyond their first post-Newtonian (PN) approximation was way out of reach. Then the discovery of the first binary pulsar by Russell Hulse and Joseph Taylor in the summer of 1974 (Hulse and Taylor 1975) provided the physics community with a completely new testbed for investigating our understanding of gravity, space and time. With the discovery and continued observation of the Hulse–Taylor pulsar, aspects of the gravitational interaction of strongly self-gravitating bodies—specifically, two neutron stars (NSs)—could be investigated for the first time. Furthermore, at a quite early stage it was clear that the Hulse–Taylor pulsar provides a unique opportunity to test for the existence of gravitational waves (GWs) emitted by accelerated masses, which was confirmed with high precision in the following decade (Taylor and Weisberg 1982, 1989).

This was important for the opening of the GW window on the Universe: not only did it greatly facilitate the construction of ground-based GW observatories, but it also motivated the current attempts at detecting very low frequency GWs via pulsar timing arrays (Perera et al. 2019), which at the time of writing seems close (Antoniadis et al. 2023; Agazie et al. 2023; Reardon et al. 2023). This highlights the double use of pulsar timing, both for precise tests of gravity theories and for detecting low-frequency GWs from other sources.Footnote 2

Meanwhile, many other radio pulsar systems were, and are being found that can be used for testing gravitational physics and our understanding of space and time. Depending on their orbital properties and the characteristics of their companions, these pulsars allow the study of different aspects of relativistic gravity and the derivation of the tightest constraints on some alternatives to GR; as an example, pulsar experiments with a millisecond pulsar (MSP) in a triple stellar system have provided the largest lower limit (around 150,000) for the Brans–Dicke parameter \(\omega _\text{BD}\) (see Voisin et al. 2020 and Sect. 5.1). Besides tests of specific gravity theories, there are pulsars that allow even for quite generic constraints on potential deviations from GR in the quasi-stationary strong-field regime and the radiative aspects of gravity. So far, GR has passed all the pulsar tests with flying colours! These experiments play a crucial complementary role to other modern gravity tests, such as those conducted by GW observatories (Abbott et al. 2016, 2021a, b) or the Event Horizon Telescope (Event Horizon Telescope Collaboration 2019b; Event Horizon Telescope Collaboration 2022b), as well as modern Solar System tests and tests on cosmological scales (Berti et al. 2015).

Despite the growth in the number of systems that we can use, and the numbers and types of experiments, this work stands on the foundations laid after Jocelyn Bell noticed the first pulsar on her chart, and later when PSR B1913+16 was found by Joe Taylor and collaborators. These foundations include the experimental techniques that were developed for the precise timing of pulsars and binary pulsars, but also the many theoretical developments triggered by the discovery of “the” binary pulsar, which provide a solid conceptual foundation for our experiments. It is important to recognize, 50 years after the discovery of PSR B1913+16, our great debt to those who opened the path for us.

In this review, we provide a summary of the tests of gravity theories with radio pulsars. This is not meant to be exhaustive, and covers only a bare minimum of necessary historical material, focusing instead on the results of greater significance for the study of gravity. In Sect. 2, we provide a very brief overview of pulsar radio emission, how it is detected and its data processed for timing purposes, and some information on the pulsars as astrophysical objects, especially the formation and evolution of binary pulsars. This is important for understanding the systems themselves, but also why some systems are better laboratories than others; the reader familiar with these topics can skip this section. In Sect. 3, we present an outline of the pulsar timing technique and describe the relativistic effects that have been seen in binary pulsars, mostly via this timing technique. We give special emphasis to the “post-Keplerian” (PK) parameters (which quantify these relativistic effects) and their interpretation in GR, which is the basis of most of the tests discussed in this review. In Sect. 4, we discuss the main tests of GR with binary pulsars. In Sect. 5, we discuss pulsar tests of our general understanding of gravity and gravitational symmetries, and a search for possible deviations from GR, with a particular focus on phenomena that are predicted by alternative theories of gravity. The non-detection of these phenomena represents strong constraints on such alternative theories. As an illustrative example, we will present pulsar tests of Damour–Esposito–Farèse (DEF) gravity in some more detail. Finally, in Sect. 6, we summarize the results and discuss future prospects.

As a final comment, it is important to emphasise that the vast majority of the work discussed here appeared in the years since the last Living Review in Relativity on this topic was published (Stairs 2003).

2 Pulsars: the neutron stars and their radio emission

2.1 Radio pulsars

Jocelyn Bell’s discovery of radio pulsars in 1967 (Hewish et al. 1968) was a complete surprise. The first radio pulsar, then known as CP 1919 (now known as PSR B1919+21) showed an extraordinarily stable pulsation period of \(1.3372795 \pm 0.0000020\) s. This stable periodicity is the feature that makes pulsars uniquely useful for a wide range of applications in astrophysics and, as discussed below, fundamental physics.

That the signals, first detected at a radio frequency of 81.5 MHz, were of astrophysical origin was soon firmly established. The discovery itself happened because Jocelyn Bell noticed that the signal reappeared, like the distant stars, 4 min earlier every day within the beam of the Cambridge radio telescope. Additional evidence that the signals were of interstellar origin was provided by the detection of dispersion: the radio pulsations at lower radio frequencies arrive with a delay relative to the same pulsations at higher frequencies, as expected from a signal traveling through diluted ionised gas for distances of hundreds of parsecs (see Sect. 2.3.3). Final proof was provided by the fact that the observed periodicity varied precisely with the Doppler shift caused by the projection of Earth’s velocity along the direction to the pulsar (Hewish et al. 1968).

Figure 1 shows a well-known pulse train from this pulsar, recorded later at a radio frequency of 318 MHz (Craft 1970). This shows clearly that the individual pulses are nearly random. However, and most importantly, adding a large number of such pulses in phase, we arrive at a stable pulse profile, which is characteristic of each pulsar.

Fig. 1
figure 1

Sequence of radio pulses from PSR B1919+21 observed in a radio intensity time series recorded at a radio frequency of 318 MHz with the Arecibo 305-m radio telescope. The figure shows how the intensity of the radio signal (vertical axis) changes as a function of spin phase (horizontal axis, increasing towards the right, only a short window of about 10% of a full rotation is shown). Successive pulses are displayed with increasing vertical offsets for visibility. Note the bar, indicating a time interval of 20 ms, which shows the high time resolution of the signal. From Craft (1970)

2.2 Neutron stars

The earlier interpretations assigned the signal to the smallest sources known to exist at the time, white dwarf (WD) stars (see e.g., discussion in Hoyle and Narlikar 1968). However, the discovery of the Vela and Crab pulsars (Large et al. 1968; Staelin and Reifenstein III 1968), and in particular the discovery of the very short periodicity of the latter pulsar, 33.091 ms (Comella et al. 1969), which is slowing down with time (Richards and Comella 1969), forced the acceptance of the model that assigned the periodic pulsating signals to the rotation of a NS (Gold 1968; Pacini 1968).

This was an extraordinary breakthrough for two main reasons:

  1. 1.

    It established beyond doubt the existence of NSs: that there is a dense, stable remnant considerably more compact than WDs had been uncertain since 1934, when they were first proposed by Baade and Zwicky (1934).Footnote 3 The first detailed calculations of NS structures (Tolman 1939; Oppenheimer and Volkoff 1939) resulted in estimates of the maximum NS mass between 0.7 and a few solar masses. With diminutive radii of \({\mathcal{O}}\) (12 km) and masses similar to that of the Sun, their central densities are higher than that of atomic nuclei. Owing to their compactness, the calculation of NS structures requires a fully general-relativistic hydrostatic equilibrium equation, together with detailed information on the macroscopic properties (especially pressure) of nuclear matter at these high densities, its equation of state (EoS) (Tolman 1939; Oppenheimer and Volkoff 1939). Even today, the unknown state of matter at these densities implies that there are still significant uncertainties in its EoS, which result in \({\mathcal{O}}\) (20%) uncertainties in the maximum mass, radius and moment of inertia (MoI) of NSs, making this a very active topic of research with implications not only for many branches of astrophysics (Özel and Freire 2016) but also for the understanding of the strong nuclear force (Lattimer 2021).

  2. 2.

    The association of the Vela and Crab pulsars with well-known supernova (SN) remnants (the Crab nebula is associated with a historical SN observed in the constellation Taurus in 1054 AD, with detailed records from China and Japan, Stephenson and Green 2002) implied that some types of core-collapse SNe mark the birth of a NS (Janka 2012; Burrows and Vartanyan 2021). Thus, NSs represent the end products of the evolution of some types of massive stars, with the collapse of their cores (and the associated neutrino burst) powering the SN explosion, as had been suggested since the 1930s (Baade and Zwicky 1934; Zwicky 1938).

2.3 Characteristics of the radio emission

Despite all the previous work on NSs, the discovery of pulsars was surprising because no one predicted their radio emission. This is still the case today: 56 years after the discovery of PSR B1919+21, the radio emission from pulsars remains poorly understood (but see Philippov et al. 2020). However, this does not hinder in the least the use of pulsars for the experiments described below.

In the radio sky, pulsars appear as faint, point-like radio sources. This faint emission is generally observed at frequencies of a fraction to a few GHz, i.e., decimetric to metric wavelengths. It is a broadband phenomenon, without any recognisable spectral features, with a maximum of emission around a couple hundred MHz and a steep decrease in power at higher frequencies, which means that the vast majority of pulsars become undetectable at frequencies above a few GHz. However, about 10% of known radio pulsars have associated gamma-ray emission (Smith et al. 2023).

One of the most prominent characteristics of the radio emission is its high degree of polarisation, with percentages of both linear and circular polarisations that are much higher than in other radio sources. This points to another fundamental characteristic of this emission, its coherence: the effective temperatures required from an object produce the observed radio emission via thermal blackbody emission (i.e., its emission temperature) are of the order of \(10^{25}\,\hbox{K}\) for most pulsars, which is too high. For additional details, see Lyne and Graham-Smith (2012), Lorimer and Kramer (2012).

2.3.1 The lighthouse model

Despite the lack of a good model for the radio emission, a geometric model was firmly established soon after the discovery of PSR B1919+21. The pulsar is a magnetised NS, and the radio emission is highly anisotropic, emerging mostly from regions close to open magnetic field lines in the vicinity of the magnetic poles. Because the magnetic axis is generally misaligned with the spin axis, distant observers only detect radio emission for the time (within each rotation of the NS) that some part of the emission beam is pointing to the Earth. The overall effect is similar to the apparent pulsations of a lighthouse. This hypothesis is confirmed by the observation that the position angle (PA) of the linear polarisation is seen to change with the spin phase in a way that is consistent with the changing orientation of the magnetic field lines directly along the line of sight to the observer (Radhakrishnan and Cooke 1969). From this change of the PA with spin phase the rotation geometry can be determined, especially the angle between the magnetic field and the spin axis and the angle between the spin axis and the line of sight (for details, see Lorimer and Kramer 2012).

2.3.2 Time domain signals

Figure 1 illustrates a fundamental aspect of most pulsar observations: that they are time-domain observations, where a 1-dimensional time signal from a restricted location in the sky is recorded with high time resolution, instead of a 2-dimensional spatial signal as when an image of the sky is being produced.

This fundamental difference implies that while imaging observations of many radio sources require the use of radio interferometers, like the Very Large Array (VLA) near Socorro, New Mexico (USA), the MeerKAT 64-dish array near Carnarvon, South Africa (Jonas and MeerKAT team 2016), and the upgraded Giant Metrewave Radio telescope (uGMRT) near Narayangaon, India (Gupta et al. 2017), or scans with single dish telescopes, sometimes using multiple receivers in the focal plane of the telescope (“multi-pixel” receivers), the use of such imaging techniques is not required for most pulsar observations, which can also be made by single radio dishes with single-pixel receivers—the poorer imaging resolution is in this case completely irrelevant.

Given the faintness of pulsar emission, what matters above all for pulsar observations is the sensitivity of the radio telescope. Currently, the most sensitive radio telescope in the world for pulsar observations is a single dish, the Five hundred meter Aperture Spherical Telescope (FAST), in Guizhou province, China (Nan and Li 2013), a design inspired on (and the first to surpass the sensitivity of) the 305-m William E. Gordon radio telescope near Arecibo, Puerto Rico, USA, which is no longer in operation. Other telescopes used extensively for pulsar searches are the largest fully steerable radio telescopes in the world, the 100-m Green Bank radio telescope in Green Bank, West Virginia, the 100-m Effelsberg radio telescope near Effelsberg, Germany, the 75-m telescope at Jodrell Bank near Manchester, UK and the 64-m Murriyang radio telescope near Parkes, Australia. Although not fully steerable, the Nançay radio telescope in Nançay, France is also extensively used for pulsar observations.

When observing radio pulsars (and many other types of sources), these radio telescopes use high sensitivity, broadband single-pixel receivers. These produce two voltage streams, one for each radio polarisation; these should be proportional to the electric field (the amplitude) of the incoming wave. In modern observing systems, these voltages are sampled and digitised at rates of at least twice the bandwidth being registered, which typically implies a few times \(10^9\) samples per second (e.g., Jonas and MeerKAT team 2016). These two voltage streams—and the local time references—provide all the information in pulsar observations. From here on we describe how these data are processed and reduced.

From these two digitised voltage streams, a digital spectrometer (also called in radio instrumentation a “filterbank") produces spectra, either using fast Fourier transforms (FFTs) or, for better channelisation, polyphase filterbank techniques. The number of frequency channels in the spectra varies according to the needs of different observations: a few hundred to a few thousand for pulsar observations, tens of thousands for observations of radio spectral lines.

In pulsar observations, at least one spectrum is recorded, for the total intensity of the signal (\({\mathcal{I}}\)), obtained from the addition of the squares of the voltages from each of the two polarisations. This process is called detection: \({\mathcal{I}}\) should be proportional to the power of the incoming radio waves.

However, four spectra can be computed and recorded at the same time, one for each of the Stokes parameters: in addition to \({\mathcal{I}}\) these include \({\mathcal{Q}}\), \({\mathcal{U}}\) and \({\mathcal{V}}\) (Lorimer and Kramer 2012), thus preserving the polarisation characteristics of the signal. This is important for pulsars, since as mentioned above they are highly polarised sources. The Stokes parameters are the components of the 4-dimensional Stokes vector. The rationale for describing the polarisation characteristics of the signal using Stokes vectors is that they can, like any other vectors, be added as necessary. This means that we can add all polarisation measurements occurring at a particular spin phase of the pulsar to derive a high S/N measurement of the average polarisation of the radio emission at that phase. The correct derivation of the Stokes parameters requires careful calibration of the two voltage streams.

The limiting time resolution within each spectral channel is the inverse of its bandwidth; for pulsar observations the latter is of the order of MHz, thus the time resolution is typically of the order of microseconds (\(\upmu \hbox{s}\)). Consecutive samples are integrated in all channels, and the resulting spectra are recorded, typically once every few tens of \(\upmu \hbox{s}\); this is generally done in order to reduce the data rates. In any case, this timescale is much faster than for most other astronomical observations.

2.3.3 Dispersion and dedispersion

Since pulsars have no narrow spectral features, one might wonder about the need for obtaining spectral information. This is primarily because of the phenomenon of dispersion, and secondarily for the purposes of the rejection of radio frequency interference, which tends to appear in a limited number of spectral channels.

Dispersion happens because the group velocity of radio waves in a cold, diluted plasma is smaller than the speed of light in vacuum, c. If the radio frequency F is much higher than the plasma frequency anywhere along the line of sight, then the accumulated delay for the wave arriving at the Earth is given by the expression (in cgs units):

$$\begin{aligned} \Delta t_{\rm dis}(F) = \frac{e^2}{2 \pi m_{\rm e} c} \frac{1}{F^2} \, \int _{\rm pulsar}^{\rm Earth} n_{\rm e}(l) \, dl, \end{aligned}$$
(1)

where F is the radio frequency, e and \(m_{\rm e}\) are the charge and mass of the electron, and \(n_{\rm e}\) is the density of free electrons of the interstellar medium (ISM) at a distance l along the propagation path. If l is specified in parsecs and \(n_{\rm e}\) in \(\hbox{cm}^{-3}\), then the integral—the electron column density between the pulsar and the Earth—is expressed in \(\hbox{cm}^{-3}\, pc\) and is known as the dispersion measure (DM). Specifying F in GHz, we obtain:

$$\begin{aligned} \Delta t_{\rm{dis}}(F) = 4.1488 \, \frac{\hbox{DM}}{F^2}\,\hbox{ms}. \end{aligned}$$
(2)

If the dispersive delays are not subtracted, the pulsar signal will be smeared in time, and the pulsar signal lost (Lorimer and Kramer 2012).

This subtraction can be done in two ways, both requiring the use of channelisation by a spectrometer. The simpler is to move the detected signal of each spectral channel at frequency F forward by \(\Delta t_{\rm dis}(F)\), this is known as incoherent dedispersion. The advantage is the simplicity and small amount of computing effort required, while the disadvantage is that the dispersive smearing within each channel is not removed. The method used for most modern timing observations is to remove the effect coherently, before detection: after the two voltage streams are Fourier transformed, a rotation proportional to \(\Delta t_{\rm dis}(F)\) is applied to the complex number in the Fourier spectrum at frequency F. Fourier transforming this signal back to the time domain gives two voltage streams that are apparently unaffected by the ISM. From these voltage streams, the Stokes parameters are computed as described above. This process is known as coherent dedispersion (Hankins 1971). The advantage is that it eliminates the dispersive smearing within each channel, while the disadvantage is the large computational power required and the need for precise a priori knowledge of the DM.

After dedispersion, all spectral channels can be added in frequency, producing a data set recording the variation of the Stokes parameters with time—a time series—which is the final product of the dedispersion procedure.

2.3.4 Individual pulses and the average pulse

The lighthouse model implies that by keeping track of the radio pulsations, we can in principle measure precisely how the number of rotations of the NS N changes with the proper time T of the reference frame of the pulsar (see Sect. 3.1)Footnote 4. To leading order, the relation is given by:

$$\begin{aligned} N(T) = \frac{\phi (T) - \phi _0}{2 \pi } = \nu (T - T_0) + \frac{1}{2} {\dot{\nu }}(T - T_0)^2 , \end{aligned}$$
(3)

where \(\phi (T)\) and \(\phi _0\) are the spin phases at time T and a reference time \(T_0\) (these phases are not limited to the interval between 0 and \(2\pi \), they increase continuously with time), \(\nu \) is the spin frequency of the NS and \({\dot{\nu }}\) is its time derivative, both measured at \(T_0\). For spin-powered pulsars, the rotation slows down with time (as mentioned above for the Crab pulsar), implying that \({\dot{\nu }}\) is negative.

In the intensity time series of the first pulsar (Fig. 1 and Fig. 1 of Hewish et al. 1968) it was already apparent that there is considerable variation not only in the strength, but also in the particular shape of each pulse. This means that, by observing individual pulses, it is difficult to measure N(T). However, if in Fig. 1 we add all individual pulses on top of each other—a process known as folding—we recover a stable average pulse profile. Thus, folding is vitally important, not only to increase the S/N of the signal, but also because it makes timing possible.

To fold a time series properly, one needs a good a priori estimate of spin frequency of the pulsar as seen at the receiving radio telescope at the local time \(\tau \)Footnote 5 when the observation is occurring, \(\tau _{\rm obs}\). This local spin frequency \(\nu _{\rm obs}\) changes constantly because of the constantly changing conversion factor between T and \(\tau \) (see Sect. 3). If this is not taken into account, the phase of the radio emission will appear to drift with time. The prediction of \(\nu _{\rm obs}\) is made by a timing programme using the best available pre-existing timing model for a pulsar, its ephemeris, and the best description of the motion of the radio telescope relative to the Solar System barycentre (SSB).

For most pulsars, there are significant deviations from the simple spin-down given by Eq. (3), which are generally categorised as either “timing noise” or “glitches” (Lorimer and Kramer 2012). However, for some types of pulsars—especially the recycled pulsars, described in the next section—the rotation can be described by Eq. (3) to a very good approximation. This means that they have an extremely stable rotation, making them ideal tools for the types of experiments described in this review.

2.3.5 The time of arrival

This stable profile allows new measurements of \(N(\tau )\) from the data of the observation. This could be done by measuring N at specific values of \(\tau \), but instead the convention used in pulsar astronomy is to measure \(\tau \) at specific values of N, in particular its integer values. This is how it is done: a high S/N version of the pulse profile is used as a template, with its zero phase representing the reference longitude on the NS. Like longitudes on Earth, this has an arbitrary element to it; it can be set, for instance, at the peak of radio emission. However, once this convention is established, it is very important to use it consistently for the same pulsar, or at least for measurements taken with the same instrument and frequency. This template is then correlated with individual pulse profiles obtained, in each observation, after dedispersion and folding of its data to derive the so-called time of arrival (ToA), \(\tau _i\) (for details, see e.g., Taylor 1992).

Although the ToA uncertainties can be very small for some pulsars (e.g., \(0.1\,\upmu \hbox{s}\) for the timing of PSR J1909−3744, Liu et al. 2020), such precision is very rare. The limitations stem from several of the aforementioned characteristics of the pulsar radio signal. The faint signal implies that the ToA precision of most recycled pulsars is limited by the S/N of the detections. Furthermore, small DM variations introduce additional frequency-dependent noise into the timing. Another problem, which dominates in the few cases when the S/N of the detections is very high, is pulse jitter. This is caused by the randomness of the individual pulses, which can, in some cases, take a long time to average into “the” average profile of the pulsar. Furthermore, the lack of sharp features in the profile can further limit the timing precision. Finally, long-term deviations from Eq. (3)—the timing noise and glitches—can also degrade the timing of some recycled pulsars.

2.4 Pulsar evolution and binary pulsars

The vast majority of known pulsars are found in our Galaxy and its retinue of globular clusters: the most distant pulsars known to us are located in two satellite galaxies of the Milky Way, the LMC and SMC, located about 50 and 60 kiloparsec (kpc) away from Earth, respectively. Because of their intrinsic faintness, the sensitivity of the observing instruments remains as the main limitation in the discovery of pulsars: the known pulsar population (3473 pulsars at the time of writing, Manchester et al. 2005) is thought to represent only a few percent of the likely population of active pulsars in our own Galaxy, and a tiny fraction of its \(\sim 10^9\) NSs.

2.4.1 Normal and recycled pulsars

In Fig. 2, we plot the spin period (\(P = \nu ^{-1}\), horizontal axis) and the spin period derivative (\({\dot{P}} = - {\dot{\nu }} \nu ^{-2}\), vertical axis) for about 3000 rotation-powered pulsars (black crosses) for which these parameters have been measured (Manchester et al. 2005). From these two parameters, the age, magnetic field and energetics of the pulsars can be inferred (dotted lines) using a variety of assumptions, like in this case that the spindown is caused by the emission of a low-frequency electromagnetic (EM) wave, as one would expect from a rotating magnetic dipole, and that the dipole is orthogonal to the spin axis (Lorimer and Kramer 2012). For the rate of rotational energy loss,

$$\begin{aligned} {\dot{E}}_{\rm{rot}} = -4 \pi ^2 I \frac{{\dot{P}}}{P^3}, \end{aligned}$$
(4)

a MoI (I) of \(10^{45}\,\hbox{g cm}^{2}\) is generally assumed (Lorimer and Kramer 2012).

Fig. 2
figure 2

Period-(intrinsic) period derivative diagram for radio pulsars, taken from Manchester et al. (2005). Double NS systems are highlighted by red and all other binary pulsars by blue circles. The pulsar in the stellar triple system is marked by a magenta triangle. The rest, i.e., the isolated pulsars, are marked with black crosses. Dashed lines indicate constant characteristic age \(\tau _{\rm{c}}\) and surface magnetic field strength \(B_{\rm{S}}\) (labeled accordingly)

Clearly, there are two main groups of pulsars in this diagram: the “normal” pulsars form the central, more numerous group, with \(0.1\,< \, P \, < 5\) s and \(10^{-17}\,< \,{\dot{P}} \, < \, 10^{-11}\,\hbox{s s}^{-1}\). In the lower left, with values of P and \({\dot{P}}\) (and magnetic fields) three orders of magnitude smaller, are the “recycled” pulsars. Unlike the normal pulsars, they were spun up (and their magnetic fields degraded) by accretion of mass from a stellar companion after they formed. This is indicated by the fact that about 80% of these pulsars are in binaries (indicated by the filled circles), while among the normal pulsars this is less than 1%.

2.4.2 Binary pulsars

The two main types of binary pulsars are indicated in Fig. 2 by the coloured dots: red for pulsars with NS companions and blue for pulsars with WD (and other types of) companions.

How do these systems form? This is important for understanding why some systems are more useful than others for GR tests. A detailed account of their evolution is given by Tauris and van den Heuvel (2023), and in what follows we present a very summarised outline. For simplicity, we start with a binary system consisting of two main sequence (MS) stars. This is a common occurrence, especially for massive stars, the majority of which form in binary and multiple systems.

As the more massive star (the primary) evolves, it eventually explodes as a SN, forming a normal pulsar. The system will then very likely disrupt owing to the large kickFootnote 6 and mass loss associated with this SN. We know this not only from the aforementioned fact that \(>99\%\) of normal pulsars are not in binary systems, but also from the observation that the few surviving normal pulsar - massive MS star systems (e.g., Johnston et al. 1992; Lyne et al. 2015) have very high orbital eccentricities (generally \(e \sim 0.9\)), indicating a near disruption.

Eventually, the pulsar might cease emitting radio waves. The secondary will also evolve and become a giant star. Its large size will cause several effects, the first being tidal circularisation of the orbit. When the secondary fills its Roche lobe, the transfer of matter to the NS starts, and the system becomes an X-ray binary. The transfer of matter can become unstable, in which case the system might go through a common envelope phase. At this stage, the NS is slowly spun up, and its spin will become aligned with the orbital angular momentum. It is thought that during this process the magnetic flux density at the surface of the NS is ablated, becoming much smaller.

What happens next depends on the mass of the secondary:

  • If the secondary is light, then it will slowly evolve into a WD star, in this case the system will retain both the very low orbital eccentricity and the alignment between the NS spin and orbital angular momentum acquired during the recycling phase.

  • If it is massive enough, it will also go through a SN explosion, the system then either disrupts (a likely occurrence) or forms a double NS system (DNS, for a detailed discussion, see Tauris et al. 2017). In most cases we observe only the recycled pulsar because their NS companions are only observable as normal pulsars for a short amount of time. However, in one case (the PSR J0737-3039A/B system), we see both NSs as pulsars: as predicted by the laws of stellar evolution, one of them (the first-formed NS) is a recycled pulsar (PSR J0737−3039A, with \(P = 22.7\,\hbox{ms}\), Burgay et al. 2003) and the second-formed NS (PSR J0737−3039B, with \(P=2.77\,\hbox{s}\)) is a normal pulsar (Lyne et al. 2004).

After the demise of the companion, the primary is now spinning fast and emitting radio waves again—this is why it is known as a recycled pulsar.

Fully recycled pulsars (the MSPs) appear overwhelmingly in the lower left corner of Fig. 2, with spin periods of a few ms. Most of them have low-mass companions. The pulsars in DNSs have spin periods that are one order of magnitude larger. The reason is the time they spend as X-ray binaries: this is much shorter for the NSs with high-mass companions, because the latter evolve much faster.

The high risk of disruption makes the DNSs relatively rare (\(<30\) known, representing less than 10% of the binary pulsar population and less than 1% of the total pulsar population). These systems are necessarily eccentric because of the mass loss and the kick associated with the second SN.Footnote 7 Because of this kick, the orbital plane after the SN might become very different from what it was before the explosion, causing a misalignment between the spin of the recycled pulsar and post-SN orbital angular momentum.

The energy loss due to GW damping in compact DNSs observed for the first time in PSR B1913+16 (see Sect. 4.1) has an inevitable consequence: every year the pulsar and companion come about 3.5 m closer, and in 300 million years—a short time compared to the age of the Universe—this system will coalescence in a NS–NS merger, producing extreme amounts of GW emission. This discovery meant that future ground-based observatories will have a secure source of GWs. This was repeatedly confirmed by the discovery of DNSs with even smaller coalescence times in our Galaxy (Burgay et al. 2003; Cameron et al. 2018; Stovall et al. 2018). The observation by the Advanced LIGO and Virgo of precisely such a NS–NS merger event in August 2017, GW170817 (Abbott et al. 2017a) and its EM counterpart (Abbott et al. 2017b) in the galaxy NGC 4993 not only confirmed beyond doubt the connection between (a certain class of) gamma-ray bursts with the collision of two NSs, but represented the fulfillment of the promise brought by the discovery of PSR B1913+16.

3 Timing and relativistic effects in binary pulsars

Most (but not all) of the applications discussed below rely on a technique where pulsars excel, timing. This technique yields highly precise measurements of many astrophysical parameters, and is more precise (and therefore powerful) for recycled, fast-spinning pulsars.

In this technique, we use the ToAs to infer fundamental characteristics of the pulsar. This is done using a timing programme: by far, the most commonly used programmes are tempo (Nice et al. 2015), tempo2 (Hobbs et al. 2006; Edwards et al. 2006) and “PINT is not tempo” (pint, Luo et al. 2021). These programs use the pulsar ephemeris and Eq. (3) to calculate predictions for ToAs (\(T_{i,p}\)) that are within less than half a spin period from the observed ToAs \(T_i\). The programme determines the optimal timing parameters by minimising the sum of the squares of the normalised residuals:

$$\begin{aligned} \chi ^2 = \sum _{i=1}^{n} \left( \frac{T_i - T_{i, p}}{\sigma _i} \right) ^2, \end{aligned}$$
(5)

where n is the number of ToAs and \(\sigma _i\) is the measurement uncertainty of \(T_i\). If all residuals are small compared to the spin period and if the normalised residuals have a normal distribution around zero, this is evidence that the ephemeris is precise enough to determine, without ambiguity, the number of rotations between any two ToAs, i.e., it is a phase-coherent ephemeris. Without unambiguous knowledge of these rotation numbers, no reliable ephemerides can be derived. In a good timing model, the value of \(\chi ^2\) is similar to the number of degrees of freedom in the fit, which is the number of ToAs minus the number of parameters being fitted.

As can be deduced from Eq. (3), any small error in \(\nu \) will produce a linearly increasing residual, and a small error in \({\dot{\nu }}\) will produce a quadratically increasing residual. Thus, by adjusting \(\nu \) and \({\dot{\nu }}\), the timing programme can make any such trends in the residuals disappear.

3.1 Timing isolated pulsars

As discussed above, Eq. (3) is calculated assuming a time (T) in the reference frame of the pulsar. However, as discussed below, we do not know the line-of-sight velocity of the pulsar, so T is not directly accessible to us. Nonetheless, a suitably Doppler-shifted version of this time can be calculated as (Taylor 1994; Edwards et al. 2006):

$$\begin{aligned} T' = \underbrace{\tau _{\rm obs} - \Delta _{{\rm E}, \odot }}_{t_{\rm obs}} - \, t_0 - \Delta _{{\rm R}, \odot } - \Delta _{{\rm S}, \odot } - \Delta t_{\rm dis}(F), \end{aligned}$$
(6)

where \(t_{\rm obs}\) is the coordinate time (in TCB) of the ToA, \(t_0\) is a reference epoch (in TCB), \(\Delta t_{\rm dis}(F)\) is given by Eq. (2) and the multiple \(\Delta _\odot \) terms correspond to three different delays:

  1. 1.

    \(\Delta _{{\rm E}, \odot }\), the Einstein delay represents the time part of the full four-dimensional transformation between the SSB coordinate time \(t_{\rm obs}\) and the proper time of the observer \(\tau _{\rm obs}\), along the observer’s world-line. To leading order, it is a result of the special-relativistic time dilation due to the motion of the observer in the SSB and the gravitational redshift caused by the masses in the Solar System (Soffel et al. 2003; Edwards et al. 2006). (Note that there are different sign conventions for this term in the literature.) This delay is independent of the direction to the source.

  2. 2.

    \(\Delta _{{\rm R}, \odot }\), the “Rømer” delay, corresponds (in the Newtonian limit) to the geometric light-propagation delay. This corresponds to the projection, along the direction of the incoming wave from the pulsar (unit vector \(\hat{{\textbf{n}}}\)), of the position vector of the receiving radio telescope in the frame of the SSB at time \(t_{\rm obs}\). This is the sum of two (coordinate-position) vectors: the first is the position vector of the radio telescope relative to the Earth’s centre; to calculate it, accurate coordinates for the radio telescope are necessary, together with accurate information on the orientation of the Earth. The second, the position vector of the Earth’s centre relative to the SSB, is calculated using a Solar System ephemeris. Finally, a direction to the pulsar (Right Ascension \(\alpha \) and Declination \(\delta \), unit vector \(\hat{{\textbf{K}}}_0 = -\hat{{\textbf{n}}}\)) must be assumed in order for the timing programme to calculate the projection correctly.

  3. 3.

    \(\Delta _{{\rm S}, \odot }\), the Shapiro delay (Shapiro 1964), is a relativistic light propagation delay caused by the curvature of the Solar System’s spacetime. This has been measured extensively with the aid of planetary radar and space probes moving in the Solar System (e.g., Bertotti et al. 2003). Like \(\Delta _{{\rm R}, \odot }\) it is also direction-dependent and can be calculated precisely once \(\alpha \), \(\delta \) are specified.

Small errors in \(\alpha \), \(\delta \) will produce a nearly sinusoidal 1-year trend in the residuals, which indicate an incorrect subtraction of Earth’s motion. Adjusting \(\alpha \) and \(\delta \) the timing programme can make such trends disappear. When a growing sinusoidal in the residuals is observed, then additional linear variations of these terms (\({\dot{\alpha }}\), \({\dot{\delta }}\)) can be fitted, from which we can derive the proper motion of the pulsar (\(\mu _{\alpha } = {\dot{\alpha }} \cos \delta \), \(\mu _{\delta } = {\dot{\delta }}\)). Finally, if a pulsar is close enough to the Solar System and its timing precision is good enough, then the parallax (\(\varpi = 1/d[\rm{au}]\), where d is the distance to the pulsar measured in astronomical units) can also be measured from the timing. These “astrometric” parameters of the pulsar (\(\alpha ,\delta ,\mu _{\alpha },\mu _{\delta },\varpi \)) can also be measured with very long baseline interferometry (VLBI, see examples in Deller et al. 2013; Ding et al. 2021, 2023; Kramer et al. 2021).

3.2 Timing binary pulsars

In Fig. 3, we depict the more complex (and more interesting) situation that occurs when the pulsar itself is in a binary system. In this case, Eq. (6) becomes (where we ignore the constant factor between T and \(T'\)):

$$\begin{aligned} T = \tau _{\rm obs} - \Delta _{{\rm E},\odot } - t_0 - \Delta _{{\rm R},\odot } - \Delta _{{\rm S},\odot } - \Delta t_{\rm dis}(F) - \Delta _{\rm R} - \Delta _{\rm S} - \Delta _{\rm E} - \Delta _{\rm A}, \end{aligned}$$
(7)

where the four last terms are, respectively, the Rømer, Shapiro, Einstein and aberration delays of the binary. These delays are described in detail in the following section. We now make a few general considerations about them:

  • Any quantities with units of mass, length or time (for instance, the delays \(\Delta _{\rm{ R}}\), \(\Delta _{\rm{ S}}\), \(\Delta _{\rm{ E}}\), \(\Delta _{\rm{A}}\), the spin period of the pulsar, \(P_{\rm{int}}\)) measured in the reference frame of its centre of mass (CM) are related to the same quantity measured at the SSB (\(P_{\rm{obs}}\)) by \(P_{\rm{int}} = D P_{\rm{obs}}\), where D is the special-relativistic Doppler factor (Damour and Deruelle 1986). To leading order, \(P_{\rm{obs}}\) is given by Damour and Taylor (1992):

    $$\begin{aligned} P_{\rm{obs}} = D^{-1} P_{\rm{int}} \simeq \left( 1 + \frac{{\textbf{v}}_{\rm{CM}} \cdot \hat{{\textbf{K}}}_0}{c}\right) P_{\rm{int}}, \end{aligned}$$
    (8)

    where \({\textbf{v}}_{\rm{CM}}\) the velocity of the CM of the binary relative to the SSB (its “systemic” velocity). However, because \(P_{\rm{int}}\) is unknown, we cannot calculate the systemic radial velocity \(v_{r, \rm{CM}} = {\textbf{v}}_{\rm{CM}} \cdot \hat{{\textbf{K}}}_0\) from \(P_{\rm{obs}}\).Footnote 8

    For the tests described below the lack of knowledge of \(v_{r, \rm{CM}}\) is not a problem, precisely because all physical quantities re-scale coherently with D (Damour and Deruelle 1986; Damour and Taylor 1992). However, a relative acceleration between the pulsar system and the SSB leads to (apparent) secular changes in the dimensional parameters of the binary pulsar system, as discussed in detail in Sect. 3.5.1.

  • Pulsar timing or Doppler measurements only provide information for the motion along the line of sight. For most binary pulsars, the transverse component of the orbital motion is not measurable by any other means like VLBIFootnote 9—only the transverse component of the systemic velocity can be measured with VLBI because its effects increase linearly with time.

  • The measurement of the binary-related delays in Eq. (7) allows a much more precise measurement of the pulsar’s orbital motion than is possible using Doppler shift techniques. Although Doppler measurements are possible for binary pulsars—and indeed, quite useful for the initial measurement of its orbital parameters—their precision is limited by the length of the individual observations. Once a phase-coherent ephemeris is obtained, the precision of the measurement of the orbital motion increases by factors of thousands to millions (larger factors for wider orbits, where the Doppler shifts are smaller and the delays are larger)—while using the same radio data. This is especially relevant for the detection of small relativistic effects, and is the main reason why binary pulsars can be excellent laboratories for testing gravity theories.

Fig. 3
figure 3

Spacetime perspective of timing observations of a binary pulsar. Depicted are the world lines of pulsar (purple) and the observer (blue). A photon is emitted by the pulsar at the pulsar’s proper time \(\tau _{\rm psr}\) (which is proportional to T in Eq. 7) and arrives at the observer at the observer’s proper time \(\tau _{\rm obs}\). For simplicity, we have assumed that the barycentres of the binary and the Solar System are at rest with respect to each other. \(\Sigma _t\) depicted a hyper-surface of constant coordinate time t (corresponds to TCB)

3.3 Binary models

At the time of the discovery of the Hulse–Taylor pulsar no timing models for binary pulsars were available, for the simple reason that no binary pulsars had been found before. After this discovery, the development of such models became vital in order to materialise the precision of the measurement of the orbital motion promised by the discovery of a binary pulsar.

One of the first models to appear was developed by Blandford and Teukolsky (1976), henceforth BT76. This model describes the observed orbital motion as a Keplerian motion with small relativistic perturbations. This “quasi-Newtonian” model was first used in the first phase-coherent ephemeris for PSR B1913+16 (Taylor et al. 1976). In their Table 1, we see that some orbital parameters derived from the timing analysis are already two orders of magnitude more precise than the same parameters estimated from the Doppler analysis; for the spin period the improvement is 6 orders of magnitude.

Table 1 List of Keplerian and PK parameters of the DD86 model

A more complete timing formula was published in Damour and Deruelle (1986), henceforth DD86. This is based on their elegant analytical, quasi-Keplerian solution to the first PN equations of motion of a binary system in harmonic coordinates (Damour and Deruelle 1985).

In both models, the orbital motion along the radial direction is described by five Keplerian parameters. In addition, there are a few PK parameters that quantify relatively small relativistic deviations from a Keplerian orbit. All are listed in Table 1. Some notes on these parameters:

  • There are two additional Keplerian parameters, that we generally cannot determine only with measurements of the radial motion. The first is the orbital inclination i (but see Sect. 3.4). Since

    $$\begin{aligned} x = \frac{a_{\rm{p}}}{c}\,\sin i, \end{aligned}$$
    (9)

    the lack of knowledge of i and \(\sin i\) implies that the semi-major axis of the pulsar’s orbit, \(a_{\rm{p}}\), is also unknown. However, changes in the orbital plane w.r.t. the plane of the sky (which is perpendicular to the line of sight and passes through the CM of the system) can cause a change in i, which is detectable as a change in x (\({\dot{x}}\), see Sects. 3.5.2 and 3.5.3).

  • The second unknown Keplerian parameter is the PA of the ascending node, \(\Omega \). The ascending node is the point in the orbit where the pulsar, while moving away from us, crosses the plane of the sky. A change in \(\Omega \) represents a rotation of the orbital plane of the system around the line of sight, which for most cases leaves the timing practically unchanged.

  • Since \(\omega \) is the angle periastron–CM–ascending node, it is (like i) generally affected by a change of the orbital plane w.r.t. the plane of the sky. Moreover, the change of the position of the nodes caused by the proper motion causes a change of the ascending node (and on \(\omega \)) that depends on \(\Omega \) (Sect. 3.5.2).

  • Unlike \(P_{\rm{b}}\), x and e, the parameters \(\omega \), i, \(\Omega \) and \(T_0\) have no astrophysical meaning, merely specifying the orientation of the system and a particular occurrence of the time of periastron as seen by the observer.

A major advantage of these models is that they describe the orbital motion for, at least, any fully conservative, boost-invariant gravity theories that have a Newtonian limit. This means that the PK parameters can be understood as phenomenological parameters which can be measured in a theory-independent manner, providing a parameterised PK (PPK) framework (see Damour and Deruelle 1986; Damour and Taylor 1992 for more details).

We now discuss the DD86 timing formula. For each time T, there are two auxiliary parameters that must be calculated. The first one is the eccentric anomaly, u. To relate u to T, we use an analogue of Kepler’s equation, which must be solved numerically:

$$\begin{aligned} u - e \sin u = 2\pi \left[ \left( {\frac{T-T_0}{P_{\rm{b}}}} \right) - {\frac{\dot{P}_{\rm{b}}}{2}} \left( {\frac{T-T_0}{P_{\rm{b}}}} \right) ^2 \right] . \end{aligned}$$
(10)

We now calculate \(\omega \) at time T. To do this, one needs a second auxiliary parameter, the true anomaly, \(A_{e}(u)\); this is the angle pulsar–CM–periastron:

$$\begin{aligned} A_e(u) = 2 \arctan \left[ \left( {\frac{1+e}{1-e}}\right) ^{1/2} \tan {\frac{u}{2}} \right] , \end{aligned}$$
(11)

from which \(\omega \) can be calculated directly via a proportionality constant k:

$$\begin{aligned} \omega = \omega _0 + k A_e(u), \end{aligned}$$
(12)

where \(\omega _0\) is the reference value of \(\omega \) at time \(T_0\). The time-averaged periastron advance \(\langle {\dot{\omega }}\rangle \) is thus \(2 \pi k\) radians for each orbital period \(P_{\rm{b}}\):

$$\begin{aligned} \langle {\dot{\omega }}\rangle = \frac{2 \pi }{P_{\rm{b}}} \, k. \end{aligned}$$
(13)

Equation (12) is the main difference between the DD86 and the BT76 orbital models, in the latter \(\omega \) changes linearly with time, like x and e:

$$\begin{aligned} \omega = \omega _0 + {\dot{\omega }}(T - T_0). \end{aligned}$$
(14)

Equation (12) describes the observed evolution of \(\omega \) in binary pulsars significantly better than Eq. (14) (Damour and Taylor 1992); for this reason the use of the DD86 model is to be preferred. The meaning of \({\dot{\omega }}\) in the latter model, and in the discussions below, is really as a scaled version of k according to Eq. (13).

With u and \(A_e(u)\), plus the Keplerian and PK parameters, we can calculate the time delays associated with the binary orbit in Eq. (7).

3.3.1 Rømer delay

In the DD86 model, the Rømer delay is given by the following equation:

$$\begin{aligned} \Delta _{\rm{R}} = x \sin \omega \big [\cos u -e(1+\delta _r)\big ] + x \cos \omega \big [1-e^2(1+\delta _{\theta })^2\big ]^{1/2} \sin u. \end{aligned}$$
(15)

In the BT76 model, neither \(\delta _r\) nor \(\delta _\theta \) are taken into account. This is the second main difference between the BT76 and DD86 models.

3.3.2 Shapiro delay

In a binary pulsar, the Shapiro delay is caused solely by the gravitational field of the companion. To leading order it is given by:

$$\begin{aligned} \Delta _{\rm{S}} = -2 \, r \ln \left\{ 1-e\cos u - s \left[ \sin \omega (\cos u - e) + (1-e^2)^{1/2} \cos \omega \sin u\right] \right\} . \end{aligned}$$
(16)

which is calculated assuming a static configuration (Blandford and Teukolsky 1976; Damour and Deruelle 1986). The next-to-leading order term caused by the motion of the companion during the propagation of the signal (Kopeikin and Schäfer 1999) has to be taken into account in the case of the Double Pulsar (Kramer et al. 2021).

Freire and Wex (2010) introduced an “orthometric” parameterisation for the Shapiro delay based on its Fourier expansion. The orthometric amplitude \(h_3\) and the orthometric ratio \(\varsigma \) replace r and s via the relations

$$\begin{aligned} \varsigma = \frac{s}{1 + \sqrt{1 - s^2}}, \quad h_3 = r\,\varsigma ^3. \end{aligned}$$
(17)

This parameterisation is particularly advantageous for systems where i is far from edge-on. In such cases the Shapiro delay is much smaller and r and s are strongly correlated, while \(h_3\) and \(\varsigma \) are not.

3.3.3 Einstein delay

Unlike in the Solar System, the Einstein delay represents only the variable part of the relativistic time dilation experienced by the pulsar—to leading order this can be understood as the result of the second order Doppler effect due to the motion of the pulsar in the centre-of-mass frame and the gravitational redshift caused by the companion—as seen by distant observers. It is given by:

$$\begin{aligned} \Delta _{\rm{E}} = \gamma \sin u, \end{aligned}$$
(18)

where we see that, unlike \(\Delta _{\rm{R}}\) and \(\Delta _{\rm{S}}\), this term is independent of the orientation of the binary relative to the observers (\(\omega \), i, \(\Omega \)). Note that this effect can only be measured in a timing baseline where \(\omega \) changes significantly. If that is not the case, then the effect of \(\gamma \) is re-absorbed in a small change from x and \(\omega \) to \(x'\) and \(\omega '\) (e.g., Wex et al. 1998):

$$\begin{aligned} x'&= x + \frac{\gamma }{\sqrt{1 - e^2}} \, \cos \omega , \end{aligned}$$
(19)
$$\begin{aligned} \omega '&= \omega - \frac{\gamma /x}{\sqrt{1 - e^2}} \, \sin \omega , \end{aligned}$$
(20)

where we ignored \(\delta _{\theta }\), which is generally of order \(10^{-6}\). \(x'\) and \(\omega '\) are the parameters in DD86 ephemerides of eccentric binary pulsars for which no \(\gamma \) is being either measured nor assumed.

3.3.4 Aberration delay

As the pulsar moves in its orbit, one needs to account for aberration, i.e., the velocity-dependent transformation between the co-moving frame of the pulsar and the centre-of-mass frame of the binary for the propagation of the radio signal (Smarr and Blandford 1976; Damour and Deruelle 1986). This causes, at any given time, small changes to the direction of the emission beam. For the emission to be directed to Earth, more (or less) time—called the aberration delay—must be allowed for the rotation of the pulsar to compensate for these small changes. This delay would be zero if the pulsar were an actual pulsating, non-rotating source. In most cases, this effect is very small and is absorbed by re-definitions of x, e, \(\delta _{\theta }\) and \(\delta _r\) (Damour and Deruelle 1986; Damour and Taylor 1992). However, if the spin-orientation of the pulsar changes due to geodetic precession (Sect. 3.5.3), then aberration becomes time-dependent and needs to be properly accounted for Damour and Taylor (1992).

In the case of the Double Pulsar, additional changes to the direction of emission of the beam of PSR J0737−3039A originate from light deflection near PSR J0737−3039B when the latter is passing near the line of sight; its effect on the timing cannot be absorbed by a re-definition of the Keplerian or PK parameters. The observed magnitude of the effect (Kramer et al. 2021) matches the detailed GR calculations presented there and in the references therein if PSR J0737−3039A is rotating in the same sense as the orbital motion (see details in Sect. 4.2).

3.3.5 Gravitational radiation and the orbital decay

For the more compact double-degenerate binaries, the variation of the orbital period (\({\dot{P}}_{\rm{b}}\)) used in Eq. (10) is dominated by the orbital decay caused by GW damping, but has other contributions (Sect. 3.5.1). While the orbit shrinks due to the loss of energy through GWs (an effect that has not yet been measured directly), the orbital period shortens according to Kepler’s third law (see Eq. 21 below). As a result, one gets a quadratic effect in the evolution of the orbital phase (see Eq. 10) which leads to a prominent effect in the ToAs that builds up quickly with time.

3.4 Keplerian and post-Keplerian parameters in GR

3.4.1 The mass ratio and the mass function

If we denote the semi-major axis of the relative orbit (the “separation” of the binary, a) in time units (\({\bar{a}} \equiv a/c\)), we can write Kepler’s third law (to Newtonian order) as:

$$\begin{aligned} \left( \frac{P_{\rm{b}}}{2 \pi } \right) ^2 = \frac{{\bar{a}}^3}{(T_{\odot }M)}, \end{aligned}$$
(21)

where \(T_{\odot }\equiv (\mathcal{G}\mathcal{M})_{\odot }^{\rm{N}}/c^3 = 4.925490947641\dots \,\upmu \hbox{s}\) is an exact quantity, the nominal Solar mass parameterFootnote 10 in time units and M is the (dimensionless) total mass of the system expressed in units of the nominal Solar mass, which we denote by \({\rm{M}}_{\odot }\) in this review (see Prša et al. 2016 for details, who in their Table 2 suggest the symbol \({\mathcal{M}}_\odot ^{\rm N}\)). The reason for this is that the precision of the total mass in mass units (like grams),

$$\begin{aligned} m = (T_{\odot }M) \, \frac{c^3}{G}, \end{aligned}$$
(22)

is limited by the comparably low precision of Newton’s gravitational constant G, while M is tied to the precision of the measurements of \(P_{\rm{b}}\) and \({\bar{a}}\) in Eq. (21), which can be much higher.Footnote 11 From Eq. (21), we obtain

$$\begin{aligned} {\bar{a}} = \left( \frac{P_{\rm{b}}}{2 \pi } \right) ^{2/3} (T_{\odot }M)^{1/3}, \end{aligned}$$
(23)

where we see that if M is known, \({\bar{a}}\) can be estimated independently of the orbital inclination. Furthermore,

$$\begin{aligned} \beta _{\rm{O}} \equiv \left( \frac{P_{\rm{b}}}{2 \pi } \right) ^{-1/3} (T_{\odot }M)^{1/3} \end{aligned}$$
(24)

is the relative orbital velocity, divided by c, for a circular binary of separation a. This quantifies how relativistic the orbit is and will appear implicitly in many equations below.

Since \(a_{\rm{p}}\) is the semi-major axis of the pulsar’s orbit, we obtain for the companion \(a_{\rm{c}} = a - a_{\rm{p}}\). In this case, the ratio of the mass of the pulsar and the mass of the companion is given by

$$\begin{aligned} R \equiv \frac{M_{\rm{p}}}{M_{\rm{c}}} = \frac{a_{\rm{c}}}{a_{\rm{p}}} = \frac{x_{\rm{c}}}{x}, \end{aligned}$$
(25)

where \(x_{\rm{c}} = (a_{\rm{c}}/c) \sin i\). Since \(M = M_{\rm{p}} + M_{\rm{c}}\), we obtain \(a = a_{\rm{p}} (M / M_{\rm{c}})\). From this expression and Eqs. (9) and (21), one derives the (Newtonian) mass function equation:

$$\begin{aligned} f(M_{\rm{p}}, M_{\rm{c}}, i) = \frac{(M_{\rm{c}} \sin i)^3}{(M_{\rm{p}} + M_{\rm{c}})^2} = \frac{x^3}{T_{\odot }} \left( \frac{P_{\rm{b}}}{2 \pi } \right) ^{-2} , \end{aligned}$$
(26)

which has on the right all the relevant information available to an observer that cannot measure the additional mass constraints described below. Although in this case we have a single equation for three unknowns, if one assumes a value for \(M_{\rm{p}}\) and \(i = 90^\circ \) we can immediately derive a minimum value for \(M_{\rm{c}}\) (for that \(M_{\rm{p}}\)) by solving Eq. (26) numerically. This is the only available mass constraint for \(\sim 80 \%\) of binary pulsars.

To measure the component masses, two additional mass constraints are necessary. Besides measuring PK parameters (see next section), there are two mass constraints of relevance for the systems discussed below (Sects. 4.24.4.14.4.2):

  1. 1.

    Measurement of R: If the companion is sufficiently bright in the optical band, then phase-resolved spectroscopy can yield many measurements of \(v_r\). These are the sum of \(v_{r, \rm{CM}}\) and the orbital component of \(v_r\), which is generally parameterised by optical spectroscopists using \(P_{\rm{b}}\), \(T_0\), e, \(\omega _c = \omega + \pi \) and the orbital velocity amplitude, \(K_{\rm{c}}\). The latter relates to \(x_{\rm{c}}\) via:

    $$\begin{aligned} x_{\rm{c}} = \frac{K_{\rm{c}}}{c} \frac{P_{\rm{b}}}{2\pi } \sqrt{1 - e^2}. \end{aligned}$$
    (27)

    In the case of the Double Pulsar, \(x_c\) is available directly from the radio timing of PSR J0737−3039B.

  2. 2.

    Measurement of \(M_{\rm{c}}\): For systems with optically bright WD companions, the width of the Balmer lines allows the determination of the surface gravity. This can then be combined with the mass-radius relation for WDs to determine the WD mass and the radius (see e.g., Antoniadis et al. 2012, 2013; Liu et al. 2020).

These constraints are very rarely available, and in the case of \(M_{\rm{c}}\) they depend on several assumptions and detailed modeling. Including the first PN corrections of the Damour and Deruelle solution, the simple relation \(a = a_{\rm{p}} + a_{\rm{c}}\) and in particular Eq. (25) still hold, not only in GR but for any Lorentz-invariant gravity theory (Damour and Deruelle 1985; see also the detailed discussion in Damour 2009). The spectroscopic determination of \(M_{\rm{c}}\), in case of a WD companion, is performed in the well-tested Newtonian limit. For this reason, these constraints are of particular importance for tests of alternatives to GR (see Sect. 5.5).

Finally, it should be noted that the Newtonian expression (26) has so far been sufficient for all binary pulsars with the exception of the Double Pulsar, for which first PN corrections to Eq. (26) are necessary (more in Sect. 4.2).

3.4.2 Post-Keplerian parameters

In GR, to leading order, the PK parameters depend simply on the well-measured, astrophysically meaningful Keplerian parameters (\(P_{\rm{b}}\), e and in one case x) and the masses \(M_{\rm{p}}\) and \(M_{\rm{c}}\), they are given by (Damour and Deruelle 1986; Taylor and Weisberg 1989; Damour and Taylor 1992):

$$\begin{aligned} {\dot{\omega }}&= 3 \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{-5/3} (T_{\odot }M)^{2/3} \, (1-e^2)^{-1}, \end{aligned}$$
(28)
$$\begin{aligned} \gamma&= \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{1/3} (T_{\odot }M)^{2/3}\,X_{\rm{c}} (1 + X_{\rm{c}}) \, e, \end{aligned}$$
(29)
$$\begin{aligned} \dot{P}_{\rm{b}}&= -\frac{192\pi }{5} \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{-5/3} (T_{\odot }M)^{5/3}\, X_{\rm{p}} X_{\rm{c}}\, \frac{\left( 1 + \frac{73}{24} e^2 + \frac{37}{96} e^4 \right) }{(1-e^2)^{7/2}}, \end{aligned}$$
(30)
$$\begin{aligned} r&= (T_{\odot }M) X_{\rm{c}} = T_{\odot }M_{\rm{c}}, \end{aligned}$$
(31)
$$\begin{aligned} s&= x \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{-2/3} (T_{\odot }M)^{-1/3}\,X_{\rm{c}}^{-1}, \end{aligned}$$
(32)
$$\begin{aligned} \delta _\theta&= \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{-2/3} (T_{\odot }M)^{2/3}\,\left( \frac{7}{2}\,X_{\rm{p}}^2 + 6\,X_{\rm{p}}\,X_{\rm{c}} + 2\,X_{\rm{c}}^2\right) , \end{aligned}$$
(33)
$$\begin{aligned} \delta _r&= \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{-2/3} (T_{\odot }M)^{2/3}\,\left( 3\,X_{\rm{p}}^2 + 6\,X_{\rm p}\,X_{\rm c} + 2\,X_{\rm c}^2\right) , \end{aligned}$$
(34)

where \(s\equiv \sin i\), \(X_{\rm p} = M_{\rm p}/M\) and \(X_{\rm c} = M_{\rm c}/M\). Equation (17) can then be used to derive the correspondent values of \(h_3\) and \(\varsigma \).

This “PK simplicity” is a consequence of a fundamental property of GR, the effacement of the internal structure of a self-gravitating mass (Damour 1987; Will 2018). Consequently, the motion of bodies in a N-body system is independent of their internal structure.Footnote 12 The PK simplicity has the following implications:

  • The measurement of two PK parameters leads to a determination of both masses. If none of these PK parameters is s or \(\varsigma \) (which yields \(\sin i\) directly), we can then use Eq. (32) (a re-write of Eq. 26) to determine \(\sin i\) from the masses.

  • Measuring additional PK parameters we can test the validity of GR from a test of the self-consistency of these equations. This is the main method used in Sect. 4.

  • This also means that additional NS properties, like their radius, tidal deformability or I cannot be measured with this technique, at least with only the leading order terms in the PK equations above. This would imply that the only way pulsar timing can constrain the EoS is via the measurement of large NS masses (see e.g., Fonseca et al. 2021). Measurements of other NS bulk parameters must therefore come from other types of measurements: radii come from X-ray observations, in some cases aided by radio measurements (Özel and Freire 2016; Miller et al. 2021; Vinciguerra et al. 2024) and tidal deformabilities come from the observation of GWs from a DNS merger (Abbott et al. 2018).

However, for the Double Pulsar (Sect. 4.2) higher order contributions to Eq. (28) become relevant for the description of the timing data, specifically the second PN correction \({{\dot{\omega }}}_{\rm 2pN}\) as well as the Lense–Thirring (LT) contribution \({{\dot{\omega }}}_{\rm LT}\), which results from the coupling between the orbital motion and the rotation of pulsar A (relativistic spin-orbit coupling). More generally, with relativistic spin-orbit coupling, the spins of the two binary components enter the equations of motion, which leads to an additional spin (magnitude and orientation) and thus \(I_{\rm p}\) and \(I_{\rm c}\) dependence of some PK parameters (cf. discussion in Sect. 3.5.3).

The PK simplicity is captured by the DDGR timing model, a version of the DD86 model that assumes the validity of GR (and thus of Eqs. 2834): This model describes the timing using only the measurable Keplerian parameters, M and \(M_{\rm c}\) (Damour and Deruelle 1986; Taylor and Weisberg 1989). This theory-dependent model is very useful for many applications, where then GR is assumed to be the correct theory of gravity.

Other theories of gravity, such as those with one or more scalar fields in addition to a tensor field, have different mass dependencies for the PK parameters. In addition, there is (unlike in the case of GR) also a dependence on the internal structure of the bodies, meaning the PK parameters depend on the EoS of the pulsar and the EoS of the companion, which, for instance, is very different for a WD companion. Some specific examples of such theories will be discussed in Sect. 5.5 below.

3.5 “Contaminations” of the post-Keplerian parameters

In this sub-section, we do not aim at an exhaustive listing of terms that can affect the measurement of the PK parameters; only a short discussion of the effects relevant for the systems described in Sect. 4.

3.5.1 Variation of the orbital period

Re-writing Eq. (8) for \(P_{\rm b}\), and then differentiating it, we obtain:

$$\begin{aligned} \left( \frac{{\dot{P}}_{\rm b}}{P_{\rm b}} \right) _{\rm int} = \left( \frac{{\dot{P}}_{\rm b}}{P_{\rm b}} \right) _{\rm obs} - \frac{{\textbf{a}}_{\rm CM} \cdot \hat{{\textbf{K}}}_0}{c} - \frac{\mu ^2 d}{c}, \end{aligned}$$
(35)

where \({\textbf{a}}_{\rm CM}\) is the (small) acceleration of the CM of the pulsar binary relative to the SSB (caused by the gravitational field of the Milky Way) and \(\mu = \sqrt{\mu _{\alpha }^2 + \mu _\delta ^2}\) is the magnitude of the proper motion. The same relations apply to any other derivative of a time-like quantity of the binary system, like \({\dot{P}}\) and \({\dot{x}}\).

The term on the left is what we generally want to measure in most binary pulsars: the intrinsic variation of the orbital period, caused by orbital decay due to GW damping. Additional intrinsic terms are possible if the system is, for instance, losing mass, but for most DNSs the mass loss term given by Eq. (4) (with \({\dot{m}}_{\rm p} = {\dot{E}}_{\rm rot}/c^2\)) (Damour and Taylor 1991) can be safely ignored, except in the case of the extreme precision achieved for the Double Pulsar (Kramer et al. 2021).

The first term on the right generally improves fast with additional observations (Sect. 3.3.5).

The second term is generally dominated by the difference of Galactic accelerations of the SSB and the pulsar (Damour and Taylor 1991), this can be estimated from the position of the pulsar in the Galaxy (which requires an accurate distance) and a model of the Galactic potential (e.g., McMillan 2017). For many pulsars, additional contributions come from the gravitational fields of globular clusters and in a few cases of nearby stars.

The third term is known as the Shklovskii effect (Shklovskii 1970); this again requires knowledge of d and \(\mu \) (the latter is generally easier to measure from the timing, especially for pulsars with good timing precision and stability).

Thus, lack of precise knowledge of the second and third terms on the right (the “extrinsic” terms) limits the precision of the measurement of \({\dot{P}}_{\rm{ b, int}}\). The situation can be improved with precise measurements of d and more accurate Galactic potential models.

3.5.2 Other effects of the proper motion

Differentiating Eq. (9) in time and then dividing the result by that equation, we obtain, in the absence of a change in \(a_{\rm p}\) (for instance by GW damping):

$$\begin{aligned} \frac{{\dot{x}}}{x} = \cot i \, \frac{\text{d}i}{\text{d}t}, \end{aligned}$$
(36)

where the variation of the inclination \(\text{d}i/\text{d}t\) can be due to a change in our viewing angle of the binary (this section) or an intrinsic change of the orbital plane (Sect. 3.5.3). For nearly edge-on systems (\(i \approx 90^{\circ }\)), \(\cot {i}\ll 1\) and \(\text{d}i/\text{d}t\) is much more difficult to measure.

The proper motion of the system causes a change in our viewing angle of the binary, which can change i as seen from Earth. This was first calculated by Arzoumanian et al. (1996) and, in greater detail by Kopeikin (1996). The latter’s expressions can be re-written as:

$$\begin{aligned} \left( \frac{\text{d}i}{ \text{d} t}\right) _\mu = \mu \, \sin (\Theta _{\mu } - \Omega ), \end{aligned}$$
(37)

where \(\Theta _{\mu }\) is the PA of the proper motion. Like any change of the viewing angle, the proper motion also produces a change in the line of nodes, and therefore a change in \(\omega \):

$$\begin{aligned} {\dot{\omega }}_{\mu } = \frac{\mu }{ \sin i} \, \cos (\Theta _{\mu } - \Omega ). \end{aligned}$$
(38)

These equations are valid in any right-handed coordinate system, like that described in Damour and Taylor (1992) (where PAs are measured clockwise starting from East through North, where a positive spin along the line of sight points away from us) or the “observer’s convention” (PAs measured anti-clockwise from North through East, where a positive spin along the line of sight points towards us) that is implemented in most timing programs (Edwards et al. 2006).

Normally, a measurement of the Shapiro delay and \(\text{d}i/\text{d}t\) allows four “degenerate” solutions for i and \(\Omega \). However, the orbital motion of the Earth also changes the viewing angle of the binary pulsar, leading to yearly variations of x and \(\omega \), known as the “annual orbital parallax” (Kopeikin 1995). These are measurable in the timing of some nearby binary pulsars, especially wider systems with excellent timing precision (for early examples, see van Straten et al. 2001; Splaver et al. 2005; Zhu et al. 2019; Stovall et al. 2019, see also Guo et al. 2021); these measurements allow the determination of i and \(\Omega \). The \({\dot{\omega }}_\mu \) term can introduce significant corrections to the total mass derived from \({\dot{\omega }}\) (Stovall et al. 2019).

One way of taking all these effects into account in a self-consistent way is to introduce the orbital orientation (i, \(\Omega \)) and all its effects into the timing model, which are then fit like all other parameters, as done in the “T2” orbital model implemented in tempo2 (Edwards et al. 2006).

3.5.3 Relativistic spin-orbit coupling

A fundamental property of a curved spacetime is that the spin (proper angular momentum) of a freely-falling rotating body changes its direction with respect to a distant observer. This is certainly the case for the spin of pulsars moving in an orbit with a companion. The geodetic precession is the leading order contribution to such a spin precession which results from the relativistic spin-orbit coupling (Barker and O’Connell 1975). As a consequence, the spin of the pulsar precesses around the conserved total angular momentum of the system at a rate \(\Omega ^{\rm geod}_{\rm p}\) that depends, like the PK parameters in Eqs. (28)–(34), simply on the masses and the Keplerian parameters (Barker and O’Connell 1975; Börner et al. 1975; Damour 1978)Footnote 13:

$$\begin{aligned} \Omega ^{\rm geod}_{\rm p}&= \frac{1}{2} \left( \frac{P_{\rm b}}{2 \pi } \right) ^{-5/3} (T_{\odot }M)^{2/3} (1-e^2)^{-1} \, \left( 3 + X_{\rm p}\right) X_{\rm c} \nonumber \\&= \frac{1}{6} (3 + X_{\rm p}) X_{\rm c} \, {\dot{\omega }}. \end{aligned}$$
(39)

This “simplicity” means that a measurement of \(\Omega ^{\rm geod}_{\rm p}\) can be used as a test of GR. If \(M_{\rm c} = M_{\rm p}\), then \(X_{\rm c} = X_{\rm p} = 1/2\) and \(\Omega ^{\rm geod}_{\rm p} = \frac{7}{24}\,{\dot{\omega }} \simeq 0.29\,{\dot{\omega }}\).

It is detectable mostly by the fact that, as the spin axis of the pulsar precesses, different parts of the pulsar’s emission beam become visible from Earth, thus causing a long-term change in the pulse profile. In some cases, the pulsar might even become completely undetectable from Earth, as in the case of pulsar B in the Double Pulsar system (Perera et al. 2010). It is generally difficult to convert the observed changes in the pulse profiles into a qualitative measurement of \(\Omega ^{\rm geod}_{\rm p}\), but several techniques can help, particularly if polarimetric measurements are available, since these can tell us about the changing angle between the spin axis and the line of sight. We will give various examples in Sect. 4 where geodetic precession has been not only observed but also quantified in binary pulsars, providing additional GR tests.

Given that the total angular momentum must be conserved (up to 2PN approximation), a change in the direction of the spin of a pulsar (and that of a rotating companion) must correspond to a change in the direction of the orbital angular momentum—the LT precession of the orbital plane. This causes an intrinsic change in i that is, when averaged over one orbit, given in GR by Damour and Schäfer (1988):

$$\begin{aligned} \left\langle \frac{\text{d}i}{\text{d}t}\right\rangle ^{\rm LT} = \frac{1}{2} \left( \frac{P_{\rm b}}{2 \pi } \right) ^{-2} (T_{\odot }M) \, (1-e^2)^{-3/2} \, \sum _{j = \rm{p,c}} (3 + X_j)X_j \, (\varvec{\chi }_j \cdot \hat{{\textbf{I}}}), \end{aligned}$$
(40)

where \(\hat{{\textbf{I}}}\) is a unit vector pointing from the CM of the system to the ascending node of the pulsar orbit and the dimensionless spin of body j is given by:

$$\begin{aligned} \varvec{\chi }_j \equiv \frac{c}{G m_j^2} \, {\textbf{S}}_j = \frac{G}{c^5}(T_{\odot }M_j)^{-2} \, {\textbf{S}}_j, \end{aligned}$$
(41)

where \({\textbf{S}}_j\) is its spin angular momentum (“spin”). If its spin frequency (\(\nu _j\)) is known, we can relate it to its spin magnitude \(S_j \equiv |{\textbf{S}}_j|\) via:

$$\begin{aligned} S_j = 2 \pi \nu _j \, I_j, \end{aligned}$$
(42)

where \(I_{\rm j}\) is, by definition, its MoI. For NSs and BHs, \(\chi \equiv |\varvec{\chi } |\) varies between 0 and the maximum value for an astrophysical BH, 0.998 (Lo and Lin 2011; Thorne 1974). For the fastest-spinning NSs, we obtain, for reasonable values of I, \(\chi \lesssim 0.5\).

In DNSs, where we can find misaligned spins and measure \(\left\langle \text{d}i / \text{d}t \right\rangle ^{\rm{LT}}\), the recycled pulsar is found to be in a range \(\chi \sim 0.01 \cdots 0.03\) (Stovall et al. 2018). This is generally much larger than the non-recycled pulsar; this means that only one spin in the system matters in Eq. (40). On the long term, i will then change periodically with a period of \(2 \pi / \Omega ^{\rm geod}_{\rm p}\). This is quite long for all DNSs discovered to date: the shortest, PSR J1946+2052 (Stovall et al. 2018), has a geodetic precession cycle of about 48 years (assuming equal masses).

The spin-orbit coupling also has an effect on \({\dot{\omega }}\)—the LT contribution to the observed longitude of periastron. The detailed expressions are given in Damour and Schäfer (1988), which simplify considerably in a DNS where only the spin of the recycled pulsar matters. In the case of the Double Pulsar, discussed in Sect. 4.2.2, the spin of the recycled pulsar (A) is nearly aligned with the orbital angular momentum (Ferdman et al. 2013), further simplifying the effect on \(\omega \), which becomes a “secular” increase given by Kramer et al. (2021), Hu et al. (2022):

$$\begin{aligned} \langle {\dot{\omega }}\rangle ^{\rm LT} = -\left( \frac{P_{\rm b}}{2 \pi } \right) ^{-2} (T_{\odot }M) \, (1-e^2)^{-3/2} \, (3 + X_{\rm A}) X_{\rm A} \, \chi _{\rm A}, \end{aligned}$$
(43)

where the “A” subscripts indicate parameters of pulsar A.

Neither \(\langle \text{d}i/\text{d}t\rangle ^{\rm LT}\) nor \(\langle {\dot{\omega }}\rangle ^{\rm LT}\) due to a NS spin have been measured precisely for any binary pulsar,Footnote 14 owing to the fact that the effects are next-to-leading order and therefore very small. Nevertheless, the current upper limit on \(\langle {\dot{\omega }}\rangle ^{\rm LT}\) in the Double pulsar system already introduces an independent and useful upper limit on the MoI for J0737-3039A (\(I_{\rm{A}} < 3.0 \times 10^{45}\,\hbox{g cm}^2\), 90% C.L., Kramer et al. 2021); improving this measurement remains an important goal because a precise estimate of I and \(M_{\rm p}\) for the same pulsar would represent a powerful constraint of the EoS (Lattimer and Schutz 2005; Hu et al. 2020; Hu and Freire 2024).

3.5.4 Measurability of relativistic effects

The reason why only \(\sim \)20% of binary pulsars have one or more measured PK parametersFootnote 15 is that this requires several conditions which are often not fulfilled. A fundamental issue for all measurements, discussed in Sect. 2.3.5, is the intrinsic faintness of most pulsars and the resulting limits on the timing precision, which highlights the importance of sensitive radio telescopes for these experiments.

The measurement of \({\dot{\omega }}\) and \(\gamma \) requires a reasonably eccentric orbit; the measurement of \(\gamma \) generally requires, in addition, a large change in \(\omega \) during the timing baseline, otherwise, as discussed in Sect. 3.3.3, it is not separable from the Keplerian parameters (but see discussion in Ridolfi et al. 2019). The fractional uncertainty on both parameters decreases as \(\Delta T^{-3/2}\), where \(\Delta T\) is the timing baseline (see Table II of Damour and Taylor 1992), but in the case of \(\gamma \), the uncertainty decreases more slowly after a significant fraction of a precession cycle is completed.

The measurement of the Shapiro delay parameters (r, s or \(h_3\), \(\varsigma \)) is strongly favoured by nearly edge-on orbits and large companion masses. Because the Shapiro delay is a periodic effect, the fractional uncertainty of these parameters decreases slowly with time (as \(\Delta T^{-1/2}\)), so their measurement depends, more than other parameters, on timing precision. Measurements of this effect are also helped by an eccentric, precessing orbit, which help separate it effectively from the Rømer delay; a large rate of precession might result in a faster improvement of the measurements, especially as the superior conjunction moves through periastron (see Sect. 4.1).

The measurement of \({\dot{P}}_{\rm b}\) requires relatively short orbital periods, because the contribution to \({\dot{P}}_{\rm b}\) from GW damping is proportional to \(P_{\rm b}^{-5/3}\), while the kinematic contribution described in Sect. 3.5.1 scales with \(P_{\rm b}\). The detection of \(\delta _\theta \) requires, as in the case of \(\gamma \), a large value of e, a large sweep in \(\omega \) and, more than even \({\dot{P}}_{\rm b}\), a short orbital period (see Table II of Damour and Taylor 1992); for \({\dot{P}}_{\rm b}\) and \(\delta _\theta \), the fractional uncertainty decreases as \(\Delta T^{-5/2}\), but as in the case of \(\gamma \), the uncertainty of \(\delta _\theta \) decreases more slowly once a significant fraction of a precession cycle is completed. In the case of the Double Pulsar, one of the two systems where there is a hint of a detection of \(\delta _\theta \) (Sects. 4.14.2), this parameter is strongly correlated with \(\gamma \) (Kramer et al. 2021).

The measurement of the geodetic precession requires a pulsar spin that is sufficiently misaligned with the orbital angular momentum. Generally, this is only seen in DNS systems (see Sect. 2.4.2).Footnote 16 The measurement of \(\left\langle \text{d}i / \text{d}t \right\rangle ^{\rm LT}\) requires, in addition, that a misaligned NS has a sufficiently large spin. The measurement of \(\langle {\dot{\omega }}\rangle ^{\rm LT}\) does not require such a misalignment, but instead extremely precise mass measurements from other PK parameters (to calculate and subtract the non-LT contributions to \({{\dot{\omega }}}\) with sufficient precision, see Sect. 4.2.2).

Because of their high spin frequency, MSPs have the best timing precision; however, their nearly circular and generally wider orbits severely restrict the number of cases where \({\dot{\omega }}\) and especially \(\gamma \) can be measured. Furthermore their (generally) low companion masses restrict the number of Shapiro delay measurements. The alignment of the spin with the orbital angular momentum (Sect. 2.4.2) means that geodetic precession cannot be measured in these systems. It is for these reasons that, despite their much smaller numbers, DNS systems have been so important for tests of gravity theories.

4 Strong-field GR tests with pulsars

To test GR in the presence of strong gravitational fields, i.e., in spacetimes that deviate (at least in some regions) significantly from Minkowski spacetime, binary pulsars provide some of the best and most precise experiments. Here, pulsars with compact companions, such as another NS or a WD, are of particular interest. In such binaries, the separation between the two bodies is still large compared to their size. Meaning, we can study the gravitational dynamics of two essentially point-like masses, without complicating contributions like tidal interaction. Pulsar-WD systems are of particular interest for testing alternative theories of gravity, due to their high asymmetry in gravitational self-energy.

In this section we will introduce pulsar tests of GR, before we then discuss tests of specific deviations from GR and alternative gravity theories in Sect. 5. A particularly clear way of illustrating these different GR tests is the mass–mass diagram: because of the PK simplicity (see Sect. 3.4), each measurement of a PK parameter leads to a constraint in the \(m_{\rm p}-m_{\rm c}\) parameter space. Hence, if all PK parameters agree on a common region in the \(m_{\rm p}-m_{\rm c}\) plane GR has passed the test.

4.1 The original system: PSR B1913+16

It is certainly fair to say that the whole field of experimental gravity with radio pulsars started with the discovery of pulsar B1913+16 in July 1974 by Russell Hulse and Joseph Taylor (Hulse and Taylor 1975; Hulse 1994; Taylor 1994) using the Arecibo telescope. PSR B1913+16, now called the ‘Hulse–Taylor pulsar’, is a 59 ms pulsar in an eccentric (\(e=0.62\)) 7.75-h orbit with an unseen compact companion. Based on the deduced masses (assuming GR), with \(M_{\rm p} = 1.44\) and \(M_{\rm c} = 1.39\), and considering evolutionary scenarios (as discussed in Sect. 2.4), it became soon evident that the companion is almost certainly also a NS. Soon after the discovery a significant advance of periastron (\({{\dot{\omega }}}\)) of about \(4.2^\circ \) per year was detected (Taylor et al. 1976), and by 1978 the time dilation \(\gamma \) and a decay in the orbital period \({\dot{P}}_{\rm b}\) were measured (Taylor and McCulloch 1980; Taylor et al. 1979).

The agreement of those three PK parameters with GR confirmed, for the first time, the existence of GWs as predicted by GR. More generally, it confirmed the validity of GR for the gravitational interaction between two strongly self-gravitating masses. Regular observations of this system over the ensuing decades have lead to a steady improvement of that \({{\dot{\omega }}}\)-\(\gamma \)-\({\dot{P}}_{\rm b}\) test (mixture of quasi-stationary and radiative aspects) (Taylor and Weisberg 1982, 1989; Damour and Taylor 1991; Weisberg et al. 2010), where in the meantime there is even a detection of the Shapiro delay in that system (Weisberg and Huang 2016) (see also Fig. 4). Despite the increasing timing base-line and precision, the testing of the GW emission by the Hulse–Taylor system has stagnated at around 0.3% (95% C.L.) for some time. The reason for this is the imprecise knowledge of the system’s distance, which limits our ability to correct for the extrinsic contributions to the observed \({\dot{P}}_{\rm b}\) (cf. Section 3.5.1) (Damour and Taylor 1991; Weisberg et al. 2010; Weisberg and Huang 2016; Deller et al. 2018).

Fig. 4
figure 4

Mass–mass diagram for the PSR B1913+16 system based on GR, using the PK parameter \({{\dot{\omega }}}\), \(\gamma \), intrinsic \({\dot{P}}_{\rm b}\), and the orthometric Shapiro parameters \(\varsigma \) and \(h_3\). All curves agree on a small common region in the mass–mass plane. Parameter values have been taken from Weisberg and Huang (2016). In their Fig. 4 they display r and s for the Shapiro delay. As is obvious from the comparison of the two figures, the \(h_3\)-test is much more stringent than the r-test, which is one of the key advantages of the orthometric parameterisation of Freire and Wex (2010). In this and all other mass–mass diagrams, the width of each curve represents the one-sigma (68.3% C.L.) uncertainty in the corresponding PK parameter. In addition, Weisberg and Huang (2016) reports a \(\sim \)2-\(\sigma \) measurement of the PK parameter \(\delta _\theta \), which is not shown here

Soon after the discovery of the Hulse–Taylor pulsar, its was pointed out that a pulsar in binary is subject to geodetic precession (see Sect. 3.5.3) and that the Hulse–Taylor pulsar is generally expected to show changes in its emission and polarisation geometry with a precession rate of about \(1.2^\circ \,\rm{yr}^{-1}\) (Damour and Ruffini 1974; Barker and O’Connell 1975; Dass and Radhakrishnan 1975; Börner et al. 1976; Damour and Taylor 1992). Such changes have indeed been observed, as the pulsar’s rotational axis is misaligned (\(\sim 21^\circ \), Graikou 2019) with respect to the orbital angular momentum (a result of the SN kick; see Sect. 2.4) (Weisberg et al. 1989; Kramer 1998; Weisberg and Taylor 2002). However, until today these observations could not be converted into a quantitative test of geodetic precession. Other systems turned out to be more useful for such a test, as we discuss below.

Finally, the discovery of the Hulse–Taylor pulsar was of great importance for the development of ground-based GW detectors, as explained in detail in Sect. 2.4.2.

4.2 The double pulsar system: PSR J0737−3039A/B

In 2003, a truly remarkable binary pulsar system was discovered in a Parkes survey of the Galactic anti-centre, the Double Pulsar PSR J0737−3039A/B, which up to date is the only known binary system consisting of two active radio pulsars that are detectable from Earth (Burgay et al. 2003; Lyne et al. 2004). Pulsar A is a mildly recycled 23 ms pulsar which is in a mildly eccentric (\(e = 0.089\)) 2.5-h orbit with a non-recycled 2.8 s pulsar B. The system is significantly more relativistic than the Hulse–Taylor pulsar (Sect. 4.1) showing, e.g., an advance of periastron (\({{\dot{\omega }}}\)) of 16.9 degrees per year. In addition, the system is seen nearly edge on, which leads to a very prominent Shapiro delay of about \(130\,\upmu \hbox{s}\) in the timing of pulsar A near its superior conjunction (Lyne et al. 2004; Kramer et al. 2006, 2021). The masses are somewhat smaller than in the Hulse–Taylor pulsar, amounting to \(M_{\rm{A}} = 1.338\) and \(M_{\rm{B}} = 1.249\), when assuming GR (Kramer et al. 2006, 2021).

From timing pulsars A and B one obtains the projected semi-major axes of both of the pulsar orbits, \(x_{\rm{A}}\) and \(x_{\rm{B}}\); from this one directly obtains the mass ratio \(R = M_{\rm{A}}/M_{\rm{B}} = 1.0714 \pm 0.0011\) (Lyne et al. 2004; Kramer et al. 2006). The uncertainty in R is dominated by the uncertainty in \(x_{\rm{B}}\). Therefore, any improvement is currently limited by the fact that due to geodetic precession (\(5.07^\circ \,\hbox{yr}^{-1}\)) pulsar B’s emission moved out of our line of sight in 2008 (Perera et al. 2010). More model based approaches, in terms of the emission geometry of pulsar B (which is strongly affected by the wind of pulsar A), can lead to a further reduction in the uncertainty of R (see e.g., Noutsos et al. 2020).

Fortunately, the (integrated) pulse profile of the fast rotating pulsar A remained stable over the past years, due to the fact that A’s spin is closely aligned with the orbital angular momentum (Ferdman et al. 2013). As a consequence, all its timing parameters significantly improved since its discovery in 2003, allowing for the measurement of a total of six(!) PK parameters, some of them known to many significant digits, and all of them in perfect agreement with GR (see Kramer et al. 2021 and Fig. 5). Furthermore, next-to-leading order (NLO) contributions of the advance of periastron, the Shapiro and the aberration delay had to be taken into account, in order to obtain a correct modelling and interpretation of these effects. This will be explained in more detail below. In the following we will have a closer look at four different predictions by GR tested in the Double Pulsar: GW damping, periastron advance, signal propagation, and geodetic precession.

Fig. 5
figure 5

Mass–mass diagram for the Double Pulsar based on GR. The PK parameters \({{\dot{\omega }}}\), \({\dot{P}}_{\rm b}\), \(\gamma \), r, and s are from timing observations of pulsar A. The mass ratio R comes from the observed projected semi-major axes of A and B. The rate of geodetic precession of B (\(\Omega _{\rm B}^{\rm geod}\)) is the result of modelling the eclipses of A near superior conjunction. Timing of pulsar A also allows for a one-sigma detection of the PK parameter \(\delta _\theta \), which is not shown here. All curves agree on a small region of masses at the centre of the plot, meaning that GR provides a consistent description of the Double Pulsar system. The figure is based on the parameters and the analysis in Kramer et al. (2021). The values for \(\Omega _{\rm B}^{\rm geod}\) are taken from Lower et al. (2024)

4.2.1 Gravitational wave damping

Like the Hulse–Taylor pulsar (Sect. 4.1), the Double Pulsar shows a significant decrease in the orbital period due to the emission of GWs (PK parameter \({\dot{P}}_{\rm b}\)). As for the Hulse–Taylor pulsar, also here the observed \({\dot{P}}_{\rm b}\) is “contaminated” by external effects (cf. Sect. 3.5.1). Fortunately, the Double Pulsar is about ten times closer to us which allows for a direct parallax measurement of its distance: \(735 \pm 60\) pc (Kramer et al. 2021). As a result, the extrinsic terms in Eq. (35) can be determined with sufficient precision so that they do not limit the GW test, like in the case of the Hulse–Taylor pulsar (Deller et al. 2009; Kramer et al. 2021). The intrinsic change of the orbital period obtained in this way (\({\dot{P}}_{\rm b}^{\rm{int}} = -1.247752(79) \times 10^{-12}\)) is in perfect agreement with the prediction by GR (Kramer et al. 2021):

$$\begin{aligned} {\dot{P}}_{\rm{b}}^{\rm{int}} / {\dot{P}}_{\rm{b}}^{\rm{GR}} = 0.999963(63), \end{aligned}$$
(44)

where the GR prediction \({\dot{P}}_{\rm b}^{\rm{GR}}\) is based on the masses \(M_{\rm A}\) and \(M_{\rm B}\), calculated from the PK parameters \({{\dot{\omega }}}\) and s.Footnote 17 Fig. 6 shows a comparison of the observed and the calculated shift in the time of periastron passage, as a result of the accelerated evolution in orbital phase due to the emission of GWs.Footnote 18 The result (44) is by far the most precise test of GR’s quadrupole formula for the leading-order emission of GWs by accelerated masses. Moreover it confirms the validity of the quadrupole formula in the presence of two strongly self-gravitating NSs, for which GR predicts the effacement of their internal structure (Damour 1987). The corresponding merger time of the system is just 86 Myr.

Fig. 6
figure 6

Cumulative shift in the time of periastron passage (“GW damping parabola”) for the Double Pulsar. The red curve is the prediction by GR when using the pulsar masses calculated from the PK parameters \({{\dot{\omega }}}\) and s. The figure is based on the data and the analysis in Kramer et al. (2021)

4.2.2 Periastron advance

The advance of periastron (PK parameter \({{\dot{\omega }}}\)) is the most precisely measured PK parameter for the Double Pulsar and therefore, in principle, should give by far the narrowest constraint curve in the mass–mass plane of Fig. 5 (see also the inset in Fig. 13 of Kramer et al. 2021). At a fractional precision of about \(8 \times 10^{-7}\) for the observed \({{\dot{\omega }}}\), terms of the second PN (2PN) order (Damour and Schäfer 1988) must be considered to obtain the correct mass constraints, since the 2PN correction \({{\dot{\omega }}}_{\rm 2pN} = +4.39 \times 10^{-4}\,\hbox{deg yr}^{-1}\) is about 35 times larger than the measurement uncertainty of this PK parameter (Kramer et al. 2021). Furthermore, the LT contribution to \({{\dot{\omega }}}\) from the coupling between the orbital motion and the spin of pulsar A has a contribution comparable in magnitude to the 2PN correction (with a negative sign, since the spin of pulsar A is aligned with the orbital angular momentum) (Damour and Schäfer 1988; Königsdörffer and Gopakumar 2006; Iorio 2009; Hu et al. 2020).Footnote 19 While the 2PN contribution is completely determined for given masses (and Kepler parameters \(P_{\rm b}\) and e), the determination of the LT contribution also requires knowledge of the MoI \(I_{\rm A}\) (see Eq. 43). However, the calculation of \(I_{\rm A}\) comes with an uncertainty related to our imperfect knowledge of the EoS of NS matter at supranuclear densities (see e.g., Lattimer and Prakash 2001). Based on the latest multimessenger constraints for the radius of a NS, Kramer et al. (2021) give a LT contribution of \({{\dot{\omega }}}_{\rm LT} = -4.83^{+0.29}_{-0.35} \times 10^{-4}\,\hbox{deg yr}^{-1}\). Although the uncertainty in this contribution is about a factor of three larger than the measurement uncertainty for \({{\dot{\omega }}}\), it is still not the limiting factor in the GW test outlined in Sect. 4.2.1 above, where the masses are estimated from \({{\dot{\omega }}}\) and s. In turn, in the near future, in particular an improved precision for the intrinsic \({\dot{P}}_{\rm{b}}\) (in combination with the Shapiro shape measurement) should allow to put interesting constraints on the EoS for NSs (Lattimer and Schutz 2005; Kramer and Wex 2009; Kehl et al. 2018; Hu et al. 2020). The current uncertainty for the intrinsic \({\dot{P}}_{\rm{b}}\) results in constraints on I which are not competitive to current constraints from other observations (see Kramer et al. 2021 and references therein).

4.2.3 Signal propagation

As it turns out, currently the Double Pulsar is the most edge-on binary pulsar observed. The orbital plane is tilted by only \(0.64^\circ \pm 0.03^\circ \) with respect to the line of sight (Hu et al. 2022). At superior conjunction, the pulsar signal, on its way to the observer on Earth, comes within \(\sim 10\,000\) km of the companion NS, resulting in a strong Shapiro delay of \(\sim 130\,\upmu \hbox{s}\) (see Eq. 16). In particular the timing of pulsar A resulted in a precise determination of the Shapiro parameters r and s, requiring PN corrections to the mass function Eq. (32) for a consistent conversion into constraints on the masses (Kramer et al. 2021; Hu et al. 2022). In addition, it became necessary to account for the motion of the “lens” (pulsar B) while the radio signal is propagating across the binary system (retardation effect, Kopeikin and Schäfer 1999; Rafikov and Lai 2006a) in order to properly model the Shapiro delay near conjunction (Kramer et al. 2021; Hu et al. 2022) (see also Sect. 3.3.2).

In addition to the Shapiro propagation delay, the radio signal of pulsar A is also deflected by the curved spacetime of pulsar B. The deflection near superior conjunction reaches up to \(\sim 0.03^{\circ }\). This deflection adds a higher order correction to the aberration of flat spacetime (Doroshenko and Kopeikin 1995; Rafikov and Lai 2006b, a) which is well measured in the Double Pulsar (Kramer et al. 2021) (see also Sect. 3.3.4). Based on precise timing with the MeerKAT telescope, Hu et al. (2022) give the so far best test of this ‘longitudinal/rotational deflection delay with a precision of 15%. While this uncertainty is certainly large compared to deflection experiments in the gravitational field of the Sun (Will 2018), it is the first such experiment in the gravitational field of a NS, i.e., a strongly self-gravitating material body. In this context, it should be noted that the Double Pulsar tests described in this subsection probe the propagation of photons in a spacetime curvature that is orders of magnitude larger than in any other photon propagation experiment, even when compared to the shadow of the supermassive black hole (BH) Sgr \(\hbox{A}^*\) (Event Horizon Telescope Collaboration 2022a) imaged with the Event Horizon Telescope (see Fig. 7 in Wex and Kramer 2020).

4.2.4 Geodetic precession

The close edge-on orientation even leads to short intermittent eclipses of pulsar A by the plasma-filled magnetosphere of pulsar B during A’s superior conjunction (Lyne et al. 2004). While this reduces the number of ToAs and consequently the timing precision near conjunction, modelling the eclipse pattern changing over time could be used to measure the geodetic spin precession (cf. Section 3.5.3) of pulsar B with moderate precision: \(4.77_{-0.65}^{+0.66}{}^\circ \,{\rm{yr}}^{-1}\) (Breton et al. 2008), \(5.16_{-0.34}^{+0.32}{}^{\circ }\, {\rm{yr}}^{-1}\) (Lower et al. 2024). These values agree well with the prediction by GR: \(5.074^{\circ }\,{\rm{yr}}^{-1}\). Compared to the Solar System, where geodetic precession of a gyroscope has been tested with high precision (Everitt et al. 2011), such a test with a radio pulsar is different in two ways: it tests geodetic precession in a binary system with nearly equal masses, and more importantly, it tests geodetic precession with a strongly self-gravitating “gyroscope”.

4.3 Other relativistic double neutron-star systems

Apart from the Hulse–Taylor pulsar and the Double Pulsar, there are several additional DNS binary pulsars which have at least three measured PK parameters.Footnote 20 In the following we discuss those systems which so far have played an important role in testing GR and/or alternative gravity theories.

4.3.1 PSR B1534+12

PSR B1534+12, discovered in 1990 also with the Arecibo telescope (Wolszczan 1991), was the second DNS system suitable for testing GR and alternative gravity theories (Taylor et al. 1992). It is a 37.9-ms radio pulsar in an eccentric (\(e=0.27\)) 10.1-h orbit. Like for the Hulse–Taylor pulsar, the three PK parameters \({{\dot{\omega }}}\), \(\gamma \) and \({\dot{P}}_{\rm b}\) are measured in this system. Moreover, compared to the Hulse–Taylor pulsar system, where \(i = 47.2^\circ \) (\(s = 0.734\)), this system is seen much more edge-on with an orbital inclination \(i = 77.2^\circ \) (\(s = 0.975\)). As a result, the timing observations show a very prominent Shapiro delay allowing for the measurement of two more PK parameters, the Shapiro shape s and the Shapiro range r (Stairs et al. 1998, 2002; Fonseca et al. 2014). PSR B1534+12 thus provided the first precision test of the Shapiro delay in the gravitational field of a strongly self-gravitating mass.

For a long time, the \({\dot{P}}_{\rm b}\) test was limited by the large (systematic) uncertainty in the estimation of the distance to PSR B1534+12, which was mainly based on models for the Galactic distribution of free electrons. As a result, this led to large uncertainties in the correction for the Shklovskii contribution to the observed \({\dot{P}}_{\rm b}\) (cf. Sect. 3.5.1). Fairly recently, a reliable distance measurement was obtained with the help of VLBI (\(d = 0.94^{+0.07}_{-0.06}\,\rm{kpc}\)) leading to a robust 4% (95% C.L.) radiative test with this system (Ding et al. 2021).

As with the Hulse–Taylor pulsar, the kick of the second SN explosion in this system also led to a misalignment between the pulsar’s rotational axis and the orbital angular momentum (\(\sim 30^\circ \)). Although the precession rate predicted by GR (\(\Omega _{\rm{p}}^{\rm{geod}} = 0.514^\circ \,{\rm{yr}}^{-1}\)) is significantly smaller than the one for the Hulse–Taylor pulsar, the high S/N ratio obtained for the (integrated) pulse profile and polarisation of PSR B1534+12 not only allowed for a detection of geodetic precession at an early stage but also for the first time provided a quantitative test of the precession rate of a pulsar (Stairs et al. 2004), with the latest value given in Fonseca et al. (2014): \(0.59_{-0.08}^{+0.12}{}^\circ \,{\rm{yr}}^{-1}\).

Figure 7 shows the result of the test of GR with the six PK parameters (5 quasi-stationary, 1 radiative) of PSR B1534+12.

Fig. 7
figure 7

Mass–mass diagram for the PSR B1534+12 system based on GR. PSR B1534+12 provides constraints from six different PK parameters, including the geodetic precession of the pulsar (\(\Omega _{\rm{p}}^{\rm{geod}}\)). Data are taken from Fonseca et al. (2014), Ding et al. (2021)

4.3.2 PSR J1906+0746

Binary pulsar J1906+0746, discovered with the Arecibo telescope, has a relatively slow spin period (144 ms) and a comparably large (inferred) surface magnetic field (Lorimer et al. 2006). Therefore it is most likely a non-recycled young pulsar in a mildly eccentric (\(e = 0.085\)) 4.0-h orbit with a NS companion.Footnote 21 Timing observations allowed the determination of three PK parameters (\({{\dot{\omega }}}\), \(\gamma \), \({\dot{P}}_{\rm b}\)) with good precision, allowing for a 5% radiative test (van Leeuwen et al. 2015). This alone does not make this system particularly interesting for gravity tests. However, the pulsar’s spin axis has a large misalignment of \(104(9)^\circ \) with respect to the orbital angular momentum and shows a very characteristic polarisation pattern which changed over time due to geodetic precession. Crucially, these changes were observed not only for the main pulse but also for the interpulse; the observation of both magnetic poles results in unusually robust estimates of the geometry of the magnetic field and the direction of the spin axis. By modelling these polarisation changes, the precession rate of the pulsar spin could be determined to be \(2.17(11)^\circ \,{{\rm yr}^{-1}}\), which is in perfect agreement with the GR value: \(2.234(14)^\circ \,{{\rm y}r^{-1}}\). This is so far the best test of the geodetic precession of a spinning NS (cf. Sects. 4.2.44.3.1). Figure 8 shows the GR mass–mass diagram for PSR J1906+0746.

Fig. 8
figure 8

Mass–mass diagram for the PSR J1906+0746 system based on GR. The three PK parameters \({{\dot{\omega }}}\), \(\gamma \) and \({\dot{P}}_{\rm{b}}\) were measured in timing observations (van Leeuwen et al. 2015). The geodetic precession rate \(\Omega _{\rm{p}}^{\rm{geod}}\) and orbital inclination i were obtained from modelling changes in the pulsar polarisation due to the spin-precession of the pulsar (Desvignes et al. 2019)

4.3.3 PSR J1757−1854

PSR J1757−1854 is currently (in certain aspects) the most relativistic binary pulsar with which GR has been tested. It is a 21.5 ms pulsar in a highly eccentric (\(e = 0.606\)) DNS system (\(M_{\rm{p}} = 1.34\), \(M_{\rm{c}} = 1.39\)) (Cameron et al. 2018). In a sense, it is a more relativistic version of the Hulse–Taylor pulsar. Although its orbital period of 4.4 h is clearly larger than that of the Double Pulsar, it is the high eccentricity that leads to a significantly stronger decrease in the orbital period due to GW damping (\({\dot{P}}_{\rm{b}} = -5.3 \times 10^{-12}\)) and a correspondingly shorter merger time of 76 Myr. At periastron the two NSs have a separation of just \(0.75\,\rm{R}_\odot \), leading to a relative velocity of 1060 km/s, the largest for known binary pulsars.

Fig. 9
figure 9

Mass–mass diagram for the PSR J1757−1854 system based on GR. Parameter values are taken from Cameron et al. (2023)

So far, there are five PK parameters measured for PSR J1757−1854, including the two orthometric parameters of the Shapiro delay (see Fig. 9). All parameters agree on a common region in the GR mass–mass plane, meaning GR has also passed this test. However, there is a large systematic uncertainty in the intrinsic \({\dot{P}}_{\rm b}\). The reason is the unknown distance to the system, and therefore a large uncertainty in the extrinsic contributions to the observed \({\dot{P}}_{\rm b}\) (Sect. 3.5.1, see also Fig. 13 in Cameron et al. 2023).Footnote 22 If one assumes GR and a model for the Galactic gravitational potential, then the \({\dot{P}}_{\rm b}\) can be used to estimate the distance of the system to about 8–13 kpc.

While it is still unclear how to obtain better constraints on the distance without assuming GR, a promising additional GR test for the near future—due to the large eccentricity—is the relativistic deformation of the orbit \(\delta _\theta \) (Cameron et al. 2023), which is only barely significant in the Hulse–Taylor pulsar (Sect. 4.1) and the Double Pulsar (Sect. 4.2). In principle, PSR J1757−1854 could also be an excellent pulsar binary system for testing the LT precession of the orbital plane (PK parameter \({\dot{x}}\); see Eq. 40). However, the analysis of the pulse structure and the polarisation of PSR J1757−1854 leads to the conclusion that the spin is oriented such that \({\dot{x}}_{\rm LT}\) is too small to be measurable in the near future.

4.3.4 PSR J1913+1102

PSR J1913+1102 is a 27-ms pulsar which is in a mildly eccentric (\(e=0.09\)) 5.0-h orbit with a NS companion (Lazarus et al. 2016). Regular timing observations since its discovery in 2012, in particular with the 305-m Arecibo telescope, allowed the measurement of three PK parameters: \({{\dot{\omega }}} = 5.6501(7)\,\hbox{deg yr}^{-1}\), \(\gamma = 0.471(15)\,\hbox{ms}\), and \({\dot{P}}_{\rm{b}} = -0.480(30) \times 10^{-12}\). Assuming GR, one obtains a pulsar mass of \(m_{\rm{p}} = 1.62(3)\,{\rm{M}}_{\odot }\) and a companion mass of \(m_{\rm{c}} = 1.27(3)\,{\rm{M}}_{\odot }\) (Ferdman et al. 2020). With a mass ratio of \(R \equiv m_{\rm{p}}/m_{\rm{c}} = 1.28(4)\), this is the most asymmetric DNS system reported so far that shows significant GW damping, i.e., a significant (intrinsic) \({\dot{P}}_{\rm{b}}\). The observed GW damping agrees with GR, providing a 6% test (see Fig. 10). This by itself cannot compete with most of the other GW tests presented so far. However, the comparably large asymmetry in the NS masses, and the correspondingly large asymmetry in the (fractional) gravitational binding energy makes this DNS system interesting for tests of dipolar GW emission, predicted by many alternatives to GR (see Sect. 5.5).

Fig. 10
figure 10

Mass–mass diagram for the PSR J1913+1102 system based on GR. For that pulsar three PK parameters have been measured: advance of periastron (\({{\dot{\omega }}}\), black), time dilation (\(\gamma \), purple), and change of the orbital period due to GW damping (\({\dot{P}}_{\rm{b}}\), blue). They agree a common mass–mass region, meaning that GR has passed this test with an asymmetric DNS system (cf. Fig. 1 in Ferdman et al. 2020)

4.4 Relativistic pulsar-white dwarf systems

Currently, about 400 binary pulsars are known, more than half of which have a WD as a companion (Manchester et al. 2005). Several of these pulsar-WD systems have orbital periods of less than one day. Many of the most precise “pulsar clocks” are found in such systems (see discussion in Sect. 2.4). As a result of the mass transfer, a circularisation of the orbit took place leading to very low eccentricities. As a consequence of this, often neither the advance of periastron (PK parameter \({{\dot{\omega }}}\)) nor the Einstein delay (PK parameter \(\gamma \)) has been measured in these systems. In some cases the system is seen sufficiently edge on, so that a significant Shapiro delay is present in the timing data. For quite a few pulsar-WD systems, the (intrinsic) change in the orbital period (PK parameter \({\dot{P}}_{\rm b}\)) is the only measurable relativistic effect. If then mass estimates can be obtained through alternative channels, e.g., high-resolution spectroscopy observations of the WD companion, then such systems can still provide valuable tests of GR. Pulsar-WD binaries are of particular interest for the study of alternatives to GR, since such systems exhibit a large asymmetry in the (fractional) gravitational binding energy, which in many alternative gravity theories leads to the prediction of strong dipolar GWs that do not occur in the GR (see Sects. 5.2 and 5.5). In the following, we will give a list of pulsar-WD systems that so far have been of particular importance in gravity tests, most notably for constraining alternatives to GR. For all of these systems, the masses can be determined (either from timing alone, or with the help of optical observations) and at least one (additional) PK parameter has been measured, so that the system is over-constrained.

4.4.1 PSR J1738+0333

PSR J1738+0333 is a fully recycled pulsar with a rotational period of 5.9 ms and an optically bright low-mass Helium-core WD as a companion (Jacoby 2005; Jacoby et al. 2007). The two stars orbit each other in about 8.5 h in a nearly circular orbit (\(e < 4 \times 10^{-7}\)). Regular timing observations since 2003, in particular with the 305-m William E. Gordon Arecibo radio telescope, allowed eventually the determination of a significant change in the (intrinsic) orbital period (Freire et al. 2012). With the latest distance measurement from VLBI, needed to correct for the extrinsic \({\dot{P}}_{\rm{b}}\) contributions, one finds \({\dot{P}}_{\rm{b}}^{\rm{int}} = -26.1 \pm 3.1~\hbox{fs s}^{-1}\) (Ding et al. 2023). This change in the orbital period due to GW damping is the only PK parameter known so far in this system. Fortunately, masses can be obtained independently from optical observations of the WD companion. High-resolution spectroscopy combined with WD models lead to a mass ratio of \(R = 8.1 \pm 0.2\) and a WD mass of \(m_{\rm c} = 0.181_{-0.005}^{+0.007}\,{\rm{M}}_{\odot }\), which converts into \(m_{\rm p} = 1.47_{-0.06}^{+0.07}\,{\rm{M}}_{\odot }\) (Antoniadis et al. 2012). With pulsar and companion masses at hand, Eq. (30) can be used to determine the GR value for the change in orbital period (\(-27.7_{-1.9}^{+1.5}~\hbox{fs s}^{-1}\)) (Freire et al. 2012), which agrees within the uncertainties well with the observed value (see also Fig. 11). The precision of this test is orders of magnitude weaker than the GW test with the Double Pulsar (see Sect. 4.2.1). Nevertheless, due to the high asymmetry in the compactness between the pulsar and the companion WD, this limit converts into tight constraints on dipolar radiation as predicted by alternatives to GR (see Sects. 5.2 and 5.5 for details).

Fig. 11
figure 11

Mass–mass diagram for PSR J1738+0333, assuming GR. Bands denote one-sigma ranges in the parameters. The mass ratio \(R = M_{\rm{p}}/M_{\rm{c}}\) (black) and the WD mass \(M_{\rm{c}}\) (red) are obtained from optical observations and WD models. The blue band is a result of Eq. (30) with \({\dot{P}}_{\rm{b}} = {\dot{P}}_{\rm{b}}^{\rm{int}}\). All three bands agree on a common region in the mass–mass plane, meaning GR has passed this pulsar-WD test

The GW test is not the only gravity test with PSR J1738+0333. It has also been used, for instance, in (generic) tests of preferred frame effects in the gravitational interaction, as discussed in Sect. 5.4.1.

4.4.2 PSR J1909−3744

PSR J1909−3744 is a fully recycled pulsar with a rotational period of 2.9 ms (Jacoby et al. 2003). Like PSR J1738+0333, it has an optically bright low-mass Helium-core WD as a companion that allows for high resolution spectroscopy (Jacoby et al. 2003). With its orbital period of 1.53 days it is considerably less relativistic than PSR J1738+0333. For that reason, and despite of its exquisite timing precision (one of the most precise timers known, Perera et al. 2019), there is to date no significant value for the intrinsic change in the orbital period. The limitation comes mainly from the uncertainties in the correction for the Shklovskii contribution to \({\dot{P}}_{\rm b}\) (Liu et al. 2020). Still, this system could be used in a GW test, providing important constraints on the strong-field scalarisation of NSs, as we discuss in Sect. 5.5.

Many years of high-precision timing of PSR J1909−3744 allowed a precise measurement of two PK parameters. Due to its near edge-on orientation (\(i = 86.4^{\circ }\) or \(93.6^{\circ }\)) with respect to the line of sight, precise measurements of the shape (s) and the range (r) of the Shapiro delay were possible (see Liu et al. 2020 and references therein). In combination with the mass ratio (from a combination of optical and radio observation) one can do a PK parameter test (see Fig. 12). That test is not of particular interest for GR and its alternatives. More importantly, the precise determination of the WD mass via the Shapiro delay makes it possible to test different WD models (Antoniadis 2013), which in turn is important for gravity tests in other systems (see e.g., Sect. 4.4.1).

Fig. 12
figure 12

Mass–mass diagram for PSR J1909−3744, assuming GR. Band denote one-sigma ranges in the parameters. The mass ratio \(R=M_{\rm{p}}/M_{\rm{c}}\) (black) is obtained by combining pulsar timing with radial velocity measurements of the WD by optical spectroscopy. The other two curves are from the detection of the Shapiro delay in the timing data: Shapiro range in red and Shapiro shape in blue. All three bands agree on a common region in the mass–mass plane, meaning GR has passed this pulsar-WD test. As is obvious, assuming GR, the Shapiro delay gives precise masses for pulsar (\(m_{\rm{p}} = 1.492(14)\,{\rm{M}}_{\odot }\)) and companion (\(m_{\rm{c}} = 0.209(1)\,{\rm{M}}_{\odot }\)). See Liu et al. (2020) for details

PSR J1909−3744 has been used in (generic) tests of gravitational preferred-frame effects and is currently providing the most stringent limits on \({\hat{\alpha }}_1\) (see Sect. 5.4.1).

Last but not least, due to its high timing precision, PSR J1909−3744 is of prime importance for all PTAs and their efforts to detect nano-Hz GWs (Perera et al. 2019).

4.4.3 PSR J2222−0137

PSR J2222−0137 is a mildly recycled binary pulsar with a spin period of 32.8 ms and an orbital period of 2.45 days (Boyles et al. 2013). The low orbital eccentricity (\(e = 3.8 \times 10^{-4}\)) indicates that the companion is a massive WD. Despite this low eccentricity, the relativistic advance of periastron could be measured with high precision: \({{\dot{\omega }}} = 0.09605(48)\,\hbox{deg yr}^{-1}\) (Guo et al. 2021). Besides \({{\dot{\omega }}}\), the system gives access to two more PK parameters from the measurement of the Shapiro delay. In combination, this 3-PK-parameter test leads to a \(\sim 1\%\) confirmation of GR (see Fig. 13). The corresponding GR masses for pulsar and companion are \(1.831(10)\,{\rm{M}}_{\odot }\) and \(1.319(4)\,{\rm{M}}_{\odot }\) respectively (Guo et al. 2021).

Fig. 13
figure 13

Mass–mass diagram for PSR J2222−0137, assuming GR. Bands denote one-sigma ranges in the parameters. Two parameters are associated with secular changes in the orbit: advance of periastron (\({{\dot{\omega }}}\), black) and change of orbital period due to GW damping (\({\dot{P}}_{\rm b}\), blue). The parameters \(h_3\) (red) and \(\varsigma \) (orange) are part of the orthometric parametrisation of the Shapiro delay (Freire and Wex 2010), replacing the PK parameters r and s according to Eq. (17). See Guo et al. (2021) for details

The large mass of the pulsar makes this system particularly interesting for certain non-linear deviations from GR in the strong-gravity regime of NSs. In particular, this system plays a key role in closing the mass gap of spontaneous scalarisation in DEF gravity (see Sect. 5.5 for details). Two additional aspects are of particular importance for this test. Firstly, the timing precision and the large ratio \(x/P_{\rm{b}}\) allow a particularly precise measurement of \({\dot{P}}_{\rm{b}}\). Secondly, VLBI observations by Deller et al. (2013) obtained the most precise VLBI distance for any pulsar (\(268 \pm 1\) pc) as well as precise values for the proper motion of the system. As a result, precise corrections for the extrinsic contributions to \({\dot{P}}_{\rm{b}}\) were possible, leading to a two-sigma significant measurement of the GW damping (\({\dot{P}}_{\rm{b}} = -0.0143(76) \times 10^{-12}\,\hbox{s s}^{-1}\)), although the orbital period is much larger than for any other binary pulsar that currently allows the verification of GW damping.

4.4.4 Other pulsar-white dwarf systems

There are a number of short orbital period pulsar-WD systems that have been used for gravity tests in the past but became less important in recent years (which in some cases may change again in the future). There are various reasons for this. In many cases, the precision of gravity tests with these systems has now simply been surpassed by other systems. The most important examples are PSR J0348+0432 (Antoniadis et al. 2013), PSR J1012+5307 (Lazaridis et al. 2009; Ding et al. 2020), PSR J11416545 (Bhat et al. 2008). In particular for the last one, an eccentric pulsar-WD system in a 4.7-h orbit, we expect updated gravity tests in the near future that will provide the best constraints within specific parameter ranges of scalar–tensor theories (some preliminary results can already be found in Venkatraman Krishnan 2019).Footnote 23

A completely different group of pulsar-WD systems relevant for gravity tests are those with very wide orbits. In general, these systems do not allow the test of an effect predicted by GR, like the pulsar-WD systems above (i.e., one cannot draw three curves in a mass–mass plane), but due to their orbital properties they provide important constraints on specific deviations from GR, such as a time dependent gravitational constant G or a violation of the universality of free fall (see Sect. 5.5 below for more details). A particularly noteworthy pulsar in this context is the 4.6-ms pulsar J1713+0747. It is in a 68-day low-eccentricity (\(e = 7.5 \times 10^{-5}\)) orbit with a \(0.3\,{\rm{M}}_\odot \) WD. Mass measurements in this system were possible due to the detection of a Shapiro delay, the only relativistic effect measured in this system. The latest timing results for this pulsar can be found in Zhu et al. (2019).

A truly unique pulsar-WD system is that of the 2.7-ms pulsar J0337+1715 (Ransom et al. 2014). This pulsar is a member of a hierarchical triple, where the \(1.44\,{\rm{M}}_\odot \) pulsar and a \(0.20\,{\rm{M}}_\odot \) WD orbit each other in 1.63 days. This inner binary is in a 327-day orbit with an outer WD of \(0.41\,{\rm{M}}_\odot \). Both orbits have a low eccentricity of \(7\times 10^{-4}\) and 0.035, respectively. The only relativistic effect observed in this system so far is the (varying) special relativistic time dilation caused by the (epicyclic) motion of the pulsar in the inertial frame of the triple system. Nevertheless, as we will see in Sects. 5.1 and 5.5.1, this system provides some of the tightest limits on deviations from GR (generic and theory specific). For details see also Refs. Archibald et al. (2018), Voisin et al. (2020).

5 Pulsar experiments and strong field deviations from GR

The fact that GR has passed all these tests presented in the previous chapter has important consequences in two ways. On the one hand, GR can bee used as a tool in binary pulsars to determine precise masses of NSs (Özel and Freire 2016; Fonseca et al. 2021), to get pulsar distances (see e.g., PSR B1534+12 in Sect. 4.3), or to confirm the fast rotation of a companion WD where, besides classical spin-orbit coupling, the LT effect contributes to the precession of the orbital plane (Venkatraman Krishnan et al. 2020), just to name a few.

On the other hand, the excellent agreement of pulsar experiments with GR also means tight constraints on deviations from GR in the presence of strongly self-gravitating bodies, i.e., deviations in the orbital motion, GW emission, photon propagation, etc. In the following we will highlight some of such tests conducted with radio pulsars. Our list is, however, far from complete. We distinguish between generic tests of phenomena expected on the basis of general theoretical assumptions on how deviations from GR might affect pulsar systems, and tests of specific theories of gravity.

For generic tests in the weak-field regime, the most important framework is the parametrised PN (PPN) framework, where 10 parameters quantify deviations from GR at the first PN level (Will 1993, 2014). These PPN parameters assume different values in different alternatives to GR, depending also on whether they are conservative theories of gravity or not (see Table 2). It is important for pulsar experiments that these PPN parameters become body-dependent in a system with strongly self-gravitating bodies, which we will explain in more detail below.

Table 2 List of generic parameters subject to gravity tests (including the ten PPN parameters; first group); their physical meaning and predictions for them in GR and fully conservative and semi-conservative gravity theories (under “c/sc”; the parameters \(\alpha _1\) and \(\alpha _2\) are 0 in fully conservative theories)

If the symmetric spacetime metric \(g_{\mu \nu }\) is the only gravitational field in a four-dimensional spacetime manifold then, under some plausible assumptions, GR (including a cosmological constant) emerges as the unique theory of gravity. This is the result of Lovelock’s uniqueness theorem (Lovelock 1972). To get around Lovelock’s theorem, one has to relax one (or more) of the assumptions that go into it (see e.g., Fig. 1 in Berti et al. 2015). Arguably the most popular assumption is the existence of additional (generally dynamical) gravitational fields, for instance one or more scalar fields. If non-gravitational fields do not couple directly to these additional gravitational fields (only to \(g_{\mu \nu }\); universal coupling), the theory is a metric theory of gravity and by construction fulfills the EEP (see e.g., Will 2018). Gravitational experiments, on the other hand, are generally expected to deviate from GR due to the presence of additional gravitational fields. From a heuristic perspective (and with some degree of approximation), certain aspects of such deviations have been summarised in the “strong equivalence principle (SEP)”, an extension of the EEP to local gravitational experiments:

  • Extension of the universality of free fall (UFF) to self-gravitating bodies in an external gravitational field.

  • Absence of preferred-location effects, i.e., any local (including gravitational) experiment is independent of where and when it is performed.

  • Absence of preferred-frame effects, i.e., any local (including gravitational) experiment is independent of motion of the (freely falling) local reference frame.

See Will (2014) for a detailed discussion. It is plausible that GR is the only viable metric theory of gravity that embodies SEP completely (Will 2014, 2018).Footnote 24

In contrast to GR, in alternative theories featuring auxiliary gravitational fields (\(\psi _a\)), the structure of a self-gravitating body generally depends on its external gravitational environment. Therefore, the masses of the bodies that enter the action of a N-body system depend on the boundary values \(\psi _a^{(0)}\). As a consequence of this, unlike in GR, the dynamics of an N-body system depends on the actual structure of the individual bodies, which leads to a violation of the SEP.Footnote 25 An important set of (body-dependent) parameters are the sensitivities, which describe the dependence of certain properties of a body on the boundary values \(\psi _a^{(0)}\). For instance, the sensitivities of the mass are defined by

$$\begin{aligned} s_A^{(a)} \equiv \left( \frac{\partial \ln m_A}{\partial \ln \psi _a^{(0)}}\right) _{N_b}, \end{aligned}$$
(45)

whereby the number of baryons \(N_b\) is kept constant. Sensitivities enter directly into the dynamics of a system of gravitationally interacting bodies and in this sense quantify the violation of the SEP. At the Newtonian level of the equations of motion, sensitivities enter the effective gravitational constant \(G_{AB}\) for the interaction between two bodies, and therefore lead to a violation of the UFF. They further modify the (weak-field) PPN parameters of Table 2 in a body-dependent way; for instance \(\gamma _{\rm PPN}\) gets replaced by \(\gamma _{AB}\). Furthermore, sensitivities also enter aspects of GW damping, which occurs beyond the first PN order. Sensitivities are theory dependent and depend on the structure of the body, hence its EoS. For weakly self-gravitating bodies the sensitivities are small. For instance, in mono-scalar-tensor theories \(s_A\) is of the order of the fractional binding energy of body A (see e.g., Will 2018), which is \(\lesssim 10^{-6}\) for bodies of the Solar System. Depending on the details of the theory, the sensitivities for a strongly self-gravitating body such as a NS are generally of the order of 0.1, but can also be much larger than that. Therefore, precision pulsar experiments are ideal for searching for deviations from GR caused by the strong internal fields of NSs. Many more details on the above can be found, e.g., in Will (1993, 2018).

In the following, we will present various theory agnostic and theory specific gravity tests based on radio-pulsar observations. Our theory agnostic discussions will mainly focus on the PPN framework and its extensions to strongly self-gravitating bodies via (body-dependent) effective PPN parameters. Table 3 gives a comparison of experimental constraints on the parameters listed in Table 2 from Solar System and pulsar experiments. However, there are other generic frameworks to discuss and compare gravity experiments, for instance the Standard-Model Extension (SME; Kostelecký 2004), where also pulsar experiments have provided interesting constraints on some of the parameters of this effective field theory (Shao 2014; Shao and Bailey 2019; Dong et al. 2024). Another popular framework, in particular in the context of GW observations, is the parametrised post-Einsteinian (ppE) framework. Pulsar experiments in the context of the ppE framework have been discussed, e.g., in Yunes and Hughes (2010), Nair and Yunes (2020).

Table 3 Comparison of Solar System and binary pulsar tests for various parameters

5.1 Strong-field Nordtvedt effect

GR is based on two basic postulates, namely a) the postulate of a universal coupling between matter and gravity (where the Minkowski metric \(\eta _{\mu \nu }\) in the laws of special relativity gets replaced by a curved spacetime metric \(g_{\mu \nu }(x^\alpha )\)), and b) Einstein’s field equations that define the dynamics of \(g_{\mu \nu }(x^\alpha )\) (see e.g., Will 1993; Damour 2009, 2012). It follows from a) that a small (sufficiently idealised) test body with negligible gravitational binding energy follows a geodesics in the curved spacetime \(g_{\mu \nu }\), independent of its mass and composition (weak equivalence principle (WEP); see e.g., Ehlers and Geroch 2004; Damour and Lilley 2008; Steinhoff and Puetzfeld 2010; Di Casola et al. 2014 and references therein). From a Newtonian point of view, this can be understood as an equivalence between inertial and (passive) gravitational mass (see e.g., Will 1993; Di Casola et al. 2015). Over the course of history, this UFF for test bodies has been verified with ever greater precision and has currently reached a precision of \(\sim 10^{-15}\) (Touboul et al. 2022). As discussed above in the introduction to this section, for GR this UFF extends to self-gravitating bodies, including strongly self-gravitating bodies like NSs and BHs.Footnote 26

The postulate of universal coupling is not a unique feature of GR. As we have discussed above, it is a basic postulate for all metric theories of gravity. These theories of gravity differ from each other by the second postulate, i.e., the field equations. It is these field equations that are responsible for a violation of the SEP by alternatives to GR in general and specifically for a violation of the UFF by the presence of gravitational binding energy (breakdown of the gravitational WEP (GWEP)). As mentioned above, to leading order in the equations of motion, instead of the Newtonian gravitational constant \(G_{\rm N}\), we have an effective gravitational constant \(G_{AB}\) that depends on the structure of the bodies. To give an example, for the class of Bergmann–Wagoner scalar-tensor theories of gravity this effective gravitational constant is given by

$$\begin{aligned} G_{AB} = G_{\rm N} \left. [1 - 2\zeta (s_A + s_B - 2s_A s_B) \right] \quad (A \ne B), \end{aligned}$$
(46)

where \(\zeta \) is a theory dependent constant (constrained to \(\lesssim 10^{-5}\) in the Solar System (Bertotti et al. 2003; Will 2014)) and \(s_A\) is the sensitivity defined in Eq. (45), which is body dependent and can assume very large values for NSs (see e.g., Will 2018).Footnote 27 Equation (46) shows that interpreting a violation of the UFF in terms of body-specific ‘gravitational masses’ is no longer meaningful in a strong-field context (see also the discussion in Di Casola et al. 2015).

In case of a violation of the UFF, a binary system falling freely in the external gravitational field of a third body will experience a characteristic polarisation of the orbit (a gravitational analogue to the Stark effect in atoms exposed to an external electric field). The reason for this is the different (external) acceleration (g) of the two components, A and B, of the binary system in the direction of the third body C:

$$\begin{aligned} \Delta _{\rm AB} \equiv \frac{g_{\rm A} - g_{\rm B}}{\frac{1}{2}(g_{\rm A} + g_{\rm B})} \simeq \frac{G_{\rm AC} - G_{\rm BC}}{G}. \end{aligned}$$
(47)

It was Kenneth Nordtvedt who first discovered such an effect in an alternative to GR, i.e., Jordan–Fierz–Brans–Dicke (JFBD) gravity, and suggested to test this with the help of Lunar Laser Ranging (LLR) (Nordtvedt 1968). In the weak field of the Solar System one can approximate for the Earth(E)–Moon(M) system: \(\Delta _{\rm EM} \simeq \eta _{\rm N}\left( \frac{E_{\rm E}^{\rm grav}}{m_{\rm E}c^2} - \frac{E_{\rm M}^{\rm grav}}{m_{\rm M}c^2}\right) \), where \(\eta _{\rm N}\) is the Nordtvedt parameter, a combination of various PPN parameters (see e.g., Will 2018 for details).

In the meantime LLR has put tight constraints on the Nordtvedt parameter: \(|\eta _{\rm N}|\lesssim 7 \times 10^{-5}\) (Biskupek et al. 2021). However, Damour and Schaefer (1991) pointed out that such a test in the weak-field regime of the Solar System is unable to constrain any higher order/strong-field contributions to the Nordtvedt effect that might become significant in the presence of strongly self-gravitating masses. Furthermore, they proposed a test of such a strong-field Nordtvedt effect with the help of pulsar-WD systems falling in the gravitational field of the Milky Way (Damour–Schäfer test). If the pulsar were to experience a different acceleration in the external gravitational field than the WD, the eccentricity of the binary pulsar system would change in a characteristic way over time (see Fig. 14). By combining several suitable pulsar-WD systems, this kind of test has lead to a limit of \(|\Delta _{\rm p0}|\lesssim 5 \times 10^{-3}\) (95% C.L., Stairs et al. 2005; Gonzalez et al. 2011), where the index 0 indicates that this is with respect to the (well tested) acceleration of a weakly self-gravitating mass, i.e., the WD. The best such limit from a pulsar-WD system comes from PSR J1713+0747 (cf. Sect. 4.4.4): \(|\Delta _{\rm p0}|\lesssim 2 \times 10^{-3}\) (95% C.L., Zhu et al. 2019).

Fig. 14
figure 14

“Polarisation” of a nearly circular binary orbit under the influence of a forcing vector \({\textbf{g}}\), showing the relation between the forced eccentricity, \({\textbf{e}}_{\rm{F}}\), the eccentricity evolving under the general-relativistic advance of periastron, \({\textbf{e}}_{\rm{R}}(t)\), and the angle \(\theta = \theta _0 + {{\dot{\omega }}}\,t\). The actually observed eccentricity is given by \({\textbf{e}}_{\rm{obs}}(t) = \textbf{e}_{\rm{F}} + \textbf{e}_{\rm{R}}(t)\). After Wex (1997)

Compared to the Earth–Moon system in the Sun’s gravitational field, pulsar-WD systems as described above have a decisive disadvantage in testing the Nordtvedt effect: their external acceleration in the Galactic gravitational field is about seven orders of magnitude smaller. In this sense, the discovery of PSR J0337+1715 as a member of a hierarchical triple system with two WD companions (cf. Sect. 4.4.4) was a decisive turning point. The inner pulsar-WD binary accelerates in the gravitational field of the external WD with about \(0.17\,\hbox{cm s}^{-2}\) which is comparable to the external acceleration of the Earth–Moon system (\(\sim 0.6\,\hbox{cm s}^{-2}\)). Consequently, after about 6 years of timing observations of PSR J0337+1715 the above limits from pulsar-WD binaries could be improved by three orders of magnitude: \(|\Delta _{\rm{p0}}|< 2.6 \times 10^{-6}\) (95% C.L.) (Archibald et al. 2018). In an independent approach using a different set of timing data, Voisin et al. (2020) were able to slightly improve that limit to \(\Delta _{\rm{p0}} = (0.5 \pm 1.8) \times 10^{-6}\) (95% C.L.). Due to the large fractional binding energy of a NS, this limit implies the tightest limit for some alternative theories of gravity, including JFBD gravity, which originally motivated Kenneth Nordtvedt to suggest this type of test with LLR (see Sect. 5.5 for more details). Just to mention it already here, for JFBD gravity, PSR J0337+1715 gives a conservative 95% confidence lower limit for the Brans–Dicke parameter of \(\omega _{\rm{BD}} \gtrsim 150,000\) (GR is obtained for \(\omega _{\rm{BD}} \rightarrow \infty \)), which is considerably stronger than the Solar System limit of 40,000 obtained from Cassini (Bertotti et al. 2003; Will 2018) (see Fig. 15).

Fig. 15
figure 15

95% Confidence lower limit on the Brans–Dicke parameter \(\omega _{\rm{BD}}\) from the pulsar in a stellar triple system, for 64 different NS EoSs. The x-axis shows the radius of a \(1.4\,{\rm{M}}_{\odot }\) NS predicted by the corresponding EoS. Obviously, stiffer EoSs give weaker limits (GR is obtained for \(\omega _{\rm{BD}} \rightarrow \infty \)). The red band gives the 95% credible limit for \(R_{1.4}\) given as the more conservative range in Koehn et al. (2024), hence we have the conservative lower limit of \(\omega _{\rm{BD}} \gtrsim 150,000\) (see Voisin et al. 2020 for more details). The 64 EoSs were taken from Read et al. (2009), Kumar and Landry (2019), assuming the criterion of a maximum mass of more than \(1.92\,{\rm{M}}_{\odot }\) (99% confidence lower limit of PSR J0740+6620 Fonseca et al. 2021)

Finally, it should be noted that the triple system test has not made the tests with pulsar WD systems that fall in the gravitational field of the Milky Way completely obsolete. There are still aspects of the UFF that cannot be tested with PSR J0337+1715. For instance, the limit from PSR J1713+0747 can be interpreted as a test of the UFF towards dark matter in our Galaxy, which is of particular interest due to the pulsar’s neutron-rich composition and the significant amount of gravitational binding energy of the pulsar (see e.g., Shao et al. 2018).

5.2 Dipolar gravitational radiation

GR predicts that the lowest multipole for the generation of GWs is the mass quadrupole. The absence of any time-varying mass monopole and mass dipole is closely connected to the fulfillment of the SEP and the corresponding effacement of the internal structure of bodies (Damour 1987; Will 2018). Alternative theories of gravity, which violate the SEP are generally expected to predict gravitational radiation at lower multipole moments, linked to additional gravitational fields (e.g., a scalar field). The corresponding (specific) gravitational “charge” of a body depends on the body’s internal structure. In a binary pulsar system with \(M_{\rm p} \ne M_{\rm c}\) this leads, most notably, to a time-varying gravitational dipole moment (Will 1993; Gérard and Wiaux 2002).Footnote 28 As a consequence, there is an additional damping of the orbital motion by dipolar GWs, leading to a contribution in the change of the orbital period, which in terms of the sensitivities can be written as (Will 1993)

$$\begin{aligned} {\dot{P}}_{\rm{b}}^{\rm{Dipole}} = -2\pi \left( \frac{P_{\rm{b}}}{2\pi }\right) ^{-1} (T_{\odot }M) X_{\rm{p}} X_{\rm{c}} \, \frac{1 + e^2/2}{(1 - e^2)^{5/2}} \, \kappa _{\rm{D}}(s_{\rm{p}}-s_{\rm{c}})^2 + {\mathcal{O}}(c^{-5},s^3), \end{aligned}$$
(48)

where \(\kappa _{\rm{D}}\) is a theory-dependent constant of the theory. The constant G is Newton’s gravitational constant as obtained in a Cavendish-type experiment. While in GR the damping of the binary orbit enters the equations of motion at the 2.5 PN order (\({\mathcal{O}}(c^{-5})\)), the dipolar radiation—as can be seen from Eq. (48) (see e.g., also Mirshekari and Will 2013)—already enters at the 1.5 PN order, i.e., \({\mathcal{O}}(c^{-3})\). For \(\kappa _{\rm D}(s_{\rm p} - s_{\rm c})^2 \sim 1\) that means a change in the orbital period which is about six orders of magnitude larger than the GR prediction for binary pulsars. Consequently, binary pulsar experiments are generally very sensitive to any presence of dipolar radiation, in particular if there is a significant asymmetry in the sensitivities between pulsar and companion.

If there is a gravitational dipole moment, one would generally expect it to be particularly large for pulsars with a WD companion, where, in comparison to the pulsar, the WD can be considered as a body with weak self-gravity having \(s_A \lesssim 10^{-3}\). For that reason, pulsar-WD systems are particularly interesting for constraining alternatives to GR (see Sect. 5.5 for a detailed discussion). From those pulsar-WD systems that allow a (mostly) theory independent determination of the masses, one can even derive quite generic limits on dipolar radiation. One such pulsar is PSR J1738+0333 which has a bright WD as companion that shows very prominent Balmer lines (see Sect. 4.4.1 for more details on that system). High resolution spectroscopy gives access to the mass ratio R and the mass of the WD (\(M_{\rm c}\)) (Antoniadis et al. 2012). The agreement of the (intrinsic) \({\dot{P}}_{\rm b}\) of the PSR J1738+0333 system with GR can then directly be converted into the limit \(|\kappa _{\rm D}^{1/2}s_{\rm p}|\lesssim 2 \times 10^{-3}\) (95% confidence), for a NS of about \(1.47\,{\rm{M}}_{\odot }\) (Freire et al. 2012).

Finally, theories that violate the SEP and predict the existence of dipolar radiation generally also are expected to predict a temporal variation of the gravitational constant G. A change in G, however, also leads to a change in the orbital period. To disentangle this effect from dipolar radiation, one needs to combine (at least two) different binary pulsars with different orbital periods (see Sect. 5.3.1 for more details).

5.3 Preferred-location effects

The SEP states that the outcome of any local experiment, including gravitational experiments with self-gravitating bodies, is independent of where and when in the universe it is performed. A violation of this local position invariance leads to a location and/or time dependence of gravitational phenomena, for instance a gravitational constant that evolves over time, with the expansion of the universe, or depends on the spatial location in a gravitating system. Scalar–tensor theories of gravity generally show such kind of preferred-location effects, for instance a change in Newton’s gravitational constant (as measured in a Cavendish experiment) due to the cosmological evolution of the (background) scalar field.

5.3.1 Time variation of Newton’s gravitational constant

That Newton’s gravitational constant is not in fact a constant of nature, but decreases in value over time, was already proposed by Paul Dirac as part of his large numbers hypothesis (Dirac 1937). Efforts to put Dirac’s heuristic reasoning on a field-theoretical footing, together with the motivation to implement (certain aspects of) Mach’s principle, eventually led to the development of scalar–tensor theories, with the JFBD theory of gravity (often just called Brans–Dicke gravity) as their best-known representative (Jordan 1955; Fierz 1956; Brans and Dicke 1961; Brans 2005).

Quite generally, theories that violate the SEP by allowing for preferred location effects are expected to permit Newton’s constant, G, to vary over time while the universe expands. Purely heuristically, temporal variations in G are expected to occur on the timescale of the age of the Universe, such that \(\dot{G}/G \sim H_0 \sim 0.7\times 10^{-10}\) \(\hbox{yr}^{-1}\), where \(H_0\) is the Hubble constant. Three different pulsar-derived tests can be applied to these predictions, as a SEP-violating time-variable G would be expected to alter the properties of NSs and WDs, and to affect binary orbits.

The effects on the orbital period of a binary system of a varying G were first considered in Damour et al. (1988), who expected:

$$\begin{aligned} \left( \frac{\dot{P}_{\rm b}}{P_{\rm b}}\right) _{\dot{G}} = -2\,\frac{\dot{G}}{G}. \end{aligned}$$
(49)

Applying this equation to the limit on the deviation from GR of the \(\dot{P}_{\rm b}\) for PSR 1913+16, they found a value of \(\dot{G}/G = (1.0 \pm 2.3)\times 10^{-11}\) \(\hbox{yr}^{-1}\). With the latest results from the Double Pulsar (see in particular Sect. 4.2.1) one obtains \(\dot{G}/G = (-0.8 \pm 1.4)\times 10^{-13}\) \(\hbox{yr}^{-1}\). Applying Eq. (49) to the binary pulsar J1713+0747, which has a WD companion, gives \(\dot{G}/G = (-0.8 \pm 2.4)\times 10^{-13}\) \(\hbox{yr}^{-1}\). While such an approach to obtain limits for \(\dot{G}\) from binary pulsars provides a first order estimation, the actual numbers have to be taken with a grain of salt for several reasons.

Nordtvedt (1990) pointed out that for a strongly self-gravitating body a change in G leads to a significant change in the mass of the body, which in turn adds significant changes to the orbital period of a binary pulsar. Taking this effect into account leads to a corrected expression for Eq. (49)

$$\begin{aligned} \left( \frac{\dot{P}_{\rm b}}{P_{\rm b}}\right) _{\dot{G}} = -\left[ 2 - (X_{\rm p}c_{\rm p} + X_{\rm c}c_{\rm c}) - \frac{3}{2} (X_{\rm p}c_{\rm c} + X_{\rm c}c_{\rm p}) \right] \frac{\dot{G}}{G}. \end{aligned}$$
(50)

The compactness \(c_A \equiv -2\partial \ln m_A/\partial \ln G \approx -2 E_A^{\rm grav}/(m_Ac^2)\) is a measure for the change of a body’s mass due to a change of the (local) gravitational constant.Footnote 29 It is closely related to the body’s sensitivity \(s_A\) and one often finds \(c_A \approx 2s_A\) (see e.g., Damour and Esposito-Farèse 1992 for multi-scalar-tensor theories). The corrections by the compactnesses of pulsar and companion in Eq. (50) generally weaken the limits which are obtained directly from Eq. (49). In fact, as already pointed out in Nordtvedt (1990), depending on the masses, the EoS of NS matter, and the underlying theory gravity, for a pulsar with a NS companion the expression in box brackets in Eq. (50) may easily be small compared to the factor 2 in Eq. (49). For that reason, pulsars with WD companions provide a somewhat more reliable generic test of \({\dot{G}}\), since there \(c_{\rm c} \simeq 0\) and therefore

$$\begin{aligned} \left( \frac{\dot{P}_{\rm b}}{P_{\rm b}}\right) _{\dot{G}} = -\left[ 2 -\left( 1 + \frac{1}{2} X_{\rm c}\right) c_{\rm p} \right] \frac{\dot{G}}{G}. \end{aligned}$$
(51)

For \(c_{\rm p}\) typically of order 0.2, limits are still comparable to what one obtains from Eq. (49).

The approach discussed so far comes with the assumption that a limit on the deviation of the (system intrinsic) \({\dot{P}}_{\rm b}\) from its GR expectation can directly be converted into a limit for \({\dot{P}}_{\rm b}^{{\dot{G}}}\). This assumption is unjustified insofar as a theory that predicts a \({\dot{G}}\) is quite generally also predicting a \({\dot{P}}_{\rm b}\) contribution from dipolar radiation (see Sect. 5.2) that cannot be separated from \({\dot{P}}_{\rm b}^{{\dot{G}}}\) in a single system.Footnote 30 Lazaridis et al. (2009) have therefore suggested to do a combined test with two suitable pulsar-WD systems that have a sufficiently large difference in their orbital periods, in order to break that covariance. Combining the long orbital period binary pulsar J1713+0747 with different short orbital period pulsar-WD systems, eventually led to (Lazaridis et al. 2009; Zhu et al. 2019; Ding et al. 2020)

$$\begin{aligned} |{\dot{G}}/G|\lesssim 10^{-12} \, \hbox{yr}^{-1} = 0.014\,H_0 \quad (95\%\,\hbox{C.L.}). \end{aligned}$$
(52)

It must be kept in mind that the exact value for that limit depends on the values of \(c_{\rm{p}}\) (respectively \(s_{\rm{p}} \approx c_{\rm{p}}/2\)) for the different NS masses that enter into Eqs. (48) and (51), for which we can only give approximate numbers in generic tests, provided certain additional assumptions hold (see e.g., Freire et al. 2012 for a discussion).

When compared to limits from Solar System experiments (e.g., Genova et al. 2018; Hofmann and Müller 2018; Biskupek et al. 2021), the pulsar limit (52) is more than an order of magnitude weaker. However, binary pulsar timing tests \({\dot{G}}\) in a very different gravity regime, and is therefore complementary to tests in weak-field environments like the Solar System. Just to give two examples. First, in theories like Barker’s constant-G theory (Barker 1978) there would still be a change to the NS mass if the background scalar field \(\varphi \) changes with the expansion of the universe, and hence a corresponding change in the orbital period of a binary pulsar (\({\dot{P}}_{\rm b} \propto {{\dot{\varphi }}}\)). Secondly, in above equations we have ignored that in the interaction between pulsar and companion there is actually an effective gravitational constant which has a body-dependent contribution. Nordtvedt (1993) has written this effective gravitational constant as \(G^{\rm{eff}}(t) = G(t) K_{\rm{pc}}(t)\), where \(K_{\rm{pc}}(t)\) depends on the structure of the two bodies in the binary system (\(K_{12}(t) \simeq 1\) for two weakly self-gravitating masses \(m_1\) and \(m_2\)). As shown in Wex (2014), depending on the details of the gravity theory, i.e., the details of \(K_{\rm{pc}}(t)\), the strong-field of a NS can lead to a very significant amplification of \({\dot{G}}_{\rm{eff}}\) so that \(|{\dot{G}}_{\rm{eff}}|\gg |{\dot{G}}|\).

5.3.2 Spatial variation of Newton’s gravitational constant

In addition to a temporal change in the locally measured gravitational constant \(G_{\rm loc}\), beyond GR there can generally also be a spatially varying gravitational constant, which depends on the position relative to (external) masses. Even for theories that in the weak field only deviate in \(\gamma _{\rm PPN}\) and/or \(\beta _{\rm PPN}\), \(G_{\rm loc}\) is generally expected to depend on the distance r to the external mass m:

$$\begin{aligned} \frac{G_{\rm loc}(r)}{G_0} \simeq 1 - (4\beta _{\rm PPN} - \gamma _{\rm PPN} -3) \frac{G_0 m}{c^2 r} = 1 - \eta _{\rm N} \,\frac{G_0 m}{c^2 r} \end{aligned}$$
(53)

where \(G_0\) corresponds to the local gravitational constant at spatial infinity (see e.g., Will 2018 for an expression that contains additional PPN parameters). For a strongly self-gravitating body, \(\eta _{\rm N}\) in Eq. (53) gets replaced by a body dependent parameter, which in the literature is often denoted by \(\eta ^*\).

As a consequence of a distance-dependent \(G_{\rm loc}\), a pulsar moving in an eccentric orbit around a strongly self-gravitating companion will experience a periodic variation of \(I_{\rm p}\) according to

$$\begin{aligned} \frac{\Delta I_{\rm p}}{I_{\rm p}} \simeq -\kappa _{\rm p} \, \frac{\Delta G_{\rm loc}(r)}{G_0} \simeq \kappa _{\rm p} \, \eta _{\rm c}^*\frac{G_0 m_{\rm c}}{c^2 r} \end{aligned}$$
(54)

which in turn leads to a variation in the pulsar’s intrinsic rotation according to \(\Delta \nu /\nu _0 = -\Delta I_{\rm p}/I_{\rm p}\) (Eardley 1975; Will 2018), due to the conservation of angular momentum. The (body-dependent) quantity \(\kappa _{\rm p} \equiv (\partial \ln I_{\rm p}/\partial \ln G)_{N_b}\) is the sensitivity of the MoI of the pulsar, describing how the MoI of a NS (with a specific mass) changes in response to a variation in the local gravitational constant. The variation in the (observed) spin-frequency \(\nu \) caused by such an effect would become apparent as a modification of the PK parameter \(\gamma \) of the Einstein delay (cf. Eq. 18 and e.g., Will 2018). Consequently, radio pulsars in eccentric DNS systems are ideal to test for a change of the local gravitational constant in the vicinity of a strongly self-gravitating body. The wealth of PK parameters measured in the Double Pulsar (see Sect. 4.2) even allows a generic \(\sim 10^{-3}\) constraint on such a strong-field violation of the SEP (Kramer and Wex 2009; Kramer et al. 2021). Apart from that, this effect also plays an important role when constraining specific alternatives to GR, like scalar–tensor or TeVeS-like theories (see e.g., Damour and Esposito-Farèse 1996; Kramer et al. 2021).

In the following we discuss constraints on preferred location effects related to a non-vanishing Whitehead parameter \(\xi \). Such a deviation from GR is closely connected to a spatial anisotropy of the local gravitational constant.

5.3.3 Limits on \({\hat{\xi }}\)

A non-vanishing Whitehead parameter \(\xi \) leads to an anisotropy in the gravitational interaction of localised systems, induced by the mass distribution of our Galaxy. Such an anisotropy would lead to characteristic (non-GR) signatures in the dynamics of self-gravitating systems. While in general the best pulsar constraints on deviations from GR are obtained from binary systems or the pulsar J0337+1715 as part of a hierarchical triple system (Sect. 5.1), by far the best limits for (the strong-field generalisation of) \(\xi \) come from fast spinning solitary MSPs. A violation of local position invariance related to a non-vanishing \(\xi \) would lead to the precession of a spinning self-gravitating body around the direction to the Galactic centre (as the CM of the Milky Way) (Nordtvedt 1987). The rate of precession is given by (Nordtvedt 1987; Shao and Wex 2013)

$$\begin{aligned} \Omega _\xi ^{\rm{prec}} = \xi \, \frac{2\pi }{P} \left( \frac{v_{\rm{G}}}{c}\right) ^2\cos \vartheta _{\rm{G}}, \end{aligned}$$
(55)

where P denotes the rotational period of the body, \(v_{\rm G}\) the rotational velocity of the Galaxy at the location of the pulsar (as a measure for the local Galactic potential \(U_{\rm G}\)), and \(\vartheta _{\rm G}\) the angle between the direction to the Galactic centre and the spin of the body.Footnote 31 Nordtvedt (1987) used the alignment of the Sun with the planetary orbits to set a constraint on \(\xi \) of the order of a few times \(10^{-7}\). In the same publication, Nordtvedt already hinted at the potential of using fast rotating solitary pulsars to constrain \(\xi \). Following this, Shao and Wex (2013) have used 15 years of continuous observations of the two solitary MSPs B1937+21 (\(P = 1.56\) ms) and J1744-1134 (\(P = 4.07\) ms) with the 100-m Effelsberg radio telescope to infer a limit of

$$\begin{aligned} |{\hat{\xi }}|< 3.9 \times 10^{-9} \quad (95\%\,\hbox{confidence}). \end{aligned}$$
(56)

where the hat indicates that pulsars are testing a strong-field generalisation of the Whitehead parameter \(\xi \). Both of the pulsars are near the Galactic plane. The decisive factor for obtaining the limit (56) was the stability of the pulse profiles over such a long period of time, which gave no indication of any precession of the spins of these two pulsars.

A non-vanishing \(\xi \) also leads to a precession of the orbital angular momentum of a binary pulsar (in Eq. (62) the rotational period P gets replaced by the orbital period \(P_{\rm{b}}\)). Shao et al. (2015) have used the binary pulsars J1012+5307 and J1738+0333 (both with a WD companion) to derive a (orbital dynamics related) limit of \(|{\hat{\xi }}|< 3.1\times 10^{-4}\) (95% confidence) which, however, is five orders of magnitude weaker than that from solitary pulsars.

The above limit on \({\hat{\xi }}\) can straightforwardly be converted into a limit on the anisotropy of the gravitational constant. If the local position invariance is violated, for a self-gravitating system falling freely in the gravitational potential of the Galaxy \(U_{\rm G}\) one could have a directional dependence in the local gravitational constant. For a system with mass m, radius R and MoI I one then finds a variation in the gravitational constant G of (Will 1993)

$$\begin{aligned} \frac{\Delta G}{G} \simeq \xi \left( 1 - \frac{3I}{mR^2}\right) U_{\rm{G}} \cos ^2\vartheta _{\rm{loc}}, \end{aligned}$$
(57)

where \(\vartheta _{\rm{loc}}\) denotes the angle between the direction to the Galactic centre and the direction to the location where G is being measured, as seen from CM of the self-gravitating system. Such an anisotropy is generally expected to cause a precession of a rotating (self-gravitating) body like a pulsar. In case of a NS, one typically has \(I/mR^2 \sim 0.4\). Using this number, from their limit on \(\xi \) (Eq. 56) Shao and Wex (2013) then obtained

$$\begin{aligned} \left|\frac{\Delta G}{G} \right|< 4 \times 10^{-16} \quad (95\%\,\hbox{confidence}), \end{aligned}$$
(58)

which is the so far tightest limit on an anisotropy of G.

5.4 Preferred-frame effects and violation of the conservation of momentum

So far, we have dealt with deviations from GR that can already occur in fully conservative theories of gravity. In the following we will summarise pulsar tests for PPN parameters that are linked to the violation of the Lorentz invariance of the gravitational interaction (preferred frame effects), leading to a violation of the conservation of angular momentum, as well as tests of parameters that result in a violation of the conservation of total momentum.

5.4.1 Limits on \({\hat{\alpha }}_1\)

A non-vanishing \({\hat{\alpha }}_1\) implies that the (uniform) motion of a binary pulsar system with respect to a “universal” reference frame (defined by the rest frame of the Cosmic Microwave Background (CMB)) will affect its orbital evolution. Similar to the violation of the UFF (see Sect. 5.1), the time evolution of the observed eccentricity will depend on both a vector \({\textbf{e}}_{\rm R}\) of constant length that rotates in the orbital plane with angular velocity \({{\dot{\omega }}}\) and a fixed vector \({\textbf{e}}_{\rm F}\) as a result of the \({\hat{\alpha }}_1\)-induced polarisation of the orbit. The “forced eccentricity” \({\textbf{e}}_{\rm F}\) lies in the orbital plane and is perpendicular to \({\textbf{w}}\), the velocity of the binary system with respect to the preferred frame. The magnitude of \({\textbf{e}}_F\) (for \(e \ll 1\)) is written as (Damour and Esposito-Farèse 1992; Bell et al. 1996):

$$\begin{aligned} |{\textbf{e}}_{\rm F} |\simeq \frac{|{\hat{\alpha }}_1|}{12} \, \left( \frac{P_{\rm{b}}}{2\pi }\right) \, |X_{\rm{p}} - X_{\rm{c}}|\, \frac{w_{\perp }}{a}, \end{aligned}$$
(59)

where \(w_{\perp }\) denotes the length of the projection of the system’s velocity \({\textbf{w}}\) onto the orbital plane, and a the semimajor axis of the relative motion. In the above equation we have neglected a factor \((\frac{2}{3}{\hat{\gamma }} - \frac{1}{3}{\hat{\beta }} + \frac{2}{3} + \frac{1}{3}{\hat{\alpha }}_1 X_{\rm p} X_{\rm c})^{-1}\), which is justified by the fact that the effective (strong-field) Eddington parameters \({\hat{\gamma }}\) and \({\hat{\beta }}\) are already sufficiently constrained (to their GR value, i.e., 1) by other pulsar experiments, and the \(\hat{\alpha }_1\) term is small compared to one.

There are various binary pulsars with very small eccentricities (\(e \lesssim 10^{-6}\)). However, in principle a large \({\textbf{e}}_{\rm F}\) could be hidden by an equally large \({\textbf{e}}_{\rm R}\), since the observed eccentricity is the vector sum of the two. On the other hand, such a fortunate cancellation of \({\textbf{e}}_{\rm F}\) and \({\textbf{e}}_{\rm R}\) will eventually break down, as \({\textbf{e}}_{\rm R}\) rotates with respect to \({\textbf{e}}_{\rm F}\) at a rate of \({{\dot{\omega }}}\) (see Fig. 14). For that reason, small-eccentricity binary pulsar systems with a short orbital period, a large difference in the two masses, and a long observing time span should be the ideal systems for such a test.

Currently there are two systems that fit best the above criteria: PSRs J1738+0333 and J1909−3744. More details on these binary pulsars are given in Sects. 4.4.1 and 4.4.2 respectively. For both systems, high resolution spectroscopy observations of their WD companions gave access to their systemic radial velocities, consequently—when combined with the proper motion from timing—allowing the determination of their 3D-velocity with respect to the Solar System and, furthermore, the determination of \({\textbf{w}}\). For PSR J1738+0333, 10 years of observation with the 305-m William E. Gordon Arecibo radio telescope lead to constraints of (Shao and Wex 2012)

$$\begin{aligned} {\hat{\alpha }}_1 = -0.4^{+3.7}_{-3.1} \times 10^{-5} \quad (95\%\,\hbox{confidence}). \end{aligned}$$
(60)

Using 15 years of observations of PSR J1909−3744 with the Nançay Radio Telescope, Liu et al. (2020) were able to further improve this limit to

$$\begin{aligned} |{\hat{\alpha }}_1|< 2.1 \times 10^{-5} \quad (95\%\,\hbox{confidence}). \end{aligned}$$
(61)

The above limits are both better in magnitude than the weak-field results from LLR, and additionally also incorporate strong field effects related to the strong spacetime curvature inside and near the pulsars.

5.4.2 Limits on \({\hat{\alpha }}_2\)

As with the pulsar test of \({\hat{\xi }}\) in Sect. 5.3.3, by far the best constraints on (the strong-field generalisation of) \(\alpha _2\) also come from solitary MSPs. Nordtvedt (1987) has shown that a non-zero \(\alpha _2\) causes the spin of a self-gravitating body moving with velocity \({\textbf{w}}\) (relative to a preferred frame) to precess about the direction of \({\textbf{w}}\), with an angular velocity

$$\begin{aligned} \Omega _{\alpha _2}^{\rm{prec}} = -\alpha _2 \, \frac{\pi }{P} \left( \frac{|\textbf{w}|}{c}\right) ^2\cos \vartheta _w, \end{aligned}$$
(62)

where P denotes the rotational period of the body and \(\vartheta _w\) the angle between \({\textbf{w}}\) and the spin of the body.Footnote 32 While Nordtvedt (1987) used the alignment of the Sun with the planetary orbits to set the tightest constraint on \(\alpha _2\), he already identified pulsars as possible probes for an \(\alpha _2\)-related violation of Lorentz invariance. In a rough estimate he derived an early pulsar limit of order few times \(10^{-6}\) from the (then) recently discovered MSP B1937+21. Shao et al. (2013) has used 15 years of continuous observations of the two MSPs B1937+21 (\(P = 1.56\) ms) and J1744-1134 (\(P = 4.07\) ms) with the 100-m Effelsberg radio telescope to infer a limit of

$$\begin{aligned} |{\hat{\alpha }}_2|< 1.6\times 10^{-9} \quad (95\%\,\hbox{confidence}). \end{aligned}$$
(63)

Like for the \(\xi \)-test in Sect. 5.3.3, the decisive factor for this analysis was again the stability of the pulse profiles over such a long period of time, which gave no indication of any precession of the spins of these two pulsars.

A non-vanishing \(\alpha _2\) also leads to a precession of the orbital angular momentum of a binary pulsar (For a (nearly) circular orbit, the rotational period P in Eq. (62) gets replaced by the orbital period \(P_{\rm b}\)). In Shao and Wex (2012) the binary pulsars J1012+5307 and J1738+0333 were used to derive a (orbital dynamics related) limit of \(|{\hat{\alpha }}_2|< 1.8\times 10^{-4}\) (95% confidence) which, however, is five orders of magnitude weaker than that from solitary pulsars.

5.4.3 Limits on \({\hat{\alpha }}_3\)

A non-vanishing \({\hat{\alpha }}_3\) implies both a violation of local Lorentz invariance and non-conservation of momentum in the gravitational sector. More specifically, it results in the self-acceleration of a rotating, gravitationally bound body that moves with respect to a preferred frame of reference. Within the first PN weak-field slow-motion approximation of the PPN formalism one finds (Nordtvedt and Will 1972; Will 1993):

$$\begin{aligned} {\textbf{a}}_A^{\rm{self}} = -\frac{\alpha _3}{3} \, \frac{E_A^{\rm grav}}{m_A\,c^2} \, \textbf{w} \times {\varvec{\Omega }}. \end{aligned}$$
(64)

where \({\varvec{\Omega }}\) is the rotational velocity vector of the body. As one can see from above equation, the self-acceleration is perpendicular to the body’s spin and its velocity with respect to the preferred frame, i.e., \({\textbf{w}}\). For a strongly self-gravitating body, one needs to replace the fractional gravitational binding energy by the sensitivity (i.e., \(E_A^{\rm grav}/(m_A c^2) \rightarrow -s_A\)) and the PPN parameter by its strong-field equivalent (i.e., \(\alpha _3 \rightarrow {\hat{\alpha }}_3)\).

In the case of a binary system consisting of two spinning bodies, both components experience a self-acceleration according to Eq. (64). As a result, there is an acceleration of the CM of the whole binary system and a modification of the relative orbital motion of the two bodies (Bell and Damour 1996). As it turns out, the second contribution is the key to constrain \({\hat{\alpha }}_3\) with the help of binary pulsars. In systems consisting of a MSP and a WD, both the sensitivity and rotational velocity of the pulsar completely dominate those of the WD and the self-acceleration of the WD therefore can be completely neglected. Furthermore, as a result of the recycling process, the spin of the pulsar is expected to be parallel to the orbital angular momentum of the system (see e.g., Bhattacharya and van den Heuvel 1991). In sum, in small-eccentricity pulsar-WD system the eccentricity vector \({\textbf{e}}\) will experience a time evolution that is analogous to the one in Sects. 5.1 and 5.4.1, with a forced eccentricity given by (Bell and Damour 1996):

$$\begin{aligned} |{\textbf{e}}_{\rm F}|\simeq \frac{|{\hat{\alpha }}_3|}{3} \,s_{\rm p} \left( \frac{P_{\rm b}}{2\pi }\right) ^2 \frac{\pi \,\nu }{(T_{\odot }M)} \, \left|\frac{{\textbf{w}}}{c}\right|\sin \vartheta _w, \end{aligned}$$
(65)

where \(\vartheta _w\) is the (generally unknown) angle between \({\textbf{w}}\) and \({\varvec{\Omega }}\), and \(\nu = |\mathbf{\Omega }|/(2\pi )\) is the spin frequency of the pulsar.

The figure of merit for systems used to test \({\hat{\alpha }}_3\) is \(\nu P_{\rm{b}}^2/e\), meaning fast spinning pulsars in wide orbits with low eccentricities. Based on probabilistic considerations, Gonzalez et al. (2011) have used an ensemble of suitable binary pulsar systems to obtain a 95% confidence limit of \(|{\hat{\alpha }}_3|< 5.5\times 10^{-20}\). A similar limit could be obtained from utilising just a single binary system. Based on a combined data set from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) and the European Pulsar Timing Array (EPTA) for binary pulsar PSR J1713+0747, Zhu et al. (2019) obtained a 95% confidence interval of

$$\begin{aligned} -3 \times 10^{-20}< {\hat{\alpha }}_3 < 4 \times 10^{-20}. \end{aligned}$$
(66)

This limit results from a direct constraint on a temporal variation in the eccentricity vector of the system, i.e., on \(\dot{\textbf{e}}\). For the PSR J1713+0747 system, all parameters involved in the evaluation of \(\dot{\textbf{e}}\) are measurable, except for the radial velocity of the pulsar binary with respect to the Solar System, \(v_r\), which enters the calculation of \({\textbf{w}}\). For that reason, \(v_r\) is kept as a free parameter and chosen such, that it gives the most conservative limits for \({\hat{\alpha }}_3\).

5.4.4 Limits on \({\hat{\zeta }}_2\)

Another PPN parameter that is related to a violation of the conservation of momentum is \(\zeta _2\). It will join \(\alpha _3\) in accelerating the CM of a binary pulsar system (Will 1992, 1993):

$$\begin{aligned} {\textbf{a}}_{\rm{CM}} = ({\hat{\alpha }}_3 + {\hat{\zeta }}_2) \left( \frac{2\pi }{P_{\rm{b}}}\right) ^2 (T_{\odot }M) \, X_{\rm{p}}X_{\rm{c}}(X_{\rm{p}} - X_{\rm{c}}) \, \frac{e\,c}{2(1-e^2)^{3/2}} \, {\hat{\textbf{n}}}_{\rm{peri}}, \end{aligned}$$
(67)

where \(\hat{\textbf{n}}_{\rm peri}\) is a unit vector from the CM of the system to the periastron of the pulsar orbit. Again, the hat indicates that in the presence of NSs we have an effective strong-field generalisation of the PPN parameter. If not perpendicular to the line of sight, the acceleration \({\textbf{a}}_{\rm CM}\) produces an extrinsic contribution to a binary pulsar’s \({{\dot{\nu }}}\) as it changes the radial velocity \(v_r\) of the binary system and therefore the corresponding Doppler effect. In general, this contribution would not be separable from the spin-down \({{\dot{\nu }}}\) intrinsic to the pulsar. However, in relativistic binary pulsar systems, like PSR B1913+16 or the Double Pulsar, the large \({{\dot{\omega }}}\) has lead to a significant change of \(\omega \) since the pulsar’s discovery—for PSR B1913+16, it has advanced by more than \(200^{\circ }\), and for the Double Pulsar by nearly \(360^\circ \). In such cases, the projection of \({\textbf{a}}_{\rm CM}\) onto the line of sight would have changed considerably over the observing time span, producing apparent higher order derivatives of the pulse frequency, i.e., \(\ddot{\nu }\), \(\dddot{\nu }\), etc. For instance, the \(\ddot{\nu }\) is given by (Will 1992):

$$\begin{aligned} \frac{\ddot{\nu }}{\nu } = ({\hat{\alpha }}_3+{\hat{\zeta }}_2) \left( \frac{2\pi }{P_{\rm b}}\right) ^2 (T_{\odot }M) \, X_{\rm p}X_{\rm c}(X_{\rm p} - X_{\rm c}) \, \frac{e}{2(1-e^2)^{3/2}}\, \sin i\cos \omega \,{{\dot{\omega }}}. \end{aligned}$$
(68)

With the extremely tight constraints on \({\hat{\alpha }}_3\) (see Sect. 5.4.3), Eq. (68) can be used to directly set a limit on \({\hat{\zeta }}_2\). A corresponding equation for \(\dddot{\nu }\) can be used to derive additional constraints on \({\hat{\zeta }}_2\) which for some of the relativistic binary pulsars turned out to be even more constraining (Miao et al. 2020). A combination of four carefully selected short-orbital-period DNS systems (including the Hulse–Taylor pulsar) by Miao et al. (2020) lead to the so far best limit for a \(\zeta _2\)-related violation of the conservation of momentum:

$$\begin{aligned} |{\hat{\zeta }}_2|< 1.3 \times 10^{-5} \quad (95\%\,\hbox{confidence}). \end{aligned}$$
(69)

The Double Pulsar is not included in the above result, since the analysis was done before the publication of Kramer et al. (2021).

With a limit such as (69), which results from a combination of different pulsars with different masses, one must always bear in mind that the underlying assumption is that the body-dependent parameter \({\hat{\zeta }}_2\) has only a weak dependence on the mass of a NS.

5.5 Alternative gravity theories

The excellent agreement of pulsar experiments with GR and the tight generic constraints on deviations from GR in the presence of strongly self-gravitating masses, as discussed above, consequently also means tight constraints on numerous specific alternative theories of gravity. In this section we mention a few examples, with a particular focus on mono-scalar-tensor theories of gravity. In view of the large number of alternatives to GR, however, this overview must inevitably remain very incomplete.

5.5.1 Damour–Esposito–Farèse gravity

A particularly well studied alternative to GR, at least in the context of pulsar experiments, is the mono-scalar-tensor theory of Damour and Esposito-Farèse with a quadratic coupling function (called DEF gravity in this review; see Damour and Esposito-Farèse 1992, 1993, 1996 for details). This two-parameter class of gravity theories shows various effects that quite generally illustrate how gravity could deviate from GR, in particular in the presence of strongly self-gravitating NSs. For this reason, DEF gravity is particularly suitable for a theory-space approach to interpret tests of GR with pulsars (Damour 2009). JFBD gravity (Jordan 1955; Fierz 1956; Brans and Dicke 1961), which for a long time was the most important competitor to GR, represents a one-parameter sub-class of DEF gravity.

Besides a spacetime metric \(g_{\mu \nu }\), DEF gravity contains a mass-less scalar field \(\varphi \), with asymptotic value \(\varphi _0\) at spatial infinity. The field equations of DEF gravity can be derived from the Einstein frame action

$$\begin{aligned} {\mathcal{S}} = \frac{c^4}{16\pi G_*} \int \big (R[g_{\mu \nu }] - 2g^{\mu \nu }\partial _\mu \varphi \partial _\nu \varphi \big )\,\sqrt{-g}\,d^4x + {\mathcal{S}}_{\rm mat}[\Psi _{\rm mat};{\tilde{g}}_{\mu \nu }], \end{aligned}$$
(70)

where \(G_*\) is the bare gravitational constant and R the curvature scalar.Footnote 33 All matter fields \(\Psi _{\rm mat}\) couple universally to the physical (Jordan) metric \({\tilde{g}}_{\mu \nu } \equiv g_{\mu \nu } \exp [2\alpha _0(\varphi - \varphi _0) + \beta _0(\varphi - \varphi _0)^2]\), and hence DEF gravity is a metric theory of gravity and fulfills the EEP. Furthermore, DEF gravity is a fully conservative gravity theory where only \(\gamma _{\rm PPN}\) and \(\beta _{\rm PPN}\) differ from their GR values:

$$\begin{aligned} \gamma _{\rm PPN} = 1 - \frac{2\alpha _0^2}{1 + \alpha _0^2}, \quad \beta _{\rm PPN} = 1 + \frac{\beta _0\alpha _0^2}{2(1 + \alpha _0^2)^2}. \end{aligned}$$
(71)

The two parameters of DEF gravity, \(\alpha _0\) and \(\beta _0\), define the two-dimensional parameter space of this class of alternatives to GR, which contains JFBD gravity (\(\beta _0 = 0\)) and GR (\(\alpha _0 = \beta _0 = 0\)). The Newtonian gravitational constant, as measured in a Cavendish-type experiment, is related to the bare gravitational constant \(G_*\) by \(G = G_*(1 + \alpha _0^2)\). There is a tight limit for \(|\alpha _0|\) of the order of a few times \(10^{-3}\) from Solar-System experiments (through tests on \(\gamma _{\rm PPN}\)), while \(\beta _0\) remains unconstrained in such weak-field experiments (Bertotti et al. 2003; Damour 2009; Fienga and Minazzoli 2024).

The quantities (“gravitational form factors”) of a body with mass \(m_A\) and MoI \(I_A\) that enter the PK parameters are

$$\begin{aligned} \alpha _A\equiv \frac{\partial \ln m_A}{\partial \varphi _0}, \quad \beta _A \equiv \frac{\partial \alpha _A}{\partial \varphi _0}, \quad {\mathcal{K}}_A \equiv -\frac{\partial \ln I_A}{\partial \varphi _0}, \end{aligned}$$
(72)

where the number of baryons is kept fixed when taking the partial derivatives.Footnote 34 The quantity \(\alpha _A\) is the effective scalar coupling of the body and gives its specific scalar charge. For weakly self-gravitating masses \(\alpha _A\) approaches \(\alpha _0\). In the parameter space where spontaneous scalarisation does occur for NSs (\(\beta _0 \lesssim -4.5\)), \(\alpha _A\) can be of order unity even if \(\alpha _0 = 0\). Hence, in the strong gravitational fields of NSs DEF gravity can deviate significantly from GR, even if it is very close (or even identical) to GR in the weak-field regime. The other two gravitational form factors, i.e., \(\beta _A\) and \({\mathcal{K}}_A\) can show a similarly extreme, non-linear behaviour in the presence of strongly self-gravitating (material) bodies. For BHs, where there is a no-hair theorem they are identical to zero (Damour and Esposito-Farèse 1992).

In an N-body system, the effective gravitational constant for the interaction of two bodies is body dependent, and is given by

$$\begin{aligned} {\hat{G}}_{AB} = G_*(1 + \alpha _A\alpha _B) = G\left( \frac{1 + \alpha _A\alpha _B}{1 + \alpha _0^2}\right) , \quad \hbox{with}\quad A\ne B. \end{aligned}$$
(73)

Likewise, the PPN parameters become body dependent:

$$\begin{aligned} {\hat{\gamma }}_{AB}= & {} 1 - \frac{2\alpha _A\alpha _B}{1 + \alpha _A\alpha _B}, \quad \hbox{with}\quad A\ne B, \end{aligned}$$
(74)
$$\begin{aligned} {\hat{\beta }}^A_{BC}= & {} 1 + \frac{\beta _A\alpha _B\alpha _C}{2(1 + \alpha _A\alpha _B)(1 + \alpha _A\alpha _C)}, \quad \hbox{with}\quad A\ne B, A\ne C. \end{aligned}$$
(75)

These PPN parameters can differ significantly from GR as well as from their weak field counterparts in Eq. (71). Figure 16 illustrates this for a specific case.

Fig. 16
figure 16

Strong-field Eddington parameter \({\hat{\gamma }}_{AB}\) in (quadratic) DEF gravity for the Double Pulsar system. The plot shows the difference between GR and DEF, i.e., \(1-{\hat{\gamma }}_{AB}\), as a function of \(\beta _0\), for the interaction between the two NSs (red solid) and the interaction between pulsar B and a photon (red dashed). We have assumed \(1 - \gamma _{\rm PPN} = 10^{-5}\) (i.e., \(|\alpha _0|\simeq 0.00224\); blue horizontal line), which agrees well with current Solar-System experiments. To calculate the structure-dependent \(\alpha _{\rm A}\) and \(\alpha _{\rm B}\) we used the NS EoS ENG (see Lattimer and Prakash 2001)

In a binary pulsar system, the PK parameters get modified by the gravitational form factors. For the quasi-stationary effects at the 1PN level one finds

$$\begin{aligned} k&= \frac{{\hat{\beta }}_{\rm O}^2}{1-e^2} \left( \frac{3 - \alpha _{\rm p}\alpha _{\rm c}}{1 + \alpha _{\rm p}\alpha _{\rm c}} - \frac{X_{\rm p}\alpha _{\rm p}^2\beta _{\rm c} + X_{\rm c}\alpha _{\rm c}^2\beta _{\rm p}}{2(1 + \alpha _{\rm p}\alpha _{\rm c})^2} \right) , \end{aligned}$$
(76)
$$\begin{aligned} \gamma&= \frac{P_{\rm b}}{2\pi } \,{\hat{\beta }}_{\rm O}^2 \left( \frac{1 + {\mathcal{K}}_{\rm p}\alpha _{\rm c}}{1 + \alpha _{\rm p}\alpha _{\rm c}} + X_{\rm c}\right) X_{\rm c} \,e, \end{aligned}$$
(77)
$$\begin{aligned} s&= x\left( \frac{P_{\rm b}}{2\pi }\right) ^{-1} {\hat{\beta }}_{\rm O}^{-1} X_{\rm c}^{-1}, \end{aligned}$$
(78)
$$\begin{aligned} r&= \frac{G_*m_{\rm c}}{c^3}, \end{aligned}$$
(79)

where \({\hat{\beta }}_{\rm O} \equiv [2\pi {\hat{G}}_{\rm pc}(m_{\rm p} + m_{\rm c})/P_{\rm b}]^{1/3}/c\). Depending on the parameters \(\alpha _0\) and \(\beta _0\), for a given binary pulsar system some of these PK parameters can differ significantly from their GR values (Damour and Esposito-Farèse 1996). Above we have omitted PK parameters that so far have not played any role in constraining the DEF gravity parameter space, for instance, the rate of geodetic precession and the relativistic deformations of the orbit \(\delta _\theta \) and \(\delta _r\).

Concerning radiative aspects of gravity, in particular GW damping, one finds modifications already at the 1.5PN order in the orbital dynamics due to (scalar) dipolar GWs (see also Sect. 5.2 above). The corresponding change in the orbital period reads

$$\begin{aligned} {\dot{P}}_{\rm{b}}^{\rm{dipole}} = -2\pi \, {\hat{\beta }}_{\rm{O}}^3 \, X_{\rm{p}} X_{\rm{c}} \, \frac{1 + e^2/2}{(1 - e^2)^{5/2}} \, \frac{(\alpha _{\rm{p}} - \alpha _{\rm{c}})^2}{1 + \alpha _{\rm{p}}\alpha _{\rm{c}}}+{\mathcal{O}}({\hat{\beta }}_{\rm{O}}^5). \end{aligned}$$
(80)

Since for \(|\alpha _{\rm{p}} - \alpha _{\rm{c}}|\sim 1\) dipolar GW damping would be many orders of magnitude (\(\sim (c/v)^2\)) stronger than the GR GW damping, any confirmation of GR’s quadrupole formula in the GW emission puts extremely tight constraints on \(|\alpha _{\rm{p}} - \alpha _{\rm{c}}|\). However, only for sufficiently asymmetric binary systems, in terms of compactness of the two bodies, this converts into similarly stringent limits on DEF gravity. For that reason, pulsar-WD systems (see Sect. 4.4) are of particular interest here.

Apart from the dipole contribution, there are also monopole and quadrupole contributions related to the scalar field, which further enhance the orbital decay. However, both of them enter at the 2.5PN level, i.e., \({\mathcal{O}}({\hat{\beta }}_{\rm{O}}^5)\), and are usually subdominant to \({\dot{P}}_{\rm{b}}^{\rm{dipole}}\). Detailed expressions can be found in Damour and Esposito-Farèse (1992).

Like for GR (previous sections), one can use different binary pulsar systems and their observed PK parameters to test the parameter space of DEF gravity. However, for DEF gravity there is no effacement of the internal structure of the bodies—a consequence of the violation of the SEP. Consequently, one has to assume an EoS for NS matter and for every pair \((\alpha _0,\beta _0)\) one needs to integrate the structure equations for slowly rotating NSs (see Damour and Esposito-Farèse 1996), certainly for the pulsar but also its companion if the latter is also a NS. For given central pressures one obtains the masses and the gravitational form factors for pulsar and companion, which then can be used to calculate the PK parameters (Eqs. 7680). By this, the PK parameters become (rather complicated) functions of the Keplerian parameters and the a priori unknown masses of the binary system. A point \((\alpha _0,\beta _0)\) in the DEF gravity plane passes the test (for an assumed EoS) if there is a pair of masses, i.e., central pressures, where the corresponding PK parameters agree with the observations (see Damour and Esposito-Farèse 1996, 1998 for details of this approach).Footnote 35 The procedure is somewhat simplified if the companion is a weakly self-gravitating body, since in this case one can assume \(\alpha _{\rm c} \simeq \alpha _0\) and \(\beta _{\rm c} \simeq \beta _0\).

For the pulsar in the stellar triple system (see Sects. 4.4.45.1) the situation is different. There we have no PK parameters to test the DEF gravity, but we can directly test the effective gravitational constant of Eq. (73). A difference in the effective gravitation constant in the interaction between the pulsar and the outer WD and between the inner and the outer WDs is a strong effect that already enters at the Newtonian level in the equations of motion. To leading order the fractional difference \(\Delta \) in the accelerations towards the outer WD reads (cf. Damour 2009)

$$\begin{aligned} \Delta \simeq \alpha _0 \, (\alpha _{\rm p} - \alpha _0), \end{aligned}$$
(81)

where for the weakly self-gravitating WDs \(\alpha _A \simeq \alpha _0\) has been assumed. From the equation above, it can be seen that for very small \(\alpha _0\) (or \(\alpha _0 = 0\)), where we could still have a scalarised pulsar (spontaneous scalarisation), the triple-system pulsar test does not give useful constraints on \(\alpha _{\rm p}\).

Figure 17 shows constraints in the DEF-gravity parameter space obtained from different Solar System and pulsar experiments. Concerning pulsar limits, one has to keep in mind that the gravitational form factors of Eq. (72) depend on the structure of the NS and are therefore EoS dependent. Consequently, the pulsar limits in Fig. 17 change to some extent if a different EoS is chosen to solve the NS structure equations. To obtain robust limits, one needs to follow an EoS-agnostic approach like in Voisin et al. (2020), where a point in the DEF gravity plane is only excluded if it is excluded for a whole range (from soft to stiff) of viable EoSs. In Fig. 17 we have chosen only one (rather stiff) EoS to illustrate qualitatively the pulsar limits. In the highly non-linear regime of DEF gravity, the EoS dependence is particularly strong, and it requires a range of pulsars with a suitable distribution of masses in order to constrain DEF gravity (Shao et al. 2017; Zhao et al. 2022).

Fig. 17
figure 17

Constraints on the DEF gravity parameter space from different experiments (95% confidence): Shapiro delay with the Cassini spacecraft (Bertotti et al. 2003), dipolar radiation (J1738+0333, J2222-0137; Sects. 4.4.14.4.3), Nordtvedt effect (LLR Biskupek et al. 2021, Triple System Pulsar; Sect. 5.1), and the Double Pulsar (Sect. 4.2). Areas above a curve are excluded by the corresponding experiment (see Damour and Esposito-Farèse 1996, 1998 for details). Pulsar curves are computed with a comparably stiff EoS (MPA1 in Lattimer and Prakash 2001), which for most of the parameter space gives conservative limits. GR corresponds to \(\alpha _0 = \beta _0 = 0\), and JFBD theory is along the vertical \(\beta _0 = 0\) line with Brans–Dicke parameter \(\omega _{\rm BD} = (\alpha _0^{-2} - 3)/2\)

5.5.2 Various other alternatives to GR

While DEF gravity is arguably the best studied class of alternatives to GR in the context of pulsar experiments, there are many other theories that have been significantly constrained or even ruled out using pulsar timing. A complete list is beyond the scope of this review, so in this subsection we will only give a few particularly informative examples. The biggest challenge in confronting an alternative theory of gravity with pulsar observations is to calculate the “gravitational form factors” for NSs, which goes far beyond a linear approximation, as it requires the full non-linearity of the theory.

While in the previous section it was assumed that the potential of the scalar field \(V(\varphi )\) only plays a role on cosmological scales, if at all, this assumption can certainly be relaxed, for example to have a massive scalar field with a Compton wavelength comparable to the length scales relevant in the pulsar experiment. Pulsar timing results have been used to exclude certain parts of the parameter space of such massive scalar–tensor theories (see e.g., Alsing et al. 2012; Ramazanoǧlu and Pretorius 2016; Yazadjiev et al. 2016; Seymour and Yagi 2020a, b).

Binary pulsar observations have also been used to constrain or even exclude some MOND-like gravity theories, i.e., relativistic theories that have modified Newtonian dynamics as their non-relativistic limits and are an attempt to avoid the need of dark matter in the Universe, at least on certain scales. The most prominent example is Bekenstein’s tensor–vector–scalar (TeVeS) theory (Bekenstein 2004) which is practically excluded by the Double Pulsar (Freire et al. 2012; Kramer et al. 2021).Footnote 36 The parameter space of a natural extension of Bekenstein’s TeVeS with a quadratic coupling of matter to the scalar field, which by design satisfies Solar system tests, has been constrained with binary pulsars in Freire et al. (2012).

Other examples of the application of pulsar observation for testing alternative gravity theories are tests of Mendes–Ortiz (MO) gravity (Mendes and Ortiz 2016; Anderson et al. 2019), Einstein-Aether and khronometric gravity (Yagi et al. 2014; Gupta et al. 2021), scalar-Gauss-Bonnet (Danchev et al. 2022; Yordanov et al. 2024), and cubic Galileon model (Shao et al. 2020), just to name a few.

On a final note, there are various alternatives to GR that naturally or by design pass all pulsar experiments the same way as GR does, because they either invoke screening mechanisms that make them indistinguishable in their so-called “strong-field regime” (which even includes the Solar System) or because they are sufficiently short range in their modifications to GR, so that effects do not show up in the orbital dynamics of binary pulsars.Footnote 37 Theories that predict deviations only in the context of BHs also have not been tested in binary pulsar experiments, simply because no binary pulsar with a BH companion was available until now, with the first strong candidate having been published only this year (Barr et al. 2024).

6 Conclusions and future prospects

6.1 Summary

As the first half century since the discovery of the first binary pulsar comes to a close, it is important to reflect on what has been achieved in terms of tests of gravity theories. This includes the first tests of gravity theories with compact, strongly self-gravitating objects and the first detection of GWs from the orbital decay of the first binary pulsar. These represent qualitatively new tests in comparison with all previous tests in the Solar System.

However, it is also important to realise that the most precise tests of gravity theories based on the timing of binary (and triple) systems have been published since the last Living Review in Relativity on this topic (Stairs 2003):

  • The measurement of the orbital decay in the Double Pulsar published in 2021 (Kramer et al. 2021) improved the precision of tests of the radiative properties of gravity—especially the leading order quadrupolar term predicted by GR—by a factor of 25 over the best previous test. The results agree with GR within the relative 1-\(\sigma \) uncertainty of \(6.3 \times 10^{-5}\).

  • The same system allowed several other independent, high-precision tests of GR as well, including the first pulsar tests of terms past the leading order.

  • These include the most precise pulsar tests of the Shapiro delay, carried out in a spacetime with a curvature that is six orders of magnitude larger than the curvature probed with the Cassini-spacecraft test in the Solar system. More generally, it is the photon propagation test with the highest spacetime curvature, exceeding the images of the supermassive BHs \({\hbox{M87}}^{*}\) (Event Horizon Telescope Collaboration 2019a) and Sgr \({\hbox{A}}^{*}\) (Event Horizon Telescope Collaboration 2022) by more than nine and three orders of magnitude respectively.

  • The MSP in a triple system, PSR J0337+1715, has allowed an improvement in our test of the UFF for NSs by three orders of magnitude (Archibald et al. 2018; Voisin et al. 2020). This test of the SEP provides some of the tightest constraints for many alternatives to GR, including JFBD gravity and a large part of the DEF-gravity parameter space.

  • The latter parameter space was also constrained by tight constraints on the possibility of dipolar GW emission in a set of pulsar-WD systems with a wide range of pulsar masses;

  • Regarding other gravity theories, pulsar tests have not only provided important limits on other types of scalar–tensor theories, scalar-Gauss-Bonnet gravity, Einstein-Aether (a tensor–vector theory which violates Lorentz invariance in the gravitational sector), etc., but also entirely ruled out others, like Bekenstein’s TeVeS and some of its variations.

Impressively, GR still passes all these precise and diverse tests. More generally, these experiments test some fundamental aspects and symmetries of gravitation and spacetime:

  • The verification of the UFF for NSs via the non-detection of the Nordtvedt effect and of dipolar GW emission.

  • The stringent limits on UFF violation and on preferred-location and preferred-frame effects for the gravitational interaction further support the SEP. This is of fundamental importance, particularly in view of the conjecture that GR is the sole valid gravity theory that fully embodies the SEP.

  • Some of these radiative experiments are stringent probes into the nature of GWs, showing that they are, to leading order, quadrupolar as predicted by GR. These GW pulsar tests nicely complement GW tests obtained from merger observations with ground-based GW observatories.

  • These radiative experiments have also excluded some strong-field highly nonlinear deviations from GR, like the phenomenon of spontaneous scalarisation predicted by DEF gravity.

  • Finally, pulsars have also provided tight constraints for parameters of generic frameworks, like strong-field generalizations of PPN parameters and parameters of the gravitational sector in the SME. Some of them are directly related to the aforementioned limits on violations of symmetries associated with the SEP, like the UFF, local position and local Lorentz invariance.

This flurry of recent results show that gravity experiments using radio pulsars are thriving. They, and many results from the Solar System, EHT, LIGO/Virgo/KAGRA, etc., demonstrate the continued interest in precision gravity experiments.

6.2 Prospects

The prospects for improvements in the precision of these tests for the near future appear to be excellent. First, the mere continuation of some timing experiments will greatly improve many of the tests done with these systems. As examples, 2 years of additional data on PSR J0337+1715 allowed a (preliminary) doubling of the precision of the test of the UFF with this system (Voisin et al. 2022). Furthermore, simulations showed that continued timing of the Double Pulsar might constrain the MoI of PSR J0737−3037A to within 10% until 2030 (Hu et al. 2020), apart from significantly improving the precision in the measurement of the orbital decay. Although such a determination of the MoI assumes GR to provide the correct description of the needed PK parameters (and will eventually help to constrain the EoS), interpreted as a LT test it still allows to probe for significant short-range deviations from GR that only affect pulsar A locally (see discussion in Hu et al. 2020).

A significantly improved radiative test and a LT test with the Double Pulsar rely on good independent constraints on the EoS of dense matter. These are provided by the measurement of large NS masses (Fonseca et al. 2021), the NICER constraints on the radius and mass of NSs (e.g., Miller et al. 2021; Vinciguerra et al. 2024) and measurements of the NS tidal deformability by ground-based GW detectors (Abbott et al. 2018) (see discussion in Sect. 4.2.2); the latter are especially valuable as they can be translated directly to constraints on the MoI via the I-Love-Q relation (Yagi and Yunes 2013). Improving these EoS constraints will not only lead to more precise radiative and LT tests with the Double Pulsar, but will also lead to more precise constraints on alternative theories of gravity, which as discussed in Sect. 5.5 are still subject to some uncertainties on this account.

The prospects for improvement of the pulsar gravity tests are further brightened by the higher sensitivity of telescopes like FAST, MeerKAT and in the future the SKA. FAST and MeerKAT are already improving the timing precision on all known pulsars, allowing new and more precise measurements of PK parameters. As an example, just 3 years of MeerKAT data yielded a photon propagation test in the Double Pulsar (Hu et al. 2022) that is a factor of two better than the previous one based on 16 years of data from 6 different telescopes (Kramer et al. 2021), which is significantly better than even the most optimistic simulations.

More importantly, it is clear that the pulsar field in general has been driven, from the start, by the discovery of new, better “laboratories”. The rate of pulsar discoveries has recently increased significantly, with 1000 new pulsars having been found by FAST and MeerKAT already.Footnote 38 Furthermore, the rate of discovery of recycled pulsars—and especially recycled pulsars in very compact orbits—is increasing even faster, because of the much improved time and spectral resolution of the search data, the much improved computing capabilities and search algorithms.

All this will very likely lead to the discovery not only of more extreme versions of the currently known systems, which will allow new leaps in the precision for the types of tests described above, but also of completely new types of systems, such as pulsar—BH binaries, of which the first one might have already been found (Barr et al. 2024). Such systems will allow gravity tests that were until now beyond the testing power of pulsar timing (Wex and Kopeikin 1999; Liu et al. 2014; Seymour and Yagi 2018). In particular, the discovery of a pulsar in a relativistic orbit around the supermassive BH at the centre of our Galaxy would allow unprecedented tests of BH physics, in particular in combination with tests from other observations in this extreme gravity environment (see e.g., Psaltis et al. 2016 and references therein).

The prospect of detecting very compact binary pulsars, especially DNSs (or pulsar-BH systems), is very alluring for tests of gravity theories. A general reason is the attainable significance of the radiative test in the presence of contaminants, which improves as \(P_{\rm b}^{-8/3}\). Such systems would also allow the measurement of the full precession cycle of relativistic spin-orbit coupling on reasonable timescales: for instance, a DNS with an orbital period of 30 min would have a geodetic precession period of about 5 years, which would then be measured precisely from the repeating changes in the pulse profile of the system. In fact, such a test of geodetic precession with repeating emission patterns could already be possible in the near future with double pulsar B, which is expected to precess back into our line of sight within the next few years (Breton et al. 2008; Perera et al. 2010; Lower et al. 2024).

Additionally, very compact binary pulsars will be detectable at good S/N by the Laser Interferometer Space Antenna (LISA) mission if they are not too distant from Earth (Thrane et al. 2020). This mission will also find, independently, the most compact NS–NS, NS–WD or NS–BH systems of our Galaxy (Lau et al. 2020). Perhaps some of these NSs will be detectable as pulsars in targeted radio surveys. In either case, binary pulsar experiments would become “multi-messenger” experiments, allowing entirely new tests of gravity theories (Thrane et al. 2020; Miao et al. 2021).