The Astrophysics of Nanohertz Gravitational Waves

Pulsar timing array (PTA) collaborations in North America, Australia, and Europe, have been exploiting the exquisite timing precision of millisecond pulsars over decades of observations to search for correlated timing deviations induced by gravitational waves (GWs). The discovery space of this nanohertz band is potentially rich with populations of inspiraling supermassive black-holes binaries, decaying cosmic string networks, relic post-inflation GWs, and even non-GW imprints of axionic dark matter. This article aims to provide an understanding of the exciting open science questions in cosmology, galaxy evolution, and fundamental physics that will be addressed by the detection and study of GWs at nanohertz frequencies. The focus of the article is on providing an understanding of the mechanisms by which PTAs can address specific questions in these fields, and to outline some of the subtleties and difficulties in each case. The material included is weighted most heavily towards the questions which we expect will be answered in the coming months to decades with PTAs; however, we have made efforts to include most currently anticipated applications of nanohertz GWs.


Introduction
The first direct detection of Gravitational Waves (GWs) in 2015 started a new era in astrophysics, in which we can now use gravity itself as a unique messenger from the cosmos (Abbott et al. 2016a. The subsequent detection of a neutron star merger with associated electromagnetic counterparts in 2017 marked the beginning of multi-messenger astronomy . The advanced Laser Interferometer Gravitational-wave Observatory (aLIGO) that made those detections is sensitive to sources that emit gravitational radiation between about ten and a few thousand Hz. In the coming decades, we will open up additional bands in the GW spectrum, which will allow us to probe entirely new astrophysical sources and physics.
The detection of nanohertz GWs by Pulsar Timing Arrays (PTAs) is expected to be the next major milestone in GW astrophysics. In the future, PTAs and groundbased laser interferometry experiments will be complemented by space-based laser interferometers (Amaro-Seoane et al. 2017) and observations of primordial GWs, imprinted in the polarization of the cosmic microwave background (e. g. BICEP2/Keck Collaboration et al. 2015), providing comprehensive access to the GW Universe. Current PTA efforts are spearheaded by a number of groups worldwide, including the North American Nanohertz Observatory for Gravitational Waves (NANOGrav, Arzoumanian et al. 2018a), the European Pulsar Timing Array (EPTA, Babak et al. 2016;Desvignes et al. 2016;Lentati et al. 2015), and the Parkes Pulsar Timing Array (PPTA, Hobbs 2013;Lasky et al. 2016;Shannon et al. 2015). The individual groups are also the constituents of an international collaboration, known as the International Pulsar Timing Array (IPTA, Hobbs et al. 2010;Verbiest et al. 2016).
This paper presents a comprehensive background in astrophysical theory that can be addressed observationally by PTAs, and thus the science that will be extracted from the detection of GWs at nanohertz frequencies. Generally, we limit this paper to astrophysics related to GW detection in the PTA band, and in §7 to gravitational effects on PTAs not due to the pulsars or their companions. Throughout the present work, the "PTA band" refers to GW frequencies of approximately 1 -1000 nHz.
We do not aim to cover the rich (non-GW-related) astrophysics accessible by pulsar timing. The prolific ancillary science from a PTA as a whole includes, but it not limited to: neutron star population dynamics (Cordes & Chernoff 1998;Matthews et al. 2016), the formation histories of compact objects (Bassa et al. 2016, Kaplan et al. 2016, and the characterization of the ionized interstellar medium (Jones et al. 2017, Lam et al. 2016, including plasma lensing events (Lam et al. 2018), tests of general relativity (Kramer et al. 2006;Taylor & Weisberg 1982;Zhu et al. 2015) and the physics of nuclear matter (Demorest et al. 2010).
For readers seeking a brief summary, each section is led by an outline of the most salient overview points from that section. The layout of the remainder of this paper is as follows: §2 provides a backdrop of concepts in pulsar timing that are relevant to the understanding of this review. In §3, we discuss PTA applications to supermassive black hole binaries; in §4, we consider cosmic strings; in §5, we assess whether nanohertz GWs can present unique tests of General Relativity; in §6.1 we consider topics in cosmology, and in §7 we consider other (potentially more exotic) possibilities. In §8 and §9, we describe potential synergistic science in multi-band GW studies (in particular with the Laser Interferometer Space Antenna, LISA), and in multi-messenger studies (in particular with electromagnetic observations of binary supermassive black holes), respectively. In §10, we summarize the current and the expected near-future developments in this field.

Pulsar Timing in Brief
Here we introduce critical concepts for understanding how PTAs can access their target science.
Timing residuals: Evidence of GWs can be seen by the influence they have on the arrival time of pulsar signals at Earth. The measured vs. predicted arrival time of pulses, as a function of time, are referred to as the timing residuals.
Pulsar vs. Earth term: A propagating GW will pass both the pulsar and Earth, affecting their local space-time at different times. Pulsar timing can detect a GW's passage through an individual pulsar ("pulsar term"), and can detect a wave's passage through Earth ("Earth term") as a signal correlated between pulsars.
Correlation analysis: While we can detect the Earth or pulsar term in individual pulsars, a GW can only be confidently detected by observing the correlated influence of the GW on multiple pulsars, demonstrating a characteristic quadrupolar signature.

GW signals:
• Continuous waves from orbiting binary black holes.
• GW bursts from single-encounter SMBH pairs and cosmic strings.
• Bursts with memory, singular, rapid, and permanent step changes in space time that can accompany binary SMBH coalescence and cosmic strings. • GW background, the combined sum from all sources of GW emission.

Pulsar Timing and Timing Residuals
We time a pulsar by building a "timing model", which is a mathematical description of anything we know about that will affect the arrival times of its pulses at the Earth (for details on how precision pulse arrival times are measured, see e. g. Lorimer & Kramer 2012). Effects we know about and model include (but are not limited to) the time-dependent position of Earth in the Solar System, the natural slowing of a pulsar's period due to rotational energy loss, and any orbital motion of the pulsar, if it is in a binary. The parameters of the timing model are iteratively refined to minimize the root-mean-square (RMS) of the "timing residuals", which are the difference between the observed pulse arrival time and the arrival time expected based on the timing model.
As a GW moves between the Earth and a pulsar, it alters the local space-time, and thus changes the effective path length light must travel. By this process, the pulses will arrive slightly earlier or later than expected. GWs and any other processes influencing pulse arrival times that are unaccounted for in the timing model will manifest as structure in the pulsar's timing residuals. Since a pulsar's timing model is modified over time to remove as much structure as possible from the timing residuals (forming so-called "post-fit" timing residuals), some of the residual structure induced by a GW will be effectively "absorbed" by the timing model.
Simulated post-fit residuals influenced by a variety of GW signals, that PTAs are poised to detect, are illustrated in Fig 1. Residuals for three different pulsars are shown to demonstrate how the GW signal can vary from pulsar to pulsar. As can be easily seen, PTAs are sensitive to effects that last on time scales from ∼weeks, which is the approximate cadence of pulsar observations, to decades, which is how long PTA Figure 1: Each panel shows pulsar timing residuals for three pulsars (black triangles, red stars, blue circles) simulated with weekly observing cadence and 1 ns of white noise in their arrival times. The pulsar-to-pulsar variations demonstrate how the quadrupolar signature of GWs will manifest as correlated timing residuals in distinct pulsars. Note that 1 ns is not a noise level yet achieved for any pulsar, however here it allows us to demonstrate each observable signal type with a high signal-to-noise ratio. Panels are: (a) Continuous waves from an equal-mass 10 9 M SMBHB at redshift z = 0.01. The distortion from a perfect sinusoid is caused by self-interference from the pulsar term (Sec. 2.2). In this case, the pulsar term has a lower frequency because we are seeing the effects on the pulsar from an earlier phase in the SMBHB's inspiral evolution. This interferes with the Earth term, which takes a direct path from the source to Earth and therefore is a view of a more advanced stage of evolution. (b)A GW background with h c = 10 −15 and α = −2/3; (c) A memory event of h = 5×10 −15 , whose wavefront passes the Earth on day 1500. (d) A burst source with an arbitrary waveform.
experiments have been running. We note that the scale of these GW effects is realistic given the properties of SMBHBs, but the noise level is optimistically small by a factor of 20 or more depending on the pulsar. Since realistic signals will not have such a high signal-to-noise ratio, PTAs time dozens of pulsars to mitigate signal-to-noise limitations in individual pulsars, and search for correlations in their timing residuals.

The Pulsar Term, the Earth Term, and Correlation Analysis
A GW passing through the galaxy perturbs the local space-time at the Earth and at the pulsar, but at different times. Pulsar timing can detect a GW's passage through an individual pulsar (pulsar term) and also through the Earth (Earth term). The Earth term signal is correlated between different pulsars, while the pulsar term is not. § § Note, a more precise remark is that the Earth and pulsar terms are not correlated as long as two pulsars are separated by many gravitational wavelengths, that is to say that f L 1, where f is the GW frequency and L is distance to the pulsar. This assumption is called the short-wavelength approximation (e. g. ).
Nanohertz GW sources of interest are thought to be tens to hundreds of megaparsecs away and perhaps further, e.g. Mingarelli et al. (2017); Simon et al. (2014) -it is thus well-justified to approximate the GW as a plane wave. With this simplifying approximation, we can consider the influence of a GW on the observed pulse arrival time as a Doppler shift between the reference frame of the pulsar we're observing, and the solar system barycenter (from which we are observing). The Doppler-shifted pulsation frequency as measured by an observer at the quasi-inertial solar system barycenter is given by f obs = (1 + z)f emit . The observed redshift varies with time, depending on the time-varying influence of the GW on the local spacetime of the pulsar and the solar system barycenter. More specifically, at time t: where h ij is the space-time perturbation (typically referred to as the "strain"), and gives the metric perturbation describing the GW in transverse-traceless gauge. With the solar system barycenter as a reference position, the parameterp is a vector pointing to the pulsar position,Ω is a vector along the direction in which the GW propagates, t l = (l/c)(1 +Ω ·p), and l is the distance between the pulsar and solar system barycenter. The timing perturbation to pulse arrival times is the integral of the redshift over time (Anholm et al. 2009;Detweiler 1979), which reduces to the difference between the Earth term (evaluated at time t) and the pulsar term (evaluated at time t − t l ). Note the pulsar term, if observed, always depicts an earlier time in the evolution of a GW signal. This is because the Earth term samples GWs arriving to Earth directly from the source, while the pulsar term is associated with a longer path length, encompassing the GW's trip to the pulsar from the source and then the traversal of light from the pulsar to Earth. This may be an important effect in studying SMBHB evolution, as discussed in more detail in Section 3.2.6. Figure 1 illustrates simulated post-fit residuals for four types of GW signals. For each signal, the residuals of three separate pulsars are shown. Figures 1a and 1b, representing continuous GWs from an individual SMBHB and a stochastic GW background from an ensemble of SMBHBs, respectively, correspond to long-duration signals, for which both the Earth and pulsar terms are active simultaneously. In these cases, the pulsar term interferes with the Earth term and lessens the extent to which the residuals between different pulsars are correlated or anti-correlated. In Figure 1a, we have modeled the inspiral phase of a SMBHB, which over the course of its evolution changes its frequency and phase. Because each pulsar has a different position relative to Earth and the GW source, the pulsar terms probe different stages of the SMBHB orbit and the pulsar terms interfere with the Earth-term signal in slightly different ways. For burst-like signals, Figures 1c and 1d, the Earth term can be active, while the pulsar term is quiescent or vice versa. If the Earth term is active, but all pulsar terms are not, the timing perturbation from the GW will be fully correlated or anticorrelated across all pulsars in a PTA.
It is important to note that a GW can only be confidently detected by PTAs if the correlated influence of the GW on multiple pulsars is observed. Earth-based clock errors will influence all pulsar timing residuals the same way (monopolar signature, e. g. Hobbs et al. 2012), and errors in our solar system models will influence ecliptic pulsars more severely (dipolar signature, e. g. Champion et al. 2010). GWs are expected to have a quadrupolar signature, the directional correlations of which are expected to depend on the nature of gravity and the polarization of the GWs. Therefore, the relative positions of a GW source and two pulsars will dictate how the residuals of those two pulsars are correlated. How these correlations appear as a function of the angle between pulsars depends on the nature of gravity and the polarization of a GW, and is commonly shown for pulsar timing data as the "Hellings and Downs" curve (Hellings & Downs 1983). Section 5 discusses correlation analysis and the Hellings and Downs curve in more detail, and describes how various models of gravity will dictate the shape of the correlations observed.

Types of Gravitational-wave Signal
Depending on the origin of the GW signal, the induced residuals might appear as deterministic signals or a stochastic background. Here we simply aim to set up a reference point of what signal modes we expect to detect with PTAs. In the remainder of the document, we further discuss what information can be extracted about the Universe, depending on the type of the detected GW signal.
Cyclic signals (continuous waves; Fig. 1a) can arise from objects in an actively orbiting binary system. Bursts (Fig. 1d) represent rapid but temporally finite accelerations, e.g., during the pericenter passage in a highly eccentric or unbound orbit of two SMBHs (e. g. Finn & Lommen 2010). These classes can be detected by PTAs as long as their characteristic timescale is between weeks and a few decades.
Bursts with memory ( Fig. 1c) represent a rapid and permanent deformation in spacetime. A burst with memory (BWM) event is generally expected to occur on timescales less than 1 day (e. g. Favata 2010). Because the duration of memory's ramp-up time is relatively brief compared to PTA observing cadence, it is commonly modelled as an instantaneous step function in strain. PTAs cannot typically detect the memory event as it occurs, but they can see the resulting difference between the pre-and post-event spacetimes; the BWM creates a sudden, long-term change in the apparent period of a pulsar. This leaves a low-frequency ramp-like signature in the pulsar timing residuals Madison et al. 2014;van Haasteren & Levin 2010). In Figure 1c, the memory event at day 1500 causes a characteristic "ω" shape to be seen in the timing residuals, indicating a difference in the spacetime before and after the event. The residual shape here is influenced by our fit to the period and period derivative of each pulsar, which is required in the pulsar timing model, e.g., see Madison et al. (2014) and Section 3.1.3 for more details on this effect.
Finally, all of nature's deterministic signals can contribute en masse to a stochastic GW background (Fig. 1b). The strain of the background is frequencydependent, described by the characteristic-strain spectrum, h c (f ). This is calculated by integrating the squared GW strain signal over the entire emitting population (Phinney 2001). For most GW sources in the nanohertz to microhertz frequency regime, the predicted characteristic strain spectrum can be simplified as a single powerlaw: or in terms of the energy density of GWs, Ω GW ∝ f 2(1+α) . In this way, the background can be characterized with an amplitude A, and a single spectral index α. The amplitude A is commonly defined at a frequency f yr = 1 year −1 ∼ 32 nHz. As discussed in future sections, the details of physical processes in galaxies, cosmic strings, and inflation can potentially make the spectrum more complex than a single power law. Nevertheless, as the PTA sensitivity to a GW background increases, they will be able to detect the amplitude and spectral shape of the background. In Figure 1b, the background appears as a red noise process, because the index α is negative, leading to greater variations at longer timescales/lower frequencies. For a good introductory Figure 2: Here we outline the approximate number of pulsars and timing precision required to access various science topics based on current predictions for the strength and nature of each signal. The upper and lower versions represent the approximate requirements for a 10-and 25-year timing array, respectively. In reality, PTAs have pulsars with varying durations and sensitivities. The black curves represent the pulsars in the upcoming NANOGrav 12.5-year data release and expectations for a future timing array (e. g. Vigeland & Siemens 2016) that are timed to a precision lower than or equal to the indicated RMS timing precision. The 12.5-year data release contains approximately 70 pulsars, however the timescale over which each pulsar has been timed ranges from ∼1 year to 20 years. The location and shape of the SMBHB regions reflect the sensitivity scaling relations of Siemens et al. (2013) assuming a detection with a signal-to-noise ratio of at least five, and a SMBHB background of h c 1 × 10 −15 , which is just below the most recent limit placed by NANOGrav on this background source of GWs. The longer-duration PTA requires less precision and fewer pulsars for a detection because the signal-to-noise ratio scales with total observing time .
review on methods to detect the stochastic background, we refer readers to Romano & Cornish (2017). Figure 2 provides a bird's-eye view of the PTA sensitivities required to successfully breach each area of science that we describe in the remainder of this document. Regardless of the emission source, PTAs will grow in sensitivity by adding pulsars to the array, by decreasing the average RMS residuals, and simply by timing pulsars for a longer duration. Thus, in Fig. 2 we show how the requirements change as a function of these parameters. As seen in the top ("now") panel, PTAs have already breached the expectations for the background of GWs from SMBHBs formed in galaxy mergers, and are now setting increasingly stringent limits on galaxy/black hole coevolution. In the coming years to decade, we expect this to become first a detection of the background, and then become an exploration of the physics of discrete SMBHB systems. Deeper explorations of gravity, dark matter, and other effects should be soon to follow thereafter.

The Population and Evolution of Supermassive Black Hole Binaries
Supermassive black hole binaries (SMBHBs) are expected to be the brightest sources of nanohertz GWs. They form during major mergers between two galaxies, each of which contain their own SMBH.

Stochastic Background
• In the simplest model, an ensemble of SMBHBs should produce a stochastic GW background with a characteristic strain ) at a GW frequency of 1 year −1 . • The GW background spectrum might be detected with a shallower slope at frequencies 10 nHz, if SMBHB evolution is accelerated due to strong interactions with their environments (stars and gas). • Galaxy merger rates, SMBH-host co-evolution, dynamical relaxation timescales, and whether SMBHBs might stall at wide separations, can all affect the amplitude scaling of this background. • PTA constraints or measurement of the background's amplitude or spectral shape can give information on all of the above astrophysical uncertainties for the ensemble SMBHB population. • A detection of the SMBHB background would confirm the consensus view that SMBHs reside in most or all massive galaxies. • Constraints on background anisotropy may indicate local binary clustering or large-scale cosmic features.

Discrete Continuous-wave Sources
• Massive or nearby systems may be individually resolvable from the background. Detections will provide the most direct probe of the earlyinspiral stage of a SMBHB merger, and can provide measurements of the binary's position, phase, and an entangled estimate of chirp mass and luminosity distance (M/D L ). • If detected, the pulsar term can permit M and D L to be disentangled.
Evolution of the waveform over PTA experimental durations is unlikely for SMBHBs, however this would also disentangle the M/D L term.

Bursts With Memory
• PTA measurement of a burst with memory can provide the date, approximate sky location, and reduced mass over co-moving distance of a SMBHB coalescence.
It is now broadly accepted that SMBHs in a mass range around 10 6 -10 10 M reside at the centers of most galaxies (Kormendy & Richstone 1995;Magorrian et al. 1998), with several scaling relations between the SMBH and galactic-bulge properties (e.g. Ferrarese & Merritt 2000;Gebhardt et al. 2000) indicating a potential co-evolution between the two. In the standard paradigm of structure formation, galaxies and SMBHs grow through a continuous process of gas and dark matter accretion, interspersed with major and minor mergers. Major galaxy mergers form binary SMBHs, and these are currently the primary target for PTAs. In this section, we lay out a detailed picture of what is not known about the SMBHB population, how those unknowns influence GW emission from this population, and what problems PTAs can solve in this area of study.
In Figure 3, we summarize the lifecycle of binary SMBHs. SMBHB formation begins with a merger between two massive galaxies, each containing their own SMBH.
Through the processes of dynamical friction and mass segregation, the SMBHs will sink to the center of the merger remnant through interactions with the galactic gas, stars and dark matter. Eventually, they will form a gravitationally bound SMBHB (Begelman et al. 1980). Through continued interaction with the environment, the binary orbit will tighten, and GW emission will increasingly dominate their evolution.
Any detection of GWs in the nanohertz regime, either from the GW background or from individual SMBHBs, will be by itself a great scientific accomplishment. Beyond that first detection, however, there are a variety of scientific goals that can be attained from detections of the various types of GW signals. The subsections below discuss these in turn, first setting up GW emission from SMBHB systems, and then detailing the influence of environmental interactions. Each section describes a different aspect of galaxy evolution that PTAs can access.
Throughout this document we refer to SMBHB parameters using the following conventions: SMBH masses m 1 and m 2 have a mass ratio q = m 2 /m 1 defined such that 0 ≥ q ≥ 1. The total mass is M = m 1 + m 2 and the reduced mass is µ = m 1 m 2 /(m 1 + m 2 ). Chirp mass is defined as M ≡ (m 1 m 2 ) 3/5 /(m 1 + m 2 ) 1/5 . The binary inclination, eccentricity, and semi-major axis are given by the symbols i, e, and a, respectively. The parameter D is the radial comoving distance to the binary system. Other specific parameters will be defined in-line where relevant.

GW Emission from Supermassive Black Hole Binaries
3.1.1. PTAs and the Binary Lifecycle As shown in Fig. 3, SMBHBs can emit discrete PTA-detectable GWs in two phases of their lifecycle. PTAs can detect continuous waves during SMBHBs' active inspiral phase, up to a few days before coalescence. PTAs can also detect the moment of coalescence of a SMBHB by detecting its related burst with memory (covered in more detail in §3.1.3). As noted earlier ( §2.2), the "pulsar term" of a binary contains information about an earlier stage of binary evolution. For a sufficiently strong GW signal, the pulsar term can be measured in several pulsars, and thus multiple snapshots of the evolutionary progression of the binary can be detected simultaneously (Mingarelli et al. 2012).
Because binary inspiral is slower at larger separations, the number density is much higher for discrete systems at earlier stages of inspiral (that is, at low GW frequency). At these stages, the binary may still be interacting closely with its environment. Here, we review deterministic and stochastic GW emission from binary SMBHs, and in the next sub-sections we develop from this into how environments can influence the GW signals-and how PTAs can uniquely explore these physical processes.

Continuous Waves: Binary Inspiral
A PTA detection of the correlated signal from a continuous-wave SMBHB (Fig. 1a) will produce constraints on a system's orbital parameters (Arzoumanian et al. 2014;Babak & Sesana 2012;Babak et al. 2016;Ellis 2013;Lee et al. 2011a;Zhu et al. 2014), in much the same way as ground-based instruments can with stellar-mass binaries. PTAs may have potentially poorer parameter estimation; this is because PTAs will typically observe an early portion of the binary inspiral, and only have a glimpse of this phase over the ∼1-2 decade observational timespans of PTAs. During this time, we are unlikely to observe frequency evolution of the binary that creates the information-rich "chirping" signal seen by ground-based laser interferometers , which can allow GW experiments to derive distances to a binary and a detailed model of the system's evolution. However, if the system is of sufficiently high 2), although we may on rare occasion detect "GW memory" from a binary's coalescence (Favata 2010;Sec. 3.1.3). Millions of such binaries will contribute to a stochastic GW background, also detectable by PTAs ( §3.1.4). A major unknown in both binary evolution theory and GW prediction is the means by which a binary progresses from ∼10 pc separations down to ∼0.1 pc, after which the binary can coalesce efficiently due to GWs (e. g. Begelman et al. 1980). If it cannot reach sub-parsec separations, a binary may "stall" indefinitely; such occurrences en masse can cause a drastic reduction in the ensemble GWs from this population. Alternately, if the binary interacts excessively with the environment within 0.1 pc orbital separations, the expected strength and spectrum of the expected GWs will change. Image credits: Galaxies, Hubble/STSci; 4C37.11, Rodriguez et al. (2006); Simulation visuals, C. Henze/NASA; Circumbinary accretion disk, C. Cuadra. mass, has initially high eccentricity, or is detected in a late stage of inspiral evolution, then chirping within the observational window may be detectable (Lee et al. 2011b;.
As a binary evolves and accelerates in its orbit, it has a greater chance at decoupling from the environment. Somewhere below separations of ∼1 pc, this may occur and the binary can be considered as an isolated physical system. In this case, the dissipation of orbital energy will depend only on the constituent SMBH masses, the orbital semi-major axis, and the binary's eccentricity. As explored in the sections below, the timing of this decoupling has a distinct effect on the detectable GW signals from SMBHB systems. Here, we lay out pure orbital evolution due to gravitational radiation.
The leading order equations for GW-driven orbital evolution are (Peters 1964;Peters & Mathews 1963): (1 + 73 24 e 2 + 37 96 e 4 ) (1 − e 2 ) 7/2 , where the derivatives of the orbital separation and eccentricity are averaged over an orbital period. One should note from Equation 3 that GW emission always causes the eccentricity to decrease, i.e. the binary will become more circular as it inspirals toward coalescence. For purely circular systems, the GW emission frequency will be twice that of the orbital frequency, and will evolve as df /dt ∝ f 11/3 . An important concept here is that of residence times; because the binary's inspiral evolves more rapidly fast at smaller separations, it spends less time residing in a highfrequency state once its inspiral is GW-dominated (and accordingly, it spends less time residing in a state of small-separation). Thus, we would naturally expect there to be fewer binary systems emitting at high GW frequencies, and many more binary systems emitting at low GW frequencies. As you will see in the next section, this becomes a critical point in assessing the shape of the GW background.
For a population of eccentric binaries, the GW emission will be distributed across a spectrum of harmonics of each binary's orbital frequency. At higher eccentricities, the frequency of peak emitted GW power shifts to higher and higher harmonics. This peak will coincide approximately with the pericenter frequency (Kocsis et al. 2012), such that: where Here, f is the (Keplerian, observed, Earth-reference-frame) GW frequency, and z is source redshift. The parameter n describes the harmonic of the orbital frequency at which the GWs are emitted; for a circular system, n = 2. Eccentric systems emit at the orbital frequency itself as well as at higher harmonics (e. g. Wen 2003). Binary eccentricity and the Keplerian orbital frequency co-evolve in the following mass-independent way (Peters 1964;Peters & Mathews 1963;; where e 0 is the eccentricity at some reference epoch, and, Hence a binary with (for example) e 0 = 0.95, when its orbital frequency is 1 nHz, will have e ≈ 0.3 by the time its orbital frequency has evolved to 100 nHz. Finally, the strain at which GWs are emitted is often quoted as the "RMS strain" averaged over orbital orientations: This equation and those above highlight the fact that continuous-wave detection by PTAs will enable a measurement of the system's orbital frequency and eccentricity. However, the strain amplitude is scaled by the degenerate parameters M 5/3 /D; therefore, chirp mass and source distance cannot be directly measured unless there is orbital frequency evolution observed over the course of the PTA observations, or unless the host galaxy of the continuous-wave source is identified (Sec. 9). Some loose constraints on the mass and distance of the continuous wave might be inferred simply based on the fact that statistically, we expect the first few continuous-wave detections to be of the heaviest, relatively equal-mass systems, at low to moderate redshifts (z 1), as demonstrated originally in Sesana et al. (2009). It is worth noting here that, until now in this section, we have ignored the pulsar term ( §2.2). Because the Earth term is correlated between different pulsars, it will always be discovered at a higher S/N than the pulsar term. If the pulsar term can be measured in multiple pulsars, however, we can map multiple phases of the binary's inspiral history. We can exploit this information through a technique known as temporal aperture synthesis to disentangle chirp mass and distance, as well as improve the precision of other parameters (Corbin & Cornish 2010;Ellis 2013;Ellis et al. 2012;Taylor et al. 2014;Zhu et al. 2016). Likewise, if we have many pulsars and excellent timing precision then we can potentially place constraints on BH spin terms in the waveform (Mingarelli et al. 2012). This is discussed further in Section 3.2.6.

Memory: Binary Coalescence
SMBHBs are one of the two leading sources that we expect to produce detectable GW memory events (the other potential source, as noted later in this paper, is cosmic string loops). In the case of SMBHBs, the inspiral and even the coalescence produce the oscillatory waveform that we see as continuous waves. However, the stress tensor after the binary's coalescence will differ from its mean before the coalescence; this is apparent in the "BURST" panel in Figure 3, where the waveform is offset from zero after the SMBHB coalescence's ring-down period. That offset, which grows over the entire past history of the binary's evolution, but most precipitously during the coalescence, is the non-oscillatory term we call memory (Braginskii & Thorne 1987;Christodoulou 1991;Favata 2010;Thorne 1992). All SMBHBs are expected to produce a GW memory signal. SMBHBs may produce both linear and non-linear memory, where the former is related to the SMBHB motion in the final moments of coalescence, whereas the non-linear signal is produced by the GWs themselves (see the discussion in Favata 2011). The memory strain from a coalescing binary depends on the binary parameters and the black hole spin. For a circular binary, the order of the strain can reasonably be approximated as and the cross-hand polarization term h × goes to zero (Madison et al. 2014).
Note that due to its non-oscillatory nature, this strain deformation is permanent. As previously noted, this leads to a sudden observed change in the observed period of pulsars. After this event, timing residuals will begin to deviate from zero with a linear upwards or downwards trend. We would observe that ramp as such, if we knew the intrinsic spin period and spin-down rate of pulsars (instead, we measure these from timing data). After fitting the timing residuals for the pulsar's period and period derivative, one finds the signature shown in Figure 1c, with the sharp peak of the curve representing the moment of coalescence. Based on the time and the amplitude of this signature in the residuals, PTAs can measure the date of coalescence, and obtain a covariant measurement of the SMBHB's reduced mass and co-moving distance. If the signature is detected in the Earth-term (i.e. correlated between multiple pulsars), a position of the memory event can be loosely inferred, likely to a few thousand square degrees depending on the number of pulsars in the array and how well they are timed.
Predictive simulations have found that PTAs are highly unlikely to detect GW memory from SMBHB mergers due to the extreme rarity of bright events (Cordes & Jenet 2012;Ravi et al. 2015;Seto 2009). Nonetheless, Cutler et al. (2014) find memory to be increasingly important for investigating phenomena at high redshift and discuss its potential for discovering unexpected phenomena. Techniques to search for memory in PTAs have been developed (e. g. Madison et al. 2014;van Haasteren & Levin 2010), and applied to place limits with PTAs Wang et al. 2015).  , highlighting the effects of variations in particular binary model parameters on the resulting GW spectrum. The dashed black line is the spectrum using only the population mass distribution and assuming GW-driven evolution, and the grey-shaded region represents the uncertainty in the overall distribution of SMBHB in the universe. The cyan (orange) line is the GW background from a particular realization of a SMBHB population using a high (low) eccentricity model. The time sampling corresponds to a PTA with duration of 20 yr and a cadence of 0.05 yr. The NANOGrav 11yr detection sensitivity and GWB upper-limits (Arzoumanian et al. 2018b) are illustrated with a grey dotted-line and triangle, respectively. Note: the shaded regions are schematic.

The Stochastic Binary Gravitational-wave Background
The superposition of GWs from a population of SMBHBs will produce a stochastic background. The GW background has greater power at lower frequencies (Fig. 1b); we typically visualize this as a plot of characteristic strain, h c (f ). Figure 4 demonstrates the effect on the GW spectrum of variations in assumptions about the SMBHB population. This connection between PTA characterization of the background and SMBHB evolution and galaxy dynamical evolution is the focus of the following subsections. Here, we outline how many discrete continuous-wave sources can contribute to a GW background.
The calculation reveals the cosmological and astrophysical factors that can influence the spectral amplitude and shape of the GW background (Rajagopal & Romani 1995;Sesana et al. 2004;Wyithe & Loeb 2003). In particular, we typically calculate the characteristic strain spectrum from an astrophysical population of inspiraling compact binaries (e.g. that shown in Fig. 4) as where the contributing factors are: (i) d 4 N/dzdM 1 dqdt r is the comoving occupation function of binaries per redshift, primary mass, and mass-ratio interval, where t r measures time in the binary's rest frame. (Uncertainties in this quantity are illustrated as the grey shaded-region in Fig 4.) (ii) The expression within {· · ·} describes the distribution of GW strain over harmonics, n, of the binary orbital frequency when the system is eccentric. As previously noted, a circular system will emit at twice the orbital frequency, while eccentric systems emit at the orbital frequency itself as well as higher harmonics. The function g(n, e) is a distribution function whose form is given in Peters & Mathews (1963). Thus, non-zero eccentricity in a binary redistributes the power of the GW spectrum, as shown by green-shaded regions in Fig. 4.
(iii) dt r /d ln f K,r describes the time each binary spends emitting in a particular logarithmic frequency interval, the "residence time", where f K,r is the Keplerian orbital frequency in the binary's rest frame. This is largely controlled by the impact of the direct SMBHB environment, as shown by red-and blue-shaded regions in Fig. 4. The effects of environment are explored in much greater detail in Section 3.2 below.
(iv) h(f K,r ) is the orientation-averaged GW strain amplitude of a single binary as given by Eq. 8. Note that in Fig. 4, sharp peaks in the orange and cyan lines indicate contributions from single-sources that may be loud enough to be "resolved" from the background.
In the simple case of a population of circular binaries whose orbital evolution is driven entirely by the emission of GWs, the form of h c (f ) is easily deduced. In this case, g(n = 2, e = 0) = 1 and g(n = 2, e = 0) = 0 such that f = 2f K,r /(1 + z), and the residence time dt r /d ln f K,r ∝ f −8/3 is given by the quadrupole radiation formula (Peters 1964). Collecting terms in frequency gives h c (f ) ∝ f −2/3 , as per Equation 2 (this single power-law spectrum is shown as the black, dashed line in Fig. 4).
The h c ∝ f −2/3 power-law GW background spectrum assumes a continuous distribution of circular SMBHBs evolving purely due to GW emission over an infinite range of frequencies. As the residence time decreases with frequency, the probability that a binary still exists (i. e. has not coalesced) also decreases at higher frequenciesthat is, there are far fewer binary systems with a high-frequency orbit. At f 10 nHz, the Poisson noise in the number of binaries contributing significantly to a given frequency bin becomes important, and realistic GW spectra become 'jagged' (e.g. orange and cyan lines in Fig 4). At even higher frequencies, the probability of a given frequency bin containing any binaries approaches zero, and the spectrum steepens sharply relative to the power-law estimate in response (e.g. Sesana et al. 2008). ¶ At the same time, with fewer sources contributing substantially to the GW spectrum at higher frequencies, the chance of finding a discrete system that outshines It is important to note that most PTA information on GW backgrounds is derived from the lowest few accessible frequencies in the datasets (the most sensitive point is typically at frequencies ∼ 2/T ). However, published upper limits often quote a standardized constraint on Aα within a fiducial singlepower-law model, for instance in the SMBH binary case by projecting the limit to a frequency of f = 1 yr −1 using α = 2/3. If there are spectral turn-overs as described herein, such projections are non-physical, even if practical to use for comparison of two PTAs.
¶ The power-law average effectively includes the contribution from fractional binary systems.
the combined background becomes larger; in this case, we say that the binary can be "resolved" from the background as a continuous-wave source. Binaries with non-zero eccentricity emit GW radiation over a spectrum of harmonics of the orbital frequency, rather than just the 2nd harmonic as in the circular case. For large eccentricities, this can significantly shift energy from lower frequencies to higher ones, and change the numbers of binaries contributing energy in a given observed frequency bin. This can thus substantially change the shape of the spectrum (i.e. green shaded region in Fig 4), decreasing h c at low frequencies ( 10 −8 Hz) and increasing it at higher frequencies ( 10 −8 Hz) (Enoki et al. 2007;Huerta et al. 2015;Ravi et al. 2014;Taylor et al. 2017).
Here again it is worth explicitly tying these ideas back to the effect of binary residence times on the GW spectrum; while the strain of an individual SMBHB rises with frequency (Eq. 8), the number of binary systems contributing to the background falls with frequency, leading to the generally downward-sloped GW spectrum at high frequencies. The turn-over seen at low frequencies for the case of eccentric binaries and strong environmental influence (green, blue, and red curves) can likewise be interpreted in part as due to these effects decreasing the residence time of the binaries at those frequencies: the systems are pushed to evolve much faster through that phase than a circular, purely-GW-driven binary would, therefore fewer systems contribute.
The "turn-over frequency" that is seen at low GW frequencies for an eccentric and/or environmentally influenced population, as well as the shape of the spectrum before and after that turn-over in the spectrum, is rich with information about nuclear environments, binary eccentricities, and the influence of gas on binary evolution, as will be explored more fully in Section 3.2.

Gravitational-Wave Background Anisotropy
The incoherent superposition of GWs from the cosmic merger history of SMBHBs creates a GW background, as we have discussed. However, some of these SMBHBs may be nearby, but not quite resolvable as continuous waves ( §3.1.2). This can induce departures from isotropy in the GW background (e.g. Mingarelli et al. 2017). Moreover, it may be possible for a galaxy cluster to host more than one inspiralling SMBHB, and thus this sky region may present excess stochastic GW power. Indeed, many physical processes can induce GW background anisotropy, which can be characterized and detected using methods developed in Conneely et al. Importantly, GW background anisotropy will influence the shape of the observed Hellings and Downs curve ( §2.2; §5), leading to different correlation functions between pulsar residuals than would be observed for an isotropic background (Mingarelli & Sidery 2014;Mingarelli et al. 2013). The current limit on GW background anisotropy is ∼ 40% of the isotropic component ). Indeed, the expected level of anisotropy due to Poisson noise in the GW background is expected to be ∼ 20% of the monopole signal (Mingarelli et al. 2013), in agreement with current astrophysical predictions based on nearby galaxies (Mingarelli et al. 2017). These estimates, however, assume a specific M • − M bulge relation (McConnell & Ma 2013) for the prediction of anisotropy levels.
The detection of the isotropic GW background is expected to follow after a GW background detection , with the details depending on the underlying astrophysics of SMBH binaries. Given that current anisotropy estimates are of the order of 10% of the isotropic background, we can use scaling laws from   Siemens et al. (2013) to estimate that it will take another ∼ 10 years to detect anisotropy in the GW background.

The influence of binary environments on the expected GW signals
We now discuss in detail factors which influence both the number and waveforms of continuous-wave sources, and the amplitude and shape of the characteristic strain spectrum for the gravitational wave background from SMBHBs. PTAs will ultimately measure at least both continuous waves and the amplitude and spectral shape of the GW background at various frequencies. Thus, these measurements have the potential to explore the factors discussed below. The influence of several of the factors below, such as binary inspiral rate and their average eccentricity in early evolutionary phases, have covariant effects on the expected GW signals (Fig 4). Thus, the measurements of PTAs for some of the effects discussed below can be enhanced by constraints on these factors from other areas of astrophysics, e. g. through electromagnetic observation of the SMBHB population and through numerical simulations (Section 9).
This section is laid out as follows. Section 3.2.1 discusses how PTAs can directly measure the SMBH mass function via the influence of this parameter on the GW background. Except for the SMBH mass, the strain of the GW background at various frequencies depends on the residence time of the binaries, which in turn depends on the physical mechanisms that drive SMBHBs to coalescence. As illustrated in Figure 3, binaries may be influenced by several external effects, particularly in the phase immediately preceding continuous-wave GW emission in the PTA band. These effects are discussed in Sections 3.2.2-3.2.4. Finally, Sections 3.2.5 and 3.2.6 review how PTAs might access information about the eccentricity of binary systems and the spin of individual SMBHs in a binary.

The Mass Function of Supermassive Black Holes
The GW amplitude depends strongly on the masses of SMBHB components: h ∝ M 5/3 (Eq. 8). Because of that, errors in the assumed SMBH mass distribution may lead to significant errors in GW background level estimates. Unfortunately, there is a tendency for different SMBH mass-estimation techniques (stellar kinematics, gas kinematics, reverberation mapping, AGN emission lines) to systematically disagree with each other, with stellar kinematics usually giving the highest values. For example, the claims of a ∼ 2 × 10 10 M SMBH in NGC 1277 (van den Bosch et al. 2012) were subsequently found to be too large by factors of 3-5 (Emsellem 2013). Such a bias is unsurprising: beyond the Local Group, only a few galaxies show evidence for a central increase in the RMS stellar velocities (see Figure 2.5 in Merritt (2013)) expected in the presence of a SMBH. That implies the SMBH sphere of influence is unresolved and we can only measure an upper limit on its mass. Other methods have their own biases, e.g. different AGN emission lines give different SMBH mass estimates (Shen et al. 2008). These biases were highlighted in the "bias-corrected" SMBH mass-host galaxy relations of Shankar et al. (2016); PTA limits on the SMBH background have also supported the possibility of biased SMBH masses at the upper (> 10 9 M ) end of the relation, demonstrating that M • − M bulge relations must be constrained to below certain values, otherwise we should have already detected the GW background Simon & Burke-Spolaor (2016).
Unlike galaxy masses, only a handful of SMBH masses are directly measured. Therefore, when constructing a SMBHB population we have to rely on various SMBHgalaxy scaling relations, such as the relation between SMBH mass and galactic bulge mass: M BH = βM bulge . While these types of relations have been thoroughly studied, there may be systematic biases in the SMBH populations they measure (e.g. Shen et al. 2008). Unsurprisingly, different studies give different values of β (the most widely used one is β ≈ 0.003); that issue is analyzed in detail in the Section IIC of . Given the observed mass and mass ratio distribution of merging galaxy pairs, β is the only free parameter defining the SMBHB mass distribution.  came up with the following analytical approximation for the GWB strain spectrum (assuming zero eccentricity and triaxial galaxies): As one can see, lower SMBH masses not only lower the GWB amplitude, but also increase the turnover frequency, since lighter SMBHBs enter the GW-dominated regime at higher orbital frequencies.

Dynamical friction
When their host galaxies merge together, the resident SMBHs sink to the center of the resulting galactic remnant through dynamical friction (Antonini & Merritt 2012;Merritt & Milosavljević 2005). This is the consequence of many weak and long-range gravitational scattering events within the surrounding stellar, gas, and dark matter distributions, creating a drag that causes the SMBHs to decelerate and transfer energy to the ambient media (Chandrasekhar 1943). For a SMBH sinking towards the center of a spherical galaxy, the inspiral timescale is on the order of 10 Gyr (Binney & Tremaine 1987): where R e and σ are the galaxy's effective radius and velocity dispersion, M • is the SMBH's mass and ln Λ is the Coulomb logarithm. + However, the braking of the individual SMBHs cannot be modeled with just the black-hole mass, as they will initially be surrounded by their constituent galaxies. After the galaxies begin to interact, each SMBH is carried by its parent galaxy through the early stages of merger as the galaxies are stripped and mixed into one. Eventually, one (in the case of a high mass-ratio merger) or both SMBH, along with a dense core of stars and gas within the SMBH influence radius, will be left to inspiral on its own. Given a more realistic model of tidal stripping, the resultant timescale can often be shorter than 1 Gyr (Dosopoulou & Antonini 2017;. By extracting energy from the SMBHs on the kiloparsec separation scale, dynamical friction is a critical initial step towards binary hardening and coalescence. For systems with extreme mass ratios ( 10 −2 ) or very low total masses, dynamical friction may not be effective at forming a bound binary from the two SMBH within a Hubble time. In this case the pair might become "stalled" at larger separations, with one of the two SMBH left to wander the galaxy at ∼ kpc separations (Dvorkin & Barausse 2017;Kulier et al. 2015;Mcwilliams et al. 2014;Yu 2002). It is possible that a non-negligible fraction of galaxies may have such wandering SMBH, some of which may be observable as offset-AGN, discussed in §9.

Stellar loss-cone scattering
At parsec separations, dynamical friction becomes an inefficient means of further binary hardening. At this stage, the dominant hardening mechanism results from individual 3-body scattering events between stars in the galactic core and the SMBH binary (Begelman et al. 1980). Stars slingshot off the binary which can extract orbital energy from the system (Mikkola & Valtonen 1992;Quinlan 1996). The ejection of stars by the binary leads to hardening of the semi-major axis, and eccentricity evolution usually parametrized as (Quinlan 1996): where H is a dimensionless hardening rate, and K is a dimensionless eccentricity growth rate. Both of these parameters can be computed from numerical scattering experiments (e.g. Sesana et al. 2006). However, only stars in centrophilic orbits with very low angular momentum have trajectories which bring them deep enough into the galactic center to interact with the binary. The region of stellar-orbit phase space that is occupied by these types of stars is known as the "loss cone" (LC) (Frank & Rees 1976). Stars which extract energy from the binary in a scattering event tend to be ejected from the core, depleting the LC. In general, the steady-state scattering rate of stars is expected to be relatively low as stars are resupplied to the LC at larger radii where relaxation from star-star scatterings is slow. Like with dynamical friction at larger scales, binaries can also stall here, at parsec scales, due to inefficacy of the LC, which is typically known as the "final parsec problem" (Milosavljević & Merritt 2002. Generally, binaries that do not reach sub-parsec separations will be unable to merge via GW emission within a Hubble time (Dvorkin & Barausse 2017;Mcwilliams et al. 2014). Some simulations suggest, however, that even in the case of a depleted, steady-state LC, the most massive SMBHB, which dominate in GW energy production and also tend to carry the largest stellar masses, may still be able to reach the GW-dominated regime and eventually coalesce .
Various mechanisms have been explored to see whether the LC can be efficiently refilled or populated to ensure continuous hardening of the binary down to milliparsec separations. In general, any form of bulge morphological triaxiality will ensure a continually-refilled LC that can mitigate the final-parsec problem (Khan et al. 2013;Vasiliev et al. 2014Vasiliev et al. , 2015Vasiliev & Merritt 2013). Isolated galaxies often exhibit triaxiality, and given that the SMBH binaries of interest are the result of galactic mergers, triaxiality and general asymmetries can be expected as a natural post-merger by-product. Also, post-merger galaxies often harbor large, dense molecular clouds that can be channeled into the galactic center, acting as a perturber for the stellar distribution that will refill the LC (Young & Scoville 1991), or even directly hardening the binary (Goicovic et al. 2017). Finally, since binary coalescence times can be of the order of Gyrs, while galaxies can undergo numerous merger events over cosmic time (e.g. Rodriguez-Gomez et al. 2015), subsequent mergers can lead to the formation of hierarchical SMBH systems (Amaro-Seoane et al. 2010;Bonetti et al. 2018;Ryu et al. 2018). In this scenario, a third SMBH can not only stir the stellar distribution for LC refilling, but may also act as a perturber through the Kozai-Lidov mechanism (Kozai 1962;Lidov 1962) wherein orbital inclination can be exchanged for eccentricity (Amaro-Seoane et al. 2010;Blaes et al. 2002;Makino & Ebisuzaki 1994). Not only could a third SMBH more effectively refill the LC, but it could also increase the SMBH binary eccentricity which speeds up GW-inspiral (see §3.1.2).
Even in the absence of a third perturbing SMBH, binary eccentricity can be enhanced through stellar LC scattering. This has been observed in many numerical scattering experiments (Quinlan 1996;Sesana et al. 2006), where the general trend appears to be that equal-mass binaries (most relevant for PTAs) with very low initial eccentricity will maintain this or become slightly more eccentric. For binaries with moderate to large eccentricity (or simply with extreme mass-ratios at any initial eccentricity), the eccentricity can grow significantly such that high values are maintained even into the PTA band (Roedig & Sesana 2012;Sesana 2010).
The rotation of the stellar distribution (when the stars have nonzero total angular momentum) can impact the evolution of the binary's orbital elements. In particular, a stellar distribution that is co-rotating with the binary will tend to circularize its orbit. But if the stellar distribution is counter-rotating with respect to the binary, then interactions with stars in individual scattering events are more efficient at extracting angular momentum from the binary, enhancing the eccentricity to potentially quite high values (e > 0.9) (Mirza et al. 2017;Sesana et al. 2011). However, the binary's orbital inclination (with respect to the stellar rotation axis) also changes: it is always decreasing so that in the end, the initially counter-rotating binaries tend to become co-rotating with the stellar environment (Gualandris et al. 2012;. In most cases, this joint evolution of eccentricity and inclination leads to the eccentricities at PTA orbital frequencies being much lower than we would expect from a non-rotating stellar environment model . Interaction of a binary with its surrounding galactic stellar distribution will lead to attenuation of the characteristic strain spectrum of GWs at low frequencies (i.e. blue shaded region in Fig. 4). This can be separated into two distinct effects: (1) the direct coupling leads to an accelerated binary evolution, such that the amount of time spent by each binary at low frequencies is reduced; (2) extraction of angular momentum by stellar slingshots can excite eccentricity, which leads to faster GW-driven inspiral, and (again) lower residence time at low frequencies. Assuming an isothermal density profile for the stellar population * , we can re-arrange Equation 15 to deduce the orbital frequency evolution, and hence the evolution of the emitted GW frequency, such that df /dt ∝ f 1/3 . Inserting this into Equation 10 and collecting terms in frequency gives h c (f ) ∝ f , which is markedly different from the fiducial ∝ f −2/3 circular GW-driven behavior. The excitation of binary eccentricity by interaction with stars will further attenuate the strain spectrum at low frequencies, leading to an even sharper turnover (e.g. Taylor et al. 2017).

Viscous circumbinary disk interaction
At centiparsec to milliparsec separations, viscous dissipation of angular momentum to a gaseous circumbinary disk may play an important role in hardening the binary (Begelman et al. 1980;Kocsis & Sesana 2011). This influence will depend on the details of the dissipative physics of the disk, however the simple case of a binary exerting torques on a coplanar prograde disk has a self-consistent non-stationary analytic solution (Ivanov et al. 1999). These studies have assumed a geometricallythin optically-thick disk whose viscosity is proportional to the sum of thermal and radiation pressure (the so-called α-disk, Shakura & Sunyaev 1973). The binary torque will dominate over the viscous torque in the disk, leading to the formation of a cavity in the gas distribution and the accumulation of material at the outer edge of this cavity (i.e. Type II migration). The excitation of a spiral density wave in the disk torques the binary, and leads to hardening through the following semi-major axis evolution (Haiman et al. 2009;Ivanov et al. 1999): whereṁ 1 is the mass accretion rate onto the primary BH, and a 0 is the semi-major axis, at which the disk mass enclosed is equal to the mass of the secondary BH, given by (Ivanov et al. 1999) where α is a disk viscosity parameter,Ṁ E = 4πGm 1 /cκ T is the Eddington accretion rate of the primary BH (κ T is the Thompson opacity coefficient), and r g = 2Gm 1 /c 2 is the Schwarzchild radius of the primary BH.
The disk-binary dynamics may be much more complicated, for example, the disk may be composed of several physically-distinct regions (Shapiro & Teukolsky 1986), discriminated by the dominant pressure (thermal or radiation) and opacity contributions (Thompson or free-free). Additionally, high-density disks (equivalently: high-accretion rates) may provide rapid hardening, but may also be unstable due to self-gravity. Furthermore, while the system will initially pass through "diskdominated" Type II migration (where the secondary BH is carried by the disk like a cork floating in a water drain), it will generally transition to "planet-dominated" migration (where the secondary BH is dynamically dominant over the disk) which can be significantly slower.
Eccentricity growth may be significant during this disk-coupled phase (Armitage & Natarajan 2005;Cuadra et al. 2009), although there are no generalized prescriptions of the form of Equation 16. The growth of eccentricity is driven by outer Lindblad resonant interaction of the binary with gas in the disk at large distances (Goldreich & Sari 2003). However, Roedig et al. (2011) found that binaries with high initial eccentricity will experience a reduction in eccentricity, leading to the discovery of a limiting eccentricity for disk-binary interactions that falls in the interval e crit ∈ [0.6, 0.8]. The emerging picture is that in a low-eccentricity orbit, the density wake excited by the secondary BH will lag behind it at apoapsis, causing deceleration and increasing eccentricity. Whereas in a high-eccentricity orbit, the density wake instead advances ahead of the secondary BH, causing a net acceleration and reduction in eccentricity. All previously mentioned studies considered prograde disks, but if a retrograde disk forms around the binary then the eccentricity can grow rapidly, leading to significantly diminished GW-inspiral time (Schnittman & Krolik 2015).
Coupling of a viscous circumbinary disk with a SMBH binary, like in the stellar LC scattering scenario, will lead to attenuation of the characteristic strain spectrum of GWs through both direct coupling, and excitation of eccentricity (e.g. red shaded region in Fig. 4). Focusing on direct coupling of a circular binary to its surrounding disks, Kocsis & Sesana (2011) studied scaling relations for the strain spectrum from different disk-binary scenarios, varying from h c (f ) ∝ f −1/6 for secondary-dominated type-II migration in a radiation-dominated α-disk, to h c (f ) ∝ f 1/2 for the Ivanov et al. (1999)-model in Equation 17. Across all models, the characteristic strain spectrum can be flattened or even increasing due to disk coupling (very different from the fiducial ∝ f −2/3 circular GW-driven behavior), and spectral attenuation is further enhanced through disk excitation of binary eccentricity. When the disk-binary models are applied to populations of SMBHB, the overall GWB spectra tend to more closely resemble the canonical −2/3 power-law, because each disk-regime only applies to a fairly narrow frequency range for a given binary mass Kocsis & Sesana 2011).

Eccentricity
The influence of an initial eccentricity on SMBH evolution, without the influence of external driving factors as in the previous subsection, was shown in Equations 6 and 7 There have been several studies of the influence of non-zero binary eccentricity on the characteristic strain spectrum of nanohertz GWs (Enoki et al. 2007;Huerta et al. 2015;Ravi et al. 2014;Taylor et al. 2017). The exact shape and amplitude of the spectrum will depend on the detailed interplay of direct environmental couplings with eccentricity, and, in the case of the latter, the distribution of eccentricities at binary formation (Ravi et al. 2014;Ryu et al. 2018). Broadly speaking, (1) eccentricity increases the GW luminosity of the binary, meaning it evolves faster and thus spends less time emitting in each frequency resolution bin; (2) eccentricity distributes the strain preferentially toward higher harmonics of the orbital frequency. These effects lead to the strain being diminished at lower observed GW frequencies, but also somewhat enhanced at higher frequencies-i.e. the spectrum can exhibit a turnover to a positive slope at low frequencies, but then a small "bump" enhancement at the turnover transition.
In simulated SMBHB populations that include eccentricity, non-zero eccentricities tend to reduce the mean occupation number at lower frequencies, thus making the stochastic background appear to have a flatter (or inverted) spectrum than the standard α = 2/3. However, these eccentricities also work to redistribute the power to higher frequencies, where in a circular population the background would be otherwise dominated by small numbers of binaries. .

Measuring the Spin of Supermassive Black Hole Binaries
When GWs transit our Galaxy, they perturb pulsar signals both at the Earth and at the pulsar, i.e. they affect both the "Earth term" and the "pulsar term"; as a reminder, the pulsar term shows a GW signal that is delayed in time by an amount proportional to the light travel time between the Earth and the pulsar ( §2.2). That is, if a source (such as a SMBHB) is evolving, the pulsar term encodes information about earlier phases in the SMBH evolution. We can use this information to our advantage: when a continuous GW signal is detected, one can look for the perturbation caused by the same source in the pulsars, but thousands of years ago. In this sense, these pulsar terms can be used as "gravity echoes", akin to well-known light echoes from nova explosions.
These pulsar terms can in this way be potentially used to map the evolution of a SMBHB system over many thousands of years: each pulsar term is a snapshot of the binary during a different point in its evolution ( Figure 6; (Mingarelli et al. 2012)). The phase evolution of the SMBHB can thus be measured. However, in order to do this phase matching, we require that 2πf L < 1, where f is the GW frequency and L is the distance to the pulsar. Therefore, it is in principal necessary to measure the pulsar distances to e.g. ∼ 1.5 pc for a GW signal at 1 nHz.
Many pulsar distances are poorly constrained, since most are estimated via the dispersion measure of the pulse (e. g. Cordes & Lazio 2002;Yao et al. 2017). However, a number of nearby, well-timed pulsars have accurate position measurements based on parallax measured by very long baseline interferometry (e. g. Deller et al. 2018). For instance, the well-timed millisecond pular PSR J0437-4715 has a distance measurement of 156.3 ± 1.3 pc (Deller et al. 2008), and is thus suitable for this measurement. Pulsar timing can also be used to obtain a parallax measurement to the pulsar, e.g. Lyne & Graham-Smith (2012), but with relatively large errors, and optical surveys such as Gaia (Gaia Collaboration et al. 2018) can be used to measure parallaxes to some pulsars' white dwarf companions (Jennings et al. 2018) though again with limited accuracy due to the low brightness of the white dwarfs. Thus, while current pulsar distances are generally not suitable for an extensive study of this effect, future instruments such as the Square Kilometre Array (e. g. Smits et al. 2011) or Next-generation Very Large Array hold tremendous promise for enabling precise pulsar distance measurements, which will also enable more tests of fundamental physics.  Figure 6: Gravitational waves spanning thousands of years in a binary's evolutionary cycle can be detected from a continuous GW source by using the pulsar term. As an example, we have drawn a few pulsars with line-of-sight path length differences to the Earth. These relative time delays between the pulsar terms can be used to probe the evolution and the dynamics of a SMBHB systems over these many thousands of years. Right: a major galaxy merger leads to the creation of a SMBHB, emitting nanohertz GWs. Left: the pulsar term from each pulsar probes a different part of the SMBHB evolution, since they are all at different distances from the Earth. The blue sinusoid is a cartoon of the GW waveform, and shows that the pulsar terms can be coherently concatenated to probe the binary's evolution, allowing one to measure e.g. the spin of the SMBHB (Mingarelli et al. 2012).

Cosmic Strings and Cosmic Superstrings
Cosmic strings and superstrings produce GWs in the nanohertz band. Cosmic strings are topological defects that can form during phase transitions in the early Universe, while cosmic superstrings are fundamental strings stretched to cosmological scales due to the expansion of the Universe. In a cosmological setting, and for the most simple superstring models, both cosmic string and superstring networks evolve in the same way. Cosmic (super)strings can exchange partners when they meet and produce loops when they self-intersect. These loops then oscillate and lose energy to GWs generating a stochastic background, with a power-law spectrum having a spectral index close to that of SMBHBs (Blanco-Pillado et al. 2018;Maggiore 2000). Detection of such a background, or GWs from an individual cosmic (super)string loop, would therefore provide a unique window into high-energy physics. Currently, pulsartiming experiments are producing the most constraining bounds on the energy scale and other model parameters of cosmic strings and superstrings.
Topological defects are a generic prediction of grand unified theories. As the universe expands and cools, the symmetries of the grand unified theory are broken down into the Standard Model in one or more stages called phase transitions. At each of these phase transitions, topological defects generically form, with the type of defect depending on what symmetry is being broken. Cosmic strings are a one-dimensional (or line-like) type of topological defect that can form in the early universe during one (or more) of these phase transitions. The other common types of topological defects Figure 7: A depiction of the production of a cosmic string loop from a self-intersecting string. The loop oscillates and produces gravitational waves slowly decaying away. This process allows the string network to reach the scaling regime.
are monopoles and domain walls. Both of these are ruled out, however, because they lead to cosmological disasters (e.g. the monopole problem), and much of the attention of the theoretical cosmology community has focused on strings as the only viable candidate that could lead to potential observational signatures. The most simple symmetry breaking that leads to the formation of cosmic strings occurs in the Abelian Higgs model, where the symmetry group U (1) breaks U (1) → 1 at some temperature, or energy scale, T s . They are characterized by their mass per unit length µ, which in natural units is determined by the temperature at which the phase transition takes place, µ ∼ T 2 s . The tension of a string is normally given in terms of the dimensionless parameter Gµ/c 2 which is just the ratio of the string energy scale to the Planck scale squared. It is worth pointing out that analogous processes abound in condensed matter systems such as superfluid helium, Bose-Einstein condensates, superconductors, and liquid crystals, which can also lead to the production of topological defects.
Phase transitions in the early universe can therefore lead to the formation of a network of cosmic strings. Analytic work and numerical simulations show that the network quickly evolves toward an attractor called the scaling solution. In this regime, the statistical properties of the system-such as the correlation lengths of long strings and the size of loops-scale with the cosmic time, and the energy density of the string network becomes a small constant fraction of the radiation or matter density. This attractor solution is possible due to reconnections: when two strings meet they exchange partners ("intercommute"), and when a string self-intersects it chops off a loop (see Fig. 7). The loops produced by the network oscillate, generate gravitational waves, and shrink, gradually decaying away. This process removes energy from the string network, converting it to gravitational waves, and providing the very signal we seek to detect. The way the scaling solution works is as follows: if the density of strings in the network becomes large, then strings will meet more often and reconnect, producing extra loops which then decay gravitationally, removing the surplus string from the network. If, on the other hand, the density of strings becomes too low, strings will not meet often enough to produce loops, and their density will start to grow. In this way the cosmic string network finds a stable equilibrium density and a Hubble volume of the Universe with a string network statistically always looks like that shown in Fig. 8, stretched by the cosmic time.
Superstrings refer to the fundamental strings of string theory that in some models can be stretched to cosmological scales by the expansion of the Universe. Fortunately, much of what we have learned about the evolution of cosmic string networks can be applied to the evolution of cosmic superstrings. Aside from the possibility of forming more than one type of string, the most significant difference is that cosmic superstrings don't always reconnect when they meet. This occurs for two reasons: (i) because these string theory models require more than 3 dimensions, and though strings may appear to meet in 3 dimensions they can miss each other in the extra dimensions, and (ii) because superstrings, being the fundamental strings of string theory, interact with a probability proportional to the string coupling squared. The net effect is to lower the reconnection probability from p = 1 for cosmic strings to p ≤ 1 for cosmic superstrings. A network of cosmic superstrings still enters the scaling regime, albeit at a density higher by a factor inversely proportional to the reconnection probability (because strings need to interact all that more often to produce loops that gravitationally radiate away). The smaller reconnection probability of superstrings therefore actually increases the chances of finding observational signatures because the equilibrium string density of the scaling solution is higher. Figure 9 (Blanco-Pillado et al. 2018) shows the areas of the cosmic (super)string Gµ/c 2 -p parameter space excluded by present and potential future experiments, including LIGO and the three leading PTA experiments (PPTA, NANOGrav, and EPTA) as of 2017, as well as the planned LISA space mission. As the reconnection probability decreases, the density of strings in the scaling regime increases, and thus experiments are sensitive to lower and lower string tensions. Following our previous discussion, the limits at p = 1 represent the limits specifically for cosmic strings.
In the 1990s, cosmic microwave background data ruled out cosmic (super)strings as the primary source of density perturbations, placing constraints on the string tension at the level of Gµ/c 2 10 −7 , relegating them to at most a secondary role in structure formation. However, cosmic (super)strings are still potential candidates for the generation of a host of interesting astrophysical phenomena: including ultra high energy cosmic rays, fast radio bursts, gamma ray bursts, and of course gravitational radiation. Clearly, any positive detection of an observational signature of cosmic (super)strings would have profound consequences for theoretical physics. PTAs are currently the most sensitive experiment for the detection of cosmic (super)strings and will remain so for more than a decade and a half. Correspondingly, the most constraining upper limits on the energy scale of cosmic (super)strings come from PTA analyses. As of the writing of this paper, the current best limit on the string tension (for p = 1) comes from the NANOGrav experiment, and is Gµ/c 2 < 5.3(2) × 10 −11 (Arzoumanian et al. 2018b). Figure 10 shows the stochastic background spectrum produced by cosmic strings in terms of the dimensionless density parameter Ω versus frequency for dimensionless string tensions Gµ/c 2 in the range 10 −23 -10 −9 for p = 1. Overlaid are current and future experimental constraints from PTAs, ground-based GW detectors, and spaced-based detectors. PTA sensitivity will not be superseded until the LISA mission which is scheduled for launch in 2034.

The Nature of Gravity
Understanding gravity-and testing general relativity as a theory of gravityare leading pursuits of most GW detectors, and PTAs are no exception to this.
Polarization modes: Upon a high-significance detection of gravitational radiation, PTAs will be able to assess its polarization; this is done by correlating the residuals of pairs of pulsars, and constructing an angular correlation function based on the angle between the pair. This can be measured regardless of the type of radiation (continuous, memory, background, etc.). By characterizing the shape of the correlation function, the polarization mode of the signal can be measured. Various classes of gravity models, including General Relativity, will be supported or ruled out. PTAs will explore different targets than ground-and space-based interferometry, permitting competitive and independent constraints on theories of gravity.
Graviton mass and gravity group velocity: Variations in the graviton mass will cause slight variations to the correlation function which may be detected if PTAs acquire sufficient measurement precision. Graviton mass can also be constrained or measured by the temporal offset of any co-detected electromagnetic counterpart to a single GW source. If the graviton mass is minimal (m g 10 −22 eV), which it appears to be based on recent LIGO measurements, PTAs will instead produce precise limits on the group velocity of GWs.
For context, we finish this section by summarizing three examples of theories that PTAs can test, which make several predictions that differ from general relativity.
General relativity has been an exceptionally successful physical theory, and continues to stand up to over 100 years of tests of its accuracy . This includes the discovery of what had been the remaining unobserved prediction of Einstein for general relativity: the detection of GWs from two black holes by LIGO (Abbott et al. 2016c). Over the years there have been many other theories of gravity put forward. Some of these are based on physically aesthetic motivations, for instance a change in symmetry or adding a generalization, such as the scalar field in Brans-Dicke theory (Brans & Dicke 1961). Others, especially in recent years, strive to explain an observed phenomenon that is unexplained by current physical theories, such as dark matter or dark energy (Akrami et al. 2013). Whatever the nature of the theory, it must pass all of the observational tests that have made general relativity such a successful theory. Myriad new tests of gravity have become possible with GW observations e.g. Chatziioannou et al. 2012;Eardley et al. 1973b;Yunes & Pretorius 2009;Yunes & Siemens 2013)

How Many Gravitational Wave Polarization States Exist?
General relativity predicts the existence of GWs which travel at the speed of light, are transverse, and have two polarizations, the standard plus and cross. Other theories of gravity may predict the existence of GWs with different properties.
For any theory in which a metric encodes the dynamics of spacetime, there are six distinct GW polarizations possible (Eardley et al. 1973b). A linear GW is a small deviation from flat spacetime with a metric given by g µν = η µν + h µν , where η µν is the flat-space Minkowski metric and |h µν | 1. Since the metric is a symmetric 4×4 matrix, h µν has ten independent components: using our freedom to choose the coordinate system (i.e., gauge freedom), we can remove four of these components, leaving only six. These components can be grouped into the way each transforms under rotations, giving us two scalar components, two vector components, and two tensor components (Eardley et al. 1973a,b). The tensor components are those most commonly pursued, and are commonly referred to as the plus and cross polarizations.
There has recently been considerable interest in using observations of gravitational waves to look for evidence of these non-Einsteinian polarizations. For example, Isi et al. (2015) determined that the signal-to-noise of match-filtered searches for gravitational waves with aLIGO can be greatly impacted if they consist of non-Einsteinian polarization modes. Analysis of the detection of GW170814 (the first GW event detected by the two LIGO observatories and Virgo) favored tensor polarization modes over a pure vector or scalar modes by a Bayes-factor of more than 200 and 1000, respectively (Abbott et al. 2017b;Isi & Weinstein 2017).
We will work in a coordinate system in which h µ0 = 0, which is called the synchronous gauge. If we consider a plane wave traveling in the z-direction then a generic GW is described by six polarization tensors (Eardley et al. 1973a,b): where we have normalized these polarization tensors to satisfy e P ij e ij P = δ P,P . Three of these polarization tensors (+/×, and b) are transverse; the other scalar mode and the two vector components are longitudinal, as shown in Fig. 11.
PTAs are sensitive to the polarization of GWs of any sufficiently bright source, including single sources, the stochastic background, etc (Chatziioannou et al. 2012). For illustration, let us imagine a single, high-intensity plane GW traveling along the positive z-axis as we observe a PTA with pulsar pairs scattered across the sky, with any two pulsars separated by some arbitrary sky angle θ. For our single wave with various polarization modes induced, we show the predicted modulation of the observed pulse phase at various sky positions in Fig. 12. Here we can see the quadrupolar response to the usual plus and cross polarization, the dipolar response to the vector polarizations, and the monopolar response to the scalar polarizations.
For each pulsar pair, we can measure the correlated response of their timing residuals, and plot this response as a function of a pairs' angular separation, θ. The shape of this angular correlation function of pulse residuals, C(θ), will be different depending on the type of polarization in the GW being observed. In Fig. 13, we show the correlation of residuals for a collection of pulsar pairs separated by an angle θ (i.e., the Hellings and Downs curve, first formulated by Hellings & Downs (1983)). We can see that each of the different polarization states generates a distinct correlation structure which can, given sufficient PTA sensitivity, be measured . As discussed in Lee et al. (2008), biweekly observations for five years with RMS timing accuracy of 100 ns can detect non-Einsteinian polarization (i.e. other than the + and × modes) with 60 pulsars for the longitudinal scalar mode and the vector modes; 40  Lee et al. (2008) found that a PTA can discriminate between the tensor and non-tensor modes with a PTA with bi-weekly observations of 500 pulsars for five years with an RMS timing accuracy of 100 ns.
pulsars for the transverse scalar mode. To discriminate non-Einsteinian modes from Einsteinian modes, the PTA needs to monitor 40 pulsars for the transverse scalar mode, 100 pulsars for the longitudinal scalar mode, and 500 pulsars for the vector modes. As such, currently the detection of such modes is believed to be at least in the intermediate future; currently, up to ∼10 pulsars have the required residual RMS levels. However, this will change rapidly with the timing programs beginning on next-generation radio facilities.
As an example, Ref. ) established the first PTA upper limits on non-Einsteinian polarizations. Currently these limits are derived from the autocorrelation of the residuals of each pulsar with itself. This analysis assumed a gravitational wave background produced by a large collection of unresolved binaries and took into account the fact that dipole and quadrupole radiation will be emitted at different rates from a binary system. Given that the scalar longitudinal mode produces the largest autocorrelation (see Fig. 13) it has the most stringent PTA upper limit.

Characterizing the Dispersion Relation of Gravitational Waves
Theories which predict these novel polarization states generically also predict nonstandard GW dispersion relations which can result in a number of effects measurable by PTAs. We can model a range of modified dispersion relations using the parameterization in geometric units (Blas et al. 2016) where m g is the mass of the graviton and c g gives a group velocity different from the In the presence of a graviton mass, the Newtonian potential has a length-dependent suppression from which m g can be constrained using a variety of observations. The most constraining limit comes from weak lensing observations which yield m g < 6.9 × 10 −32 eV and can be translated into a frequency f g = m g c 2 /h > 1.7 × 10 −17 Hz (Choudhury et al. 2004). A massive graviton changes the functional form of the gravitational potential from a shape of ∼ 1/r to ∼ e −kmgr /r; this is the socalled Yukawa potential. This change in the potential leads to a change in density fluctuations of matter, which in turn affects the power spectrum of fluctuations measured in weak gravitational lensing surveys. It should be noted that these constraints are subject to model-dependent uncertainties in the exact distribution of dark matter, and so should be used with caution. Other, less model-dependent, constraints from the dynamics of objects in the solar system yield a limit m g < 4.4 × 10 −22 eV or f g > 10 −7 Hz (Talmadge et al. 1988;Will 1998).
With a non-zero graviton mass, GWs acquire a frequency-dependent phase velocity leading to an additional phase term in the GW signal detected by LIGO (Will 1998). Again, fixing c g = 1 constraints from GW170104 provide the best dynamical constraint to the graviton mass and yields m g < 7.7 × 10 −23 eV or f g > 1.8 × 10 −8 Hz .
Pulsars can also be sensitive to the graviton mass, and given that the best dynamical limit to the graviton mass lies in the middle of the PTA frequency band it is unsurprising that PTAs may complement the limits set by LIGO. A massive graviton would be detectable through slight variations in the Hellings and Downs curve, and an example of this is shown in Fig. 14. As discussed in , a PTA with bi-weekly observations of 60 pulsars for 5 years with a RMS timing accuracy of 100 ns, massless gravitons can be distinguished from gravitons heavier than m g = 3 × 10 −22 eV with 90% confidence level. A 10-year observation with the same RMS timing accuracy and cadence but with 300 pulsars would probe graviton masses down to m g = 3 × 10 −23 eV.
In addition to using the correlated residuals between pulsars in response to a stochastic GW background, if a PTA detects a single SMBH binary merger which has a time-tagged electromagnetic counterpart, a comparison between the arrival time of the two signals allows for a complementary probe of the graviton mass. We discuss this possibility further in Sec. 9.3.7.
Pulsar timing can also place a limit on the group velocity of GWs. Assuming that the graviton mass m g 10 −22 eV so that f g year −1 , in the PTA band the gravitational dispersion relation becomes ω 2 c 2 g k 2 . If c g < 1, the pulse train from a pulsar travels faster than the group velocity of GWs, thus "surfing" on the wave. These interactions can lead to phase changes in the waves that significantly amplify the observed timing residuals. Assuming a reasonable model for the SMBH binary background and taking the timing residuals for PSR B1937+21, Baskaran et al. (2008) find that 1 − c g < 10 −2 . This limit will improve significantly as PTAs become more sensitive.
In other GW bands, the GW group velocity can be estimated through measurements of the propagation time between the two LIGO detectors. An analysis of GW150914 allowed the first direct constraint on c g giving c g < 1.7 (Blas et al. 2016). An analysis of the neutron star binary merger GW170817 ) with its electromagnetic counterpart placed the most restrictive constraint on the relative propagation speed of gravitational waves and electromagnetic waves of −3 × 10 −15 < c g − 1 < 7 × 10 −16 (Abbott et al. 2017b).

Examples of Gravity Theories That Predict Additional Polarizations and modified dispersion relations
There are several gravity theories which predict additional polarizations and modified GW dispersion relations. Here we will briefly describe three of them: Einstein Aether (Jacobson & Mattingly 2001), massive gravity (de Rham et al. 2011), and f (R) gravity (Carroll et al. 2004). We note that these three theories have been chosen for having a complete and consistent theoretical description and that each face some challenges when confronted with observations. See Ref. (Isi & Stein 2018) for a detailed discussion of how to probe alternative gravity theories with gravitational wave measurements. We also note that the binary neutron star merger observed by LIGO Abbott et al. (2017a) places strong constraints on the properties of alternative theories of gravity (Baker et al. 2017;Sakstein & Jain 2017).

Einstein Aether
Einstein Aether posits the existence of a Lorentz symmetry-violating, dynamical, timelike vector field. This is in addition to the gravitational metric, a 2-tensor field. This arrangement preserves three-dimensional rotational symmetry while generating a preferred rest-frame.
The dynamical time-like vector field introduces a preferred frame arising from local physics leading to a generally covariant theory. Einstein Aether is an example of the more general vector-tensor theories  and can be thought of as a simplified model of spontaneous Lorentz symmetry breaking which may occur in string theory (Kostelecky & Samuel 1989).
Considering a theory that includes all covariant couplings which is quadratic in all derivatives the theory can be specified by the value of four new constants, {C 1 , C 2 , C 3 , C 4 } (Jacobson & Mattingly 2004). In the limit where these constants vanish this theory is indistinguishable from general relativity. As shown in Jacobson & Mattingly (2004), there are five GW polarizations in this theory. Transforming to a gauge where h µ0 = 0 and taking the C i → 0 limit the theory predicts: plus/cross tensor polarizations traveling at the speed of light; two longitudinal vector modes traveling at a speed possibly different from the speed of light; and a linear combination of the two scalar modes where v 0 is the linear perturbation to the time-component of the Aether field, c s is the speed of the scalar GWs, and e ij is as given in Equations 19-21.

Massive gravity
Building a well-behaved model of gravity in which the graviton has a non-zero mass has been an issue for decades; however, recent attempts at understanding the nature of dark energy have reinvigorated the conversation. In such models, a massive graviton might introduce a scale that could explain the observed long-range behavior of the gravitational field. Since the early 70s it has been known that adding a mass in the absence of non-linear interactions gives rise to an observationally ruled-out discontinuity, the dDVZ discontinuity (van Dam & Veltman 1970;Zakharov 1970). The addition of non-linear interactions can cure this, but at the cost of a degree of freedom. This is known as the Boulware-Deser ghost (Boulware & Deser 1972), proof that only five propagating degrees of freedom can exist in any massive, interacting extension of general relativity. Only very recently was there a solution to this issue; the imposition of a symmetry, the Gallileon symmetry, (de Rham et al. 2011;Hassan & Rosen 2012b), restricts the form of the non-linear interactions and projects out the Boulware-Deser ghost. Later it was shown that bimetric theories can be derived from these new massive gravity theories, and so the theories have been connected (Hassan & Rosen 2012a).
Just as in the case of Einstein-Aether theory the stable propagating mode is a superposition of the 6 degrees of freedom identified in Sec. 5.1. Transforming the metric far away from the source (where the non-linear terms in the field equations vanish) into synchronous gauge, massive gravity generates the standard plus/cross tensor polarizations which travel at the speed of light and a scalar wave which travels at a speed c s with a polarization which is a linear combination of the breathing and longitudinal modes.

f (R) gravity
Another commonly considered alternative theory of gravity imagines a gravitational Lagrangian which is a function of the Ricci curvature scalar, f (R)-where f (R) = R gives the Lagrangian for GR. The original motivation for f (R) gravity was to provide an alternative gravity theory that could account for the observed accelerated expansion of the universe (Carroll et al. 2004). In its original formulation the gravitational Lagrangian was modified to include a term inversely proportional to the Ricci scalar: R+µ/R. As the universe expands the homogeneous value of the Ricci scalar decreases until, at a late enough time, this new term becomes dynamically important. Although it was shown that this specific modification was at odds with measurements of the deflection of light around the Sun (Chiba 2003;Erickcek et al. 2006), there are specific forms of the function f (R) which pass Solar System tests and lead to a late-period of accelerated expansion (Chiba et al. 2007;Hu & Sawicki 2007). f (R) gravity is an example of a scalar-tensor theory (see, e.g., Ref. (Will 1993)): by performing a conformal transformation the theory can be described as a scalartensor theory with a scalar mass where R 0 is the Ricci scalar for the background over which the GWs propagate. An analysis of the polarization modes shows that this theory predicts the standard plus/cross tensor polarizations propagating at the speed of light and again, a linear combination of the breathing and longitudinal scalar-modes (Eqs. 20 and 21 Liang et al. 2017) and follow the dispersion relation m 2 = ω 2 − k 2 .

Relic Gravitational Waves and Early-Universe Cosmology
Leading theories for the evolution of the early Universe invoke an inflationary epoch to explain the isotropy and other broadly observed properties of the Universe.
If this epoch occurred, it would have produced GWs by rapidly expanding quantum fluctuations present in the pre-inflation epoch. The nature of inflationary expansion will be encoded in the strength and spectrum of the produced broad-band GW background. For standard inflation models, the inflationary background will likely be fainter than that of SMBHBs, however some inflationary modes may have a spectrum that rises into the PTA/LISA/LIGO bands. PTAs have already begun to place stringent limits on those models. PTAs will most sharply probe the spectrum (scalar spectral index), while cosmic microwave background (CMB) experiments will probe the relative strength of these "relic" GWs and the more standard density waves (the latter have already been detected and characterized by CMB probes).
If primordial GWs can be detected, PTAs may also be sensitive to phase transitions in the early Universe, allowing (amongst other possibilities) an independent constraint on the existence of the cosmological constant.

Relic Gravitational Waves
GW signals can arise from the early Universe if a period of rapid inflation occurred (Fig. 15). Quantum fluctuations from early in the Universe would have been amplified by inflation, and like strings, would produce a broad-band signal detectable by multiple instruments. These "relic" GWs are a long-standing target for current and future CMB experiments, looking for the tensor modes induced by these waves (Kamionkowski & Kovetz 2016;and references within). CMB experiments will constrain only the longwavelength (low-frequency) portion of the spectrum of inflationary GW signals, and can most effectively constrain the scalar-to-tensor ratio. However, higher-frequency experiments, like PTAs and space-and ground-based laser interferometry, are able to Figure 15: The quantum fluctuations in the very early Universe, if rapidly amplified by inflation, could cause a highly broad-band GW background signal in addition to affecting the cosmic microwave background. While cosmic microwave background experiments can constrain the scalar-to-tensor ratio, higher-frequency experiments (as PTAs, LISA, LIGO/VIRGO shown here) will most sharply probe the spectrum of these early fluctuations. Figure reproduced from (Battye & Shellard 1996).
more effectively constrain the spectrum of these early fluctuations by looking at the scalar spectral index (the spectral index of the detected background, which reflects the spectrum of the fluctuation scales and the mode of their amplification). Various models of inflation predict differing values for the observed scalar spectral index. Only weak-and often highly model-dependent constraints-exist on the shape of the inflationary GW spectrum. Following the recognition that primordial black holes could contribute to the LIGO detections, there has been a renewed interest in examining what kind of constraints could be placed on the spectrum of inflationary perturbations at shorter wavelengths, where PTAs can contribute broad-band limits. In fact, the most stringent limit on relic GWs, before LIGO's first observation, had been set at Ω gw (f ) < 2.3 × 10 −10 by the combination of PTA, LIGO, and CMB limits (Lasky et al. 2016). This marked the first time the most optimistic predictions of various models were impinged upon by observations. Given the rapid progress across the GW spectrum (from ground-based interferometers to PTAs to CMB experiments), our assessment is that this area is likely to remain observationally driven. That is, while models can always be constructed to evade the observational limits, the extent to which a blue inflationary GW spectrum (i. e. one where the energy density rises with GW frequency) remains viable will be determined by the observations across the GW spectrum.
While we can currently place constraints on inflationary GWs, we state with some confidence that PTAs will not be the primary driver for the study of inflationary GWs. This is because it is highly unlikely that the inflationary epoch signal will dominate over the much brighter GW background from inspiraling SMBHBs.

Cosmological Measurements and the Cosmological Constant
As the early Universe cooled, successive constituents decoupled and began to evolve (more or less) independently. The most notable such example is the decoupling of photons, which led to the formation of the CMB. Prior to photon decoupling, it is expected that neutrinos underwent the same process, leading to a relic neutrino background. Lattanzi et al. (2010) and Benini et al. (2011) note that cosmic neutrino decoupling should have happened a few seconds after the Big Bang, at a redshift z ∼ 10 10 . A GW entering the horizon at this time, currently has frequency of order 1 nHz (e.g., see Fig. 15), suggesting that the neutrino decoupling time can be probed by nanohertz GWs. Specifically, they predict that the effective viscosity from the neutrinos will suppress such GWs. They also conclude that a multi-decadal dataset would be required to detect this effect. With current PTAs beginning to surpass the decadal observational duration, it seems that this effect still remains out of reach.
The early Universe may have also experienced phase transitions as it cooled, likely before the epoch of neutrino decoupling. Caprini et al. (2010) consider a quantum chromodynamics phase transition (which they take to occur at z > 10 17 ), and the GWs produced during that epoch. Such a phase transition could produce GWs with f ∼ 1 Hz. Their assessment was that the NANOGrav sensitivity, at the time, was insufficient to detect GWs from this phase transition, but predicted that PTAs might be able to detect these GWs on the time scale of the year 2020. However, their prediction was based on an idealized PTA (with higher precision timing residuals, and on a smaller number of pulsars than currently timed). For a realistic estimate, this calculation would need to be reformulated to encompass the actual sensitivity of current PTAs, i.e. taking into account the true number of pulsars being timed by the various PTAs, and realistic timing residuals. The standard cosmological model includes a "dark energy" component Λ (Planck Collaboration et al. 2016; and references within), which may be equivalent to the "cosmological constant." The expectation is that dark energy becomes relevant on large scales, and significant international efforts are devoted to placing constraints on the properties of dark energy. Bernabeu et al. (2011) and Espriu & Puigdomènech (2013) describe an approach in which measurements of pulsar timing residuals due to GWs emitted in the nearby Universe (z 1) could place a complementary constraint on Λ. One (acknowledged) caution about their results is that their estimates are based on a somewhat idealized PTA with a relatively short observation duration (3 yr). Given that current PTAs have durations in excess of a decade, and substantially more pulsars, it is likely that different (and probably more strigent) constraints could be obtained by repeating their analysis for current PTAs.

Additional Science from "Hidden Planets" to Dark Matter
There are numerous scientific endeavors that may be attainable with PTAs, including those which: 1) our theoretical understanding is still under active development or is speculative in nature, or 2) the pursuit/detection of the science is likely much further in the future than the topics described in previous sections. This includes GW studies of stellar convection and dark matter, as well as the observed effect of direct gravitational perturbations on Earth or the pulsar by primordial black holes or our own solar system bodies.
The discussion here is structured in terms of specific sources and predictions. We also recommend reading Cutler et al. (2014), which considers the blind discovery space of PTAs. That work showed that, whatever the source, GW memory at high redshift is a potential discovery area.

Stellar Convection
Although the common assumption for nanohertz GWs is that they are produced by extragalactic or cosmological sources, Bennett & Melatos (2014) evaluate the mass quadrupole produced by turbulence within a convective star, such as the Sun. They show that the ensemble of stars should produce an irreducible background. While their primary focus is LISA, they note that the amplitude of the GW signal is amplified in the near-field zone, which is certainly the case for the Earth term for frequencies relevant to PTAs. A simple extrapolation of their results into the nanohertz GW frequency band suggests that the Sun may be a contributor to the PTA noise budget, and thus potentially detectable. Such an extrapolation relies upon assumptions regarding the nature of turbulence within a star, and that the actual magnitude of any GW signal may be substantially lower; thus, with increasingly stringent limits, PTAs may place constraints on solar and stellar convection.

Dark Matter
Several authors have considered various classes of dark matter candidates that may produce observable signatures in PTA data.

Warm Dark Matter
Khmelnitsky & Rubakov (2014) discuss warm dark matter models that address some of the issues that Λ Cold Dark Matter (ΛCDM) has in reproducing the observed number density of sub-galactic scale structures in the Universe. Structure formation is suppressed below the de Broglie wavelength, λ dB ≈ 600 pc 10 −23 eV m where m is the mass of the dark matter particles and v is their characteristic velocity in units of c. Such a population of particles will produce an oscillating pressure that, though averaging to zero, will cause sinusoidal variations in the local gravitational potential with a frequency f ≈ 5 · 10 −9 Hz m 10 −23 eV and an amplitude where ρ(x) is the density of dark matter particles at position x. While it is not a GW effect, pulsar signals propagating through such a time-variable gravitational potential will have their pulsation frequencies sinusoidally modulated. The perturbation to the observed times of arrival could be as large as several hundred nanoseconds, which would be a signal that is accessible to a number of currently timed pulsars. Inspired by the work of Khmelnitsky & Rubakov (2014), using the NANOGrav Five Year Data Release, Porayko & Postnov (2014) searched for this signature of warm dark matter and placed the first observational constraints on the mass of such particles. Their upper limits on the local oscillating gravitational potential are an order of magnitude above the current predictions. However, Porayko & Postnov (2014) also note that the ability to measure the oscillating gravitational potential improves as more pulsars are added to the PTA. The current NANOGrav PTA contains more than twice the number of pulsars than were published in the NANOGrav Five Year Data Release, and more continue to be added on an annual basis.
Prior to many of the publications of the current international PTAs, Ishiyama et al. (2010) considered the potential perturbations due to dark matter (sub-)halos in the Galaxy. While their focus was on dark matter sub-halos produced as the result of a specific form of dark matter, the concept is likely to be more generally applicable. These results would be detectable in individual pulsars, rather than a correlated signature between multiple pulsars in an array. With current pulsars, the predicted timing perturbation ( 10 ns) is not yet detectable. A further complication is that, due to the Galactic formation and potential, dark matter sub-halos would tend to be closer to the Galactic center than the Sun. Because of this, it is not clear that any of the pulsars timed by the current PTAs would be good candidates to detect this effect; as an example, few of the NANOGrav pulsars are more distant than 1 kpc and less than 10% are (roughly) in the direction of the Galactic center. Kashiyama & Oguri (2018) note that the concordance ΛCDM model for the evolution of the Universe predicts that the mass function of dark matter halos may extend to very small masses ( M ), with the mass function cutoff related to the nature of dark matter. Thus, detection of small-scale dark matter clumps can both provide support of the cold dark matter scenario and provide understanding of the mass of dark matter particles. As in the warm dark matter case, this signature is not one due to GWs, but instead involves an acceleration of the Earth or pulsar due to the nearby passage of a dark matter clump. Kashiyama & Oguri (2018) find that PTAs can test or detect the existence of clumps from 10 −11 − 10 −8 M using approximately 100 pulsars with 10 ns timing precision that are not close to the Galactic center ( 3 kpc).

Axion-Field Dark Matter
More recent work in this vein from De Martino et al. (2017) discusses a phenomenologically similar dark-matter candidate based on axion particles that naturally arises in many treatments of string theory. They emphasize that the axion field will form clumps on the de Broglie scale and that its average density will track a Navarro-Frenk-White profile (Navarro et al. 1996) with a dense soliton at the Galactic center. Pulsars within approximately 200 pc of the Galactic center will show the highest amplitude timing modulations. While no pulsars have been discovered this close to the Galactic center to date, ongoing surveys seek to find such pulsars.

Primordial Black Holes
The first detection of GWs from merging stellar-mass BHs by (LIGO, Abbott et al. 2016c), as well as the subsequent mergers detected, raised the question of the formation channel that produced these BHs. While a full discussion of this topic is beyond the scope of this paper, it has been proposed these BHs are primordial, produced during the inflationary period in the early Universe. Primordial BHs have also been proposed as relevant to PTAs, by Seto & Cooray (2007) and Kashiyama & Seto (2012), due to the perturbations they can cause, when passing close to the Earth. In contrast to the BHs responsible for the LIGO events, the primordial BHs considered by these authors have much smaller masses (< 10 −10 M ). Nonetheless, from the perspective of an observationally driven constraint on primordial BHs, these constraints remain of potential interest. These authors show that a primordial BH may produce perturbations of order 20 ns over a duration of order 15 yr. With the various PTA data sets now exceeding a decade, this signal may become feasible to detect in the next decade, if other contributions to pulsar timing residuals at similar levels (tens of nanoseconds) can be sufficiently controlled or modeled.

Solar System Ephemerides and Wandering Planets
A fundamental term in pulsar timing is an astrometric term, designed to transfer the pulsar TOAs into the frame of the solar system barycenter, which is assumed to be (quasi-)inertial (Lorimer & Kramer 2012), where R SSB is the vector connecting the Earth to the solar system barycenter,n is the unit vector in the direction of the pulsar, and c is the speed of light. The barycenter position is the center of mass of the solar system, involving weighted contributions from all planets and important dynamical objects. Any uncertainties in the position or masses of solar system bodies will create a corresponding uncertainty in the barycenter position.
In Equation 30, it is clear that uncertainties in the knowledge of the position of the solar system barycenter will affect the accuracy to which pulsars can be timed. As the position of the solar system barycenter R SSB is determined from the orbits and masses of the solar system planets (and minor bodies), there is a long history of assessing whether, and to what extent, precision pulsar timing can be used to constrain properties of the solar system (e.g., Li et al. 2016;Mulholland 1971). These efforts generally find that the position of the solar system barycenter is not known to better than about 100 m, a value that is consistent with expectations from the spacecraft data used to construct the solar system ephemeris (W. M. Folkner 2017, private communication). There have also been efforts to measure the masses of the solar system planets by using precision pulsar timing (Caballero & et al. submitted;Champion et al. 2010), though the results are inferior to those obtained by orbiting spacecraft.
Recent efforts to constrain the nanohertz stochastic GW background with the NANOGrav 11-yr dataset uncovered differences in upper-limits and detection statistics caused by systematic errors in published solar system ephemerides (Arzoumanian et al. 2018b). Upper limit variations were small, but Bayes factors for a long-timescale red noise process shared by all pulsars (a potential first signature of GWs) varied by over an order of magnitude, from compelling to insignificant evidence. To mitigate these systematic errors, (Arzoumanian et al. 2018b) added gas-giant mass perturbations and Jupiter orbital-element perturbations to the global PTA signal and noise model. This led to consistent constraints and detection statistics, regardless of the assumed baseline ephemeris version. Work is ongoing to expand this model into a full Bayesian solar system ephemeris.
There is also a long history of assessing whether precision pulsar timing could reveal currently unknown members of the solar system or a distant companion star to the Sun (e.g., Guo et al. 2018;Harrison 1977;Zakamska & Tremaine 2005). Though some of the early efforts may have been affected by severe selection effects (Henrichs & Staller 1978), the existence of distant members of the solar system is a question that has recently attracted considerably more attention, given the suggestion of a previously unknown "Planet Nine" (Batygin & Brown 2016). While we are unaware of any large-scale effort to constrain the existence of this body with the modern PTAs, Guo et al. (2018) estimate that PTAs could constrain the mass to be 10 −4 M , and Zakamska & Tremaine (2005) show that measuring the acceleration of the solar system's barycenter with precision pulsar timing may be highly competitive with other methods for constraining the existence of distant companions (> 300 au).

Gravitational-Wave Synergies: Multi-band GW Science
PTAs have some target science in common with other GW detectors at higher frequencies (LIGO, LISA) and lower frequencies (CMB experiments). In some cases, this allows for synergistic or even direct multi-band GW studies.
PTAs and higher-frequency millihertz experiments like LISA uniquely share SMBHBs as a direct common target. PTAs and LISA will make synergistic measurements of galaxy and SMBHB evolution. However, LISA will probe lower-mass and higher-redshift systems, while PTAs will preferentially detect the largest systems in the nearest few Gpc. These experiments will work together to fully characterize black hole growth over cosmic time. For a small subset of discrete systems, both experiments may co-detect the selfsame system at different phases of evolution.
Cosmic strings and relic GW signals are also intrinsically broad-band GW emitters, and where relevant we have summarized the links between experiments sensitive to those phenomena in Sections 4 and 6.1, respectively. Complementary constraints from multiple wave bands on the nature of gravity are described in Section 5.
The LIGO GW detections (Abbott et al. 2016a in the high-frequency regime (∼ 100s Hz) have ushered us into the era of GW astrophysics. In addition to PTAs, there are many other instruments and techniques currently in the design, planning and prototyping phases for future GW observatories. that the same fundamental physics is required to produce and evolve both populations of BH binaries, there exists a strong link between the methodology of evolutionary models used to study them, and the insights that observations of their GW signatures will provide. In particular PTA observations of the GWB, and measurement of its spectral index, will provide valuable constraints on the mass-function and eccentricitydistribution of SMBH (e.g.  which will better constrain the detection rates for LISA. Under the right circumstances, an individual source could be observed by PTAs and LISA at different stages of its evolution (Pitkin et al. 2008;Spallicci 2013). For a PTA with high-frequency cutoff of 4 × 10 −7 Hz, a SMBHB will transition from being observable by PTAs to being observable by LISA in 1 -50 years. This transition time can be reduced by increasing the pulsar observing cadence, which improves the sensitivity of PTAs at high frequencies. However, astrophysical rate estimates suggest the probability of a sequential detection is extremely low (4.7 × 10 −4 − 3.3 × 10 −6 per year to merger per year of survey) due to the small number of individual sources observable by both detectors (Spallicci 2013).
It is also possible to use ringdown observations made by LISA as triggers to search PTA data for past continuous waves, or for memory-inducing SMBHB coalescence events (Sec. 3.1.3). LISA can observe the ring-down of higher mass sources, even when the inspiral and merger happen outside of the LISA band, meaning there is better overlap for direct observations of these sources with both PTAs and LISA. Parameter constraints from observing the ring-down can be used to improve the search for the inspiral, extrapolating a SMBHB model back in time to predict the expected gravitational waveform throughout the previous years of pulsar observations. Currently, the planned launch date for LISA is 2034, at which point PTAs will have accumulated over 30 years of data that can be used for such a search.
Galactic sources of GWs that LISA will be able to study may, under certain circumstances, be possible to investigate with pulsar timing. Globular clusters (GCs) likely host GW sources detectable by LISA (Kremer et al. 2018). GCs are also known to host large populations of pulsars (Freire et al. 2017;Ransom et al. 2005). Pulsars in a GC will be within a few parsecs of GW sources in that cluster and could act as sensitive probes of those GWs (Jenet et al. 2005;Madison et al. 2017). However, pulsars in GCs are not typically observed as part of PTA efforts to detect GWs because accelerations from intra-cluster dynamics tend to cause low-frequency structure in the timing residuals of those pulsars, possibly masking or mimicking low-frequency GWs. Alternatively, a pulsar behind a GC, significantly more distant than the cluster, would be extremely sensitive to GWs from sources near its line-of-sight. However, such an arrangement is unlikely. In the absence of multi-messenger information, several such pulsars would also be necessary in order to distinguish GWs from unmodeled effects in an individual pulsar. Additionally, for pulsars within or behind a GC, the conceptually straightforward division of the timing signal into an Earth term and pulsar term does not apply because the GWs cannot be approximated as plane waves. This fact will require PTAs to adopt currently undeveloped detection techniques. Finally, several pulsar-timing GW limits at high GW frequencies (10 −6 Hz -10 −3 Hz) overlapping with the LISA band have been published (Perera et al. 2018, Dolch et al. 2016, Yi et al. 2014, Yardley et al. 2010, due to high-cadence observations of selected pulsars. These limits are many orders of magnitude above the Doppler tracking GW limits from the Cassini spacecraft (Armstrong et al. 2003) and the expected GW sensitivity of LISA. However, both Cassini and LISA are entirely solar-system bound; one possible advantage of pulsar timing in the µHz to mHz band is its sensitivity to a Galactic GW source close to a pulsar line-of-sight. Studying Galactic sources is an area of investigation that needs to be further developed by the PTA community, but it could yield exciting overlap with LISA science.
Recently, there has also been work on the possibility of astrometric GW detection (Book & Flanagan 2011;Schutz 2010), for example using an instrument like GAIA (Gaia Collaboration et al. 2016) which can accurately measure the positions of over a billion stars in the Milky Way. This technique uses correlated deviations in the observed positions of astrophysical objects (most likely stars, or possibly quasars) to measure passing GWs. PTA use a similar procedure based on time delays which are determined by integrating a GW induced redshift over the path of each photon. This integration introduces a frequency dependence in which the characteristic-strain sensitivity of the array is linearly related to the GW frequency, and thus is worse at high frequencies. Astrometric detection, on the other hand, performs an effectively instantaneous measurement of each photon's source direction. Within the Nyquist band, determined by the observational cadence and duration of the observatory, astrometric detection has a sensitivity relatively independent of frequency. This detection method is still in its early days of study, however astrometric GW-detection may be able to complement PTAs at the high-frequency end of the PTA sensitivity range (f 3 × 10 −8 Hz) (Moore et al. 2017), and could potentially be used in a coherent multi-messenger PTA+Gaia analysis to validate the detection of nanohertz GWs, or even constrain beyond-GR GW polarizations (Mihaylov et al. 2018;.

Electromagnetic Synergies: Multi-messenger Science
Almost all of the science cases described in the previous sections can be significantly augmented with electromagnetic studies. However, for the PTA GW band, the science that would benefit most clearly from a concerted multimessenger and multi-wavelength effort is the identification of the inspiral and/or coalescence from discrete SMBHBs. Here, we outline a selection of studies related to SMBHs that can be uniquely achieved with a host galaxy identification and the direct measurement and tracking of a SMBHB electromagnetic counterpart (note, the latter implies the host galaxy has been identified).
We also summarize a few practical aspects of multi-messenger detection in the nanohertz wave-band, which may include both new and archival data.

Identifying and Tracking a SMBHB Host
To identify the host of a discrete (continuous wave/memory/burst) GW source, we need to be able to localize the object on the sky. The size of a PTA's localization error depends on the signal-to-noise ratio of the GW detection, and the relative position of pulsars with respect to the GW source (e. g. Ellis 2013). However, typical localizations for at least the near future (detection significance of ∼7σ) will be up to a few times 1000 deg 2 (Fig. 17). Although the chirp mass of and distance to a SMBH binary are covariant in a PTA detection (Eq. 8), we can estimate an upper limit on the distance by assuming a maximum chirp mass (∼10 10 M ). Still, the large number of resulting galaxies in this error region will require us to turn to electromagnetic information to identify the most likely host.
There are two main goals of multi-messenger astronomy in the nanohertz regime: (1) We aim to simply identify the likely galactic host of the GW signal, and determine its distance. This information will allow us to determine the GW source's chirp mass, by breaking the chirp-mass/luminosity-distance degeneracy, mentioned above.
(2) We aim to perform true "multi-messenger" monitoring of a target, by detecting electromagnetic signals from the vicinity of a SMBH binary, during the inspiral or coalescence. This will be possible, if the SMBHs are illuminated as AGN, or if the binary leaves a distinct observable signature on the stellar or gas dynamics, the photometry, or the morphology of their host. We summarize the methods (and efforts) to electromagnetically identify SMBHBs in section 9.2.
SMBHBs are formed in galaxy mergers, which for much of their duration are ostentatious events; they exhibit large-scale asymmetries, tidal tails, sudden bursts of star-formation (e.g., ULIRG stage), and potentially an abundance of tidal disruption events (Burke-Spolaor 2013). However, since, as discussed extensively in Section 3, the efficiency of SMBHB formation and inspiral is highly uncertain, it is also unclear whether these easily-observable features would be present by the time a binary SMBH enters the GW-dominated phase. If the final-parsec problem is resolved efficiently, it is possible that these large-scale indicators of a galaxy merger persist, until the binary reaches the inspiral phase. Therefore, we can easily survey for this type of signature in the GW error region, and potentially narrow down the list of candidate hosts. However, if the archival data prove inadequate, new dedicated observing campaigns will also be required to systematically catalog and identify outstanding features of galaxies in the error region of PTAs.
Before we move on to discuss direct SMBHB detection and multi-messenger science, it is worth noting an important practical point about electromagnetic SMBHB emission. The evolution of SMBHBs within the PTA band is slow, and the inpsiral phase may last for long periods of time (of order years to millennia). Additionally, SMBHBs at these stages may have long-enduring electromagnetic signatures. Therefore, search efforts for multi-messenger signals in the nanohertz regime have a tremendous advantage (e.g., compared to LIGOs electromagnetic counterparts); for the majority of objects, tracking of a GW target will not necessarily be time-critical. Electromagnetic data collected years before the GW detection might reveal key information for a binary SMBH, its host, and its evolution. Therefore, the vast availability of archival images, and spectra, along with time series from the ongoing time-domain surveys, as well as data from the emergent highly sensitive instruments will allow great potential for discovery. They may identify any electromagnetic signatures on the large scale (i.e. unique identifiers in the properties of the host galaxy), or they may allow the tracking of any multi-messenger counterparts that mark the GW-emitting binary itself.

Direct Searches for SMBHBs in Electromagnetic Bands
Given the significance of SMBHBs in galaxy evolution and the prospects for GW astrophysics, over the past few decades, several groups have searched for compact sub-pc binaries using electromagnetic data across the electromagnetic spectrum. In principle, direct imaging of material near the SMBHB would be the most direct identification method. However, only radio interferometry can provide sufficient resolution to probe sub-parsec scales (Burke-Spolaor 2011; Kharb et al. 2017); although this provides a convenient observation method to test some binary SMBHB candidates, it can only reach those up to moderate redshifts at the earlier stages of binary evolution.
Therefore, many recent searches have relied on the effects of the binary on its environment (e.g., Doppler-shifted broad emission lines, periodic variability, distorted morphology of radio jets, etc; see Burke-Spolaor (2013) for a discussion of multiwavelength counterparts to SMBHBs). However, since these signatures are indirect, other physical processes can also explain the observed features.
For instance, in the standard picture of AGN, the broad emission lines arise in regions close to the central SMBH, whereas the narrow emission lines are associated with large-scale features of the host galaxy. In the presence of a compact binary, the orbital motion of the SMBHB will be imprinted in the broad emission lines, producing noticeable shifts compared to the narrow lines. Alternatively, if both SMBHs have their own broad-emission-line region, the broad lines should be double-peaked, reflecting the motion of the binary. Several candidates have been identified, especially from systematic searches in the spectroscopic database of SDSS (Eracleous et al. 2012;Ju et al. 2013). However, similar features can be produced by inflows/outflows, the geometry of the broad-emission-line region, as well as by a special type of AGN, the so-called double-peaked emitters. Therefore, only long-term monitoring can prove/disprove the binary nature of these candidates, if coherent changes in the spectra are observed, as expected due to the orbit of a binary.
Periodic variability in AGN and quasars can also signify the presence of SMBHBs. The binary is expected to periodically perturb the circumbinary disk and several hydrodynamical simulations have found that SMBHBs can produce bright quasar-like luminosities, with the mass accretion rate onto the BHs periodically modulated at the orbital period of the binary. Several candidates (∼150) have emerged from systematic searches in the photometric databases of CRTS (Graham et al. 2015) and PTF (Charisi et al. 2016), along with some individual detections (Liu et al. 2015;Zheng et al. 2016). However, the identified periods are relatively long compared to the available baselines and only a few cycles have been observed. This, combined with the stochastic underlying variability of quasars, can lead to false detections (Vaughan et al. 2016). Again long-term monitoring can test whether the detected periodicity is persistent and thus attributed to a SMBHB. Additionally, several independent signatures have been proposed, which, if observed along with the periodicity, will increase our confidence in the binary nature of the candidates. These include multi-wavelength observations of Doppler boosted emission (Charisi et al. 2018;D'Orazio & Haiman 2017;D'Orazio et al. 2015b), periodicity with a characteristic frequency pattern (Charisi et al. 2015;D'Orazio et al. 2015a), periodic self-lensing flares (D'Orazio & Di Stefano 2018).
Last but not least, the orbit of the binary can affect the morphology of radio jets. If one of the BHs is emitting a radio jet, the orbital motion of the base of the jet will result in a jet with helical structure (Kaastra & Roos 1992). Furthermore, the presence of a secondary BH can lead to jet precession, producing a jet with conical or helical morphology (e. g. Britzen et al. 2017). As before, these signatures are not unique to SMBHBs, as instabilities in the jet or the misalagnment of the accretion disk with the jet axis can also produce similar features. Independent lines of evidence are required to test whether these galaxies host SMBHBs. We note that the above signatures, albeit indirect, will be significant in searches for the host galaxy, once PTAs detect a GW source. 9.3. Topics in Binary Supermassive Black Hole Multi-messenger Science 9.3.1. Accretion Dynamics; Active Nucleus Geometry In gas-rich mergers, copious amounts of gas will be funneled into the central regions of the post-merger galaxy. Once the two SMBHs become gravitationally bound, the gas will settle in a circumbinary accretion disk. The disk can dissipate the SMBHB energy and angular momentum, driving the binary to the GW regime. Since the existence of ambient gas can catalyze the binary evolution, this has been suggested as one solution to the final-parsec problem (Sec. 3). Additionally, torques from a circumbinary disk could influence the inspiral evolution of the binary significantly enough for this effect to be visible in the gravitational waveform. Theoretical work on circumbinary disks covers a variety of topics, including the timescales of SMBHB evolution, whether the disks will be retrograde or prograde with respect to the binary orbit (Schnittman & Krolik 2015), whether such disks will support accretion onto one SMBH or both SMBHs (Cuadra et al. 2009), the effect of the gaseous disk on the binary eccentricity, etc. (see also Sections 3.1.2 and 3.2.4).
More importantly, the presence of gas is crucial in providing bright electromagnetic counterparts, which will allow us to pinpoint the tentative SMBHBs. This is an area of intense investigation, with several remaining open questions regarding the expected emission signatures. For instance: • Will the binary have one or two radio jets?
• Will the circumbinary disk itself support a stable or unstable jet?
• Will a secondary black hole clear a gap or a cavity (Hayasaki et al. 2007), and if so, can we use this as an observable to infer its mass ratio?
The GW signal will provide an independent measurement of the chirp mass M c (provided that the distance is constrained electromagnetically from the redshift of the source). It will also allow for the measurement of the separation of the two black holes and time-tagged tracking of the orbital motion of the binary. These, in combination with any observed electromagnetic signatures, will allow us to explore in detail the interplay between the disk and the binary orbit, the accretion rate onto the individual BHs and the overall disk geometry, as well as the processes involved in jet generation, thus providing unprecedented insight into the physics of active galactic nuclei.

Constraining Black Hole Mass-Host Galaxy Relations
The mass of the central SMBH M • is strongly correlated with several properties of the host galaxy, e.g., bulge mass (Magorrian et al. 1998), galactic velocity dispersion (Ferrarese & Merritt 2000;Gebhardt et al. 2000), and Sérsic index (Graham & Driver 2007) (amongst other properties), which suggests that SMBHs co-evolve with their host galaxies. These relations have been critical throughout astronomy because of their ability to predict SMBH masses where there is no direct measurement. However, the reliability of the SMBH masses derived from these relations has recently come into question, and it has been suggested that SMBH masses might be biased upwards (Shankar et al. 2016). As mentioned above, PTAs can only measure a degenerate chirp mass and distance to the source. However, with the identification of the galactic host, this degeneracy can be broken, permitting a measurement of M c . As we write this, PTAs are sensitive to SMBHBs of M c > 10 9 M out to approximately 190 Mpc, or a redshift of z 0.05 (NANOGrav Collaboration 2018). As the PTA horizon grows and more distant sources are detected, PTAs will have some limited capability to probe the redshift dependence of these relations, particularly at the highest-mass end.
Note that, while M c does not reflect the total black hole mass directly, numerous simulations have demonstrated that all SMBHBs detected by PTAs will likely have mass ratios q 0.1, implying an error when inferring M of ∼0.5 dex. Even in the more conservative limit of q 0.01, the error in the total mass inferred from PTA measurement of M c would be only ∼ 1 dex.
As discussed here, PTAs can probe the SMBH mass-host relations through the detection of continuous waves, however Simon & Burke-Spolaor (2016) demonstrated that they can also probe these relations by looking at the amplitude of the GW background. Because SMBH masses are currently linked to GWB predictions through modelling of host galaxy populations, that study was able to use limits on the GW background to place constraints on the high-mass slope, scatter, and intercept of the relationship between SMBH mass and host bulge mass.

Active Nuclei Preceding and Following SMBHB Coalescence
In the case of a GW memory event, identification of the host will allow us to "wait and watch" for a post-coalescence ignition of an active galactic nucleus. If there is a circumbinary disk, it will closely track the shrinking orbit of the SMBHB. If circumbinary disks are highly viscous, at some point it is possible that the binary and circumbinary disk will "decouple" if the GW-induced inspiral becomes so rapid that the viscosity of the disk no longer allows it to track the binary. After the coalescence, there may be a delay in any nuclear activation while the disk settles; this delay depends both on the disk viscosity, which governs the infall rate of the disk, and the SMBHB masses, which govern the rapidity of the final break-away inspiral. The delay between coalescence and subsequent turn-on of a luminous X-ray source can be modeled as: t X−ray 7(1 + z) M 10 6 M 1.32 yr (31) (Milosavljević & Phinney 2005). For the high-mass systems that PTAs are expected to discover (M 10 8 M ), this implies timescales that are impractically long, on the order of hundreds to thousands of years. However, Tanaka & Menou (2010) proposed a much more rapid temporal evolution of the circumbinary disk, finding that there may be a rapid brightening to super-Eddington luminosities in the few years following a merger. These signatures would be detectable in X-rays, with potential signatures in optical and infrared bands. Subsequent ignition of a radio jet might also arise once the accretion disk is settled. Another promising prospect is to search through archival data to uncover the evolution of the system leading up to and during the merger in electromagnetic bands. In practice, the extent of such studies will be highly dependent on the position of the source on the sky, and the prevalence of historical data in that direction. However, given the broad coverage of current and upcoming synoptic sky surveys, like the Catalina Sky Survey (CRTS, Drake et al. 2009

Tidal Disruption Events
PTAs will not directly detect tidal disruption events, however a number of studies have postulated that tidal disruption events may occur at much higher rates in SMBHB systems. This is largely caused by the chaotic three-body interaction of the SMBHB with stars in the stellar core, which can cause an increase of a few to several thousand over the rate of disruption events around an isolated black hole (Chen et al. 2011;Darbha et al. 2018). It has also been postulated that tidal disruption events might encode electromagnetic information about the binary's orbit in their time-resolved signatures (Liu et al. 2014).
LSST is expected to detect thousands of tidal disruption events in the coming decade (Wegg & Nate Bode 2011). This may highlight individual galaxies with excess event rates. Such galaxies could signify the host of a potential continuous-wave GW source, which would assist in host identification for any future continuous-wave detections by PTAs.
Similarly, if we can otherwise identify the host of a GW, this detection can be cross-matched with transient catalogs to test for excess events, or to allow direct multi-messenger studies of any tidal disruption event caused by the continuous-wave emitter. Any binary modelling extracted from a tidal disruption event will provide PTAs greater sensitivity to the continuous waves from that event.
Extreme-mass-ratio inspiral events (commonly termed EMRIs; in this case, the inspiral of the object before it is disrupted) may also be detected by LISA. The coordinated detection of GWs from a binary SMBH by PTAs, an EMRI signature from the low-mass object, and the electromagnetic observation of a tidal disruption event could provide a fascinating set of studies of distinct processes in a common object.

Testing Electromagnetic SMBHB Emission Models
Several binary SMBHB candidates have emerged, but it has proven difficult to find a smoking-gun signature that can conclusively confirm the binary nature of these sources. PTAs, on the other hand, can provide a straight-forward way to test the binary hypothesis for such candidate systems identified in electromagnetic bands. Given the current PTA capabilities, this test can be applied for candidates that are sufficiently nearby, massive, and at the right orbital phase to be emitting GWs in nanohertz frequencies, i.e., they need to be at sub-parsec orbital separations.
Even without a direct detection of nanohertz GWs, the PTA upper limits already provide astrophysically relevant constraints on the SMBHB population. For instance, recently, Sesana et al. (2018) and Holgado et al. (2018) examined the GW background from the population of SMBHBs inferred from the samples of periodic quasars and blazars, respectively. They found that, even if the individual candidates are below the sensitivity of PTAs, the inferred population is in moderate tension with the current limits on the GW background, which suggests that the samples may be contaminated by false detections.
Additionally, in some cases, electromagnetic constraints on a binary system may raise the PTA sensitivity for the same source by constraining the searchable parameter space. There are already several examples of this type of multi-messenger astrophysical application. Most famously, a M c 10 10 M SMBHB was hypothesized to exist in galaxy 3C66B due to an elliptical motion detected through long-baseline interferometry; it was suggested that this implied a binary-modulated jet precession (Sudou et al. 2003). However, Jenet et al. (2004) used only seven years of data on PSR 1855+09 to place limits on the allowed chirp mass and eccentricity of this system, effectively ruling out the purported binary parameters. More general applications of ruling out binaries in specific systems involve targeted campaigns to place limits on binaries in nearby, massive systems (Lommen & Backer 2001;Schutz & Ma 2016).
These types of studies are constantly improving, as the PTA horizon grows with time, and with the addition of more pulsars. Based on the non-detection of continuous waves in PTA data thus far, we can already en masse rule out nanohertz-frequency binaries with M c > 0.5 × 10 8 M in the Virgo cluster (NANOGrav Collaboration 2018).

Constraining SMBHB Populations
The previous sections discussed electromagnetic signatures of SMBHB systems that are primarily already in the microhertz to nanohertz gravitational waveband. However, we note that there is also a direct benefit to systematically search for "GW precursor" binaries-systems easily resolved as two AGN but well within the ∼few kpc virial radius of a typical merger. In part because they likely have a higher residence time at very large separations (Eq. 14), and in part because they are directly resolvable, these systems may be more readily accessible to large electromagnetic surveys. These can provide direct statistics on the systems that will make up our binary population (e. g. Wen et al. 2009). These systems also complement GW-based studies of massive merger environments and of the efficiency of SMBHB evolution (Sec. 3.2). So far, there have been dozens of such discoveries; a few examples include the resolved dual AGN in NGC6240 (Komossa et al. 2003), the spatially resolved broad line regions in the galaxies of Comerford et al. (2013), and the record-holding binary at a separation of 7 pc in 0402+379 (Rodriguez et al. 2006). Recently, astrometric tracking of 0402+379 led (Bansal et al. 2017) to suggest an orbital period for this binary of ∼ 10 4 yr. For a summary on the detections of dual AGN, we refer the reader to Rubinur et al. (2018) Statistics on the population of dual AGN can directly constrain some of the covariant parameters contributing to the GW background that are discussed in Section 3.2 and summarized in Fig. 4. This includes eccentricity distributions, inferred inspiral rates, and observed evidence of ongoing interaction with the galactic environment. However, the predictive power of these discoveries currently suffer from small-number statistics. Future large-scale systematic surveys, such as LSST, or the Square Kilometer Array, coupled with detailed observations using instruments with specialized capabilities, such as high resolution, rapid spectrum acquisition, and/or time-resolved imaging will significantly enhance our understanding of the binary population.
9.3.7. Tests of Gravity As mentioned in Section 5.2, in general relativity electromagnetic and GW signals propagate at the same speed. This agreement can be tested if a resolvable continuous wave source also happens to be an eclipsing electromagnetic binary (Cutler et al. 2003;Will 1998). Such a system allows for a differential test of the propagation speed of GWs and light, and the mass of the graviton. Using the order-of-magnitude limit on the graviton Compton wavelength (λ g = h/(m g c)) of such a system (Will 1998), and choosing nominal values for the parameters of an SMBHB, of f = 1 × 10 −9 Hz, D = 1 Gpc, and an observing time of ∆t = 5 years, one obtains a lower limit of λ g ∼ 10 19 km. This limit is three orders of magnitude less precise than the current Particle Data Group limit of 10 22 km, suggesting that today's PTAs might not be able to produce competitive limits without more sophisticated GW data analysis schemes (Choudhury et al. 2004). Figure 18: Limits on the mass of the graviton, given a characteristic noise strain amplitude for a pulsar timing array with a continuous wave signal of given chirp mass Hazboun et al. (2013). The lines and coloring represent contours of the graviton mass limit. A fiducial frequency of 10 −9 Hz is used for this illustration.
The approaches in Will (1998), Cutler et al. (2003), and Hazboun et al. (2013) are framed in the context of a null test, in which the assumed observation is the simultaneous arrival, within the margin of error, of some fiducial feature of both the electromagnetic and GW signal. While the error in arrival time is that of both the electromagnetic and GW signals, in practice the electromagnetic arrival time is much more precise, and is often ignored in terms of these calculations. The error, related to the signal-to-noise ratio in general, and to the strain sensitivity for a specific detector, is then used to set a limit on the difference in propagation speed between the electromagnetic and GW signals. This velocity limit can then be used to calculate a lower limit on the Compton wavelength and an upper limit on the mass of the graviton m g , if one assumes a Yukawa-type potential (V g ∼ e −kmgr /r) for the graviton. This type of potential is a phenomenological model which acts to restrict a interaction's ability to act at a distance. Figure 18 shows a contour plot of the limits on m g for a generic PTA at f gw = 10 −9 Hz. Takahashi (2017) considers an alternate approach to testing the concordance of electromagnetic and GW propagation using gravitational lensing. Standard analyses assume that gravitational lensing can be treated with geometric optics. This assumption is no longer applicable if the mass of the lens is too small compared to the wavelength of the radiation; less than 10 5 M (f /Hz) −1 .
Both of these tests of gravitational/electromagnetic wave propagation concordance would require the detection of an individual SMBH binary with a high enough signal-to-noise ratio in GWs and some electromagnetic measure of the orbiting SMBHs (e.g., optical or X-ray light curve). Takahashi (2017) shows that, for signal-to-noise ratios of 10-100 and GW frequencies f ∼ 10 nHz, there is a reasonable probability of obtaining a few lenses, for which the time difference between the GW and electromagnetic signals would be of order 1-10 days, if of order 100 individual SMBH binaries could be detected as (continuous wave) GW sources. (Longer time delays are required for detection at lower signal-to-noise ratio.) At higher (lower) GW frequencies, shorter (longer) time delays result in similar predictions for comparable signal-to-noise ratios.

Current Outlook
The focus of this paper has been on the possible sources that could generate nanohertz GW signals. We now consider briefly the observational status of pulsar timing arrays, both current and near future, as a means of providing some measure of the likelihood of detecting these signals. As indicated in the first two sections above and referenced throughout, the sensitivity of pulsar timing arrays depends on length of data set, RMS timing precision of pulsars, and the total number of contributing pulsars.
The three large national and regional PTAs are the North American Nanohertz Observatory for Gravitational Waves (NANOGrav, Arzoumanian et al. 2016;The NANOGrav Collaboration et al. 2015), the European Pulsar Timing Array (EPTA, Babak et al. 2016;Desvignes et al. 2016;Lentati et al. 2015), and the Parkes Pulsar Timing Array (PPTA, Hobbs 2013). These efforts, in addition to other emerging timing arrays in other nations (China, India) are currently working to combine their data into an International Pulsar Timing Array (IPTA; e. g. Hobbs et al. 2010).
As an approximate illustration of the performance of these PTAs, each typically times 40 millisecond pulsars with sub-microsecond precision, while some such as NANOGrav have begun to time > 70 pulsars to this precision. Currently, the total number of pulsars timed world-wide does not exceed 100, because there is overlap between the PTAs. Some of this overlap is intentional, for the purposes of calibration between the different telescopes, in addition to providing the ability to independently verify any reported GW detection.
Millisecond pulsars continue to be discovered (e.g., Cromartie et al. 2016;Pleunis et al. 2017), and, while not all are suitable for inclusion into a PTA, many are, so these PTAs can and are expanding. As a specific example, the NANOGrav 9 Year Data Release contained 37 pulsars (The NANOGrav Collaboration et al. 2015), the NANOGrav 11 Year Data Release contained 45 pulsars (Arzoumanian et al. 2018a), and NANOGrav is now timing more than 70 pulsars. In addition to new pulsar discoveries, improvements to radio observing systems will continue to improve mitigation of Gaussian pulsar noise , and we have an ever-increasing understanding of other contributions to the PTA noise budget. These improvements can provide sudden jumps in PTA sensitivity, as the data are re-processed using novel noise suppression techniques (e. g. Jones et al. 2017). Given these ongoing improvements, we consider it plausible that PTAs will include approximately 100 pulsars at typical precisions of approximately 500 ns by 2020.
Many of the possibilities discussed in later sections are somewhat more speculative ( § §6.1, 5, and 7). In many cases, these are likely to be "observationally driven," with models that are updated to evade new limits as they are placed. Most of these models are capable of producing data over many orders of magnitude across many parameters, thus updated models will continually be available, however, they will likely cover smaller and smaller allowed parameter space.
Current PTAs are already placing astrophysical limits on the existence of SMBH binaries and their interactions with their environments ( §3) and the existence of cosmic strings ( §4). Based on projections of SMBHB populations, we expect a GW background detection to be imminent from that population (Vigeland & Siemens 2016), with the detection of continuous-wave SMBHBs soon to follow (Mingarelli et al. 2017). The detection of these systems will be occurring concurrently with a number of planned synoptic sky surveys, enabling a much greater scope for multi-messenger studies ( §9) in addition to potential multi-band GW science with LISA when it flies in the mid-2030s ( §8). For the forseeable future, PTAs will maintain unique access to exploring the inspiral of high-mass SMBHB systems, and will uniquely probe the physical properties of strings.