Primordial gravitational waves in the nano-Hertz regime and PTA data -- towards solving the GW inverse problem

In recent years, several pulsar timing array collaborations have reported first hints for a stochastic gravitational wave background at nano-Hertz frequencies. Here we elaborate on the possibility that this signal comes from new physics that leads to the generation of a primordial stochastic gravitational wave background. We propose a set of simple but concrete models that can serve as benchmarks for gravitational waves sourced by cosmological phase transitions, domain wall networks, cosmic strings, axion dynamics, or large scalar fluctuations. These models are then confronted with pulsar timing data and with cosmological constraints. With only a limited number of free parameters per model, we are able to identify viable regions of parameter space and also make predictions for future astrophysical and laboratory tests that can help with model identification and discrimination.


Introduction
The discovery of the first gravitational wave (GW) signal by the LIGO/VIRGO [1] collaboration has started a new era in astrophysics and cosmology.GWs offer a new avenue to explore the physics of the very early universe, since they travel almost entirely undisturbed through space-time.While LIGO and other ground based GW detectors are most sensitive to GWs with kilo-Hertz frequencies, here we focus on much longer wavelengths down to nano-Hertz (nHz).GWs in that frequency range are searched for by pulsar timing array (PTA) experiments such as the European Pulsar Timing Array (EPTA) [2,3], the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) [4,5] and the Parkes Pulsar Timing Array (PPTA) [6,7], as well as their joint collaboration, the International Pulsar Timing Array (IPTA) [8,9].
An interesting aspect of the PTA frequency range is that, due to causality, it requires sources which are active down to cosmological temperatures of a few Mega-electronvolt (MeV).This is exactly the time when we start to have, at least indirect, constraints on the content and dynamics of the universe, from observations of the cosmic microwave background (CMB), big bang nucleosynthesis (BBN) and neutrino decoupling.Thus, while there are many mechanisms known to produce sufficiently large GW signals to explain the PTA data, it is non-trivial to write down concrete models which satisfy all existing constraints.
The main goal of our paper is therefore to identify well defined benchmark models that can be confronted with both the GW data and cosmological constraints.Our focus here is on simplicity, i.e. to introduce as few new fields, parameters and couplings as possible, that nevertheless allow a consistent cosmological history.This also reduces the penalty for having too many parameters in Bayesian model comparison, which can in particular plague models of FOPTs, for which the signal parameterisation can be quite complicated even for simple underlying models.Other open questions that motivate our work are whether the frequency spectrum of the PTA observations can already discriminate between different models, whether the data from different PTAs can be consistently explained within these models, and which other cosmological observables can be used to identify or rule out different explanations of the signal.
In section 2, we introduce our benchmark models and their GW spectra.These are a simple abelian Higgs model with classical scale invariance as a benchmark for GWs from supercooled FOPTs, several axion and axion-like particle (ALP) models that serve as benchmarks for GWs from global CSs, DWs and from bosonic instabilities, and a single field inflationary model for secondary GWs from large density fluctuations.The GW spectra of these models are confronted with the PTA data using the enterprise software suite [37,38] and the rapid fitting tool ceffyl [39], as discussed in more detail in section 3. Cosmological constraints in general and also separately for each model are discussed in section 4.There we also present the results of our fits to the PTA data and identify the regions of parameter space which satisfy all cosmological constraints.We conclude with some thoughts about the necessary steps for addressing the GW inverse problem [40], in anticipation of upcoming data releases by the PTA collaborations.

Primordial sources of GWs in the nano-Hertz regime
A variety of models and scenarios can give rise to a primordial SGWB, see e.g.Ref. [41] for a review.Our goal here is not a comprehensive overview, but to study a selection of simple, but concrete, models that can produce a GW signal compatible with the current PTA data.This allows us to also confront the models with cosmological and laboratory constraints which provide complementary insight into their viable parameter space.
An important consideration is the magnitude and frequency range of the signal.To fully explain the observed excess noise, the GWs should today have a fractional energy density Ω GW h2 ∼ 10 −9 in the (10 −9 -10 −8 ) Hz range.GWs of this frequency are produced before matter-radiation equality, and thus subject to dilution by a redshift factor F ∼ 10 −5 between production and today.The largest possible amplitude is further constrained by the fraction of the total energy density carried by the source at the time of production Ω S , and by the ratio of the typical length scale of the source divided by the Hubble radius at production, (LH * ), to some power.Thus, the expressions for the peak amplitude can all be cast in a form where the power n depends on the type of source and the expression should be evaluated at the time of GW production.Consistency and causality imply that these factors cannot exceed unity, but they can easily suppress the signal by several orders of magnitude in realistic models.Said differently, most scenarios that give rise to primordial GWs will fail to produce a sufficiently large signal.We therefore focus on models and scenarios that are known to produce large GW signals at least in parts of their parameter space, which are also the models that will first be observed, or constrained, by future GW searches. 1Since it will appear frequently below, let us also define the reduced Planck scale here as M P = 2.4 × 10 18 GeV.

Strong first order phase transitions
Spontaneously broken symmetries are typically restored at high temperatures [43].As a consequence, the early universe probably went through at least two cosmological phase transitions (PTs) related to the breaking of the electroweak gauge symmetry and chiral symmetry breaking.While both transitions in the Standard Model (SM) are crossovers [44,45] and hence do not generate GWs, new physics effects can modify the SM transitions to first order, or induce additional symmetry breaking transitions, which can then generate a SGWB [46,47].
Cosmological FOPTs proceed via the nucleation of spherical bubbles of the true vacuum within the false vacuum [48][49][50][51].The latent heat released in the transition drives the expansion of the bubbles, which then collide and merge and eventually fill the entire universe with the true vacuum.Interactions of the vacuum bubbles with the ambient primordial plasma of the universe further induce bulk motion of the fluid, forming fluid shells around the bubble walls and producing sound waves as well as turbulent motion in the plasma.Collisions of the vacuum bubbles and fluid shells as well as sound waves and turbulence can then act as sources of a SGWB [46,47,52].
A FOPT can be characterized by four parameters: the temperature T * at which it occurs, the transition strength α, its characteristic timescale β −1 , and the wall velocity v w at which the bubbles expand.The transition temperature is approximately given by the percolation temperature [53,54], T * ∼ T p , at which the probability to be in the false vacuum falls below e −0.34 [55].The strength parameter α is often defined as the latent heat release normalized to the energy density ρ R in radiation at the time of the transition. 2  We here use the approximation α ≈ ∆V (T * )/ρ R (T * ), where ∆V (T * ) is the potential difference between the minima, which holds for strong transitions.If the energy released in the transition is non-negligible compared to the radiation energy density, and the transitioning field couples to the plasma, the universe is subsequently reheated to the temperature T rh ∼ T p (1 + α) 1/4 .The timescale β = Γ/Γ is related to the time derivative of the nucleation rate Γ.The wall velocity v w , last but not least, depends on the pressure on the wall exerted by the plasma, and hence on the particle physics model under consideration.In the following, we will focus on PTs that release sufficient energy to accelerate the bubbles close to the speed of light, i.e. v w = 1.
The corresponding GW spectrum can be extracted from numerical simulations and is approximated as a single or double broken power-law for the bubble and fluid shell collision [57][58][59][60] or sound wave [61][62][63] and turbulence [64,65] contribution, respectively.The parametric dependence of the peak frequency and peak amplitude on the PT parameters is given by where H * is the Hubble rate at the temperature T * .Furthermore, we assumed v w = 1, and F is a redshift factor that depends on the transition temperature only indirectly through the effective number of degrees of freedom.The powers p and q as well as the efficiency factors κ for converting vacuum energy into the respective source depend on the contribution under consideration (cf.e.g.Ref. [66]).The collision contributions for instance scale as p = q = 2.
As can be inferred from the scaling of Eq. (2.2), observable GW signals typically require strong transitions with α ≳ 1.We hence here focus on transitions that exhibit a large amount of supercooling, where the energy released in the tunneling can be comparable or even exceed the energy of the radiation bath, and the plasma friction can be insufficient to prevent the bubble walls from accelerating until collision.In this case, the dominant contributions to the GW spectrum come from the collisions of the vacuum bubbles as well as the relativistic fluid shells [60].
A simple class of models known to feature strong FOPTs are scenarios of near-conformal dynamics undergoing a supercooled PT [21,[67][68][69][70][71][72][73][74].Arguably the simplest model is that of a classically scale invariant scalar field with radiative breaking of conformal symmetry, the famous Coleman-Weinberg (CW) model [75].At the classical level the CW theory is defined by the Lagrangian density where D µ Φ = ∂ µ Φ−igA µ Φ is the covariant derivative, and the tree-level potential of the conformal theory is V 0 = λ|Φ| 4 .Loop corrections to the potential then induce radiative symmetry breaking.The corresponding PT can be of first order, provided that the potential exhibits a sufficiently flat direction, i.e. λ ≪ g 4 .To simplify the analysis, we hence neglect the tree-level contribution and focus on the potential V 1 (ϕ, T ) generated at one-loop level for the order parameter ϕ of the theory, such that Φ = (ϕ + iG)/ √ 2, and where G denotes the Goldstone boson that becomes the longitudinal mode of the gauge boson after symmetry breaking.The theory is then described by only two parameters: the gauge coupling g, and the renormalization scale µ R at which the coupling is given.
At zero-temperature, the potential is given by the CW one-loop potential [75].The field develops a global minimum at ⟨ϕ⟩ = e 1/6 µ R /g, breaking the gauge symmetry spontaneously.The gauge bosons then acquire a mass m A = g⟨ϕ⟩, while the mass of the scalar mode is loop-suppressed, m ϕ = 3/2g 2 /(2π)⟨ϕ⟩.
The finite-temperature effective potential at one-loop can be expressed in the high-temperature expansion as In this equation m 2 (T ), δ(T ) and λ(T ) are functions of the temperature T and of other possible couplings or mass scales of the theory.For the CW model, we find the following form for these parameters [69] where N b = 3 is the number of the degrees of freedom of the (massive) gauge boson.Following [69] we introduce the mass scale M of the theory as such that the zero-temperature vacuum expectation value becomes v ϕ = ⟨ϕ⟩ ≈ 11.6 M/g, and m ϕ ≈ 2.3 gM .Note that Eq. (2.4) includes both, the finite-temperature and zero-temperature contributions, see appendix C for further details.Using the expressions above, we can computationally determine the parameters associated with the PT (see also appendix C for further details on the determination of the PT parameters).The parameters are then used to determine the corresponding SGWB, cf.appendix D. This then allows a comparison between our predicted spectra and empirical data from PTAs.
Figure 1 depicts the percolation and reheating temperatures T p and T rh as well as the transition strength α and inverse timescale β in the CW model as a function of the mass scale M and gauge coupling g.A significant amount of supercooling with percolation temperatures orders of magnitude below the mass scale, T p ≪ M , can be achieved in a large part of the parameter space, in particular at low gauge couplings.If the gauge coupling becomes too low, the field does not tunnel and the universe remains stuck in the false minimum, leading to eternal inflation, indicated by the white region.The supercooling goes along with strong transitions as well as large amount of reheating, with the reheating temperature on the order of the mass scale.As the reheating temperature eventually determines the frequency scales of the corresponding SGWB, M ∼ MeV is required for a signal in the NANOGrav frequency band.We further find α > 1 throughout most of the parameter space.The largest values of α are obtained close to the region where the field gets stuck in the false vacuum, however, which is also the region where our calculation is unreliable as higher-order corrections3 such as Daisy resummation become important.The inverse timescale parameter is typically of the order of β/H * ∼ 10 to 100.Note that for β/H * < 3 bubbles cannot percolate efficiently due to the expansion for the universe, and for β/H * < 10, the GW spectrum is likely overestimated as the expansion during the transition has to be taken into account [21,23,63].

Meta-stable topological defects, remnants of symmetry breaking
Spontaneous symmetry breaking in the early universe is accompanied, in many extensions of the SM, by the production of topological defects such as CSs and DWs [79,80].If symmetry breaking happens after inflation, these topological defects can carry a significant fraction of the total energy density in the early universe, and they give rise to GW spectra with distinct features.
Of particular relevance for GW production are networks of CSs from the breaking of a global U(1) symmetry, or networks of annihilating DWs from the breaking of a discrete symmetry.A simple and well studied model that gives rise to CSs is that of ALPs with post-inflationary Peccei-Quinn (PQ) breaking.For DWs, we consider two scenarios.In aligned axion models [81,82] the QCD axion is accompanied by several heavier partners, which necessarily couple to photons and gluons and thus reheat the visible sector after DW annihilation.Instead in heavy ALP models [83] the ALPs could also decay to dark radiation after DW annihilation, thus leading to a "dark DW" scenario [84].In the following we introduce these scenarios in more detail.

Global (ALP) Strings
CSs are one-dimensional objects originating from the spontaneous breaking of a U(1) symmetry.Here we will solely discuss global strings, i.e. strings stemming from breaking a global U(1) symmetry, for instance, an ALP string network.The core of the CS has a typical size that is of the order of the inverse of the symmetrybreaking scale, usually much smaller than the horizon scale.This allows us to employ the Nambu-Goto approximation, where CSs are one-dimensional infinitely thin objects whose tension µ is in units of energy per unit length.The string tension of ALP strings is given by where f a is the symmetry-breaking scale for an ALP string network with winding number n = 1.The network forms when the temperature of the universe is of the order of the PQ symmetry-breaking scale through the Kibble mechanism [85], This implies that there is a UV cut-off for the network associated with the time of formation, but this is in frequency ranges far above our region of interest.Once the network is formed it evolves towards an attractor solution that is independent of the initial conditions, the so called scaling regime.This is the result of equilibration between the competing effects of string recombination and the Hubble expansion.GWs from strings are primarily radiated off loops.For global strings, this happens shortly after the loops are produced.GWs are only a subcomponent of the total emission from CS networks, with the main radiation emitted from long strings and loops going into axions.We base our quantitative analysis on the GW spectrum that was obtained in Ref. [86] from numerical simulations.In the scaling regime, it takes the form . (2.9) If the U(1) symmetry was exact, the string network would exist until today and would be subject to strong bounds from the non-observation of gravitational imprints on the CMB.We therefore consider a scenario where the U(1) is only approximately realized.The amount of symmetry breaking can be parameterized by the axion mass m a (the Goldstone boson associated with the symmetry).Once the Hubble rate falls below the mass, DWs form and lead to the rapid annihilation of the string network given that there is only one minimum in the axion potential (see below for multiple minima).The time of the decay can be related to a pivot frequency today at which the spectrum deviates from the form given above and transitions to a ∝ f 3 slope in the IR.The pivot frequency can be expressed in terms of the mass as which is the annihilation frequency of the string network.We use a smooth step function for interpolating the GW spectra between the regime f > f pivot described by Eq. (2.9) and a ∝ f 3 dependence in the IR regime.
Before moving on, let us note that in the literature there is an ongoing debate about whether the GW spectrum features the log enhancement that is apparent in Eq. (2.9).When these simulations are carried out, technical constraints force one to only consider times close to the creation of the network when log(f a /H) = O(1 -10).While there are many simulations in this regime [87][88][89][90] that agree up to order one factors, there is disagreement on how one should extrapolate these results to log(f a /H) = O(10 -100) which is the relevant regime for predictions on GWs in PTAs.Suggestions reach from an exactly scale invariant spectrum [90,91] to the log 4 enhanced one [87,88] that we are showing.The analytic observation that the string tension is expected to have a log enhancement together with other considerations on the scaling of the number of strings per Hubble patch leads semi-analytic approaches to suggest an in-between behavior [92][93][94].
The main consequence of this choice is that it changes the value of f a towards larger values by a factor of up to O(10) to explain the amplitude.When comparing the PTA fit to the strength of other constraints these effects partially cancel, however, since also other observables such as the axion abundance are predicted to be log enhanced if the GW amplitude is (see section 4.2.1 for more details).A minor effect is that a log enhancement leads to a small slope in the scaling regime.

Annihilating ALP/Axion DWs
DWs are two-dimensional topological defects that form due to the spontaneous breaking of a discrete symmetry.We refer the reader to Ref. [80] for a review of their cosmology, which we briefly summarize here.After the breaking of the symmetry, different patches in the universe populate the different degenerate minima.At the intersections of the patches the field has to interpolate between these ground states, which leads to a concentration of energy in thin two-dimensional sheets, the DWs.They carry a characteristic energy per area, or surface tension, σ.Subsequently, these networks evolve to minimize the area of the DWs, while radiating the majority of the released energy as particles and a small fraction as GWs.
Numerical simulations show that the evolution rapidly reaches a scaling regime, with roughly one DW per Hubble volume.The typical curvature radius is on the order of O(H −1 ) and the energy density is given by ρ DW ≃ A σH , (2.11) where H is the Hubble rate and the area parameter A ≃ O(1) is a numerical prefactor extracted from simulations.In a model with a single real scalar field with broken Z 2 symmetry, its value is A = 0.8±0.1 [95].
For larger numbers of degenerate minima, N DW > 2, it is slightly bigger [96], but we will restrict our discussions to the case of N DW = 2 below.
As the universe cools down and the Hubble rate decreases, the total energy ∝ H 2 is depleted faster than the one in the DW network.Thus the energy fraction in the DW network is growing as and eventually comes to dominate the energy density at a temperature . (2.13) In order not to overclose the universe, DWs must be unstable and decay at temperatures above T dom .This requires an explicitly symmetry-breaking term in the Lagrangian, lifting the degeneracy between the vacua.We denote by V bias the energy difference between minima.This difference generates a volume pressure which leads to the annihilation of the network once the pressure becomes comparable to the surface tension in the walls.In a radiation-dominated universe, the network then annihilates at a temperature [96] T ann ≃ 20 MeV σ TeV 3 . (2.14) In order to avoid overclosure, V bias has to satisfy V 1/4 bias ≳ 0.03 MeV σ/TeV 3 1/2 .The network's energy is located in the DWs that are moving at relativistic velocities.Additionally, the scalar particles that get radiated from the DWs also possess large inhomogeneities.Both contribute to an anisotropic stress that leads to the emission of GWs.The majority of GWs get emitted right before the network annihilates, when its energy Ω DW is largest.The GW spectrum can be found from numerical simulations [80] and its peak amplitude and frequency are found to be T ann 10 MeV . (2.15) Here we included redshift factors assuming that annihilation takes place during radiation domination at temperatures close to 10 MeV that lead to the correct frequencies for PTAs.We follow Ref. [24] and assume that away from the peak the spectral shape is given as The behavior for frequencies below the peak as f 3 is dictated by causality, while the f −1 above the peak is found in numerical simulations.Additionally, there will be a cutoff at much larger frequencies that is correlated with the time when the network is first formed.These frequencies are, however, way too large to be relevant for the PTA data we consider.
So far, the discussion has been fairly model-independent.We will now explore two explicit models, where the DWs arise in a generic ALP model or in a clockwork realization of the QCD axion.While the two scenarios lead to almost identical GW spectra, their cosmological signatures are quite distinct.
We first consider the case of an ALP as the pseudo Nambu-Goldstone boson generated from the spontaneous breaking of an anomalous U(1) symmetry, which we will refer to as a Peccei-Quinn symmetry, at a scale f a .The Lagrangian takes the form where Φ = ρ/ √ 2 exp (ia/v a ) is a complex scalar field and the axion a is its angular part.The potential of Eq. (2.17) is such that the U(1) symmetry is spontaneously broken, with a vev ⟨Φ⟩ = v a / √ 2 and a ∈ [0, 2πv a ).The term V (a) in Eq. (2.17) is the anomaly-induced U(1) breaking under the influence of a strongly coupled gauge theory with dynamical scale Λ ≃ √ m a f a .This explicitly breaks the U(1) symmetry into its Z NDW subgroup.The conventional form of such explicit breaking at zero temperature is where f a = v a /N DW .The tension of ALP DWs in the absence of finite temperature effects can be estimated as [80] σ = 8m a f 2 a . (2.19) We will work under the assumptions that v a < T rh , such that the U(1) symmetry is restored after inflation and the network forms as the universe cools down, and that there is a large separation of scales between v a and Λ.This initially leads to the formation of a CS network which persists until the time of DW formation when H ≈ m a .Since N DW ≥ 2, there are multiple DWs attached to every string and the network is stable.Shortly after this time, the combined network is dominated by the dynamics of the DWs and one can neglect any effect the remnant strings have on the evolution.GWs produced by strings, as well as cosmological constraints such as those coming from N eff , are negligible with respect to the contributions from DWs.This is because, as will be clear from section 4.2.2, the decay constant f a is much lower than in the CS scenario discussed above.
In addition, the global U(1) symmetry is expected to be broken quantum gravity effects.Therefore, additional breaking terms, if not accidentally aligned with the anomalous breaking, lift the degeneracy between the minima.They provide the necessary V bias for the network to annihilate, with the temperature of annihilation given by Eq. (2.14).
In addition to the generic ALP model, we consider DWs in models of the QCD axion, i.e. models that solve the strong CP problem.One such scenario is that of axion alignment [81] realized by a clockwork mechanism.Here a collection of N axions that individually respect a shift symmetry is considered, where C i is a real-valued transformation parameter.One then assumes that N − 1 of these shift symmetries get explicitly broken into their discrete subgroups, giving rise to the potential for N − 1 linear combinations of the axions.The remaining flat direction is then identified as the QCD axion with its associated gluon coupling in, for instance, the Kim-Shifman-Vainshtein-Zakharov (KSVZ) model.The main advantage of this scenario is that it gives a light QCD axion with an exponentially enhanced effective decay constant F a ∼ f a e N , while the actual symmetry breaking scales f a can be much lower, e.g.around the TeV-PeV scale, thus making the model testable at particle physics experiments.This also ensures that the symmetry breaking can take place after reheating, and thus a DW network, made from the N − 1 heavy axions predicted by the model, can form.Ref. [81] found that the DWs are long-lived and survive until the QCD axion potential becomes relevant.For simplicity we take equal masses m a and equal decay constants f a for all the heavy axions.In terms of these, the DW tension is again given by Eq. (2.19).
Different from the generic ALP model, here the network is destabilized by QCD instantons at the time of the QCD phase transition.This lifts the degeneracy between the different minima by ∆V ≃ Λ 4 QCD , and the annihilation temperature can be predicted as [81] T ann ∼ 1 GeV g * (T ann ) 80 Up to order-one factors, the GW spectrum obtained in Ref. [81] agrees with Eq. (2.15), and we will therefore use the latter for both models.The main difference instead is that in the aligned axion model, also the heavy axions couple to the SM via the usual axion couplings [97].Shortly after DW annihilation, the energy is therefore transferred back to the visible sector via decays of the heavy axions into SM particles.As discussed in more detail in section 4.2.2, these two DW scenarios are therefore subject to very different cosmological constraints.

Bosonic instabilities and late preheating
Instabilities in bosonic fields can lead to the exponential amplification of fluctuations, which then act as sources for GWs.Such mechanism are best known in the context of preheating [98][99][100][101][102][103] but can also take place in a purely gravitationally coupled sector contributing a sub-dominant fraction of energy [104][105][106][107][108][109].In these models the energy is initially stored in one homogeneous field, typically a scalar.The self-couplings of the field or couplings to other bosons then induce a time-varying effective mass that leads to the exponential amplification of fluctuations in the secondary field.Once the fluctuations start to dominate the energy density, the distribution of the energy also becomes highly inhomogeneous.With this transition, a large anisotropic stress get generated that leads to the emission of GWs.
The peak wavelength of the produced GWs is given by the fastest growing mode.Due to causality this wavelength is always smaller than the horizon at the time of production.On the other hand, the wavelength cannot be much larger in order for the mechanism to efficiently source GWs, establishing a firm relation between the peak frequency of the GW spectrum and the time of production.Similar to the PTs discussed above, any such process must therefore occur at temperatures of T ∼ (10 -100) MeV in order to account for the PTA signal.
The amplitude of the GW signal is mainly set by the fraction of energy in the bosonic sector.Explaining the PTA signal requires energies that cannot easily be hidden from other cosmological probes.We can distinguish two possible scenarios: If the GW production is highly efficient, a purely gravitationally coupled sector might still be able to evade the bounds on dark radiation and matter.If this is not possible, all or part of the energy in the dark sector must be depleted into the visible sector via direct couplings.Such energy transfers would need to occur before BBN.A detailed discussion of the bounds can be found in section 4.3.
As a concrete example, we consider a model where an ALP is coupled to a dark photon [104,110,111].The dark sector consists of a CP odd scalar ϕ with mass m a and decay constant f a and a dark photon X µ .The two are coupled through where α denotes a dimensionless coupling constant while X µν and Xµν denote the dark photon field strength and its dual.We assume the axion is initially displaced by ϕ 0 = θf a from the minimum of its potential that we take as ) to be explicit.Once the Hubble rate drops below the axion mass, H(t osc ) = m a , the axion starts oscillating around the minimum of the potential.The non-vanishing velocity of the axion φ ̸ = 0 leads to a tachyonic instability in one of the chiral polarizations of the dark photon induced by the coupling.As a result, fluctuations with momenta k ∼ αθm a grow exponentially.In order for the tachyonic production to be efficient the product αθ needs to be larger than 20, although larger values lead to more efficient emission of GWs.For simplicity we will fix θ = 1 and α = 100. 4The tachyonic growth is very rapid as compared to the process of parametric resonance occurring in other models (cf.e.g.Refs.[103,105]) and therefore leads to more efficient GW emission.
The GW peak frequency in this case is largely controlled by the mass m a , since it determines the time at which the axion starts to oscillate.The peak amplitude at emission is controlled by the energy in the dark sector which is given as Ω d,osc ≈ (ϕ 0 /M P ) 2 .In principle one needs to distinguish here between ϕ 0 ≳ M P and ϕ 0 ≲ M P .In the first case the axion dominates the energy density and possibly even drives a period of inflation.It must then necessarily reheat the SM.Also the expansion is primarily driven by the bosonic sector and should be self-consistently taken into account in simulations [114,115].In the other case the bosonic sector only comprises a small amount of the total energy and can be purely gravitationally coupled [104,116].
Using the data from Ref. [117] we update the template for the GW spectrum given in Ref. [118].We find that the peak frequency and amplitude of the signal today are given by f peak ≈ 3.9 × 10 −9 Hz αθ 100 ) (2.24) The shape of the spectrum can be characterized by a broken power-law of the form where we determined the values of n 1 = 0.73, n 2 = −4.96and ∆ = 0.24 by fitting.The small prefactor here includes the usual redshift, while the main additional suppression of the signal is due to the energy density of the source scaling as (f a /M P ) 2 .Thus, an observable signal requires the scale f a to be close to the Planck scale.

Scalar-induced GWs
When scalar perturbations generated during inflation reenter the horizon, they act as a second-order source term for GWs, schematically through the relation Ω GW (k) = T P ζ P ζ , where T is an appropriate transfer function [119][120][121][122] (see Ref. [123] for a review).The predicted GW abundance for a power-law spectrum compatible with CMB observations is Ω GW ≃ 10 −21 at NANOGrav frequencies [121].If scalar perturbations are enhanced at scales smaller than the ones probed by CMB, they could instead generate a visible signal that may explain the NANOGrav data.This may happen typically because of some feature in the inflaton potential which leads to slow-roll violation and the onset of an ultra slow-roll phase [124,125], or because of the dynamics of a second field in multi-field or spectator field scenarios [126,127].Other, more exotic options are also possible, including e.g. the violation of the null energy condition (see Ref. [128] and references therein).At the same time, such large scalar fluctuations are expected to induce a population of primordial black holes (PBHs).Imposing that these PBHs do not overclose the universe, and that they comply with astrophysical bounds, strongly constrains the amplitude of the primordial scalar fluctuations [36].
A conventional parametrization for the curvature power spectrum is a log-normal function centered around some momentum k * and of width ∆.Assuming standard expansion history and that the modes reenter the horizon during radiation dominance, analytic expressions can be obtained for the induced GW spectrum [36,129] which, around the peak, is well approximated by a log-normal function with width ∆/ √ 2. The problem can be studied in a more general background [130,131], but we will stick to radiation dominance for simplicity, as it is expected in the standard cosmological history.
In inflationary models featuring an inflection point, the log-normal parametrization discussed above does not reproduce the qualitative behaviour of the curvature power spectrum, which, in this case, is rather asymmetric around the peak, with a steep tail in the IR and a relatively flat (although red-tilted) plateau above the peak.We found that a reasonably good parametrization around the peak region can be written as a log-normal function below the peak and a red-tilted power spectrum above it, where we have stitched the two pieces with a θ function centred at k/k * = e n∆ 2 , ensuring a continuous first derivative.Following conventions, the peak amplitude in this parametrization is ).From Eq. (2.26), one can derive the GW abundance by integrating numerically P ζ P ζ with the correct transfer function [127,130], or try to obtain an analytic approximation along the lines of Ref. [129].As it turns out, due to the smallness of the cross terms, a good approximation is to simply glue the analytic expressions for the GW power spectrum obtained for the log-normal and for the power-law scalar spectra.These are given by the following expressions, for the log-normal part, where α = f /f ⋆ , K = α exp 3∆ 2 /2 , and we assumed ∆ ≳ 0.5 [36,129], and (2.28) for the power-law part [130].Putting things together, our ansatz for the GW power spectrum is where a tanh function has been used to glue the two contributions together.The GW frequencies f correspond to momenta k of the scalar power spectrum through the relation In section 4 we will use Eq.(2.29) to fit PTA data, while Eq.(2.26) will be needed to discuss constraints from the overproduction of PBH and CMB spectral distortions.
In order to motivate an interpretation of PTA data in terms of scalar-induced GWs, it is important to discuss models of inflation that could generate such a signal.Refs.[132] and [133] discuss examples of this in the context of thermal inflation and two-fields inflation with a waterfall phase, respectively.The focus of these works is on PBH production, but the same scalar perturbations would induce a GW signal in the right ballpark.In both cases, a symmetric log-normal distribution seems to be a reasonable proxy for the power spectrum.Another model is the one shown in Ref. [134], in which the fluctuations of an axion-curvaton field are enhanced by the rolling phase of its radial counterpart.Interestingly, in this model the power spectrum can be enhanced for many efolds, generating a GW signal spanning from PTA up to the LISA range.An important consideration made in this paper concerns the role of non-gaussianities.In their model, non-gaussianities suppress the production of PBHs, while only mildly affecting the spectrum of GWs, thus avoiding the bound from the overproduction of PBHs.This highlights the fact that the correlation between the GWs and the PBH signals can only be discussed once non-gaussianities are taken into account.Finally, in Refs.[135,136], a reverse-engineering approach is taken, in which a suitable power spectrum is generated by assuming a specific behaviour of the second slow-roll parameter as a function of the number of efolds during inflation η(N ), and then a slow-roll potential is obtained numerically.
Here we try a more direct approach, and consider whether a simple single-field inflationary model with an inflection point could reproduce our signal, while still respecting CMB bounds.We show below that this appears to be impossible.The simplest example one can think of is a ϕ 4 theory with a running coupling and non-minimal coupling to gravity [125], where The parameters b 1 , b 2 , b 3 mimic the logarithmic running of the potential, and can be tuned in such a way to reproduce an inflection point around ϕ = ϕ 0 .A simplified version of this model is obtained by setting b 1 = 0, and reproduces a possible scenario for Higgs inflation with a reasonable choice of parameters [124,137].The potential for the field ϕ in the Einstein frame can be rewritten as ) with β = 0, the potential has an exact inflection point at x = x c .The parameter β ̸ = 0 allows for a small deviation from an exact inflection point.We assume β ≪ 1, and take it as a measure of the amount of fine-tuning needed.
The curvature power spectrum is obtained in the standard way by solving the equations of motion for the zero-mode field and the Mukhanov-Sasaki equation for the curvature power spectrum, for each choice of parameters.One should remember here that the field ϕ is not canonically normalized after going to the Einstein frame, and thus the equations of motion are not the usual ones.In order to speed up the calculation, it is useful to adopt the approximate expression for the curvature power spectrum Figure 2. Scan of the parameter space of the inflationary model.The green horizontal bands identify the target region that could fit PTA data [36], while the gray bands correspond to the CMB measurements.The blue dots corresponds to the full scan.The three stars are the benchmark points discussed in appendix E. Shown in the plots are the peak position k * , the peak amplitude P ζ, max , the spectral index at the pivot scale and the number of efolds from when the pivot scale exits the horizon until the end of inflation.
Strictly speaking, this expression is valid only at slow-roll, while a peak in the power spectrum is obtained precisely due to its violation.Nevertheless, it gives a reliable enough indication of the position of the peak in momentum space and of its amplitude.Once some relevant benchmarks are identified, one can then solve the Mukhanov-Sasaki equation for a more reliable calculation.
The parameter space of the model is scanned in the range reported in Tab. 3 in appendix E. We fix the amplitude of the scalar power spectrum at the pivot scale k = 0.05 Mpc −1 to P ζ = 2.1 × 10 −9 [138], and compute the peak amplitude P ζ, max , the position of the peak k * , and the momentum k 0.1 < k * at which the amplitude decreases by a factor 0.1.The last quantity is important when the spectrum has a very flat plateau, in which case k 0.1 represents better than k * the point above which the power spectrum is enhanced.In order to compare with CMB observations, we also compute the scalar spectral index and the number of efolds from when the pivot scale exits the horizon to the end of inflation.
The results of the scan are shown in Fig. 2. The situation is particularly clear from the first plot.A peak in the power spectrum in the PTA region inevitably leads to a low spectral index at the pivot scale.This is ultimately due to how the enhancement is generated: in field space, the CMB pivot scale 0.05 Mpc −1 and the PTA scale are too close to each other to have an important feature at the latter without affecting the former, unless the feature is generated via some kind of step function, which is difficult to justify in a single field model.We checked that this result does not depend on the finiteness of our scan region by extending it in all directions.
A similar tension is present even if one is interested in a signal in the LISA frequency band, that corresponds to k * ∼ (10 12 -10 14 ) Mpc −1 .This can be generated in this scenario, at the price of a small n s and a large number of efolds between the horizon crossing of the CMB scale and the end of inflation [137].
The same exercise we discussed here can be done with a polynomial potential inspired by EFT considerations, of the form (2.36) Ref. [139], for example, considers such a potential including only even terms up to n = 6, and mimics the inclusion of a non-minimal coupling to gravity by dividing V by (1 + ξϕ 2 ) 2 .Notice that they do not canonically normalize the field ϕ, so this is not equivalent to what is done here and in Ref. [125].While they are able to find a peak corresponding to NANOGrav frequencies, they obtain n s ∼ 0.9 (see their Fig.3), in accordance with our results and in 10 σ tension with the n s measurement by Planck.

PTA data and fitting method
Since 2020, three of the currently operating PTA observatories, NANOGrav [10], EPTA [13] and PPTA [12] have reported strong evidence for a common-spectrum red process across pulsars in their data.Similar results were found by the joint IPTA [14] collaboration combining previous data releases of the former three.Although the evidence for the interpulsar Hellings-Downs correlations [140] required for establishing the observation of a GW signal is still not conclusive, it is intriguing to attribute this signal to a SGWB.We here operate upon the assumption that the putative SGWB signal is of cosmological origin and assume that the potential SGWB from SMBHBs is subdominant in the frequency range we consider.We fit the GW signals produced by the sources discussed in section 2 to pulsar timing data.For PTA data this fit is typically done in terms of the timing-residual cross-power spectral density, Here, Γ ab denotes the overlap reduction function between two pulsars a and b and H 100 = 100 km s −1 Mpc −1 .However, a thorough Bayesian analysis of this quantity for PTA data is a computationally expensive task.Hence, fast refitting techniques have been developed to reinterpret free-spectrum fits (to the entire PTA or for each pulsar individually) in terms of arbitrary SGWBs, resulting in the development of the ceffyl analysis suite [39].In this vein, we perform a PTA free-spectrum refit in ceffyl to the NANOGrav 12.5 year dataset [5] and the second data release (DR2) of IPTA [9,14].The kernel density estimator (KDE) representation required by ceffyl for the NANOGrav dataset is provided by the collaboration [141].For the IPTA data, the KDE is created reproducing the 30-frequency-bin free-spectrum fit with enterprise [37] and enterprise extensions [38].We verified that ceffyl reproduces the best-fit regions for a simple power-law SGWB that were obtained by NANOGrav and IPTA.Furthermore we also checked that this remains true for broken power-law signals by comparing the best-fit regions obtained with ceffyl to those from a full enterprise run for the scenario section 2.3.We use PTMCMCSampler [142] to sample the respective parameter spaces of our signal models, except for the CW model for which we evaluate the likelihood on a regular grid.An overview of the priors we use in our Bayesian analysis can be found in appendix A. Following the NANOGrav searches for a general common-spectrum signal [10] and for a SGWB from FOPTs [11], we use only the five lowest-frequency bins of the NANOGrav data, corresponding to frequencies below f ≲ 12.5 nHz.These bins give the strongest contribution to the signal-to-noise ratio (SNR) of the NANOGrav signal [10].For consistency, we cut the IPTA data at the same frequency.Due to the longer observation period of 30 years, this corresponds to twelve bins in the IPTA 30-component free-spectrum fit.Other choices regarding the IPTA frequency range are discussed in appendix B.
Intuitively, the most robust insight into the low-frequency regime can be obtained by combining data from different PTA collaborations.However, this is a very challenging task.The different datasets comprise pulsars observed by multiple collaborations, 5 timed over various observation periods at different times and with different telescopes.A considerable amount of care has to be taken to properly account for all correlations between the individual measurements due to overlaps in the timing data as well as instrumentspecific noise contributions.In order to address this, EPTA, NANOGrav and PPTA, as well as the Indian Pulsar Timing Array (InPTA) project, have formed the joint IPTA consortium.While the current evidence in the individual PTAs could only be established with the respective latest data releases, IPTA was able to detect the common spectrum combining data available in 2016 and before.Hence, the question arises what can be expected when all currently available measurements are taken into account.
Clearly, a forecast with a proper account of pulsar overlaps is beyond the scope of this work.To approximate the signal reconstruction obtained in a future IPTA release, we here employ a naive combination by multiplying the likelihoods of the NANOGrav 12.5 year data and IPTA DR2, where we again evaluate the likelihood of each experiment using ceffyl, i.e. treating the two datasets as independent.In addition to the overlap discussed above, this procedure evidently double-counts pulsars, as IPTA includes the nine-year NANOGrav data.However, not all pulsars in the nine-year dataset have been observed for a sufficient amount of time to contribute to the frequency bins we consider, somewhat reducing the number of doublecounted pulsars.Nonetheless, a significant overlap between the two datasets remains.Consequently, while we still present the naive combination as a rough indicator for how the results might change in the near future, the corresponding fit should be taken with caution.

Cosmological constraints and final results
As discussed in section 2, in order to explain the large observed signal, the GW sources should constitute a significant fraction of the total energy density in the universe at the time of GW production, and are therefore likely to be subject to other cosmological constraints.Furthermore, the frequency range puts the time of GW production close to that of nucleosynthesis and neutrino decoupling at T ≲ 1 MeV and T ≈ 2 MeV, respectively.More precisely, the frequency today can be written as [143] where T * is the temperature of the universe at the time of production.Causality requires the last factor to be larger than unity, therefore GWs in the frequency range probed by NANOGrav cannot be produced much before the time of nucleosynthesis.Overall, the following constraints are considered for each model: • The abundance of relativistic degrees of freedom, commonly expressed as ∆N eff , should satisfy the current constraints ∆N BBN eff ≤ 0.39 at 2σ and ∆N CMB eff ≤ 0.29 at 2σ [138] confidence level (CL), at the relevant temperatures.
• Big Bang Nucleosynthesis should not be affected significantly, which roughly implies that the universe should look standard model like at temperatures below 1 MeV.More precise constraints on the allowed amounts of energy influx into the SM thermal bath during and after BBN are available e.g. in Refs.[144,145] and are taken into account when relevant.
• If the energy in the GW source leads to significant late time reheating, then the universe should reheat to T RH ≳ 2 MeV such that also the neutrino sector is re-thermalised [145,146].Furthermore the dilution of previously produced baryon and dark matter (DM) abundances has to be taken into account.
• Spectral distortions of the CMB can be induced by late decays of particles [147,148] or by large fluctuations in the plasma or even decoupled sectors [149,150], and thus can both directly and indirectly constrain models that produce large GW signals at low frequencies.The current limit on µ-type spectral distortions from COBE/FIRAS is µ < 4.7 × 10 −5 [151], which could be improved to µ < 3 × 10 −8 by future missions such as PIXIE [152].
• If the energy stored in the GW source is partially converted to non-relativistic dark matter, its abundance should not exceed the observed value today Ω DM h 2 ≤ 0.12.If this becomes a significant fraction of the total dark matter, also isocurvature constraints apply.
• The large energy anisotropies typically required to produce GWs may lead to the formation of primordial black holes or dark matter mini-clusters.Their abundance should obviously agree with observational constraints.
While above these cosmological observations are discussed as constraints, more precise measurements in the future will allow us to discriminate between different sources of GWs.Below we show this in particular for the case of constraints on ∆N eff , which is expected to improve by an order of magnitude in the coming decade [153,154], and for the case of CMB spectral distortions, which could be improved by several orders of magnitude in the not too distant future [155,156].

Strong first order phase transitions
Only very strong PTs can explain the observed GW signal, which implies that at the time of GW production the energy in the source is a sizable fraction of the total energy density of the universe.Fits of generic PT spectra have obtained α ≳ 0.1 [11,19,145], which however requires pathologically small values of β/H * that might be difficult to realise in concrete models.Instead in our model we find α ≫ 1 for all viable parameter points.This implies that converting the energy into dark radiation and thus hiding it, as suggested in Ref. [66], is not a viable option.Instead, the energy should be transferred back to the visible sector, which leads to a substantial amount of reheating.We quantify this using the ratio of entropies in the visible sector at percolation T p and after reheating T rh , s rh /s p .This is more than one order of magnitude for most parameter points, and can exceed 10 10 in some regions, as shown in Fig. 3.This ratio is important because it corresponds to the amount by which previously generated abundances such as the baryon asymmetry or the dark matter density are diluted.A dilution factor of order 10 10 would require an order one baryon asymmetry before the phase transition, so significantly smaller values are clearly preferred.
Let us consider the results of the fit, shown in Fig. 3, in more detail.First we see that the NANOGrav data prefers a lower mass scale of M ∼ (1 -10) MeV, while the IPTA data prefers a higher mass but also significantly more supercooling/reheating.This can easily be understood from the left figure, where one sees that IPTA data prefers to sit on the IR tail of a very large signal, which pushes the fit to larger masses and larger amplitudes, while NANOGrav prefers to have the GW peak at lower frequencies.This discrepancy depends quite sensitively on the number of frequency bins included for each dataset, as discussed in more detail in appendix B. Ultimately, additional data will hopefully resolve this situation.
In fact most of the parameter range preferred by IPTA would be challenging to accommodate in standard baryogenesis scenarios. 6While one could imagine producing the baryon asymmetry directly in this supercooled PT, it would require a departure from the minimal model that we consider here.The reheating constraint is satisfied in most of the preferred parameter space.Another interesting aspect is the percolation temperature, which is the minimal temperature that the thermal bath reaches before the PT completes.In a large fraction of parameter space, this is below 1 MeV, which implies that nucleosynthesis begins, with the undiluted baryon asymmetry.A second phase of BBN then follows after reheating, with the correct  [5,10] (blue) and IPTA DR2 dataset [9,14] (orange), as well as the best-fit of a naive combination of the two datasets (green).Left: Best-fit (solid lines) of the GW spectrum.The violins depict the corresponding 30-bin free-spectrum fits used for the refitting.Right: 2D posteriors for the coupling g and mass scale M of the model.Solid black contours show the dilution factor due to reheating after the PT.Below the dotted line the percolation temperature is below 1 MeV, while to the left of the dashed line the reheating temperature does not reach 2 MeV.The full triangle plot including 1D posteriors is shown in Fig. 9.
baryon abundance.While reheating should dissolve most bound states, this is not immediately obvious for Helium-4 with a binding energy of 28.3 MeV.We plan to revisit potential constraints from this two stage of nucleosynthesis in the future.
Finally, let us discuss how the reheating process can work in practice.From the Friedmann equations, one finds the reheating temperature T rh = 0.55 g where g * ≈ 10.75, and Γ ϕ is the decay width of ϕ.Reheating above 2 MeV requires Γ ϕ ≳ 4 × 10 −20 MeV.The preferred range of m ϕ is (0.92 -6.9) MeV for NANOGrav and (11.5 -124) MeV for IPTA.We can consider different decay operators that can satisfy these constraints.The simplest scenario is the Higgs portal, via the operator which after symmetry breaking leads to mixing of ϕ with the Higgs boson with mixing angle θ ≈ λ hΦ v ϕ v h /m h , where v h = 246 GeV and m h = 125 GeV (see e.g.Ref. [158] for a recent study).This allows ϕ decays to electrons and photons, however both channels are suppressed by the small Higgs couplings to those states, requiring a Higgs mixing of order 10 −4 [159, 160], 7 and thus a large portal coupling λ hΦ .Unfortunately the operator in Eq. ( 4.3) also contributes a large mass for the scalar after electroweak symmetry breaking, and is thus in conflict with our initial assumption of classical scale invariance.
Alternatively we can consider a direct decay channel to electrons or photons, via couplings where Λ is some UV scale.These operators do not violate scale invariance at the tree level, and are otherwise not strongly constrained [161].The main laboratory probes of our PT scenario therefore are searches for a light scalar in the (1 -100) MeV range which decay to electron or photon pairs.In fact there is an intriguing hint for a new boson with a mass around 17 MeV [162,163], which could be either ϕ or the gauge boson of our model, and which could be searched for in high intensity electron beam experiments such as MESA [164].
As far as astrophysical probes of this scenario are concerned, for very small m ϕ , close to the lowest preferred masses, late decays will continue to inject energy into the plasma during BBN and before CMB formation [165,166], and could produce CMB spectral distortions that can be probed in the future [156], while acoustic waves à la Ref. [150] are not expected to contribute here.

Global (ALP) Strings
In Fig. 4 we show the preferred parameter space in which cosmic strings from the spontaneous breaking of a U(1) symmetry can explain the PTA signal.This region changes slope around m a ≈ 10 −14 eV.For lower masses, the network decays at temperatures below 1 MeV and the PTA signal is due to the almost scale invariant part of the spectrum, for larger masses the ∝ f 3 tail of the spectrum that gets created in the decay of the network gives the signal.The NANOGrav data seems to prefer the former scenario, while IPTA favors the latter.
In a minimal scenario ALPs in this parameter space are expected to be cosmologically stable with negligible SM interactions.As the ALP strings continuously radiate ALPs during the evolution of the network, they create an abundance of both relativistic and non-relativistic ALPs contributing to N eff and DM respectively.For the contribution towards N eff we obtain using Eq. ( 21) of Ref. [88], Using this relation, the present bound on ∆N eff corresponds to an ALP decay constant of f a ≲ 10 15 GeV.It is therefore at tension with all of the parameter space favored by the fit.At this point it is however worth mentioning that other predictions concerning the log scaling as discussed in section 2.2.1 might lead to a milder tension.Assuming that this is a viable explanation, the non-relativistic ALPs also contribute to DM. 8 In order to not overproduce DM, very small masses of m a ∼ 10 −22 eV are required.At least the NANOGrav data would still be consistent with such a scenario, our figure only ends at m a ∼ 10 −18 eV due to our choice of priors.Alternatively, also a period of late matter domination can improve the viability of the model, while still yielding observable GWs [92,94,167,168].For completeness, let us also mention one further constraint, even though it is always weaker than N eff .In order for the string network to form, the global symmetry should be broken after inflation and reheating.The current Planck bound on the Hubble scale [138] at the end of inflation, H inf ≤ 6 × 10 13 GeV, one finds that the highest temperature of the post-inflationary universe is T max ∼ 5 × 10 15 GeV.The constraint f a ≤ T max is therefore also indicated in Fig. 4.
Overall, it appears that the global CS scenario is not a viable explanation for the current PTA data.Nevertheless, future GW probes in the nHz range will be able to test viable regions of parameter space (see e.g.Fig. 11 of [150]), and it could easily be that most of the current signal is of astrophysical origin.A complementary probe of such a scenario would also be provided by future measurements of µ-distortions, as shown by the dotted line in Fig. 4. We therefore decided to keep it in our list of simple benchmark models.

Annihilating ALP/Axion DWs
To explain the available PTA data with DWs, the network needs to annihilate close to BBN.In order to produce a sufficient amount of GWs, it needs to comprise at least a fraction of the total energy, Ω DW ∼ 0.1.Annihilating DWs mainly decay into non-relativistic particles, which in our models are the heavy axions and ALPs.They necessarily have to decay further, in order to not overclose the universe.At this point we have to distinguish between the ALP and the aligned axion model.
In the ALP model, for simplicity, we assume that the ALPs subsequently decay to some form of dark radiation, which will contribute to ∆N eff .Assuming efficient annihilation, we can set ρ DW ≃ ρ d where ρ d is the energy density in dark radiation, and obtain ∆N eff ≃ 1.6 g s (T ) 10 where the temperature should be taken at the time of annihilation, T = T ann .Besides ∆N eff constraints, DW networks that are decoupled from the SM were shown in Ref. [150] to exhibit µ-distortions.We follow their approach to compute the induced µ-distortions and impose the constraints from FIRAS as well as the expected sensitivity of the PIXIE proposal.There are two more technical constraints that we impose.First we require that the DW network annihilates sufficiently before it would start to dominate the energy density of the universe, T dom ≲ 4 T ann .On the other hand we also require that T dom ≳ T ann /4, to ensure that plasma effects on the DWs are negligible [83].This should ensure that our results are in the region where the simulations of the GW spectra can be trusted.
Figure 5 shows the range of axion masses and annihilation temperatures T ann preferred by the fit of the ALP DW GW spectrum to NANOGrav (blue) and IPTA (orange) data, as well as the combined fit (green).Here .Fit results of the ALP DW model from section 2.2.2 to NANOGrav (blue), IPTA (orange) and their combination (green).Left: Best-fit GW spectrum alongside the free-spectrum fit (violins).Right: 68 % and 95 % CL fit region in terms of the axion mass ma and annihilation temperature Tann.In the region between the dashed lines, our description of the GW spectrum in terms of the scaling regime is valid.The region below the solid lines is excluded by N eff for fa = 10 5 GeV (black) and for fa = 10 7 GeV (grey).The dotted line shows the projected sensitivity of PIXIE.The full triangle plot including 1D posteriors is shown in Fig. 11.
for NANOGrav, while IPTA prefers values closer to the upper end of the range, as can be seen in the full triangle plot in Fig. 11.Only a small part of the parameter space is disfavored by N eff limits.We indicate them for two characteristic values of f a in the figure.As discussed above, our estimate of the GW signal is only reliable in a certain window, which here is the region between the dashed lines.Finally PIXIE would be able to probe the region below the dotted line.It is apparent that especially at small annihilation temperatures µ-distortions provide a strong independent probe of the model going much beyond the N eff limit.Within the range of decay temperatures favored by the fit their reach is however limited.
In the aligned axion model, we expect instead that the heavy axions rapidly decay to SM particles after DW annihilation.Therefore, we do not expect constraints from N eff or spectral distortions.Instead, one needs to make sure that the decay products of the heavy axions do not jeopardize BBN.To estimate this, we compare their decay rate into gluons and photons with the Hubble rate, i.e.Γ a→gg/γγ ≃ H(T ), where the decay rate at leading order of an axion into two gluons is given by with α s = 0.1 and C gg = 1.Our primary concern will be the decay into gluons for m a ≥ 1 GeV.This rate is fast enough to ensure decays before the onset of nucleosynthesis.In fact it guarantees that the relic axions will almost instantly decay after the DW network annihilates for all values of m a and f a considered here.We show the 68 % and 95 % CL contours as a function of the axion mass and decay constant in Fig. 6.The technical constraints discussed for the ALP case also apply here.Again we see that the best-fit region fully agrees with the range of validity of the DW simulations we are using, and there are no conflicts with cosmological bounds.It should be noted though that the best-fits shown on the left of Figs. 5 and  and their combination (green).Left: Best-fit GW spectrum alongside the free-spectrum fit (violins).Right: 68 % and 95 % CL fit region in terms of the axion mass ma and decay constant fa.In between the dashed lines our description of the GW spectrum in terms of the scaling regime is valid.The full triangle plot including 1D posteriors is shown in Fig. 12.The collider projections from LHC Run 2 in grey are taken from Ref. [169], whereas the projections from searches by FCC and CLIC are from Ref. [170].
the same combination of parameters.This leads to the peak of the spectrum sitting at higher frequencies than the range probed by PTAs, while for the ALP model the peak can be freely adjusted and the fit prefers parameters where it falls into this range.
Furthermore, it can be interesting to ask whether the heavy axions in this model can be probed in the laboratory, in particular at the LHC.It was shown in Ref. [169] that the production of axions in the decay of electroweak bosons provide a particularly sensitive probe for heavy axions in the (1 -100) GeV mass range.While the projected collider reach of the LHC (grey shaded region) is not sufficient to probe the best-fit region, it is still interesting to see that collider probes of such scenarios are in principle possible.In particular, a future linear electron-positron collider such as CLIC with a center-of-mass energy of 3 TeV can explore the best-fit region for axion masses above m a ≳ (10 -100) GeV, whereas a circular collider like FCC-ee would not be able to probe the required decay constants [170].
A potentially important constraint on DWs as a source for the PTA signal comes from the formation of PBH during the annihilation of the domain-wall network.Closed domain walls of typical radius r ∼ H −1 ∼ t will form a BH when their linear size becomes smaller than their Schwarzschild radius r S ∝ M ∝ r 3 [171][172][173].According to a recent study [173], in the parameter space where the PTA signal is reproduced an overabundance of PBH is produced.Still, we believe that the current understanding of this process does not allow to claim the exclusion.First, the scaling regime r ∼ t is expected to be violated when the Schwarzschild radius becomes comparable to r, but dedicated simulations are lacking.Secondly, the assumption r = t implies a monochromatic spectrum, which is not expected to be realized in a consistent cosmological history.Last, the evolution from the start of the annihilation process T ann until the time of BH formation T * is still very uncertain, and its impact on f PBH is parametrized by the ratio (T * /T ann ) α , with α ≃ 7−28.Given these large uncertainties, we refrain from showing these constraints in our figures, but we stress their potential relevance and encourage a dedicated analysis, which goes beyond our current scope.

Bosonic instabilities and late preheating
Explaining the PTA signal requires the bosonic sector to comprise a non-negligible amount of the total energy.In our model of an axion coupled to a dark photon we will have two components, the axion behaving as DM and the photon contributing to N eff , in the case where there are only gravitational interactions with the visible sector.The contribution to N eff can be estimated as [117] ∆N As one can see from Fig. 7, this puts the parameter space preferred by the fit in mild tension with the current bound of ∆N eff ≤ 0.29.Furthermore, as pointed out in Refs.[116,117,174], the relic abundance of the axion is typically larger than the observed amount of dark matter.This problem has also been observed in models relying on a parametric resonance instead of tachyonic growth [105,108,109].A possible solution to this problem might be model extensions that allow for a time dependent axion mass as discussed in Refs.[117,174].
As with the other models, the second option is to deplete the energy to the SM plasma.Here this is however very challenging, and impossible to achieve via perturbative processes.The reason is that the axion mass sets the energy scale of all processes.The decay rate of the axion is for instance proportional to Γ ∝ m 3 /f 2 , and therefore efficient decays can only occur long after the onset of the axion oscillations, in our case preventing the decay before BBN.The large field strengths present in this model might possibly allow for the non-perturbative production of particles through the Schwinger effect as considered in Ref. [175].This process, while efficient in reducing the energy in the bosonic sector, however also lowers the amount of produced GWs.

Scalar-induced GWs
Our results for scalar-induced GWs are presented in Fig. 8.We see that the PTA signal can be reproduced with a rather large amplitude, A ζ ∼ 10 −2 -10 −1 , and peak momentum around (10 6 -10 7 ) Mpc −1 , in agree-  ment with previous results [36]. Figure 14 shows the contours and the posteriors for the other parameters of the model.
Different from the other scenarios, here the underlying source of the GWs is active during inflation.Therefore, after inflation ends and the universe reheats, only the curvature perturbations remain as traces, frozen outside the horizon.Once they reenter the horizon, besides sourcing GWs, very large curvature perturbations can lead to (over)production of PBHs.Since this has to happen at scales not too far away from the CMB, spectral distortions of the latter are also typically induced.
The production of PBHs is a delicate issue.The fraction f PBH of DM in the form of PBHs depends strongly on the choice of the window function for the variance of the density contrast, for which there is no unique prescription, and exponentially on the exact value of the critical density for collapse.This delicate sensitivity prevents a reliable calculation of f PBH from the model parameters.Nevertheless, as pointed out in Ref. [36], the reverse approach is possible: imposing f PBH < 1, reliable bounds on the parameters of the model can be derived.Assuming Gaussian distributed perturbations, the upper bound on A ζ derived from f PBH < 1 is A ζ < 0.01 -0.04 [36], falling right inside the best-fit region shown in Fig. 8.This claims for a very careful assessment of the PBH constraint.An important role is played by non-Gaussianities (NGs), which come from two sources: primordial NGs in the inflationary spectrum and NGs coming from nonlinearities in PBH formation.A proper inclusion of these two effects can suppress PBH formation and lift the constraint, only mildly affecting the GW spectrum [134].This effect can only be computed in a model-by-model basis.We therefore refrain from presenting PBH constraints here, but we highlight their potential importance.
The absence of spectral distortions in the CMB sets strong constraints on the amplitude of the power spectrum at low momenta.In particular, COBE/FIRAS set an upper limit on the µ parameter [176,177], where W µ is the window function [178] W In Fig. 8 we show as solid lines the bounds obtained by imposing µ < 4.7×10 −5 (from a recent reanalysis of COBE/FIRAS data [151]) and µ < 3 × 10 −8 as a benchmark for future observers such as PIXIE [152].The bounds are obtained with fixed ∆ = 1 or ∆ = 2, while the tilt n has no impact because the integral in Eq. (4.9) is dominated by momenta below the peak.Precisely for this reason, constraints for smaller ∆ vanish, as the spectrum becomes sharper.
Depending on the value of ∆, current µ-distortion constraints exclude interesting portions of the parameter space in this model, and future CMB observations have an important discovery potential, especially for spectra with broad tails in the IR, which correspond to ∆ ≳ 1 in our model.

Conclusions
In this paper, we have proposed and studied a series of benchmark models that can produce GWs in the frequency bands probed by PTA experiments.By focusing on models with a minimal set of free parameters, we were able to identify the best-fit regions for each model directly in terms of the model parameters such as the masses of the new particles and their couplings.In particular we have shown that it is possible to directly reconstruct the parameters of a model with a first order phase transition from GW data, without taking the detour of reconstructing the signal parameterization first.This has the clear advantage that one immediately knows that a given point in the fit can actually be realised by a particle physics model.
Fitting a large variety of models to two different PTA datasets was in particular enabled by using the new tool ceffyl, which dramatically speeds up obtaining the viable parameter regions.We fit our models to two datasets, the 12.5 year NANOGrav search and the data from IPTA DR2 (which includes the NANOGrav nine-year data).For some models, the best-fit regions disagree at 95% CL.This was more pronounced for models that feature a sharper peak, since IPTA seems to prefer a flatter spectrum.The discrepancy therefore also seems to grow if more frequency bins are included in the fit, see appendix B for more details.It will be interesting to see if this discrepancy disappears with more data, or if there is a difference in the way the noise is subtracted.
In addition to the PTA data, we include a variety of cosmological constraints on the models.The main results are shown in Figs. 3 to 8, where the constraints are superimposed on the regions preferred by PTA data.Let us briefly summarise the results for the individual models.We find that the signal can be well explained in scenarios with a supercooled phase transition that subsequently reheats the visible sector.In our concrete realisation a large initial baryon asymmetry is required due to the large dilution, but we expect that this dilution can be more moderate in variations of this scenario.GWs from topological defects also provide a good fit of the PTA data, however constraints from N eff rule out the global cosmic string scenario, while domain walls are viable both if they annihilate into a dark sector or into the visible sector.In the former case, CMB spectral distortions are predicted in range of future probes, while the latter case future colliders will be sensitive to parts of the parameter space preferred by the PTA signal.Tachyonic instabilities in an axion-dark photon model can also produce a GW background that fits the observed signal, however, the preferred parameter range is in tension with constraints from N eff .Further studies are required to find out if variations of this scenario can improve this.Finally we also find that the GW spectrum produced by large scalar perturbations during inflation can provide a good fit to the data, with the caveat that the required amplitude for the perturbations cannot be obtained in the simplest single field inflationary models.
On more general grounds, we observe the following.First, since the signal is very large, the source must at some point carry a significant fraction of the total energy density in the universe.Hiding this energy in a dark sector fails in most models due to N eff constraints, with the exception of the dark ALP DW model, which seem to be a very efficient source of GWs, and a tiny slice of parameter space in the audible axion model.This bound might tighten in the future and will remain an important constraint (or hint) for these models.The obvious alternative is to transfer the energy back to the visible sector after the GWs are produced.If this happens through particle decays, as in the phase transition and domain wall scenarios, the models may be probed by laboratory experiments in the future.A third option is realised by the scalar-induced GW scenario.Here the GW source is produced and frozen in during inflation, while the energy of the original source field, the inflaton, goes into reheating of the visible sector.The large scalar fluctuations nevertheless have other observable consequences such as spectral distortions and production of primordial black holes, and the GWs would be a smoking gun for a non-minimal inflationary sector.
One aspect that should not be neglected is naturalness, or how likely these models can be realised in nature.Often, a large separation of scales is implicitly required in these models.An example is the bias term in DW models, which is many orders of magnitude smaller than the other mass scales that appear there, and which has to be tuned such that the DWs annihilate shortly before they start to dominate the energy density of the universe.Similarly the operators that allow the decay of the lightest scalar in the phase transition model may spoil the classical scale invariance that is required in order to obtain the largest signals.There are also typically more parameters required in each model than those which determine the GW signal, but which could impact other observables.
While here we have taken the next step in establishing simple and testable models, in most cases additional model building and more detailed phenomenological studies will be required.Furthermore, the prediction of the GW signal is still affected by large uncertainties, which are at least of order one in most models, and which often rely on extrapolation from complicated numerical simulations that are only available for a limited set of parameters.The large fluctuations in energy density associated with GW production can also have other observable consequences like the formation of dark matter substructures and mini-clusters.One may also wonder if the tension in the Hubble rate determination can be improved in one of these scenarios.
An important question that we have not addressed here is what happens if an astrophysical background of GWs is included in the fit.It will probably open up the parameter space in most models, since not all of the GW signal has to be explained.Furthermore it will be interesting to see if a primordial component of the stochastic GW background can be distinguished from the contribution of SMBHBs.Certainly precise predictions of the spectral shape of the GW signals are essential for this task.

Note added
After the appearance of our work, PTA collaborations have released new data in which, for the first time, they claim evidence for the Hellings-Downs correlation curve, a clear indication of a GWs signal in pulsar timing arrays [179][180][181][182][183][184] the overall conclusions of our work should be unaffected.Certainly, a study including the new data from all PTA experiments is beyond the scope of this work.

A Priors, full fit results and best-fit parameters
For completeness, here we show the full results of our fits of PTA data to the various model parameters, including the 1D posterior probabilities.The best-fit results of the model parameters for the NANOGrav 12.5-year dataset are shown in Tab. 1.The priors used for the fits in ceffyl are listed in Tab. 2. Figure 9 shows the results for a FOPT in the CW model, the results for CS from ALPs are shown in Fig. 10.The posteriors for DW networks from ALPs and heavy axions are depicted in Figs.11 and 12, respectively, whereas Fig. 13 displays the corresponding results for the audible axion model.Last but not least, the corner plot for a scalar-induced SGWB can be found in Fig. 14.

B Choosing the IPTA frequency range
In the main body of this work, we present results from IPTA data based on twelve frequency bins.We here briefly investigate the effects of changing the number of bins in the analysis.
We use the second IPTA data release [9], consisting of 65 pulsars with an observation time of up to 30 years.The dataset comprises data from the first data release of EPTA [3]    PPTA [185,186], and the NANOGrav nine-year dataset [187].Similar to the NANOGrav collaboration [10], IPTA has performed a free spectrum fit with 30 frequency bins [14].We repeat this fit using the enterprise code [37] with enterprise extensions [38] and process the result for further fitting in ceffyl [39].
We here follow the NANOGrav search for a SGWB from a FOPT [11] and focus on the first five frequency bins of the 12.5 year dataset, as these are most sensitive to a cosmological background and give the dominant contribution the SNR of the observed common red process signal [10].This hence includes   In Fig. 15 we hence show results for fits of the SGWB from a FOPT in the CW model taking into account five, twelve, thirteen, fourteen, as well as all thirty frequency bins of IPTA.The left panel depicts the resulting mean spectrum (solid lines) in fractional energy density Ω GW = ρ GW /ρ crit (bottom) and timing residual amplitude (top).The violins represent the result of the thirty-bin free-spectrum fit, and the vertical dotted lines indicate the position of the fifth and sixth NANOGrav frequency bin.The corner plot in the right panel depicts the posterior probabilities in terms of the mass scale M and gauge coupling g.The colored regions in the lower left corner correspond to the 95 % CL region.
An analysis including only the first five bins (i.e. the same number of bins as NANOGrav) yields on average a GW spectrum peaking at a few nHz, corresponding to transitions at temperatures and mass scales of a few MeV.Including more frequencies moves the peak to higher frequencies and scales around (10 -100) MeV, leading to a slight tension with our NANOGrav 12.5 year fit (cf.also Fig. 3).As the thirteenth and fourteenth bin, however, favor lower amplitudes, adding these bins shifts the spectrum back to somewhat lower frequencies, alleviating the tension a bit.Including further bins then again moves the spectrum back to higher scales.

C Details on the Coleman-Weinberg model
Neglecting the tree-level contribution to the potential, the leading-order effective potential consists of the Coleman-Weinberg zero-temperature and the thermal one-loop potential.The former is given by [75] where the sum runs over species with field-dependend masses m 2 i (ϕ), N i counts the degrees of freedom for each species, c i = 5/2 (3/2) for gauge bosons (scalars and fermions) respectively, and µ R is the renormalization scale.In our case, the only species contributing are the gauge bosons with N A = 3 and m 2 A = g 2 ϕ 2 .Using this, we obtain a zero-temperature vacuum expectation value ϕ 0 = (e 1/6 µ R )/g.For the thermal corrections, we use the high-temperature expansion, which for bosonic DOFs becomes where γ E is the Euler-Mascheroni constant.Summing Eqs.(C.1) and (C.2), the logarithms of the field cancel, and one obtains the polynomial potential in Eq. (2.4).Note that we here neglect the contribution from Daisy resummations.These become important for g < 0.53 [69], and should hence be negligible for most of the parameter space we consider.
It is possible to further redefine the potential Eq. (2.4) in terms of a single parameter κ ≡ λ(T )m 2 (T ) , leading to a simplified form for which the tunneling action can be calculated as a function of a κ only.See Ref. [69] for further details.
For the CW model in the high-temperature expansion, κ = log(T /M )/6.Now we want to calculate the microphysics inputs to describe PTs, i.e. the PT strength α, the duration of the PT β and the temperature at which the PT occurs T * .For this we use the approximations in Ref. [69] and calculate the three-dimensional Euclidean tunneling action S 3 (T ) of the bounce solution [48][49][50][51] as a function of κ from Eq. (C.3).The corresponding expression for S 3 /T in terms of κ can be found in Ref. [69].
We here employ the tunneling potential method [188].The transition strength and inverse timescale can be calculated from the potential and tunneling actions as where we need to calculate the parameters of the PT at the temperature T * .A good approximation in the case of a supercooled sector is using the percolation temperature T p .In the expressions above ρ R ≡ π 2 (g SM * + g BSM * )T 4 /30 is the energy density of the universe in radiation at the time of the transition, and ∆V (T ) is the positive potential difference between the false and true vacuum.While Eq. (2.4) gives a good approximation around the barrier and hence also for the tunneling action, we obtain κ < 0 in a large fraction of the parameter space.It can thus not be used to determine the finite-temperature minimum.As we are considering supercooled transitions, we however expect that the corrections around the true vacuum at the percolation time are small.Therefore, we approximate the transition strength using the difference of the potential at zero temperature, α = ∆V 0 /ρ R .For ∆V 0 we obtain ∆V 0 = 43.5 M 4 [69] using the CW potential.

D FOPT GW spectrum and PT parameter fit
We here present results for supercooled phase transitions in terms of the thermodynamics parameters describing a FOPT, i.e. the percolation temperature T p , the transitions strength α, and its inverse timescale β.
As before, the wall velocity is fixed to v w = 1.The present-day GW spectrum of a supercooled PT is the sum of the contributions from vacuum bubble and fluid shell collisions, Ω GW h 2 = Ω vac GW h 2 + Ω fl GW h 2 .The latter two take the form [60] Ω GW (f ) h 2 = 1.67 × 10 −5 g * (T rh ) 100 100 g s (T rh ) where g * and g s are the effective number of relativistic and entropic degrees of freedom, and the reheating temperature T rh = T p (1 + α) 1/4 is used for the redshift factors [53].The coefficients A and r p as well as the spectral slope parameters a, b and c depend on the contribution under consideration and, in case of the vacuum bubble collisions, on whether the broken symmetry is gauged or global.We refer the reader to table I of Ref. [60] for the corresponding numerical values.The respective efficiency factors are κ vac = 1/(1 + R eff /R eq ) for the vacuum bubble collisions and κ fl = (1 − κ vac ) κ sw (α) for the fluid shells, with the sound wave efficiency κ sw (α) taken from Ref. [189].We use R eff = 5/β for the bubble radius upon collision [60].The radius R eq at which the leading and next-to-leading order pressures [190,191] exerted on the bubble wall are equal depends on the underlying particle physics model.Figure 16 depicts a fit of the GW spectrum contributions in Eq. (D.1) in terms of the PT parameters with ceffyl, using the NANOGrav 12.5 year dataset (blue) and IPTA DR2 (orange). 10We also show the naive combination of the two datasets (green).The fit considers only one contribution at a time, setting the bubble collision efficiency factors to κ vac = 1.We here employ uniform priors on log 10 (T p /GeV) ∈ (−4, 2),

Figure 1 .
Figure 1.Percolation (Tp) and reheating temperature (T rh ) as well as transition strength α and timescale parameter β/H * as a function of the mass scale M and the gauge coupling g in the CW model.In the white region, the PT does not complete.

Figure 3 .
Figure3.Fit of the supercool dark sector model to the PTA data as discussed in section 3. Shown are the bestfit regions for the NANOGrav 12.5 year dataset[5,10] (blue) and IPTA DR2 dataset[9,14] (orange), as well as the best-fit of a naive combination of the two datasets (green).Left: Best-fit (solid lines) of the GW spectrum.The violins depict the corresponding 30-bin free-spectrum fits used for the refitting.Right: 2D posteriors for the coupling g and mass scale M of the model.Solid black contours show the dilution factor due to reheating after the PT.Below the dotted line the percolation temperature is below 1 MeV, while to the left of the dashed line the reheating temperature does not reach 2 MeV.The full triangle plot including 1D posteriors is shown in Fig.9.

Figure 4 .
Figure 4. Best-fit regions from fitting the GW signal from an ALP string network (cf.section 2.2.1) to NANOGrav (blue) and IPTA (orange) data, as well the naive combination of both (green).Left: Fitted GW spectrum and input data for the refitting (violins).Right: Contours of the ALP mass ma and decay constant fa.The regions above the solid lines are excluded by constraints from reheating and N eff , as discussed in the text.The projected sensitivity of PIXIE is indicated by the dotted line.The full triangle plot including 1D posteriors is shown in Fig.10.

f a takes values between 5 × 10 4f a = 10 5
Figure 5. Fit results of the ALP DW model from section 2.2.2 to NANOGrav (blue), IPTA (orange) and their combination (green).Left: Best-fit GW spectrum alongside the free-spectrum fit (violins).Right: 68 % and 95 % CL fit region in terms of the axion mass ma and annihilation temperature Tann.In the region between the dashed lines, our description of the GW spectrum in terms of the scaling regime is valid.The region below the solid lines is excluded by N eff for fa = 10 5 GeV (black) and for fa = 10 7 GeV (grey).The dotted line shows the projected sensitivity of PIXIE.The full triangle plot including 1D posteriors is shown in Fig.11.

Figure 6 .
Figure 6.Fit results of the aligned QCD axion DW model from section 2.2.2 to NANOGrav (blue), IPTA (orange) and their combination (green).Left: Best-fit GW spectrum alongside the free-spectrum fit (violins).Right: 68 % and 95 % CL fit region in terms of the axion mass ma and decay constant fa.In between the dashed lines our description of the GW spectrum in terms of the scaling regime is valid.The full triangle plot including 1D posteriors is shown in Fig.12.The collider projections from LHC Run 2 in grey are taken from Ref.[169], whereas the projections from searches by FCC and CLIC are from Ref.[170].

Figure 7 .
Figure 7. Fit results of the audible axion model from section 2.3 to NANOGrav (blue), IPTA (orange) and their combination (green).Left: Best-fit GW spectrum alongside the free-spectrum fit (violins).Right: 68 % and 95 % CL fit region in terms of the axion mass ma and decay constant fa.Decay constants above the dashed line are excluded by effective number of neutrino species N eff .The full triangle plot including 1D posteriors is shown in Fig.13.

Figure 8 .
Figure 8. NANOGrav (blue), IPTA (orange) and combined (green) fit of the scalar-induced SGWB (cf.section 2.4).Left: Best-fit GW spectrum.Right: 1σ and 2σ regions of the peak amplitude A ζ and frequency f⋆ of the curvature power spectrum from inflationary scalar perturbations.Solid lines show the upper bound on A ζ from existing limits on µ-distortions for the indicated values of ∆, while dashed lines show the potential future sensitivity of PIXIE.The full triangle plot including 1D posteriors is shown in Fig. 14.

Figure 9 .
Figure 9. 1D and 2D posteriors for the fit to the CW model for a FOPT of sections 2.1 and 4.1.

Figure 10 .
Figure 10.1D and 2D posteriors for the fit to the ALP CS model of sections 2.2.1 and 4.2.1.

Figure 13 .
Figure 13.1D and 2D posteriors for the fit to the audible axion model of sections 2.3 and 4.3.

Figure 14 .Figure 15 .
Figure 14.1D and 2D posteriors for the fit to the scalarinduced GWs model of sections 2.4 and 4.4.

Figure 16 .
Figure 16.1D and 2D posteriors for the fit to the SGWB contributions from a supercooled PT of appendix D in terms of the PT parameters for collisions of fluid shells (left) as well as vacuum bubbles from breaking a global (center) or gauged (right) symmetry.

Table 1 .
. While we expect that the best fit regions would slightly shift with the new data, Best-fit values of the model parameters obtained in ceffyl for the NANOGrav 12.5-year dataset.The corresponding models are presented in sections 2 and 4.
, the extended first data release of

Table 2 .
Parameter priors used in ceffyl for the models presented in sections 2 and 4.
Figure 11.1D and 2D posteriors for the fit to the ALP DW model of sections 2.2.2 and 4.2.2.Figure 12. 1D and 2D posteriors for the fit to the heavy axion model generating domain walls of sections 2.2.2 and 4.2.2.