Probing Small-Scale Power Spectra with Pulsar Timing Arrays

Models of Dark Matter (DM) can leave unique imprints on the Universe's small scale structure by boosting density perturbations on small scales. We study the capability of Pulsar Timing Arrays to search for, and constrain, subhalos from such models. The models of DM we consider are ordinary adiabatic perturbations in $\Lambda$CDM, QCD axion miniclusters, models with early matter domination, and vector DM produced during inflation. We show that $\Lambda$CDM, largely due to tidal stripping effects in the Milky Way, is out of reach for PTAs (as well as every other probe proposed to detect DM small scale structure). Axion miniclusters may be within reach, although this depends crucially on whether the axion relic density is dominated by the misalignment or string contribution. Models where there is matter domination with a reheat temperature below 1 GeV may be observed with future PTAs. Lastly, vector DM produced during inflation can be detected if it is lighter than $10^{-16} \,{\rm GeV}$. We also make publicly available a Python Monte Carlo tool for generating the PTA time delay signal from any model of DM substructure.


I. INTRODUCTION
Dark Matter (DM) is one of the pillars of the standard cosmological model. Perturbations in the DM density field generated by inflation provide the seeds for the hierarchical structure formation we observe in the Universe. On large scales the matter power spectrum of these primordial perturbations can be inferred from the anisotropies in the cosmic microwave background (CMB). These observations indicate a nearly scale-invariant spectrum of primordial fluctuations, which is compatible with the large scale structures we observe on galactic and extra-galactic scales.
However on smaller length scales, k pc −1 , different theories of DM leave unique fingerprints on the primordial perturbations and/or their evolution. Here we focus on four theories which produce different small scale structures, avoid current experimental bounds, and are theoretically well motivated: • ΛCDM: A nearly scale invariant spectrum of adiabatic perturbations is produced at the end of infaltion [1,2].
• Post-inflationary QCD axion: The U (1) PQ symmetry is broken after inflation. The decay of axion field defects at the QCD phase transition then induces large amplitude isocurvature fluctuations on scales smaller than the horizon at the QCD epoch [3,4].
• Early Matter Domination: Adiabatic perturbations which are within the horizon during an early stage of matter domination grow linearly instead of logarithmically, enhancing the amount structure on these scales [5,6].
• Vector DM Produced During Inflation: If the DM is a massive spin-1 particle, the longitudinal modes produced at the end of inflation can peak the power spectrum at small scales, with the location of the peak determined by the DM mass [7].
These model specific features in the primordial seeds, and their evolution, translate to different predictions for the amount, and properties, of sub-galactic DM halos (subhalos). Therefore measuring this population of subhalos can be a powerful tool in pinning down the model of DM.
Unfortunately DM subhalos are elusive objects, mostly because they are expected to contain very little baryonic matter and are therefore nearly invisible (assuming a weak indirect detection signal).
The signals DM subhalos can induce in a PTA measurement of the pulsar phase have been studied in depth [16,17]. Each subhalo induces a shift to the residual phase measurement which can be non-degenerate with the timing model. Previous works [16,17] have used analytic approximations of the signal, denoted as static, dynamic, and stochastic limits. These correspond to three different limits where the timescale, τ , for a typical subhalo to transit the line-of-sight is much greater than the observing time T (τ T , static), is much less than the observing time (τ T ), but the signal is dominated by a single transiting subhalo (dynamic) or many transiting subhalos (stochastic). Here we will limit these simplifications by generating the full time series produced by a population of DM halos by using a Monte Carlo (MC) simulation. One goal of this paper is to develop this numeric tool, which we make publicly available on GitHub .
The second goal of this work is to use this MC to generate the PTA signal that would be produced by the subhalo populations predicted by the four DM models listed above. To obtain an estimate of the local population of DM subhalos, one needs to know how primordial perturbations evolve from the early Universe until today. This is a complex problem, especially at the sub-galactic scales relevant for PTA searches where non-linearities and tidal effects play a crucial role. The Press-Schechter formalism [18] is known to give a reasonably good analytic description of the non-linear physics related to the clustering of DM overdensities (at least for the case of ΛCDM, where a direct comparison with numerical simulations is possible). We then use this model, together with semi-analytic description of tidal effects, to relate the primordial power of primordial perturbations to the local population of DM subhalos. We should caution that a number of the methods we are using have not be tested against N -body simulations for such low mass and high density subhalos. We leave for future work validating and calibrating the analytic results for small scale DM subhalos.
The outline of the paper is as follows. In Section II we review the derivation of the PTA signal induced by a population of DM subhalos, along with the signal-to-noise ratio, and discuss the MC algorithm used to compute it. In Section III we illustrate the semi-analytic procedure used to relate the primordial power of density perturbation to the local subhalo population. In Section IV we apply these results to the benchmark scenarios discussed above, and derive constraints for planned [19], and future PTAs. Finally, in Section V we conclude.

II. DARK MATTER SUBHALO SIGNATURES IN PULSAR TIMING ARRAYS
We begin with a discussion of the signal induced in PTAs by DM subhalos, how we generate this signal with the Monte Carlo, and the signal-to-noise ratio (SNR) of such a signal. Many of the formulae presented here were previously derived in Refs. [16,17] and we review them here for completeness. Previously both the Doppler effect (acceleration induced by a DM subhalo of either the Earth or a pulsar (called the Earth or pulsar terms, respectively)), and Shapiro effect (a change in the light arrival time due to the DM subhalo gravitational potential) were considered. Here we will only consider signals from the Doppler effect, which is dominant over the Shapiro effect for subhalos with mass below M 10 −3 M for any concentration parameter (see Ref. [17]).

A. Phase Shifts from Dark Matter Subhalos
Pulsars are rotating, highly magnetized neutron stars that emit beams of electromagnetic radiation from their magnetic poles. Given a misalignment between the rotation and magnetic axes, the pulsar rotation can cause the radiation beam to sweep across Earth. If this happens, a pulsar will appear to an Earth observer as a periodic emitter. When a DM subhalo approaches either the Earth, or a pulsar, in the array it will cause an acceleration and therefore change the observed pulsar frequency, ν. This frequency shift, δν, for a subhalo passing with position r = r 0 + vt, where r 0 is the initial position and v is the velocity, is given by where Φ is the subhalo gravitational potential. The coordinate system for the Earth (pulsar) term is chosen with the Earth (pulsar) at the origin, withd pointing from the Earth to the pulsar (pulsar to Earth). We also parameterize the position vector as impact parameter, and t = −r 0 · v/v 2 is the time to reach the point of closest approach. The phase shift, δφ, is then simply For spherically symmetric halos, the gradient of gravitational potential appearing in Eq. (1) can be written in terms of a form factor F(s, c) as where M is the virial mass defined as the subhalo mass contained inside the virial radius r v , the radius within which the mean halo density is 200 times the critical density of the Universe, ρ c . In the following we will assume that DM subhalos have an NFW density profile where r s is the scale radius, and ρ s ≡ ρ(r s ). For NFW subhalos the form factor is where we have introduced the concentration parameter, c ≡ r v /r s , related to the subhalo scale density by ρ s = 50 c 3 ρ c 3 (ln(1 + c) − c/(1 + c)) .
We see that for very compact halos, c → ∞, the form factor reduces to one and subhalos behave as point-like object. We will refer to this as the 'PBH' limit since when F → 1 the gravitational potential reduces to that of a primordial black hole (PBH).
In the PBH limit Eq. (2) can be simplified to [16] 1 where x = (t − t)/τ is a normalized time variable, and τ = b/v.
For an F given by Eq. (5), δφ can be challenging to compute analytically due to the discontinous derivatve in F at s = 1 which can be important if the subhalo passes from s > 1 to s < 1. In principle, one could perform the integrations numerically, though this becomes intensive with a large number of subhalos. We instead use a conservative estimate of the signal size by substituting F(r/r v , c) with where r min is the distance of closest approach over the observation time. This approximation changes the amplitude by O(10%) for c ∼ 100 but quickly shrinks to O(1%) for c ∼ 10 4 .
Using this conservative estimate the phase shift for an NFW subhalo is simply This effect is now easily generalized to a PTA with N P pulsars, each of which is influenced by N subhalos. In this case the phase shift of the I th pulsar is given by where the sum runs over all of the DM subhalos affecting the I th pulsar.
As discussed in detail in Ref. [17], the phase shift itself is not directly observable. This is because the shift induced by transiting DM subhalos can be partially degenerate with the phase shift induced by the 'natural' evolution of the pulsar frequency, described by the timing model where φ 0 , ν,ν are the phase offset, pulsar frequency, and its first time derivative. In general, if a DM signal is present, each pulsar measures a total residual phase shift, s I , given by the sum of signal, h I , and noise, n I : The DM signal, h I , is defined as the part of the phase shift, ∆φ I , that is not absorbed by the pulsar timing model, where with f n = √ 2n + 1P n (2t/T − 1), P n the n th Legendre polynomials, and φ(t) given by Eq. (10).
Similarly, the post-fit residual noise, n I (t), is given by the intrinsic pulsar phase evolution that is not reabsorbed by the timing model.
Sample unsubtracted and subtracted signal shapes (i.e. before and after fitting for φ 0 , ν, and ν) are shown in Fig. 1. The left and middle columns show the signal shapes from a single subhalo.
Depending on the timescale of the interaction, τ , we classify the signals as either "static" (τ T , left column) or "dynamic" (τ T , right column). Both limits allow for simplifications of Eq. (8). In the static limit δφ can be Taylor expanded in t/τ . The total δφ, in the top row, then looks nearly linear, while the measurable subtracted signal, the bottom row, is ∝ f 3 (t), defined below Eq. (13), due to the timing model fit of φ 0 , ν andν. A dynamic signal, on the other hand, has a more characteristic signal shape. Expanding the term ∝ b · d in Eq. (8) for small τ leads to δφ ∝ |t − t 0 |, which explains the kink in the upper middle panel in Fig. 1. This parameterization can be simplified further, since the terms ∝ t will be subtracted, to δφ(t) ∼ (t − t)Θ(t − t), and therefore the signal can be parameterized in terms of two variables: an amplitude and t. Lastly, the stochastic signal in the right column is the sum of a large number of individual subhalos with τ T and we see that the signal has no simple parameterization.

B. Constructing the SNR
With the signal defined, we now discuss the signal significance relative to noise, or SNR. As done in Refs. [16,17] we will use a matched filter procedure [20][21][22]. As discussed previously, each pulsar in the PTA measures the residual phase shift, s I , given in Eq. (11) which is the sum of signal, h I , and noise, n I . Naively one would expect that the signal is only significant if |h I | > |n I |. However if we know the form of the signal, we can filter the noise and boost the significance. The optimal scenario for this filtering procedure is the one where the shape of the signal is known a priori, or can be parameterized by a computationally searchable space of parameters. In this case the best test statistic is T ≡ I dt s I (t)Q opt I (t), where the signal of each pulsar is convolved with its own optimal filter, Q opt I = h I / N I , where ∼ denotes the Fourier transform and N I is the noise power spectrum of the I th pulsar. We will assume the noise is white, so that we have where t rms is the root mean square residual timing noise, and ∆t is the time between measurements (the cadence). However to know, or find through a Markov Chain Monte Carlo, the optimal filter for all the pulsars in the array is not always possible. We discuss alternatives for both the pulsar and Earth terms.
We start by discussing the pulsar term. For a generic pulsar in the array, many transiting events contribute to the signal s I (t). This makes unfeasible to look for the optimal filter of a generic pulsar, given the large number of parameters needed to describe the signal shape. However, we expect the largest signal in the array to be dominated by a single halo transiting very close to the pulsar.
Therefore, the largest signal is expected to have the simple parametrization discussed below Eq. (13), which makes the search for its optimal filter feasible. We therefore define the Pulsar SNR, SNR P , to be the largest across the array: Concretely, to place a constraint on a model we would generate an array of filters with different values of the parameters in the signal model (e.g. an amplitude for the static signal, and an amplitude plus epoch, t, for the dynamic signal), use these as the filter Q in Eq. (15), and then place constraints with the largest SNR P . 2 In practice, we assume we can select a filter which is arbitrarily close to Q opt , and compute the expected maximum SNR from it, The statistical significance of a given SNR, for the purposes of limit setting, is discussed in Appendix C.
The h I 's are generated with the MC which will be discussed in more detail below, as well as in Appendix A. The resulting constraints for a monochromatic population of subhalos are labeled "Closest only" in the right panel of Fig. 2, where only the signal from the closest subhalo is kept. To verify this approximation we also show the constraints obtained by taking h I to be the sum of the signals from all the subhalos around each pulsar, they are labelled "All." in the right panel of Fig. 2.
The second signal, labelled "Earth" in the left panel of Fig. 2, originates from a large flux of subhalos transiting near the Earth. Since the signal is generated by an acceleration of the Earth it will leave an imprint across the entire array with correlations between pulsars, similar to the correlations in the stochastic gravitational wave background [24,25]. This signal is generated stochastically by many passing subhalos; as such, the optimal filter is not easily reconstructed. One can, however, parameterize where denotes an ensemble average. As in Ref. [17], we then define the SNR in terms of the signal correlator, s I (t)s J (t ), where is known a priori then the optimal filter to use is Q opt Since this is not the case, we use a filter based on the expected signal, Since the expected filter will not be optimal, this introduces a mis-filtering effect. In the right panel of Fig. 2 we illustrate the effect that misfiltering has on the constraints, lowering them by an O(1) number. We compute this constraint by 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -1 using the MC to generate S IJ (f, f ) in numerous realizations, taking the ensemble average R IJ (f, f ) over all the realizations, re-labelling it to Q IJ (f, f ) and computing the SNR of the signal from new realizations using Eq. (17). We will take the latter, more conservative, limits below.
As discussed in detail in Appendix A, to generate signals for monochromatic halo mass functions, as shown in Fig. 2, the MC first generates the pulsar positions uniformly on a sphere. Then, for the Pulsar (Earth) term, around each pulsar (the Earth), subhalo initial positions r 0 i and velocities v i are generated and used to compute the signal in Eq. (9). The initial positions are assumed to be uniformly distributed throughout space with density fixed to be 0.46 GeV/cm 3 , and the velocities drawn from a Maxwell-Boltzmann distribution with RMS velocity v 0 = 325 km/s. The relevant SNR is then computed in 1000 realizations and constraints are placed setting the 10 th percentile SNR equal to 4 which, as detailed in Appendix C, correspond to a signal significance of σ significance ∼ 2.

III. FROM PRIMORDIAL PERTURBATIONS TO THE LOCAL SUBHALO POPULATION
As we have seen in the previous section, PTAs are a powerful tool to constrain the abundance of Milky Way (MW) DM subhalos. These sub-galactic structures are seeded by the primordial perturbations on scales much smaller than the ones tested by CMB observables, where DM models can leave unique fingerprints. In principle, once the power spectrum of primordial perturbations is known the abundance of these DM subhalos can be derived. In practice, however, this is an intricate problem where non-linearities, tidal effects, and baryonic feedback play an important role. For the ΛCDM model we can rely on numerical simulations which have been well-tested with semi-analytic fits. For other models of DM we need to employ motivated analytic estimates. The goal of this section is to illustrate the analytic prescription that we will adopt in the rest of this work to compute the subhalo mass function (sHMF) and concentration parameters.
We start by writing the dimensionless primordial power spectrum for DM density perturbations, where ∆ 2 MD (k) is the model dependent contribution to the small scale power, and ∆ 2 SI (k) is the scaleinvariant primordial power spectrum for adiabatic perturbations as extracted from CMB measurements (and extrapolated to smaller scales) [26]: with n s = 0.9665, A s = 2.101 × 10 −9 , and pivot scale k 0 = 5 × 10 −3 Mpc −1 . Following the standard assumption of linear evolution of the density perturbations, we relate this primordial power spectrum to its late-time value by means of transfer, T (k), and growth, D(z), functions: where z eq 3.3 × 10 3 is the redshift at matter-radiation equality, and we allow ∆ SI and ∆ MD to have different transfer and growth functions.
The first category of models we consider modifies ΛCDM by increasing the amplitude of small-scale perturbations (i.e. ∆ 2 MD (k) = 0). The post-inflationary axion and vector DM models, discussed in Secs. IV B and IV D respectively, belong to this category. The second category of models preserves the primordial power of ΛCDM (i.e. ∆ 2 MD (k) = 0) but modifies the growth of small-scale perturbations in the early Universe, i.e. an enhanced, non-standard T (k). Models featuring an early stage of matter domination, discussed in Sec. IV C, belong to this category. The net effect of both of these categories is similar: they enhance the power of matter density fluctuations on small scales, as shown in Fig. 3. We will discuss these figures in more detail below, but simply note that the models under consideration

A. Halo Mass Function
The first quantity we derive from ∆ 2 (k, z) is the global Halo Mass Function (HMF), a function that gives the comoving number density of isolated halos in the Universe as a function of mass. Despite the non-linear nature of the problem, the analytic approach pioneered by Press & Schechter [18] has been proven to well-approximate numerical N-body simulations (at least for the standard ΛCDM).
In the Press-Schechter (PS) formalism, overdense regions are expected to detach from the Hubble flow and gravitationally collapse when their average overdensity, δ ≡ δρ/ρ, becomes larger than a critical value, δ c (which in a flat Universe is found to be δ c 1.686). 3 Therefore, for Gaussian perturbations, the probability an overdense region of mass scale M is collapsed by redshift z is given where σ(M, z) is the variance of density perturbations smoothed over a sphere of size R(M ) = (3M/4πρ m ) 1/3 : where W (x) is the window function of choice, and ρ m is the background matter density of the universe today. In the following, unless otherwise specified, we will use a top-hat window function, The fact that a region of size R(M ) is collapsed does not prejudice against being part of a larger overdensity which has also collapsed. The HMF only counts collapsed objects which are not contained within larger halos; to obtain the comoving number density of these we differentiate Eq. (21) with respect to M and multiply by the average number density, ρ m /M , with the mass fraction per logarithmic interval, df /d ln M , is where we have defined ν(M, z) ≡ δ c /σ(M, z).
The PS formalism has been validated against numerical simulations for ΛCDM cosmology [29,30].
But whether it is still a good approximation for models that, on small scales, differ vastly from ΛCDM has yet to be studied in detail. We plan to study this question in future work, but for the moment we limit our discussion to PS predictions and compare them to numerical results available for the case of axion miniclusters [31]. The results of this comparison, shown in Appendix B, suggest that -at least for axion miniclusters -PS still provides a good estimate of the HMF. All the following results are based on this assumption.

B. Subhalo Mass Function
As already mentioned, the target of PTA searches are pulsars within a radius of O(5 kpc) from our solar system. Because of this, PTAs can only detect signals generated by the local population of MW DM subhalos, and not from the cosmological population described by the HMF. Unfortunately, estimating the local subhalo density is a much more complicated problem than deriving the HMF.
Indeed, in the galactic environment these subhalos are exposed to tidal forces that strip part of their mass, and which may ultimately lead to their complete disruption.
We divide the problem of deriving the subhalo mass function (sHMF), dñ/d log M , in two steps.
First we derive its value at infall: dñ/d log M acc , where M acc is the subhalo mass at the moment of its accretion into the MW and before tidal effects may reduce its value. Then, we analytically estimate the impact of tidal effects and derive the relation between the infall and final subhalo mass where we have implicitly defined the bound fraction, f b (M acc ), as the fraction of the infall subhalo mass that survives tidal effects. Finally, by using Eq. (25), we relate the infall sHMF to its final value, where here and in the following, we adopt a ∼ notation to distinguish between the local sHMF and the global HMF.
As we will see, the density profile plays a key role in determining the impact of tidal effects.
Subhalos are expected to have a radial density profile which is well described by the usual NFW profile given in Eq. (4). The characteristic density of the subhalo, Eq. (4), is set by the Universe's energy density at the time of collapse, where C ρ is a free parameter that encodes the growth of the halo density during the virialization processes, and z col is the collapse redshift. We fix the value of C ρ by fitting Eq. (27) against N -body simulations. For ΛCDM halos (for which z col 10) we use the results in [32] and find C ρ ∼ 600, for axion miniclusters (for which z col 10 3 ) we use the results of [31] and find C ρ ∼ 9 × 10 4 . In the following we use a z col -dependent C ρ which smoothly interpolates between these two values.
Following Ref. [33], we define z col for a subhalo of final mass M as the redshift at which half of its final mass is contained in progenitors more massive than M . Using the extended Press-Schechter model [34], this redshift can be estimated by where and the best fit to ΛCDM halos is found for = 0.01. In the following we will assume that the same value of provides a good estimate also for different DM models. The typical concentration parameters, as a function of mass, are shown in the lower panels of Fig. 4, where Eq. (6) was used to relate ρ s to the concentration parameter. As expected, larger amplitude primordial perturbations lead to an earlier collapse and in turn larger concentration parameters. Lower: typical concentration parameter as a function of the subhalo mass. Additional power on small scales leads to an early collapse and therefore a larger concentration parameter for light halos.

sHMF at infall
In the PS formalism the probability that a halo of mass M at redshift z will be part of a larger halo of mass M 0 at redshift z 0 is given by [34,35] where s ≡ σ 2 (M, 0) and S ≡ σ 2 (M 0 , 0). Therefore, as a proxy for the sHMF at infall we can use where ρ DM is the local DM energy density, M MW 200 ≈ 1.5 × 10 12 M is the MW virial mass, and z MW is the MW collapse time, derived from Eq. (28).

Tidal effects
Subhalos in the MW can experience different kinds of tidal forces. Subhalos on nearly circular orbits are subjected to an almost constant tidal pull from the galactic halo. The effect of this gravitational pull is to strip away the halo mass contained beyond its tidal radius, r t , defined as the distance from the subhalo center at which the gravitational pull of the galaxy is stronger than the self-gravity of the subhalo [36]: where r ⊕ is the radius of the Earth's circular orbit (assumed to be the distance between the solar system and the center of the galaxy, 8 kpc), M MW (r < r ⊕ ) is the Milky Way mass enclosed within radius r ⊕ , and M acc (r < r t ) is the accreted mass within r t . For an NFW profile, the enclosed mass is given by M (r < r * ) = M F(r * /r v , c), where F is defined in Eq. (5), and M is the total mass.
Assuming that the mass outside the tidal radius is instantaneously stripped away, the halo bound fraction surviving tidal stripping is 4 The treatment of tidal effects performed in this analysis should be seen as a simple order of magnitude estimate whose main goal is to take into account the large tidal disruption suffered by the standard ΛCDM halos. To accurately estimate the smaller impact that tidal effects have on high concentration subhalos a more sophisticated analysis is needed. Specifically, additional processes like subhalo-subhalo encounters and interactions with MW stars can be the dominant disruption mechanism for high density subhalos, and need to be considered. A detailed study of these effects would require dedicated numerical simulation that we leave for future works. However, we do not expect these effects to change our results drastically, as also suggested by the recent results of Ref. [37].

IV. CONSTRAINTS ON PRIMORDIAL POWER SPECTRA
In this section we use the tools developed to derive the discovery potential of PTAs for four benchmark DM models. However before discussing model-specific results, we want to give a rough idea of the constraints that PTAs can place on the DM primordial power spectrum. Assuming this spectrum is locally scale invariant, it can be shown [40] that where the constraints for a monochromatic population of subhalos, f mono (M, c), are shown in Fig. 2.
Assuming the transfer function is the one predicted by ΛCDM, and r ⊕ = 8. can strongly constrain models that have a small scale power spectrum that is enhanced compared to ΛCDM. However, standard ΛCDM subhalos, being almost completely disrupted by tidal effects (see Fig. 4), are a much more challenging target (as discussed in Sec. IV A).

A. ΛCDM
The ΛCDM model is reproduced taking ∆ 2 MD (k) = 0 in Eq. (18), and the standard growth and transfer functions in Eq. (20). Specifically, we use the BBKS transfer function [41], and a linear growth function, Here a eq 3 × 10 −4 is the scale factor at matter radiation equality, k eq 0.01 Mpc −1 the largest scale that enters the horizon before equality, and we use the approximated growth factor given in [42]: where Ω m and Ω Λ are the matter and vacuum energy densities, normalized to the critical density, ρ c , respectively.
The sHMF for standard CDM halos has been computed in the mass range (10 −5 − 10 12 )M [32] but little is known for lighter subhalos. Because of this, we compute the sHMF at infall by using Eq. (30). As expected, the result is compatible with the numerical simulations, and the sHMF at infall is well fitted by The evolution of the PQ field down to the QCD phase transition has been studied numerically in [43,44]. These simulations need to resolve string cores and contain a few Hubble patches at the same time. These two scales are respectively fixed by the mass of the radial mode, m r ≈ f a , and 1/H ≈ M P l /T 2 . At the QCD phase transition the scale separation is log(M P l f a /Λ 2 QCD ) ≈ 70, which makes numerical progress to evolve from the PQ to QCD phase transition nearly impossible.
Luckily strings are expected to enter a scaling regime, during which their length per Hubble patch remains constant, soon after the PQ phase transition. Therefore simulations can be stopped for reasonable values of log(m r /H) and extrapolated to the QCD era. However, the authors of [45] recently pointed out that large logarithmic corrections to this scaling regime are present, the extrapolation of which suggests that axions from strings may dominate the axion relic density contrary to what is assumed in [43,44]. This would not only change the predicted axion mass but also the spectrum of its primordial perturbations. In the following we will set constraints assuming that the axion relic density is dominated by the misalignment contribution, with the possibility of updating the results in case the conclusion of reference [45] is confirmed.
Given this assumption we use the primordial spectrum for the axion field derived in [43]. Since isocurvature perturbations are expected to experience a very small logarithmic growth during radiation domination, and linearly evolve afterwards, in Eq. (20) we take while for the transfer and growth function of the scale invariant part of the spectrum (T (k) and D(z)) we use the same conventions described in the previous subsection. The properties of the resulting local population of DM subhalos are shown in the left panels of Fig. 7.
As before we use the MC to simulate the local population of DM subhalos and derive the induced PTA signal with the methods discussed in Sec. II, and the results are shown in the right panel of Fig. 7. For SKA parameters the detection of axion MC appears to be challenging, but more futuristic PTA experiments should be able to test this model.
Similarly to the axion case, models featuring a cosmological phase transition in the early Universe can boost the DM power spectrum on small scales [46] and be tested by PTAs. Interestingly, if the phase transition happens at low enough temperatures (T 1 GeV), these models are also expected to generate a background of gravitational waves which could be searched for in PTAs (see for example [47]). We leave to future works a detailed study of these scenarios.

C. Early matter domination
Models with an early stage of matter domination (MD) are another class of models potentially testable by PTAs. Here, after the end of inflation, the Universe's energy density is dominated by a non-relativistic spectator field. Sub-horizon perturbations of this field then grow linearly with the scale factor during this early stage of MD. When the spectator field decays and reheats the Universe, these density fluctuations are inherited by the DM density field. This primordial growth of matter perturbations can be parametrized in terms of a modified matter transfer function, T MD . Following [5], we include a period of early MD by modifying the transfer function, where T (k) is the standard BBKS transfer function, R(k) encodes the modifications induced by the early matter domination era, and the exponential is given by the free streaming scale, induced by the finite velocity of the DM when it is produced from the decay of the spectator field. Large scales are not affected by the early matter domination so R(k < 0.05 k RH ) = 1, where k RH is the wave number of the mode that enters the horizon at reheating, k RH k eq = 1.72 × 10 11 T RH 100 GeV g * 100 with k eq 0.01 Mpc −1 the largest scale that enters the horizon before matter-radiation equality, T RH the temperature at which the spectator field decays (i.e. the reheating temperature), and g * the with a hor (k) the scale factor at reheating and horizon crossing, and f 1 and f 2 related to the baryon On small scales, the functions A(x) and B(x) are fit according to where S(y) = [tanh(y/2) + 1]/2. Finally, ignoring DM interactions, the free-streaming scale, k fs , appearing in Eq. (42) is approximately given by (47) where v RH is the DM velocity at reheating. In deriving our results we have assumed that v RH = 10 −3 .
For smaller values of v RH the results shown in Fig. 8 remain almost unchanged. However if the DM is produced with v RH 0.06 then k RH k fs and free streaming erases the small scale perturbations of the scalar field entirely. In this case PTAs will not be able to set constraints.
The properties of the subhalo population resulting from this modified transfer function are shown in the left panels of Fig. 8. While the significance of the predicted signal is reported on the right panel of the same figure. We see that, assuming optimistic PTA parameters, the model can be tested for T RH 1 GeV. For higher reheating temperatures typical subhalo masses become too small to induce a large enough signal and sensitivity is lost.

D. Vector DM
As shown in [7], massive vector bosons can be produced by quantum fluctuations during inflation.
The relic abundance of particles produced in this way is given by where Ω DM 0.265 is the observed DM relic density, m is the mass of the boson, and H I is the Hubble rate during inflation. In addition to the adiabatic and scale invariant fluctuations inherited from perturbations in the inflaton field, the inflationary production generates isocurvature fluctuations on small scales. Specifically, the primordial power spectrum of the field amplitude is [7] ∆ 2 Notice that this is not the power spectrum for DM density perturbations; we will discuss shortly how the two are related. At late times the small-scale power spectrum for the field amplitude has the usual relation where D(z) is the standard growth function, while the transfer function is and k * = a eq H eq m, with a eq = 3×10 −4 and H eq 10 −29 eV the value of the scale factor and Hubble rate at equality. Finally, the small-scale power spectrum for density perturbations is given by [7]: where A 2 = d ln k ∆ 2 A (k, z). In Fig. 9 we show the properties of the subhalo population predicted by this model (left panel), and the significance of the signal that it would produce (right panel). We see that future PTA experiments will have a good sensitivity for DM masses below 10 −16 GeV.

V. CONCLUSIONS
We have studied the detectability in Pulsar Timing Arrays of a variety of well motivated DM models of substructure including: standard ΛCDM, axion models where the PQ symmetry breaks after inflation, early matter domination, and vector DM. Given the low concentration, ΛCDM subhalos are particularly susceptible to tidal effects which drastically reduces their local abundance. As a result, we found that ΛCDM will not be testable by present or (near-)future PTAs. The other models, which feature subhalos with much larger concentration parameters, and hence much lower tidal disruption, are better candidates for detection. Specifically, we found that models featuring an early stage of matter domination ending at temperatures lower than 1 GeV will be testable by PTAs with SKAlike capabilities. Similarly, vector DM candidates produced during inflation with a mass smaller than 10 −16 GeV are within the reach of PTAs with SKA-like parameters. Finally, axions whose PQ symmetry breaks after inflation (if the production is dominated by misalignment) are out of reach for an SKA-like experiment, but could be probe by a slightly more optimistic set of experimental parameters.
To generate the signals we have developed a Python Monte Carlo tool that, given the subhalo mass function, DM velocity distribution, and concentration parameters, generates a population of subhalos and computes the acceleration of the Earth, or pulsar, induced, and the resulting shift on the phase of the pulse time-of-arrival. We make the code publicly available on GitHub .
In future works we plan to improve the present analysis in two ways. First, by performing dedicated N-body simulations to more precisely describe the impact of tidal effects on high-density subhalos.
Second, by performing a more realistic treatment of the background noise through the NANOGrav software Enterprise [23].
• f M (M ): the subhalo mass distribution (related to the subhalo mass function, dñ/d log M as described below), all of which are normalized to 1: dXf X (X) = 1. Since PTA searches are only sensitive to DM subhalos in the neighborhood of the solar system, we expect the position distribution to be uniform.
Therefore we take f r = 1/V , where V is the simulation volume. Practically this volume is limited by the total number of subhalos that can be kept in the simulation. The velocity distribution, f v (v), is taken to be a Maxwell-Boltzmann distribution with v 0 = 325 km/s, v esc = 600 km/s, and the angular dependence assumed to be isotropic. The larger value of v 0 relative to the standard choice of ∼ 200 km/s is chosen to account for the Earth or pulsar velocity relative to the Galactic rest frame. 5 Lastly, the mass distribution can be written in terms of the subhalo mass function as where n ≡ dM dñ/dM . The concentration parameters are then taken from the concentration-mass relationship, c(M ), as discussed in Sec. IV. We also generate the pulsar directions,d in Eq. (8), from a uniform distribution on the sphere.
Given these pdfs, the Monte Carlo generates a large number (taken to be 1000 in our results) of different universe realizations and in each computes the total phase shift, Eq. (9), due to subhalos surrounding the Earth or pulsar. The total phase shift is computed on a uniform grid of time points with spacing equal to the cadence, ∆t, and total number of points equal to T /∆t. The residual signal in each pulsar, h I , is then computed by finding the best fit to the parameters φ 0 , ν, andν in the timing model of Eq. (10) and subtracting the best fit timing model from the total phase shift. These h I 's are then used to compute the SNRs discussed in Sec. II for each realization. To draw constraints we take the 10 th percentile SNR across universes. The statistical significance of a given SNR is discussed below in App. C.
The results derived here are more complete and subsume the previous works [16,17]. To illustrate the differences with the previous results, in Fig. 10 we compare the constraints on the fraction of DM in PBHs, f = Ω PBH /Ω DM . The curves labelled "DopDet-P" in the left panel and "DopStoch" in the right panel are taken from [17], which are analogous to the "pulsar term" and "Earth term" analysis presented in this work. The main difference between the more recent work, Ref. [17], and this analysis is how the signal is calculated. Previously the signal h I (t) and its autocorrelator were computed using analytic approximations, and the constraints were cut off when these were no longer good expressions. Here we explicitly generate the residual signal h I which avoids the need of analytic simplifications. More specifically, for the "DopDet-P" curve in [17] the signal δφ(t) was approximated by the leading order term in the power series in the small (dynamic) and large (static) τ /T limits.
By contrast, this work does not use the aforementioned approximations. This allows the constraint to smoothly interpolate between the two regimes and asymptote to the "DopDet-P" curves in both limits. Similarly, the "DopStoch" curve was obtained using an approximate form of the correlation function R(t, t ), which was derived analytically from the step function approximation of δφ(t). As shown in Fig. 10, the constraint has a non-negligible deviation from this work, which is an indication that the approximations used in [17] is overly optimistic near the peak of the constraint. For an explicit comparison we ran the MC using the same approximate form of δφ(t) in [17], keeping only subhalos with impact parameter satisfying b <vT . The resulting constraints are labelled "Step" in the right hand panel of Fig. 10 and we see good agreement with the result in [17].

Appendix B: Comparison between analytic and numerical HMF
In this appendix we compare the numerically derived HMF for axion models where the PQ symmetry is broken after inflation [31], against some commonly used analytic prescriptions. Specifically, we compare the numerical result against the analytic predictions of the Press-Schechter [18], and Sheth-Tormen [48] formalisms. The first one has been described in the main text, for convenience of reference we report here the differential halo mass fraction predicted by the PS formalism: where the choice of parameters A = 0.3222, a = 707, and p = 0.3 has been showed to best reproduce the numerical results (at least for the the case of ΛCDM).
The results of the comparison, at four different redshifts, are shown in Fig. 11. The agreement between numerical and analytic results is quite good, with the exception of the high mass regime where simulations start to lose sensitivity.

Appendix C: SNR statistical significance
In this section we discuss the statistical significance associated with a given observed value of the SNR, more specifically, we derive its p-value (i.e the probability that an SNR at least as large as the one observed is generated in presence of noise only).
In absence of a signal, the pulsar SNR given in Eq. (15) can be written as where we have drop the expectation value in the numerator because we want to study the full statistical distribution of the SNR, and ignored the filter functions for simplicity. Assuming that the noise is Gaussian, it is evident from the previous equation that the SNR in absence of a signal is distributed as the absolute value of a Gaussian variable with zero mean and unit standard deviation. From this follow that, given an array of N P pulsars, the p-value for the pulsar term is Assuming pulsar independent noise, the Earth SNR in absence of a signal takes the form SNR = I,J dtdt n I (t)n J (t ) where, as before, we have dropped the filter functions and the expectation value in the numerator.
We now define ρ ≡ I dt n I (t) N N P T (C4) which, by construction, is a Gaussian variable with zero mean and unit standard deviation. In terms of ρ, Eq. (C3) can be rewritten as from which conclude that, in absence of signal, the Earth SNR follows a rescaled χ-squared distribution.
Therefore, the p-value for the Earth term is given by where P (s, t) is the regularized gamma function.
Finally, when showing our final results in the main text, we express p-values in terms of standard deviations by using the relation As an example, an SNR = 4 corresponds (both for the pulsar and Earth term) to a p-value of p ∼ 0.01, and a signal significance of σ significance ∼ 2.