A taste of dark matter: flavour constraints on pseudoscalar mediators

Dark matter interacting via the exchange of a light pseudoscalar can induce observable signals in indirect detection experiments and experience large self-interactions while evading the strong bounds from direct dark matter searches. The pseudoscalar mediator will however induce flavour-changing interactions in the Standard Model, providing a promising alternative way to test these models. We investigate in detail the constraints arising from rare meson decays and fixed target experiments for different coupling structures between the pseudoscalar and Standard Model fermions. The resulting bounds are highly complementary to the information inferred from the dark matter relic density and the constraints from primordial nucleosynthesis. We discuss the implications of our findings for the dark matter self-interaction cross section and the prospects of probing dark matter coupled to a light pseudoscalar with direct or indirect detection experiments. In particular, we find that a pseudoscalar mediator can only explain the Galactic Centre excess if its mass is above that of the B mesons, and that it is impossible to obtain a sufficiently large direct detection cross section to account for the DAMA modulation.


Introduction
One of the major objectives of modern particle physics is to determine the properties of particle dark matter (DM). A critical part of this task is understanding how the DM particle interacts with Standard Model (SM) states and how these interactions lead to the observed DM relic abundance. Stringent bounds from direct detection experiments [1][2][3][4] and recent constraints on the invisible width of the SM Higgs [5,6] make it increasingly difficult to understand these interactions in terms of the known interactions of the SM. It is therefore a well-motivated possibility to assume that the interactions of DM are mediated by an additional new particle that couples weakly to the visible sector.
Out of the various possibilities for this new particle, pseudoscalar mediators are particularly interesting for several reasons. First of all, with the discovery of the Higgs boson, there is now convincing evidence that fundamental scalars exist in nature. Many extensions of the Higgs sector (such as Two-Higgs Doublet Models [7] e.g. in the context of Supersymmetry [8]) naturally include additional pseudoscalar states, making searches for such particles a well-motivated and timely task. Pseudoscalar mediators are at the same time attractive from a purely phenomenological point of view, because they predict a strong suppression of the event rate in direct detection experiments, thus avoiding some of the most severe constraints on the interactions of DM [9,10]. This consideration provides one reason why DM models with a pseudoscalar mediator (sometimes referred to as 'Coy DM' [11][12][13]) have received much attention in the context of explaining the diffuse GeV-energy excess of gamma-ray emission from the Galactic Centre observed with the Fermi-LAT instrument [14][15][16][17][18][19][20][21][22][23][24] (see also [25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41] for further model-building involving pseudoscalars).
An interesting possibility is that the pseudoscalar mediator mass is sub-GeV and is therefore light compared to Large Hadron Collider (LHC) energies. Such a set-up was for example advocated in the context of asymmetric DM [42] in order to avoid constraints from LHC monojet searches. At the same time, for light mediators the suppression of DM scattering in direct detection experiments is significantly reduced, so that it might be possible to obtain observable signals in present or future experiments [11] and possibly even explain the DAMA modulation signal [13]. Moreover, if the pseudoscalar has a mass smaller than the DM mass, DM can annihilate into pairs of pseudoscalars, which subsequently decay into SM particles. This way, DM can obtain the required relic density even if the interactions between the pseudoscalar and SM particles are constrained to be rather weak.
Furthermore, the presence of a light mediator offers the interesting possibility to obtain large self-interactions in the dark sector. Such self-interactions have received much interest in the context of explaining the discrepancies between N -body simulations of collisionless cold DM and the observations of small-scale structures [43]. The central idea is that DM scattering can reduce both the central densities of DM halos and the size and number of Milky Way satellites. For velocity-independent self-interactions, however, there are important constraints from colliding galaxy clusters [44,45], sub-halo evaporation [46,47] and elliptical galaxies [48][49][50]. These constraints can be evaded if self-interactions are suppressed for large relative velocities [51,52]. Such a velocity dependence can for example arise from a light mediator inducing a Yukawa potential [48,49,[53][54][55]. In addition, for a light mediator DM self-interactions may be additionally enhanced at low velocities by non-perturbative effects corresponding to the (temporary) formation of DM bound states [49,55].
However, there are stringent constraints on new light states coupling to SM particles. Of particular interest in this context are experimental searches for rare meson decays, because the presence of a new light pseudoscalar mediator A will in general lead to a large enhancement in the rates of flavour-changing processes such as K → π A or B → K A [56,57]. Flavour observables therefore provide a unique opportunity to constrain the interactions of the dark sector with SM particles via a light mediator. While similar constraints have been studied for light vector mediators [58,59], many other cases remain relatively unexplored, although there has been some interest in flavoured DM [60,61].
The topic of this paper is to explore in detail various constraints on the SM couplings of a new light pseudoscalar particle and infer the implications for the interactions between DM and SM particles. We will show that flavour constraints completely rule out the possibility to obtain an observable DM signal in direct detection experiments from scattering via the exchange of light pseudoscalars. Similarly, indirect detection signals can only be sizeable if the mediator mass is so large that it cannot be produced on-shell in the decay of B mesons. In particular, it appears impossible to obtain both indirect detection signals and large DM self-interactions from the same pseudoscalar mediator.
Our paper is structured as follows. Sec. 2 contains the general set-up for our study and discusses the various ways in which flavour-changing processes can induce rare meson decays. In Sec. 3 we use various experimental results to constrain a light pseudoscalar coupling to the SM. The resulting bounds are presented in Sec. 4. The focus of Sec. 5 is the connection to the dark sector and the resulting cosmology. Sec. 6 considers implications for possible DM signals, in particular concerning the interpretation of the DAMA annual modulation and the Galactic Centre excess. Various details of our calculations are provided in Appendices A-D.

General set-up and conventions
We are interested in the interactions of a light real pseudoscalar A with the DM particle χ, which we take to be a Dirac fermion, and with SM fermions. Neglecting CP -violating couplings, we write the DM-pseudoscalar coupling as where we introduce a factor of i so that the coupling g χ is real. For the interactions between A and SM particles we write in general where g f is the effective coupling and f refers to all SM quarks q = {u, d, s, c, b, t}, all charged SM leptons = {e, µ, τ } and all SM neutrinos ν.
In the following we will consider different cases for the coupling structure with the charged SM fermions; unless explicitly stated otherwise, we assume that g ν 0.
• Yukawa-like couplings: Arguably the most natural case is the one where the couplings to all charged SM fermions are proportional to the SM Yukawa couplings: where m f is the fermion mass and v 246 GeV is the vacuum expectation value (vev) of the SM Higgs field. In this case g f = √ 2 g Y m f /v. This coupling structure is expected for pseudoscalars arising from an extended Higgs sector, because the couplings of the pseudoscalar to SM fermions arise from mixing with the SM Higgs boson and are therefore automatically proportional to the SM Yukawa couplings. Such extended Higgs sectors often contain additional CP -even and charged Higgs particles as well. Our results should apply in such theories as long as the effects of these particles decouple.
• Quark Yukawa-like couplings: As we shall see many experimental constraints assume that the pseudoscalar can decay into charged leptons. These constraints can be significantly relaxed -or even removed altogether -if the pseudoscalar is assumed to couple only to quarks i.e. g f = √ 2 g Y q m f /v for f = q and g f = 0 otherwise. Such a coupling structure can be expected for axion-like particles with a shift symmetry, which would have a coupling proportional to e f ∂ µ af γ µ γ 5 f , where e f is the charge of the fermion under the new global U (1) symmetry. This coupling structure leads to Yukawa-like couplings after integrating by parts and using the equations of motion. If e f = 0 for leptons, such a particle would couple only to quarks (like the QCD-axion).
• Quark universal couplings: The assumption of Yukawa-like couplings for the pseudoscalar is consistent with the hypothesis of minimal flavour violation (MFV) [62]. Consequently, one would expect other (non-MFV) coupling structures to lead to significantly stronger experimental bounds. Moreover, we will see below that an arbitrary coupling structure typically does not lead to a consistent renormalisable quantum field theory. Nevertheless, it is interesting from the phenomenological point of view to consider different coupling structures as a low-energy effective theory resulting from more complex new physics at higher scales. In particular, we will be interested in the case that the pseudoscalar has universal couplings to all quarks and no couplings to leptons: L (q) Figure 1. Flavour-changing transitions such as b → sA (and also s → dA after relabelling the external lines) are generated by diagrams with heavy quarks and W ± -bosons.
Experimental searches (to be discussed in the following section) typically look for rare decays of the form K → π + X or B → K + X where X is a set of (potentially invisible) SM particles. In the context of our model, these processes can be decomposed into the production of a pseudoscalar in a flavour changing process, such as K → π A followed by the decay of A into SM particles. We will therefore now discuss the theoretical predictions for both contributions.

Effective flavour-changing interactions
Although the tree-level interactions of A are assumed to be flavour-diagonal, flavourchanging neutral currents (FCNCs) arise at the one-loop level from diagrams with heavy quarks and W -bosons, such as those depicted in Fig. 1. We will be interested in the transitions b → s A and s → d A. The relevant flavour-changing terms are typically parameterised in the form [63]: where the coefficients h S,P qq are typically complex, so we do not include an extra factor i in front of the pseudoscalar coupling. To connect to various results in the literature, we note that this expression can also be written as where q R,L = 1 2 (1 ± γ 5 )q and the couplings are related by In order to calculate the loop-induced flavour-changing couplings, we first of all need to determine the quark field renormalisation constants. This can be done by calculating the loop-induced contribution to the quark two-point function and fixing the counterterms in such a way that all flavour changing transitions q → q vanish for on-shell quarks [64,65]. The same counterterms contribute to the processes b → s A and s → d A. Since we assume that the pseudoscalar has no flavour-changing interactions at tree-level, there is also no corresponding counter term for the three-point vertex, so that the quark field renormalisation constants must cancel the divergent parts of the diagrams in Fig. 1. In other words, the diagrams in Fig. 1 should be finite when considering renormalised quark fields.
This cancellation of divergences does indeed occur for the case of Yukawa-like couplings and we obtain a finite result for the effective flavour-changing couplings. Using FeynArts, FormCalc and LoopTools [66,67], we find in the approximation that the top-quark mass m t and the W -boson mass m W are large compared to all other masses where α ≡ e 2 /(4π), θ W is the Weinberg angle, V is the CKM matrix and Analogous expressions are obtained for h R ds and h L ds , except that in this case a second term with t replaced by c gives a relevant contribution. Using the full expression and substituting the measured masses, couplings and mixing angles, we find (2.10) If we deviate from the assumption of MFV, however, the quark field renormalisation constants are no longer sufficient to cancel the divergence of the diagrams in Fig. 1. Indeed, for general couplings and using dimensional regularisation we find a pole in = (4 − d)/2 of the form 11) and analogous expressions with t replaced by c or u. These expressions are non-zero in general unless g q ∝ m q . In particular, the divergence does not cancel for universal couplings. The fact that for general coupling structures the one-loop divergences do not cancel demonstrates that it is inconsistent to assume general but flavour-diagonal couplings. In other words, in order to render the theory renormalisable, we need to include flavourchanging interactions already at tree-level, so that an additional counter term can be introduced that cancels the divergence of the three-point functions. Clearly, such a model would be extremely tightly constrained by experiments and would therefore typically not be phenomenologically viable. 1 For the purpose of this paper, we will take a more optimistic approach and assume that general (but flavour-diagonal) couplings can still be obtained in an effective theory that arises in the low-energy limit of a more complete theory. In other words, we assume that additional new physics at a high scale Λ cancels the divergences present in the effective theory. This new physics will in general induce additional higher-dimensional operators with flavour-changing interactions, but we assume that these effects are small compared to the effects that we consider here and that the coupling structure is not significantly changed by renormalisation group evolution over the energy range that we consider. Clearly, these assumptions are highly optimistic, but we will show that due to the loopinduced flavour-changing couplings models with general coupling structure are nevertheless tightly constrained. In the presence of a UV completion (if such a completion can indeed be constructed), even stronger constraints are expected.
Interpreting models with general coupling structure as effective theories below some new physics scale Λ corresponds to making the replacement 1/ +log(µ 2 /m 2 ) → log(Λ 2 /m 2 ), where m is the relevant mass scale for the process under consideration. Keeping only the logarithm and taking Λ = 1 TeV and m to be m b for b → s A and m s for s → d A, we find for quark-universal couplings for couplings only to the third generation.
Using these results, we can now calculate the partial decay widths for various flavour changing meson decays. The relevant formulae are provided in Appendix A. To obtain the branching ratios (BR) into SM final states, we need to divide these partial widths by the total meson decay width and multiply with the branching ratio of the pseudoscalar into the appropriate final state. For example, where we have made use of the narrow width approximation in the second line. We will therefore now discuss the decays of the pseudoscalar mediator.

Pseudoscalar decays
In principle, the pseudoscalar A can decay into SM states which are kinematically accessible, such as leptons, photons and hadrons. Our results for the various branching ratios in the case of Yukawa-like couplings -and the total width of the pseudoscalar -are shown in Fig. 2. 2 We provide details of our calculations of these quantities in Appendix B, being  . Right: Total width in units of the pseudoscalar mass for Yukawa-like couplings to all fermions (with g Y = 10), Yukawa-like couplings only to quarks (with g Y q = 10) and universal quark couplings (with g q = 0.1). For the latter case, we do not attempt to estimate the total width once hadronic decay channels open up (when m A > 3m π ). The narrow width approximation is valid for all parameter values that we consider.
careful to provide a self-consistent estimation of the hadronic decay width. In particular, we point out that for all couplings and masses we consider, the pseudoscalar state is a narrow resonance (i.e. Γ A /m A 1) so that the narrow width approximation introduced above is valid.
For m A < 2 m π , hadronic decays are kinematically forbidden. For m A > 2 m π , the decays A → ππ and A → ππγ are kinematically allowed, but forbidden by CP . Consequently, sizeable hadronic decay channels only open up for m A > 3 m π when A → πππ becomes possible. While the decay A → πγγ is allowed for m A > m π , it is always suppressed compared to A → γγ because of the smaller available phase-space.
For m A ≤ 3 m π the pseudoscalar will therefore dominantly decay into pairs of leptons or photons. In the case of Yukawa-like couplings, decays into electrons dominate for pseudoscalar masses of about (1-100) MeV. As the pseudoscalar mass approaches the µ + µ − threshold, the branching ratio for A → γγ becomes sizeable, while above the threshold the µ + µ − decay channel is the most important one. If couplings to leptons are absent, the only allowed two-body decay for m A < 3 m π is A → γγ. Since loops with light quarks give an important contribution to this process, the total decay width depends on the precise matching of the light quark masses. We discuss this complication in more detail in Appendix B.
Predicting the hadronic decay width of A for m A > 3 m π is a notoriously difficult problem. Indeed, even for the SM-like Higgs with a mass m h ∼ (0.5-1) GeV there is an unresolved debate about the ratio of Γ(h → ππ) to Γ(h → µ + µ − ) [71] (see also [72] for a recent review). To obtain an approximate expression for the case of Yukawa-like couplings to quarks, we employ the perturbative spectator model [7], even though this treatment is not expected to be very accurate for m A 1 GeV. Given that we find the partial decay width for hadronic final states to be significantly smaller than the corresponding width for leptons (due to the phase-space suppression for three-body final states), the total width and the leptonic branching fraction do not suffer significantly from these uncertainties and our results based on these quantities are robust. Still, it is important to bear in mind that in the presence of resonance effects, using the perturbative spectator model might significantly underestimate the branching fraction into hadrons, so that bounds based on leptonic decays of A might be significantly suppressed for particular values of the pseudoscalar mass.
For the case of quark universal couplings we assume that the branching ratio to hadrons is 100% for pseudoscalar masses above 3 m π . This is a reasonable approximation since the typical phase-space suppression for three-body decays is only 1/(32π 2 ), compared to a suppression factor α 2 /(16π 2 ) for loop-induced decays into photons. Nevertheless, since the tree-level couplings to light quarks give a significant contribution to hadronic decays, it is very difficult to reliably estimate the total decay width of the pseudoscalar. Fortunately, our results do not depend sensitively on this quantity since the total width is neither so small that one could obtain displaced vertices, nor so large that the narrow width approximation becomes invalid.

Experimental constraints
To constrain the interactions of the pseudoscalar mediator with SM particles, we study a large set of flavour constraints and other experiments searching for rare processes. We provide a concise list of the searches we consider in Tab. 1. We will now discuss in more detail those measurements which are responsible for the best limits in pseudoscalar parameter space.

Flavour constraints
The simplest constraints can be obtained by requiring that the partial widths Γ(K → πA) and Γ(B → X s A) (where in the inclusive decay X s is any strange meson) do not exceed the experimentally measured total width of the K and B mesons, or in other words, by imposing Γ(K → πA)/Γ exp K < 1 and Γ(B → X s A)/Γ exp Bs < 1. In principle, one could obtain stronger constraints by taking into account the partial widths of well-known (or well-measured) decay channels. For kaons, for example, the partial widths for leptonic decays can be accurately calculated and could therefore be subtracted when setting a bound on the pseudoscalar couplings. For B 0 mesons, on the other hand, the PDG quotes BR(B 0 → cc + X) = 119 ± 6% [96], which would imply that BR(B → X s A) must be significantly smaller than unity. We do not attempt to include these contributions and therefore set a very conservative bound.
The pseudoscalar mediator introduced in Sec. 2 does not have any invisible decay modes (except for negligible loop-induced decays to neutrinos). However, as we discuss in more detail in Appendix C, its lifetime can be so long that it will leave the detector without decaying at all. If such a pseudoscalar is produced in a kaon decay, the resulting experimental signature (a single pion in association with missing energy) is identical to the one from the rare SM process BR(K → π + νν), which has recently received much interest. Indeed, the E949 collaboration [73,74] has measured this branching ratio to be in good agreement with the theoretical SM prediction. To suppress backgrounds, however, these searches restrict the momentum range for the pion in the final state. Most searches have focussed on the range 211 MeV < p π < 229 MeV, corresponding to the maximum pion momentum allowed by kinematics for an off-shell mediator [73]. For an on-shell mediator, the pion momentum will be reduced as m A increases. The search window above therefore ceases to be constraining for m A > 110 MeV. An additional search for low-momentum pions has been performed in Ref. [74], covering the momentum range 140 MeV < p π < 199 MeV, which corresponds to 150 MeV < m A < 260 MeV. In both search regions, the experiment places an upper bound of BR(K + → π + + inv) 4 · 10 −10 . (3.1) The overwhelming SM background for m A ∼ m π makes it impossible to search in the intermediate region. Nevertheless, E949 has also measured the branching ratio for the process K + → π + π 0 → π + νν, finding BR(K + → π + π 0 → π + νν) ≈ 6 · 10 −8 . (

3.2)
For the mass range 110 MeV < m A < 150 MeV we therefore require that the branching ratio for K + → π + A multiplied with the probability that A escapes from the detector does not exceed this value. In practice, we require that the pseudoscalar travels at least 4 metres in the laboratory frame before decaying.
For Yukawa-like couplings and 210MeV < m A < 420MeV, the pseudoscalar will dominantly decay into a pair of muons. The KTeV/E799 experiment constrains the decay of kaons into π 0 µ + µ − [80]: This search is inclusive in the sense that it does not depend on the invariant mass of the dilepton pair. Nevertheless, an important complication arises from the fact that experimental searches require the tracks from the three particles to originate from a common vertex. Consequently, if the lifetime of the pseudoscalar is too large, its decays into muons will be vetoed because of the poor quality of the reconstructed vertex. We therefore have to multiply the theoretical prediction for the branching ratio with the probability that the pseudoscalar decays instantaneously (i.e. within the vertex resolution of the detector). We discuss the calculation of this probability, taking into account the boost factor of the pseudoscalar, in Appendix C. For our analysis we take the vertex resolution of KTeV/E799 to be 4 mm.
Similar searches exist for the decays of the pseudoscalar into a pair of electrons, which are relevant for m A < 210 MeV. These searches, however, typically require that the invariant mass of the electron pair satisfies m ee > 140 MeV. Consequently, these searches do not constrain pseudoscalars with very low mass, even though the branching ratio into electrons is largest at these masses.
The most promising place to look for the process K → πA → πγγ is in decays of K L , since the background from K L → π 0 π 0 violates CP and is therefore suppressed. Nevertheless, this background still reduces the experimental sensitivity for m A ∼ m π . For this reason, the KTeV experiment [81] excludes the mass range 100 MeV < m γγ < 160 MeV. In the remaining mass range, the experiment finds This measurement is in agreement with the SM expectation [97]. The theoretical uncertainties for the SM prediction are of the order of 10 −7 , i.e. larger than experimental uncertainties. We will therefore take the experimental data to give a constraint on new physics contributions of the order of As before, we cannot obtain such a strong constraint for m A ∼ m π . Nevertheless, we can still obtain a relevant constraint by requiring that BR(K L → π 0 A → π 0 γγ) does not exceed BR(K L → π 0 π 0 ). This estimate yields [82] BR(K L → π 0 γγ) < 9 · 10 −4 . (3.6) As in the case of decays into charged leptons, we require that the pseudoscalar decays within the vertex resolution of the detector.
The K µ2 experiment has studied the momentum distribution of charged pions produced in the decays of K + [83]. In the presence of a new light pseudoscalar, the decay channel K + → π + A would lead to a bump in the momentum spectrum. The absence of such a bump allows K µ2 to constrain the branching ratio for K + → π + A independent of the consecutive decay channels of A. Consequently, this search is particularly interesting in the case where the decays of the pseudoscalar are such that they would be hard to see in most experiments (e.g. due to displaced vertices). To set a limit, we take the bound from Fig. 2 of [83].
This search channel works in complete analogy to the case K + → π + + inv. The strongest bound results from CLEO [84]. Again, we require that the pseudoscalar travels a distance of at least 4 m before decaying.
This search channel proceeds in analogy to the case K + → π + µ + µ − , with the strongest constraint resulting from LHCb [90]. We use the Feldman-Cousins method [98] to obtain an upper bound on the new physics contribution in each of the bins considered by LHCb. We take the vertex resolution of LHCb to be 5 mm. 3 As is usually done in the analysis of B → K + − , LHCb does not consider events where the invariant mass of the two muons is close to the charmonium resonances J/ψ and ψ(2S). To obtain approximate bounds in this regions, we require that the branching ratio If the pseudoscalar couples only to quarks, the only allowed decay channel for m A < 3 m π is A → γγ. For m A > m K − m π , however, the pseudoscalar cannot be produced in kaon decays. An interesting way to constrain the mass range 350 MeV < m A < 420 MeV (apart from the trivial constraint from the total B-width) would be to search for the process B → Kγγ. Nevertheless, this decay mode is not listed in the PDG review. To indicate the potential impact of constraining this channel, we show the bound obtained from the assumption that the branching ratio is smaller than 10 −2 .
b → s g CLEO has performed a fully inclusive search for any B decays involving an FCNC process, such as b → s g or b → sqq, which are collectively denoted as b → s g. The resulting bound BR(b → s g) < 6.8% can be used to constrain the branching ratio for B → X s A for the case that A decays hadronically (see Appendix A). [90] and CMS [91,100] have determined the branching ratio for B s → µ + µ − to be in good agreement with the SM prediction. However, since the contribution from the pseudoscalar can in principle interfere with the SM processes, a simple subtraction of the SM prediction from the observed branching ratio is not possible [101]. We therefore take a more conservative approach and require that the contribution from the pseudoscalar should not exceed the SM prediction. It is clear that a very strong bound will be obtained for m A ≈ m Bs , when the B s decay receives a resonant enhancement from the on-shell mediator. For other values of m A the pseudoscalar is forced to be off-shell, so that the branching ratio is proportional to g 4 f . Consequently, for very small couplings this search will not be competitive with decays that produce the pseudoscalar on-shell. On the other hand, the constraints do not vanish above the B s meson mass and are very relevant there.

Fixed target experiments
It has long been known that beam-dump experiments are sensitive to long-lived light new states. The majority of beam-dump experiments operate using electron beams which are insensitive to new light scalars or pseudoscalars with Yukawa-like couplings or couplings only to quarks. However, there are a number of proton beam-dump results in the literature including CHARM [95], NuCal [102] and E613 [103] as well as the recent DAEδALUS [104] proposal. We have calculated the constraints from CHARM, following [72,105] and estimated the implications of NuCal, following [106,107]. We find that the CHARM reach is greater than NuCal, and so we present only results for CHARM here. Reproducing the calculations of [108] in order to derive results from E613 is beyond the scope of this paper.
Constraints from the fixed target CHARM experiment [95] are important when the pseudoscalar can be produced in K L , K + and B decays [105] and subsequently decays into e + e − , µ + µ − or γγ. Since CHARM observed 0 events, we set a bound at 90% confidence level of N det < 2.3 events [72]. We assume that the A production cross section is dominated by meson decays from either kaons or B mesons (i.e. we neglect direct mediator production) such that the approximate pseudoscalar production cross section is (3.9) where σ pp is the proton-proton cross section, M pp is the average hadron multiplicity and the numerical prefactors for the fraction of kaons and B mesons are taken from [105]. Using the relation σ π 0 ≈ σ pp M pp /3, we can normalise this cross section to the neutral pion yield. Since this yield is known, we can then write the number of pseudoscalars produced in the solid angle of the detector as N A ≈ 2.9 · 10 17 σ A σ π 0 . (3.10) To obtain the number of decays in the detector region we need to multiply this number with the branching ratio of the pseudoscalar into electrons, muons and photons, and with the probability that the pseudoscalar decays inside the detector. The detector was placed 480 metres away from the beam-dump, and was 35 metres long. The number of events in the detector region is thus (see Appendix C) As long as the pseudoscalar can be produced in K decays, we typically find N A 1 in the parameter region that we study. The expected number of events in CHARM therefore depends crucially on whether the lifetime of the pseudoscalar τ A = Γ −1 A is long enough for it to reach the detector.

Radiative Υ decays
Once m A > m B − m K , constraints on the pseudoscalar couplings can only come from either processes where the pseudoscalar is off-shell (such as B s → µ + µ − ) or from Υ decays. In particular, there are strong constraints from BaBar on new states A produced in the radiative decay Υ → Aγ, which apply for a wide range of different final states. For Yukawalike couplings the strongest bound comes from A → µ + µ − for m A < 2 m τ [92] and from A → τ + τ − above the kinematic threshold [93]. For universal quark couplings, strong bounds can still be obtained from hadronic decays of A by searching for a bump in the momentum spectrum of the photon [94].

Excluded parameter regions
The parameter regions excluded by the various experimental results discussed above are presented in Fig. 3 for the case of Yukawa-like couplings and Yukawa-like quark couplings, and in Fig. 4 for the case of universal quark couplings and third generation quark couplings. Let us briefly discuss the different cases in more detail.

Yukawa-like couplings
A straight-forward bound on g Y can be obtained from K µ2 , which gives BR(K + → π + A) < 10 −6 for m A 100 MeV independent of the decay modes of A. Substituting the value for h S ds from Eq. (2.10) into Eq. (A.2), we obtain the prediction BR(K L → π 0 A) ∼ 10 −4 g 2 Y in this mass region. Consequently, the bound from K µ2 implies g Y 0.1 for m A ∼ 100 MeV. As many other searches, this bound is significantly weakened for m A ∼ m π . 4 Most of the experimental constraints that we consider depend on the pseudoscalar branching ratios and its decay length. For example, the bound BR(B → K +inv) 5·10 −5 together with the prediction BR(B → K + A) ∼ 0.06 g 2 Y gives the bound g Y 3 · 10 −2 provided the pseudoscalar escapes from the detector before decaying. This is indeed the case for g Y ∼ 10 −1 and m A 100MeV, but it is no longer true for larger couplings or larger mediator masses. In particular, searches for B → K+inv cannot exclude pseudoscalars with m A > 2 m µ , which would typically decay within the detector. In a similar way the shape of the CHARM exclusion can be understood from the requirement that the pseudoscalar should neither decay too quickly nor too slowly in order to give an observable signal. The feature at m A ∼ 210 MeV results from the rapidly decreasing lifetime of the pseudoscalar as decays into muons become allowed.
Searches for K L → π 0 + − or B → K + − , on the other hand, can only be applied to pseudoscalars that decay promptly within the detector. In other words, these searches lose sensitivity towards small couplings not because the number of pseudoscalars produced is too small, but because the pseudoscalar lifetime becomes so large that most decays occur from displaced vertices (see Appendix C). For example, the bound BR(K L → π 0 µ + µ − ) 10 −10 cannot exclude couplings all the way down to g Y ∼ 10 −3 , because the decay length of the pseudoscalar for such small couplings would be significantly larger than the vertex resolution of the detector. Consequently, there is a useful complementarity between searches for invisible decays and searches for leptonic decays. By implementing a search for displaced vertices, LHCb should be able to significantly improve its sensitivity and probe pseudoscalar couplings down to g Y ∼ 10 −3 .
Finally, above the B meson mass, the only sensitive searches are those for Υ decays at BaBar and for B s → µ + µ − at LHCb, both probing roughly g Y ∼ 1. Note that the process B s → µ + µ − is the only search channel that involves an off-shell pseudoscalar. Consequently, the resulting bound in principle extends up to arbitrarily large pseudoscalar mass. For m A m Bs it can be written as

Yukawa-like couplings to quarks
For small pseudoscalar masses the case of Yukawa-like couplings only to quarks proceeds in complete analogy to the one including leptons, with the only differences being that the bound on K L → π 0 γγ replaces the search for K L → π 0 + − and that there is no special feature at m A = 2 m µ . In the absence of the process B → Kµ + µ − , however, it becomes very difficult to constrain the mass region m K − m π < m A < 3 m π , when the pseudoscalar can no longer be produced in kaon decays, but hadronic decays are still forbidden. A promising way to search for such pseudoscalars would be the decay B → Kγγ, but no such search exists in the literature. To estimate experimental sensitivity we show the bound corresponding to BR(B → Kγγ) < 10 −2 . In the absence of such a search, we still obtain g Y q < 1 from the total B width. For m A > 3 m π hadronic decays become allowed and completely dominate over photonic decays. Consequently, the only relevant searches are b → s g and Υ → γ + hadrons, the latter one giving stronger constraints. Note that in the absence of couplings to muons, the lifetime of the pseudoscalar is significantly increased above 210 MeV. Consequently, the CHARM bound extends up to larger pseudoscalar masses and includes a relevant contribution from B meson decays.

Universal couplings to quarks
For the case of universal quark couplings, the constraints on g q are considerably stronger than the corresponding ones for g Y . This is partially due to the fact that there is no factor √ 2 m f /v in the couplings, but more importantly due to larger flavour-changing effects resulting from the non-MFV coupling structure. The former reason also implies that experiments probing rare kaon decays become more important compared to experiments probing rare B meson decays. The enhancement of flavour-changing effects, on the other hand, implies that bounds from processes like b → sg give stronger bounds than processes like Υ → γ + hadrons.
For small pseudoscalar masses, we again find a clear complementarity between searches for K L → π 0 γγ and searches for K + → π + + inv (see left panel of Fig. 4), ruling out the entire parameter region m A < m K − m π and g q ∼ 10 −8 . To close the gap between m A < m K − m π and m A > 3 m π , we again show the bound corresponding to BR(B → Kγγ) < 10 −2 . By assumption, we take photonic decays to be negligible above the hadronic decay threshold. Consequently, the dominant bound comes from b → sg for m A m B , and from BaBar for m B m A m Υ .

Universal couplings to b and t quarks only
If we assume that the pseudoscalar couples only to bottom and top quarks with the same coupling strength g Q , the effective flavour changing coupling h S ds is reduced by more than three orders of magnitude, as can be seen by comparing Eq. (2.13) with Eq. (2.12). Consequently, any bounds on g Q from kaon decays are relaxed by more than three orders of magnitude (cf. left and right panels of Fig. 4). At the same time, the lifetime of the pseudoscalar grows significantly, because for universal quark couplings, light quarks give the dominant contribution to the loop-induced decay of the pseudoscalar into photons. In this scenario, these light quark contributions are absent with the result that, even for g Q ∼ 10 −2 , the pseudoscalar will almost always escape from the detector without decaying. Therefore searches for pions in association with missing momentum are constraining up to large couplings.
The effective coupling h S sb responsible for flavour-changing B meson decays receives its dominant contribution from the heavy-quark couplings of the pseudoscalar. Consequently, bounds from B → K A remain largely unaffected if we exclude couplings to light quarks. The same is true for radiative Υ decays, which only probe the bottom-quark coupling of A.

The case of an invisibly decaying pseudoscalar
One might be tempted to think that the constraints discussed above can be evaded if the pseudoscalar dominantly decays into neutrinos or other invisible states (such as a DM particle with m χ < m A /2). The experimental bounds on new contributions to B → K + inv and K + → π + + inv, however, imply that these decay channels are in fact tightly constrained. We show these constraints in Fig. 5. For the case that the pseudoscalar is produced via Yukawa-like quark couplings, but decays dominantly into invisible final states, we find If the pseudoscalar has universal couplings to quarks, the constraints are g q 1 · 10 −8 for m A 100 MeV 2 · 10 −5 for 100 MeV m A 1 GeV . (4.3)

The dark matter connection
So far we have mainly discussed the couplings of the pseudoscalar to SM fields and found that they are strongly constrained by flavour observables. In the following we will concentrate on the pseudoscalar couplings to DM. As we shall see, the presence of a new light particle that can communicate interactions between DM particles and the SM can have important implications for both the generation of DM in the early Universe and the detection of DM in present and future experiments. In this section we discuss the general concepts, while more specific direct and indirect signals will be considered in the following section.

Relic density constraints
Two processes are relevant for the freeze-out of DM in the early Universe: Annihilation into pairs of pseudoscalarsχχ → AA and annihilation into SM fermionsχχ →f f . For a Dirac DM particle, the thermally-averaged annihilation cross section into SM fermions is given by while for the annihilation into pseudoscalars we find Here x = m χ /T , where T is the temperature; we see that the annihilation into SM fermions is s-wave while the annihilation to pseudoscalars is p-wave. Since both annihilation cross sections depend on g χ , we can always eliminate this parameter (for given m A , m χ and g f ) by imposing that our model reproduces the observed abundance of DM. In Fig. 6, we show the dependence of g χ on m A and g f for the fixed value m χ = 10 GeV. While these results are obtained numerically from micrOMEGAs [109], it is instructive to use an approximate freeze-out calculation to estimate the required value of g χ .
First, we consider the case g f g χ so thatχχ → AA dominates. When m A m χ , the relic density requirement reads where x f is the value of x at freeze-out and there is one extra factor of 1/2 for Dirac DM and another since the annihilation is a p-wave process [110]. Taking x f = 20, we obtain which is in good agreement with the value in Fig. 6. As m A → m χ , annihilation into pseudoscalars becomes suppressed due to the reduced phase space, so a larger value of g χ is necessary to compensate.
For larger values of g f the annihilation to SM fermions becomes increasingly relevant. The purple lines in Fig. 6 indicate the transition from freeze-out dominantly into pseudoscalars to freeze-out dominantly into SM fermions. For m A m χ , we can estimate the corresponding value of g f by equating Eq. (5.1) and Eq. (5.2), again taking into account a factor of 1/2 for the p-wave process. Substituting the estimate for g χ from Eq. (5.4), we Again, this estimate is in good agreement with the values shown in Fig. 6. For larger values of g f , the annihilation cross section into pseudoscalars (and hence the coupling g χ ) must be reduced to avoid underproduction of DM.
As we have shown in Figs. 3 and 4, the coupling g f is strongly constrained by precision measurements. Comparing these figures with Fig. 6, we see that in most of the allowed parameter space the dominant contribution to DM annihilation will result from annihilation to pseudoscalars, which is independent of g f . The fact that annihilation into pseudoscalars is p-wave only means that present-day indirect detection signals are unobservably small over much of the parameter space. We will, however, also consider parameter regions where the annihilation cross section into SM fermions is comparable to the thermal cross section during freeze-out and indirect detection signals may be observable.

Thermal equilibrium
As we have seen above, for very small values of g f the DM relic density becomes independent of the SM couplings of the pseudoscalar, since annihilation into pseudoscalars completely dominates for the freeze-out calculation. However, we cannot make the couplings g f arbitrarily small, because the calculation of the relic density relies crucially on the assumption that the dark sector and visible sector are at the same temperature. If the coupling between the two sectors becomes too weak, the energy transfer becomes so scarce that thermal equilibrium between the two sectors may not be achieved. Of course, the dark sector may still thermalise if the intra-sector coupling g χ is large enough, but it will in general have a temperature different from the visible sector. DM can therefore still freeze out into pseudoscalars, but the resulting abundance becomes sensitive to the details of reheating [111]. 5 In this section we estimate the couplings necessary to obtain thermal equilibrium. For smaller couplings, it is no longer possible to reduce our model to three parameters by imposing the relic density constraint, so we will not consider such cases further. Fortunately, we will see that this requirement is not very constraining.
We impose the usual requirement for thermal equilibrium (see e.g. [111]), namely that the reaction rate is larger than the expansion rate of the Universe σv n eq > H , (5.6) where as usual, H = 1.66 √ g * T 2 /M P and n eq is the equilibrium number density. In principle, the left-hand side can refer to both the production of SM particles from the dark sector or DM production from the visible sector (in thermal equilibrium both rates must be equal).
Using Maxwell-Boltzmann statistics for the equilibrium distributions, we obtain [110,111] σv n eq = f σv f f →χχ n f eq (5.7) with where K i (s) are the modified Bessel functions of the second kind and is the annihilation cross section into two Dirac DM particles as a function of the centreof-mass energy √ s. Note that for definiteness we consider the production of DM particles from the visible sector, although the final expression is valid also for the inverse process. 6 We find that the condition for thermal equilibrium is most easily satisfied for T ∼ max(m f , m χ ). The reason is that the cross section decreases proportional to 1/s for large s, so that at high temperatures σv f f →χχ n f eq = N c g 2 For T m χ , m f , on the other hand, the reaction rate is exponentially suppressed. In practice, we check the condition for thermal equilibrium for a range of temperatures around the masses of the particles involved. Nevertheless, it is instructive to obtain a rough estimate by considering the case that T ∼ m χ m f . In this approximation we find σv f f →χχ n f eq ≈ 1.45 N c g 2 f g 2 χ m χ /(128π 3 ) and hence the condition If thermal equilibrium is achieved, we want the annihilation of DM into pairs of pseudoscalars to yield the observed relic abundance. As shown above, this gives the approximate relation g 2 χ /m χ ≈ 0.006 GeV −1 . Substituting into the condition for thermal equilibrium and using g * ∼ 80 yields For Yukawa-like couplings and m b m χ m t , the dominant contribution therefore comes from the top quark at T ∼ m t giving g Y > 3 · 10 −7 v/ √ m t m χ . For universal couplings to all quarks, on the other hand, the contribution of the top quark is negligible, while all light quarks give an equal contribution at T ∼ m χ , leading to the requirement g q 2 · 10 −7 . Fig. 7 shows the minimum coupling values required in order to obtain thermal equilibrium as a function of m A and m χ (with g χ fixed by the relic density requirement). As expected, for m A m χ we find that this lower bound is independent of m A . We also confirm the expectation that for universal quark couplings the bound is largely independent of m χ provided m b m χ m t , while for Yukawa-like couplings the bound is proportional to 1/ √ m χ .

Big Bang Nucleosynthesis
If DM annihilates predominantly into pseudoscalars, these pseudoscalars must decay sufficiently quickly before Big Bang Nucleosynthesis (BBN) to avoid changes to the expansion rate and the entropy density during BBN or the destruction of certain elements. While the precise constraints are rather difficult to calculate (see e.g. [55]), such constraints should not be important if the average lifetime of the pseudoscalar is less than 1 second. As discussed in Appendix C and illustrated in Fig. 2, however, the lifetime of the pseudoscalar can be large, in particular if tree-level decays are suppressed or forbidden. Consequently, BBN constraints are most severe for very light pseudoscalars and cease to be constraining for larger masses.
We show the parameter region where BBN constraints are relevant in Fig. 8 as a function of m A and g f for different values of m χ , fixing g χ such that the observed DM relic density is reproduced. We also show the parameter region where the dark sector does not reach thermal equilibrium with the visible sector. Moreover, Fig. 8 shows the parameter regions relevant for DM self-interactions, DM direct detection and DM indirect detection, which will be discussed next.

Dark matter self-interactions
As we have shown in Fig. 6 above, the relic density constraints require the coupling g χ between DM and the pseudoscalar to be rather large. For DM masses in the range (10-100) GeV, we typically find α χ ≡ g 2 χ /(4π) ∼ O(10 −2 ). If the pseudoscalar is significantly lighter than the DM particle, DM scattering may therefore be in the stronglycoupled regime corresponding to α χ m χ /m A > 1, potentially leading to interesting effects and relevant constraints.
The quantity most relevant for DM self-interactions is the momentum transfer cross section 7 In the weakly-coupled regime, we can calculate the momentum transfer cross section in the Born approximation. For particle-particle scattering, one finds approximately 14) while for particle-antiparticle scattering the low-velocity regime is dominated by s-channel pseudoscalar exchange, giving  Wherever thermal equilibrium can be achieved in the early Universe, g χ has been fixed by the requirement to obtain the observed DM relic abundance. For the left (right) column, we have assumed Yukawa-like couplings to fermions (universal couplings to quarks).
In particular, the Born approximation predicts that for scattering via pseudoscalar exchange there is no enhancement of DM self-interactions for small velocities, which has been invoked to help reconcile observational constraints from galaxy clusters and elliptical galaxies with the self-interaction cross sections favoured by small-scale systems [48,49,53]. In addition, the absence of an enhancement at small velocities means that, as long as α χ m χ /m A < 1, the momentum transfer cross section is significantly too small to lead to any observable effects in astrophysical systems.
Nevertheless, a very different behaviour can be expected in the strongly-interacting regime [49,55]. To find the momentum transfer cross section in this case, one needs to solve the Schroedinger equation for the potential created by pseudoscalar exchange. It is well known from pion-nucleon interactions that the relevant potential is the so-called one-pion-exchange potential [113]: This potential depends on the spin configuration of the two scattering DM particles through the spin variables σ 1,2 . For a singlet configuration, the second line of Eq. (5.16) vanishes and we obtain an attractive Yukawa-like potential (together with a repulsive contact interaction). For a triplet configuration, on the other hand, the potential at small distances (i.e. m A r 1) is dominated by the tensor contribution in the second line, resulting in an attractive 1/r 3 potential, which is singular at the origin.
The one-pion-exchange potential has been discussed in the context of DM both regarding a potential Sommerfeld enhancement of DM annihilation [114] and an enhancement of DM self-interactions [115], both references being careful to renormalise the divergences resulting from the singular potential. The conclusion is that resonances can significantly boost the interaction rates at low velocities, even though significant tuning may be required to obtain a sufficiently large momentum transfer cross section in small-velocity systems [115].
Although DM self-interactions via pseudoscalar exchange are significantly more involved than for scalar exchange, the latter are strongly constrained when imposing relic density constraints and BBN constraints, because the predicted event rates in direct detection experiments are typically too large [116] (see [117] for how to potentially evade these constraints). As we will see below, our scenario is much less constrained, since direct detection constraints are largely absent. Consequently, large DM self-interactions via pseudoscalar exchange are a viable and very interesting possibility. We leave a detailed study of the resulting effects to future work and simply show in Fig. 8 the parameter region corresponding to strong coupling as an indication of where DM self-interactions may become important.
We observe from Fig. 8 that for the case of Yukawa-like couplings and the values of m χ that we consider, it is always possible to choose g Y such that strong DM selfinteractions can be obtained while evading both flavour constraints and BBN constraints. For m A < 2 m µ the required value is typically g Y ∼ 10 −4 . For 2 m µ < m A < m K − m π the strong constraints from CHARM require g Y ∼ 10 −5 , while for m A too heavy to be produced in kaon decays it is typically sufficient to have g Y ∼ 10 −3 . For the case of quark universal couplings, on the other hand, we find that flavour constraints exclude the entire range of g q that lead to thermal equilibrium between the dark and the visible sector, unless m A > m K − m π . Consequently, in order to obtain strong DM self-interactions, we must have m A 350 MeV, m χ 30 GeV and 10 −7 g q 10 −4 .

Direct detection
DM that scatters though the exchange of a pseudoscalar mediator leads to a spin-dependent interaction at direct detection experiments. As we show in Appendix D, the differential cross section to scatter off a nucleus is given by where E is the nucleus recoil energy, v is the DM speed, m N and m T are the mass of the nucleon and the mass of the target nucleus respectively, F N,N Σ is the spin form-factor and the coefficients g N are functions of g f . Crucially, the differential event rate also depends on the momentum transfer q. This is in contrast to the cross section obtained from a scalar or vector mediator. Typically, the momentum transfer in DM-nucleus scattering is approximately q ∼ µ v ∼ 100 MeV, where µ is the reduced mass of the DM-nucleus system and v 10 −3 c is the typical speed of DM particles in the Galactic halo. For m A q, the differential cross section is proportional to q 4 /(m 2 χ m 2 N ), leading to a suppression of differential event rates by up to twelve orders of magnitude. At face value, direct detection signatures with a pseudoscalar mediator are therefore unobservably small and as a result this interaction has frequently been neglected altogether.
For the typical parameters that we consider, however, the mass of the pseudoscalar can be comparable to -or smaller than -the momentum transfer. In this case, the q dependence of the propagator must be correctly accounted for i.e. the factor (m 2 A + q 2 ) −2 cannot be neglected. It then becomes evident that for m A q the momentum suppression in the numerator is cancelled by a small denominator and the differential event rate can become large enough to lead to observable signatures.
To compare the current bounds from direct detection experiments with our other constraints, we consider the recent results from the LUX experiment [2]. 8 Although there is no published limit for this case, it is straightforward to derive a bound at 90% confidence level. We follow the analysis strategy described in [121] based on the 'pmax' statistic [122] and we have checked that the binned likelihood method from [123] gives very similar results. We take the xenon form factors F N,N Σ from [124], the Earth's velocity from [125,126], 8 The SIMPLE [118], PICASSO [119] and COUPP [120] experiments also give similar constraints. assume the Standard Halo Model and use v 0 = 230 km/s, v esc = 550 km/s and ρ = 0.3 GeV/cm 3 for the astrophysical parameters [127].
The parameter regions excluded by LUX constraint are shown in Fig. 8. As expected, we find that the LUX limit is more constraining for smaller values of m A because of the enhancement by a small denominator. For the case of Yukawa-like couplings (left panels), LUX excludes values of g Y O(10 −1 ). For the case of universal couplings to quarks (right panels), LUX provides no constraints i.e. the LUX constraint on g q g χ is higher than the values required to obtain the observed DM abundance in all parameter regions. It is obvious from this figure that for all the scenarios we consider, LUX cannot probe the parameter region allowed by flavour constraints. In fact, the allowed parameter region is out of reach even for next-generation direct detection experiments.

Indirect detection
Indirect detection experiments are sensitive to the s-wave annihilation of DM into SM fermionsχχ →f f . Our estimate of the sensitivity of near-future searches for gamma rays from DM annihilation in Dwarf Spheroidal Galaxies using Fermi-LAT data is shown in Fig. 8. This estimate is based on the preliminary 95% confidence level bound on σv χχ→bb presented in [128] and the additional assumption that a similar bound will apply for annihilation into light quarks, i.e. σv χχ→qq ∼ σv χχ→bb . This assumption is in agreement with previous results from the Fermi-LAT collaboration [129].
In Fig. 8 we have fixed g χ by the requirement to obtain the observed DM relic abundance. It is therefore not a surprise that the Fermi-LAT bound constrains the parameter region where s-wave annihilation into SM fermions dominates (see Fig. 6), since the p-wave annihilation into two pseudoscalars is unobservably small. The annihilation cross section constrained by Fermi-LAT is larger than the thermal cross section (i.e. 2.5 · 10 −26 cm 3 /s) for DM mass above approximately 100 GeV. This is why there is no Fermi-LAT constraint for the m χ = 100 GeV cases in Fig. 8. We see clearly that the indirect detection constraints only become more constraining than the flavour constraints above the B meson mass, when the flavour constraints lose sensitivity.

Implications for dark matter signals
In the previous section we have focussed on the general bounds that can be placed on thermal DM. However, there are two longstanding experimental signatures that are consistent with a DM origin. The first is the DAMA annual modulation signal [130,131] and the second is the Galactic Centre excess of gamma rays observed by Fermi-LAT [14][15][16][17][18][19][20][21][22]24]. In both cases, pseudoscalar mediated interactions between the DM and SM fermions have been proposed to explain the putative signals. In this section we compare the flavour constraints discussed above to the respective preferred parameter regions.

DAMA
The DAMA modulation signal remains a longstanding puzzle. Under the standard assumptions of spin-independent contact interactions, many other direct detection experiments exclude the preferred DM scattering cross section by many orders of magnitude [1][2][3][4]. It was pointed out in [13] that pseudoscalar exchange may be one avenue for reconciling these direct detection experiments. For DM scattering via pseudoscalar exchange, both the shape of the recoil spectrum and the relevant nuclear matrix elements and form factors differ from the standard analysis of direct detection experiments under the assumption of spin-independent contact interactions. In particular, in the non-relativistic limit the pseudoscalar couples more strongly to the proton spin than to the neutron spin. 9 The resulting effects lead to a large enhancement of the signal expected in experiments such as DAMA, which have nuclei with unpaired protons, compared to experiments such as LUX, which contain nuclei with unpaired neutrons.
As emphasised in Sec. 5.5, in order to obtain observable scattering rates from pseudoscalar exchange in any direct detection experiment, the mediator mass must be comparable to the momentum transfer q. Once the mediator mass becomes smaller than the typical momentum exchange in DM scattering, the event rate becomes independent of m A and decreasing the mediator mass further does not lead to an additional enhancement. In the case of DAMA, an observed energy of E ee = (2-4) keV corresponds to a momentum transfer for scattering off iodine and sodium of q I = 2 m I E ee /Q I = (70-100) MeV and q Na = 2 m Na E ee /Q Na = (17-24) MeV respectively, where we have assumed quenching factors of Q I = 0.09 and Q Na = 0.3 and ignored channeling [132]. Consequently, the transition between contact interactions and long-range interactions for DM scattering off iodine occurs around m A ∼ 100 MeV. This observation is in contrast with the treatment in [13], where contact interactions are assumed to remain valid down to mediator masses of around 30 MeV.
To analyse DAMA, we include the energy resolution from NaI and LIBRA weighted appropriately and perform a goodness-of-fit test to 12 equally spaced bins between 2 keVee and 8 keVee. We take the sodium and iodine form factors from [124] and use the astrophysical parameters mentioned in Sec. 5.5. We find that, in order to fit the annual modulation observed by DAMA, two mass values are preferred. For scattering dominantly off iodine, we require m χ ≈ 36 GeV, while for scattering dominantly off sodium, m χ ≈ 9 GeV. Although we focus our discussion on the case of scattering off iodine here for clarity the conclusions are the same for scattering off sodium. Furthermore, we focus on the case where the pseudoscalar couples either universally to all quarks or only to the third generation, since the suppression of LUX is strongest in this case [13].
For m χ ≈ 36 GeV and universal quark couplings, we find that the pseudoscalar couplings must approximately satisfy the relation (6.1) 9 The precise value of the ratio of neutron coupling to proton coupling, gn/gp, depends sensitively on the assumed current quark masses and nuclear matrix elements. By varying these parameters within their 1σ range, we find for Yukawa-like couplings −0.4 gn/gp 0 and for universal couplings −0.25 gn/gp 0.2. In the following, we adopt the values from [13] and take gp/gn = −4.1 for Yukawa-like couplings and gp/gn = −16.4 for universal couplings.
Substituting these values into the expression for σv χχ→qq in Eq. (5.1), we find that for any value of m A , the couplings favoured by DAMA predict an annihilation cross section significantly larger than the thermal cross section, i.e. σv χχ→qq 2.5 · 10 −26 cm 3 /s. In other words, to account for the DAMA modulation, the couplings must be so large that DM annihilates too efficiently in the early Universe and will be underproduced even when we neglect the direct annihilation into pseudoscalars. The presence of this additional annihilation channel will only make the tension worse. Assuming that the pseudoscalar couples only to the third generation of quarks (a scenario advocated in [13]) also does also not help to reduce the tension.
We conclude that within our model the DAMA signal is incompatible with the assumption that DM is a thermal relic. Even if we were to invoke a different production mechanism for DM, the resulting annihilation cross section would still be so large that the favoured parameter region can be excluded by indirect detection experiments [133]. Nevertheless, the indirect detection constraints can be avoided and the observed DM relic density can be obtained if the dark sector has an initial particle-antiparticle asymmetry similar to the baryon asymmetry of the visible sector. In the presence of such an asymmetry, all anti-particles annihilate away so that the final DM number density is entirely determined by the asymmetry [134][135][136][137][138].
While the constraints on the DM annihilation cross section can thus be avoided, the flavour constraints presented in Sec. 4 still apply. Of course, these constraints only probe the quark coupling g q . Nevertheless, we cannot make g χ arbitrarily large and therefore g q cannot be arbitrarily small if we want to keep the product g q g χ fixed to the value preferred by DAMA. In Fig. 9 we consider the extreme case g χ = √ 4π and show that even then, the values of g q required to account for the DAMA modulation are excluded by many orders of magnitude. Fig. 9 also shows the naive extrapolation of contact interactions down to small mediator masses. In agreement with [13] we find that this line crosses the relic density constraint around m A ∼ (30-40) MeV.
Let us briefly discuss whether other possible interactions can improve the agreement between DAMA and the flavour constraints. In this context it is important to note that the enhancement of DAMA compared to LUX is largely a result of how the pseudoscalar couples to SM quarks, i.e. it is mostly independent of the DM-pseudoscalar interactions. Therefore a possible modification of the dark sector is to consider a CP -violating coupling between DM and the pseudoscalar, since the resulting coupling structure between DM and nuclei is the same as in the CP -preserving case [139]. We therefore now introduce an additional CP -violating coupling between DM and the pseudoscalar: Since the term proportional to g S χ violates CP , one would expect that g P χ g S χ , so that this additional term does not significantly change the freeze-out of DM. In particular, g S χ does not induce s-wave annihilation of DM into pseudoscalars. Moreover, the contribution of g S χ to the annihilation of DM into quarks is p-wave suppressed and hence completely negligible. Nevertheless, even a small CP -violating coupling will significantly change the predictions for direct detection experiments [10]. The reason is that the additional term drastically reduces the momentum suppression. While we previously considered differential event rates proportional to (g P χ ) 2 g 2 N q 4 /(16 m 2 χ m 2 N ), the new term will introduce an additional contribution proportional to (g S χ ) 2 g 2 N q 2 /(4 m 2 N ) (see Appendix D). This new contribution will dominate, and therefore enhance the differential event rate, as soon as g S χ /g P χ > q/(2 m χ ) ∼ 10 −3 . The modified momentum dependence implies that experiments probing large momentum transfer receive a smaller enhancement than for the case g S χ = 0. Indeed, for m 2 A q 2 , the cross section is proportional to 1/q 2 , thus favouring experiments with low energy thresholds and light target materials. In the case of DAMA, we find that for small mediator masses, DM scattering on sodium always dominates over scattering on iodine, even for heavy DM. Consequently, we no longer obtain two separate best-fit DM masses corresponding to scattering on the two different isotopes, but rather there is now only the low-mass solution with m χ 10 GeV.
In Fig. 10 we show the DAMA best-fit region for this DM mass and two different scenarios. In the left panel, we fix g S χ = 0.01, so that the CP -violating coupling plays no role in cosmology. Furthermore, we assume that the pseudoscalar has universal couplings to all quarks and that CP is not violated in the visible sector. In the right panel of Fig. 10, we consider a more optimistic choice, namely g S χ = 0.1. As long as g S χ 0.3, DM can still be a thermal relic, provided g P χ ∼ g S χ , and there are no severe constraints The left-hand side assumes universal quark couplings and g S χ = 0.01, the right-hand side assumes couplings only to the third generation and g S χ = 0.1. The (Na) indicates that scattering in DAMA is off sodium. The flavour constraints exclude the DAMA region by several orders of magnitude. from DM self-interactions. We make the interesting observation that for sufficiently low pseudoscalar masses, the enhancement for small momentum transfer is sufficient to reconcile DAMA and LUX. However, even if we make the additional optimistic assumption that the mediator couples only to the third generation of quarks, the parameter region favoured by DAMA remains solidly excluded by flavour constraints. Thus even a CP -violating coupling (together with additional optimistic assumptions) is insufficient to explain the DAMA modulation while at the same time evading flavour constraints.

Fermi Galactic Centre excess
Using data from Fermi-LAT, a number of independent groups have reported an excess in gamma rays at the Galactic Centre above the expected astrophysical emission [14][15][16][17][18][19][20][21][22]24]. The excess is largest for gamma-ray energies around a few GeV and it has many features, including the morphology and radial profile, expected from DM annihilating to SM fermions. Previous analyses have shown that the preferred annihilation cross section is around 10 −26 cm 3 /s for annihilation into SM fermions. Uncertainties in the DM halo parameters translate to about a factor five uncertainty in the annihilation cross section [24], so we do not quote a more precise value.
A majority of the proposals to explain the Galactic Centre excess within a particle physics model have included DM annihilation into SM fermions mediated by a pseudoscalar mediator [11][12][13][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40]. This annihilation process is s-wave and the annihilation cross section compatible with the excess are easily achievable. An advantage of a pseudoscalar mediator, which we have discussed above, is that scattering at direct detection searches  Figure 11. The parameter region favoured by the Galactic Centre excess (green). As before, g χ has been fixed by the requirement to obtain the correct DM relic abundance. In all cases, an interpretation of the Galactic Centre excess with a pseudoscalar mediator requires m A 5 GeV. The lighter region corresponds to σv χχ→qq > 2 · 10 −26 cm 3 /s, the darker to σv χχ→qq > 10 −26 cm 3 /s. is suppressed. It is therefore relatively straightforward to avoid constraints from these experiments.
To determine the preferred DM mass for the different coupling structures, we generate the gamma rays from DM annihilation with Pythia 8.186 [140] and fit to the spectral shape of the Galactic Centre excess data in [22,24], which is sensitive to the mass, but not the normalisation of the signal. We find m χ = 43.5 +5.2 −4.5 GeV, m χ = 44.9 +5.3 −4.6 GeV, m χ = 28.1 +3.6 −2.8 GeV and m χ = 45.9 +5.4 −4.7 GeV for the case of Yukawa-like couplings, quark Yukawa-like couplings, universal quark couplings and universal couplings only to the quark third generation, respectively.
As already mentioned, owing to the large uncertainties in the DM halo we do not attempt to reproduce a precise value of the cross section. Instead, Fig. 11 indicates the parameter regions that give σv χχ→qq > 10 −26 cm 3 /s (dark green) and σv χχ→qq > 2 · 10 −26 cm 3 /s (light green), both of which are consistent with the cross section required to fit the Galactic Centre excess. The upper value of the cross section is chosen because it is just excluded by the preliminary Fermi-LAT Dwarf Spheroidal limit presented in [128]. In this figure we have again fixed g χ by the requirement to obtain the observed relic abundance. Therefore the dark green region indicates parameter space that can fit the Galactic Centre excess, be consistent with the preliminary Fermi-LAT Dwarf Spheroidal limit and obtain the observed relic abundance. This demonstrates a second advantage of DM models with a pseudoscalar mediator. The s-wave annihilation of DM into SM fermions can be below the thermal value (i.e. 2.5 · 10 −26 cm 3 /s) so that it is not in tension with the Fermi-LAT Dwarf limits, while the larger p-wave annihilation into pseudoscalars then ensures that the observed relic abundance is achieved.
Comparing the Galactic Centre excess region with the bounds from rare decays presented in Sec. 4, we see from Fig. 11 that an interpretation of the Galactic Centre excess in terms of a pseudoscalar mediator requires m A 5 GeV. For mediator masses above this, we also see that it is impossible to obtain large enough DM self-interaction cross sections or large enough event rates in direct detection experiments to confirm these models with these search strategies.

Conclusions
Pseudoscalar mediators coupling the visible and dark sectors are interesting from both the model-building and phenomenological perspectives. In this article we have investigated constraints on light pseudoscalar mediators using results from flavour physics, which is less studied as a tool to constrain theories and properties of dark matter. In particular, we have calculated the branching ratios for various rare meson decays resulting from the loop-induced flavour-changing couplings of the pseudoscalar. In this context, we have also discussed general theoretical problems that can arise from assuming an arbitrary coupling structure not consistent with Minimal Flavour Violation.
We have then focussed on the various decay channels for the pseudoscalar (see Fig. 2) and how the resulting experimental signatures can be constrained with existing searches (see Tab. 1). The limits we obtain on the couplings of the pseudoscalar to the Standard Model are summarised in Figs. 3 and 4 for a variety of pseudoscalar coupling structures: Yukawa-like (both including and excluding couplings to leptons), universal to quarks only and universal to third quark family only. The case of invisibly decaying pseudoscalars is also strongly constrained (see Fig. 5).
Our results also suggest new searches which could be performed to tighten the limits on the parameter space. In particular, a search for B → Kγγ would set constraints on models where the mediator couples only to quarks and has a mass below the three-pion threshold. Displaced vertex searches for K L → π 0 + − and B → K + − would also help to improve limits for smaller mediator couplings which lead to longer decay lengths.
Cosmological and astrophysical measurements, on the other hand, allow us to set constraints on the direct couplings of such a pseudoscalar to dark matter and on the interactions between dark matter and Standard Model quarks mediated by it. In particular, it is generally possible to adjust the couplings such that dark matter freeze-out yields the observed dark matter relic density (see Fig. 6). Doing so allows us to directly compare cosmological and astrophysical constraints with the bounds from rare decays and beam dump experiments and the results from traditional searches for dark matter in direct and indirect detection experiments. We present this comparison in Fig. 8 and find that the flavour constraints are substantially stronger. While we have focussed on Dirac dark matter, similar results also hold for the Majorana case.
From these results we may draw a number of interesting conclusions. Firstly, a light pseudoscalar mediator can induce strong dark matter self-interactions via non-perturbative effects. These effects will be largest for small mediator masses, which is exactly the parameter region most tightly constrained by rare decays. Consequently, large self-interactions will only be viable if the pseudoscalar couples very weakly to Standard Model particles, so that it does not seem possible to obtain both large self-interactions and at the same time a dark matter signal from direct or indirect detection experiments given current bounds (see Fig. 8). Due to the complexity of the pseudoscalar potential, we have not attempted to fully characterise the regions in parameter space which may explain the various small-scale structure discrepancies but leave this problem to future work.
Secondly, pseudoscalar mediators have also been of interest since they naturally lead to suppressed event rates in direct detection experiments ('Coy Dark Matter'). While this suppression helps to evade the strong bounds from experiments such as LUX, observable event rates in direct detection experiments may still be obtained if the mediator is sufficiently light. In particular, it was recently suggested that a O(10 2 ) MeV pseudoscalar mediator could explain the DAMA modulation. Our results not only rule out such an interpretation, but indeed the possibility of any direct detection signal observed in the foreseeable future being due to pseudoscalar exchange (see Fig. 9).
Finally, while the direct detection cross sections for pseudoscalar mediators are suppressed, the corresponding annihilation cross sections are not, making them of interest in explaining the Galactic Centre excess seen in the Fermi-LAT data. We find that in order for such an explanation to be consistent with flavour constraints, the mediator mass must be greater than the mass of the B mesons i.e. m A 5 GeV (see Fig. 11). Improved measurements of BR(B s → µ + µ − ) provide an exciting opportunity for extending this bound to larger mediator masses.
While we have focussed in this article on pseudoscalar mediators, rare decays should also set strong constraints on light scalar mediators, which have also been studied in the context of self-interactions. Indeed, our work suggests that a detailed comparison of the two cases together with an analysis of the pseudoscalar potential will be very relevant in this context. A dedicated in-depth study of measurements from proton beam-dump experiments would also be highly interesting. Finally, it would certainly warrant further investigation how the particular coupling structures of the pseudoscalar considered in this paper may be obtained from a more complete high-energy theory and how our calculations would be modified in a given UV completion.
At first sight, flavour physics appears unrelated to the puzzle of dark matter. Whenever the mediator of the interactions between dark matter and Standard Model particles is light, however, constraints from rare decays can give very relevant and highly complementary information. In such cases our study demonstrates the need to take these constraints into account when constructing phenomenological models in order to predict or explain signals in direct or indirect detection experiments. Clearly, input from flavour physics is a valuable and underappreciated tool in the challenge to identify the properties of dark matter.

A Rare meson decays
In this appendix we provide the expressions for the partial decay widths of various rare meson decays in terms of the effective couplings h S qq and h P qq defined in Sec. 2.1.

Kaon decays
Under the assumptions of CVC and PCAC, the decay width for K + → π + A is [63]: For a more complete calculation, the expression above should be multiplied by a form factor |f K + 0 (m 2 A )| 2 but this form factor is close to unity [141] so we neglect it. Note that the pseudoscalar coupling proportional to h P ds does not contribute to this decay. The decay width for K L → π 0 A can be written as [141][142][143]: 10 There is a similar expression for the decay B → K A [56,65,144]: This time, however, the form factor is approximately f B 0 0 ∼ 0.3-0.4 and therefore cannot be neglected [145]. Here we use the parameterisation from [146]: , which should be accurate within 10%.
To avoid these theoretical uncertainties, it is possible to study the more inclusive decays B → X s A, where X s can be any strange meson. One then finds [56] Γ(B → X s A) = 1 8π The decay B s → A * → µ + µ − will give relevant constraints for the case of non-vanishing coupling to muons, e.g. for Yukawa-like couplings. In order to compare the contribution of the pseudoscalar mediator to the SM prediction, it is convenient to write the amplitude in the form where C P is given in terms of the effective coupling h P as . (A.7) We then obtain the simple relation [147] BR where C SM 10 = −4.103 and we have neglected terms due to the width difference in the B s system, which lead to a correction of about 10% [148].
Substituting the expression for C P we therefore obtain For Yukawa-like couplings, g µ should be replaced by g Y √ 2 m µ /v.  Figure 12. The form factor F(m A ) for Υ decays (solid purple). We assume that the QCD and relativistic effects factorise such that F(m A ) = F QCD · F rel . Green dashed and cyan dotted show F QCD and F rel respectively.

Radiative Υ decays
In contrast to the decays considered above the radiative decay Υ → γA does not probe the flavour-changing couplings h S,P qq , but the tree-level coupling g b , where g b = g Y √ 2 m b /v for Yukawa-like couplings and g b = g q for universal quark couplings. The easiest way to remove theoretical uncertainties is by considering the ratio [149] BR(Υ(nS) → γA) where F(x) is a form factor which parameterises the effects of higher order QCD contributions [150,151] and relativistic corrections. 11 We assume that the QCD effects and relativistic effects 'factorise', i.e. that F = F QCD · F rel . We take the QCD corrections for a pseudoscalar from [68,150], keeping in mind that these corrections are accurate only if m A is not too close to m Υ . Following [68] we write where C F = (N 2 − 1)/(2N ) = 4/3, α S (m Υ ) ≈ 0.184 [154] and F A (x) can be extracted from Fig. 1 of [68] (see also [150]). For m A m Υ we find F QCD (x) ≈ 0.5, in agreement with [69].
For large pseudoscalar masses, on the other hand, the dominant effect comes from the relativistic corrections, which suppress the effective form factor significantly. We take these corrections from [155] for m A 7 GeV, and from [156] for m A 7 GeV. Note that these corrections were calculated for a scalar rather than a pseudoscalar, but lacking an explicit calculation for the latter we assume both cases to be similar. We show the individual contributions and the total form factor F(m A ) in Fig. 12.

B Decays of the pseudoscalar mediator
In this appendix we discuss the various decay channels of the pseudoscalar A (see also [56,57]). As before, we use g f to denote the coupling of A to SM fermions. For Yukawa-like couplings one has g f = g Y √ 2m f /v, which depends on the fermion mass m f .

Leptonic and Photonic Decays
For m A ≤ 3 m π , the pseudoscalar can only decay into pairs of leptons and pairs of photons. One finds [157] Γ For universal quark couplings and small pseudoscalar masses, light quarks give the dominant contribution to the decay of the pseudoscalar into photons. The partial decay width therefore depends sensitively on the assumed values of the quark masses in the loop (i.e. the values of τ used to evaluate F A ). Instead of using the current quark masses, we take τ u = τ d = m 2 A /(4 m 2 π ) and τ s = m 2 A /(4 m 2 K ). We have checked that this choice approximately reproduces the results from chiral perturbation theory [158] for the scalar case.

Hadronic decays
When m A > 2 m π , decays into hadrons become kinematically allowed. However, since the decay A → ππ is forbidden by CP symmetry, the pseudoscalar can only decay into threebody final states, such as πππ for m A > 3 m π . This observation differs from the results of [70], which are based on NMHDECAY [159]. In these references, the pseudoscalar hadronic decay width is taken to be the same as the A → gg partial width even below Λ QCD , and in fact all the way down to m A = 0 MeV. Nevertheless, once m A > 3 m π , there will be hadronic decays. For the case of Yukawalike couplings we use the perturbative spectator model to obtain an estimate of the partial decay widths. In contrast to the case of a scalar boson, the decay of a pseudoscalar proceeds via s-wave rather than p-wave. Consequently, we replace the factor (1 − 4m 2 π /m 2 A ) 3/2 /(8π) appearing for a SM-like Higgs boson by the phase space for isotropic three-body decays, ρ(m A , m π , m π , m π ), to obtain The parameters for the perturbative spectator model are inferred from matching to chiral perturbation theory at about 1 GeV. We follow [68] and use m d = m u = 50 MeV. An analogous expression is obtained for the decays A → KKπ and A → DDπ, e.g.
where we take m s = 450 MeV [68]. Finally, there are also loop-induced decays into gluons, which again hadronise into (at least) three pions. To obtain an approximate expression for this decay, we consider k heavy quarks, where we take k = 4 for 3 m π < m A < m K and k = 3 for m A > m K . Taking furthermore into account that the pseudoscalar form factor asymptotically yields F A (0) = 2, while the scalar form factor gives F S (0) = 4/3 we obtain where we take α s = 0.47 [68]. For the case of universal quark couplings, we do not attempt to derive corresponding expressions. Instead, we only assume that the partial width for decays into hadrons dominate over the one for decays into photons above the three-pion threshold. This assumption is fully sufficient, since our bounds are not sensitive to the total width of the pseudoscalar, as long as it is small compared to the mass so that the narrow width approximation is valid.

C Pseudoscalar total decay width and lifetime
For sufficiently small values of m A , more specifically m A < 2 m µ for Yukawa-like couplings and m A < 3 m π for couplings only to quarks, the pseudoscalar will typically have a very small decay width, and consequently a very long lifetime. In addition, light pseudoscalars produced in meson decays will often be highly boosted so that their decay length can become comparable to the spatial dimensions of the detector. Such a long decay length can lead to two important effects: displaced vertices and escaping particles.
If a pseudoscalar is produced with momentum p A in the laboratory frame, its decay length will be [57,99] Defining l max (p A ) as the spatial extension of the detector from the interaction point in the direction of p A , the probability for such a particle to escape from the detector without decaying is then given by To determine the total probability for a pseudoscalar to escape, we need to multiply this expression with the predicted momentum distribution in the laboratory frame, f (p A ), and integrate over p A .
If the mother particle (which can be K, B or Υ) is approximately at rest in the laboratory frame and for a two-body decay, the expression above can be significantly simplified, because p A ≡ |p A | is fixed. We then obtain In many of the experiments that we consider, however, the mother particles will travel in a collimated beam with high energy. In this case, Eq. (C.3) will remain valid in the transverse direction, with l max replaced by the transverse dimension l T,max . In the direction of the beam there is an additional factor 1/γ M in the exponent reflecting the boost of the mother particle. For typical forward detectors, such as NA48/2, this additional factor is precisely cancelled by the larger spatial extend of the detector in the direction of the beam. Consequently, we can use the same expression also for decays along the direction of the beam, taking the transverse dimensions of the detector l T,max instead of l max everywhere. Nevertheless, the approximation in Eq. (C.3) is not sufficient for high-energy colliders like LHCb, where we have to include the factor 1/γ M explicitly. We will determine an appropriate value for γ M below. If the pseudoscalar produced in a meson decay escapes from the detector without decaying, one would typically only observe a single decay product with large unbalanced momentum. These kinds of decays are strongly constrained, giving us a unique opportunity to test pseudoscalars with very small couplings. Another possibility is that the pseudoscalar decays inside the detector, but at a visible distance from the interaction point. While such displaced vertices are in principle a very promising way to search for weakly-coupled pseudoscalars, most existing searches will discard these events as background, since the event reconstruction would yield a vertex with too low quality.
If p A is the momentum of the pseudoscalar in the laboratory frame and l min (p A ) is the vertex resolution in the direction of p A , the probability for the decay of A to yield a sufficiently high quality vertex is approximately As before, we will evaluate this expression in the rest frame of the mother particle. Again, it is a reasonable approximation for forward detectors to use the transverse vertex resolution l T,min instead of l min (p A ). 12 We then find To check our approximation, we have compared our results for P prompt with the more detailed study of displaced vertices in [99]. We find that for BELLE the predictions agree well without the need to introduce an additional factor γ M > 1. In the case of LHCb the B mesons are much more boosted, so our approximation is no longer valid. Nevertheless, we can recover the values of P prompt shown in Fig. 4 of [99] by setting γ M ∼ 20, which corresponds roughly to the average boost of the B mesons.
Although it is possible that the lifetime of the pseudoscalar is such that the fraction of decays from displaced vertices is large, there will always be either a sizeable fraction of prompt decays or a sizeable fraction of escaping particles. Obviously, the two requirements are highly complementary in the sense that it is impossible to reduce the fraction of escaping particles (e.g. by increasing the coupling strength) without at the same time increasing the fraction of prompt decays. By combining searches for both prompt decays and missing momentum, we are able to probe many orders of magnitude in coupling strength.

D Direct detection from pseudoscalar exchange
In this appendix we derive the scattering cross section at direct detection experiments for a pseudoscalar A that interacts with the DM and SM particles through the interaction terms in Eqs. (2.1) and (2.2). Let us for the moment make the usual assumption that the pseudoscalar is heavy compared to the typical momentum transfer q in DM direct detection experiments, i.e. m 2 A q 2 , where the momentum transfer is related to the target nucleus mass m T and the recoil energy E by q = √ 2m T E ∼ 100 MeV. In this case, the pseudoscalar can be integrated out and the effective DM-quark interaction is L q = c qχ iγ 5 χqiγ 5 q , (D.1) where c q = g χ g q /m 2 A . Here we make no assumption on the coupling structure of g q . Following the notation from [124], the DM-nucleon interaction is given by (D.4) The spin-averaged matrix element can therefore be written as and C(j χ = 1/2) = 1. The from factors F N,N Σ for various target nuclei are tabulated in the appendices of [124]. Bringing these results together, we obtain our final result for the scattering cross section: which agrees with the result in [13].
For the parameters we consider, the mass of the pseudoscalar can be comparable to (or even smaller than) the typical momentum transfer. In this case, the overall pre-factor m −4 A (absorbed into the coefficients c N ) must be replaced by (m 2 A + q 2 ) −2 . Defining which is identical to the expression above for m 2 A q 2 , but becomes independent of both q 2 and m 2 A for m 2 A q 2 .

CP violating interaction
We also briefly discuss the case where the DM and pseudoscalar are coupled with a CPviolating coupling. Integrating out the mediator leads to the effective DM-quark interaction LC P = d qχ χqiγ 5 q , (D.10) where d q = g S χ g q /m 2 A . Performing the same procedure as in the case above, we obtain the final result dσC P dE = m T 8π where the effective nucleon coupling g N is again given by Eq. (D.8).