Dark Matter Bound State Formation in Fermionic $Z_2$ DM model with Light Dark Photon and Dark Higgs Boson

If fermionic dark matter (DM) is stabilized by dark $U(1)$ gauge symmetry that is spontaneously broken into its subgroup $Z_2$, the particle contents of the model becomes very rich: DM and excited DM, both of them are Majorana fermions, as well as two dark force mediators, dark photon and dark Higgs boson are naturally present due to the underlying dark gauge symmetry. In this paper, we study the DM bound state formation processes within this scenario, assuming both dark photon and dark Higgs are light mediators and including the effects of excited DM. The Goldstone boson contributions to the potential matrix in the Schr\"{o}dinger equations are found to be important. The emissions of a longitudinal vector boson (or somehow equivalently a Goldstone boson) during the DM bound state formations are crucial to induce a significant reannihilation process, reducing the dark matter relic abundance. Most of the stringent constraints for this kind of dark matter considered in the literature are simply evaded.


Introduction
A number of cosmological observations through gravitational interaction indicate that about 25% of the energy budget of the current Universe consists of nonbaryonic dark matter (DM). So far almost nothing is known about the physical nature of DM: the number of DM species in the universe, their masses and spins, and their interactions among themselves and with the Standard Model (SM) particles. These can be revealed only by nongravitational observation of physical effects related with DM particles. And various types of DM searches have been performed all around the globe.
From the view point of particle physics described by quantum field theory, one of the most important and fundamental properties of DM is that it should be absolutely stable or long-lived enough in order to make DM of the Universe. Let us remind ourselves that electron stability in the SM is related with unbroken U (1) em and massless photon. The longevity of proton is also attributed to the baryon number being an accidental global symmetry of the SM and being broken only by dim-6 operators. Likewise, one can assume that the absolute stability of DM is due to some local dark gauge symmetry, and long-lived DM is due to some accidental global symmetry of the underlying dark gauge symmetry 1 . Then this class of DM models come with extra particles such as dark photon (or dark gauge bosons), dark Higgs and sometimes excited DM because of the underlying local dark gauge symmetries. Depending on the mass scales of these new particles and their interaction strengths, one can imagine new interesting phenomenology would be anticipated in the dark sectors. In particular if some of them are light, they can play the role of light mediators which are often introduced to the DM phenomenology in order to solve some puzzles in the vanilla CDM paradigm. In short one can introduce light mediators in order to keep gauge invariance, which is well tested principle in the SM.
In the literature, based on the weakly interacting massive particle (WIMP) models 2 , scalar/vector light force mediators interchanged among the dark matter particles are sometimes introduced as a solution to some problems that the vanilla CDM models encounter. Such models are called the self-interacting dark matter (SIDM). The attractive/repulsive forces between the two dark matter particles can enhance/reduce the annihilation cross section times the relative speed (σv) which is an important input for determination of both thermal relic density and indirect detection signatures of DM. This is called the Sommerfeld effect 3 . If the annihilation rate of the dark matter particles in our galaxy is boosted by this effect, SIDM might become a solution to the positron excess observed by PAMELA and AMS-02 [17][18][19]. SIDM also provides the potential to resolve the "missing-satellite problem" [20,21], the "core-cusp problem" [22,23], and the "too-big-to-fail problem" [24][25][26]. These problems are beyond the scope of this paper, and due to the controversaries on these issues [27][28][29], we do not consider these effects, but only point out that the SIDM models are stringently constrained by the CMB distortion observations [30]. The cluster observations and similations, e.g., the bullet cluster also constrains the self-interaction parameters of the dark matter particles [31][32][33][34][35][36][37].
The Sommerfeld effects are the resonant effect of the so-called "zero-energy" bound state of a dark matter particle pair. If the interaction is sufficiently strong and the mediator is sufficiently light, the dark matter can also form a real bound state while emitting a mediator particle to keep the energy conservation. Many of the models are built and calculated 4 . Ref. [38,39] had shown the general derivations of the dark matter bound state formations 1 There are other possibilities: lightest supersymmetric particles (including massive gravitino case) become good cold dark matter (CDM) if R-parity is assumed to be consderved. Or lightness of DM particles can make them long-lived enough, e.g. light axions or sterile neutrinos. We do not consider these possibilities since there are no light force mediators that can make DM bound states, which is the main theme of this paper.
2 See Ref. [1] for example 3 For the original work by A. Sommerfeld, see [2]. And for some early applications in the dark matter, see [3][4][5][6][7][8][9][10][11][12][13][14][15][16] 4 See Ref. [38] for the references on the early dark matter bound state models therein on various situations with the tools of Bethe-Salpeter wave functions. For the applications on the WIMP model, Ref. [40] calculated the modified Boltzmann equation including the contributions from the bound state formations. Its Eqn. (34) clearly shows the competition of the decay and dissociation of the bound state particles. Ref.  are the recent papers which had built or calculated the dark matter models in which bound state can be formed.
In this paper, we shall consider a case where DM is absolutely stable due to the unbroken Z 2 symmetry assumed to be the remnant of an underlying local U (1) dark gauge symmetry 5 . In this case, both the dark photon and dark Higgs boson are mediators of the dark force. It should be interesting to study the dark matter bound state formation in addition to the Sommerfeld enhancement in such a scenario. We will find that the emission of the longitudinal dark photon plays a crucial role compared with the Refs. [38,39]. Together with the situation of the dark Higgs boson emission, these processes are controlled by the wave function "overlap" I, and its zeroth order expansion is no longer zero in our case, unlike in Refs. [38,39]. This will affect significantly the DM relic density after the first-step annihilation of the DM particles, and such a second epoch process is called the "reannihilation" [82][83][84][85][86] in the literature. In our model, the reannihilation mainly occurs in the co-annihilation channel of the dark matter and its nearly-degenerated partner. Therefore, the suppression of the relative number density of the heavier component automatically switches off the re-annihilation before x f 10 6 . Such an early re-annihilation does not disturb the Big-Bang Nucleosynthesis (BBN), as well as the cosmological epochs afterwards.
Another possibility for the DM stability is to assume a global dark symmetry. For example in Ref. [87], S. Weinberg introduces a global dark U (1) symmetry that is spontaneously broken into its Z 2 subgroup. In that framework, the Goldstone boson becomes the fractional cosmic neutrinos (or dark radiation) and is constrained by CMB and other cosmological observations. However DM-stabilizing global symmetry may be broken by nonrenormalizable operators, especially due to the gravity effects, which may induce fast decay of the DM particle with the O(10) GeV or heavier masses. This issue could be simply evaded by implementing a global dark gauge symmetry to its local version. Compared with the global dark U (1) symmetry, the local U (1) dark gauge symmetry could guarantee the DM stability even in the presence of the nonrenormalizable higher dimensional operators (see discussions in Refs. [75,88]). And due to the existence/absence of the dark photon, the resulting phenomenology varies significantly in these two situations. For example, the viable mass ranges of the DM particles would be completely different in both cases 6 .
In some literature, one introduces the soft-breaking term to explicitly break the local or global dark U (1) symmetry without a detailed mechanism [89,90], or considers nonrenormalizable interactions [91]. These models suffer from the potential risk to break the unitarity 7 . One can cure this problem by introducing the dark Higgs mechanism to spontaneously break the dark U (1) symmetry and keeping only the renormalizable couplings 5 The case of the scalar DM model with local Z2 and Z3 symmetries were considered in Refs. [74] and [75,76], respectively. And similar models have been discussed in Ref. [77][78][79][80][81] in different contexts. 6 Work in preparation, with Seungwon Baek, Toshinori Matsui and Wan Il Park. 7 See Appendix A for a detailed calculation.
between the dark Higgs and the fermionic dark matter. The paper is organized as follows. In Sec. 2, we show our relied Lagrangian of the Z 2 Fermionic model. Some basic features of this model are also discussed. In Sec. 3, we classify the bound states by their quantum numbers. The potentials in the Schroedinger equations are also derived. These potentials are generated by the mediation of the dark photon and dark Higgs boson exchanged between the (excited) DM particles, and in particular, we derive the potential terms induced by the longitudinal dark photon, or equivalently the Goldstone boson for the first time to our best knowledge. In Sec. 4, we illustrate the methods to calculate the bound state formation cross sections and the bound state decay induced by the component annihilation in our model. The modified Boltzmann equations are also demonstrated. In Sec. 5, we present the numerical results based on the formulas in the previous sections. Experimental constraints and comparisons with some earlier literature which ignored the longitudinal dark photon are also presented. Finally we summarize in Sec. 6 with future prospects. A number of technical issues are described in detail in Appendices.

Model Setup
We start from a dark U (1) model, in which there is a Dirac fermion χ with a nonzero dark U (1) charge Q χ . We also introduce a complex dark Higgs field Φ, which takes a nonzero vacuum expectation value and thus breaks the dark U (1) symmetry into a dark Z 2 symmetry with a judicious choice of its dark charge Q Φ .
Then the gauge invariant and renormalizable Lagrangian for this system is given by where g is the dark coupling constant, and Q is the dark charge that Φ and χ takes. Note that we made a judicous choice Q Φ = 2Q χ in order to validate the (yΦχ C χ + h.c.) terms. Here χ C means the charge conjugate of the Dirac field χ, and B µν = ∂ µ B ν −∂ ν B µ is the field strength tensor associated with the SM U (1) Y hypercharges. The kinetic mixing term (∝ ) and the Higgs portal interaction (∝ λ ΦH ) communicate the dark and the standard model sectors. Various experimental results put the tight constraints on them, especially for form the dark photon searches. The λ ΦH is constrained from both the Higgs exotic decay width collider searches and the Higgs-portal dark matter direct detections. However, an appropriate value of λ ΦH within the constraints is enough to contact the dark and the standard model sectors, keeping them to be in thermal equilibrium in the early universe. We will explain the details in later discussions.
Decompose χ into two Weyl spinors, χ = χ L iσ 2 χ * R , and notice that if µ 2 < 0, Φ will take a vacuum expectation value Φ = v Φ +R+iI √ 2 . Written in the basis of χ L and χ R , the mass matrix of the fermions becomes where δm = yv Φ . After diagonalizing (2.2), we acquire and the corresponding mass matrix We can clearly see that one Dirac fermion χ splits into two nearly-degenerate majorana fermions χ 1 and χ 2 if δm m χ . The lighter one is the candidate of dark matter. Its stability is preserved by a remained Z 2 symmetry. In this paper, without loss of generality, we adopt δm > 0 thus χ 1 is the dark matter particle.
After taking the vacuum expectation value

5)
R and dark photon acquires a positive mass The Goldstone boson I is "eaten" by the dark photon to become its longitudinal mode. However, it is convenient to apply for the "Goldstone equivalent theorem" to calculate and understand some processes. Therefore, I will appear in some following discussions. The √ 2yΦχ C χ term will induce the following Yukawa terms −yR yI yI yR The dark photon interactions also become In the following we shall focus on the case of light mediators, In addition to the usual pair annihilation channels of (excited) DM when the dark Higgs degrees of freedom is absent, if we include the dark Higgs degree of freedom, besides the channels, the co-annihilation channel arises and becomes important during the freeze-out processes. s-channel Z mediated χ 1 + χ 2 → Z * → SM particles channel is suppressed by the small mixing between the Z and SM vector bosons, so it is ignored in our paper. In the current universe, χ 2 had all decayed away, so (2.11) is absent when we consider the indirect detection constraints.

Dark Matter Bound State Classification
In the parameter space m R,γ m χ , two χ 1,2 particles can form bound states by emitting R and/or γ . Before calculating the χ 1,2 bound state formation rates and their relevance to DM phenomenology, it is beneficial to discuss the dark matter bound states when dark U (1) symmetry is strictly conserved.
In the dark U (1) conserving case, the dark fermion χ, together with its anti-particle χ, are a pair of Dirac (anti-)fermions. In this case, the γ exchange mediates a repulsive interaction in the |χχ or |χχ states, while it becomes attractive in the |χχ state. Interestingly, the interchange of the boson Φ ( * ) will cause the oscillation between |χχ ↔ |χχ states, unlike the usual case that scalars will always induce an attractive force. The total wave function of such kinds of fermion pairs can be written as where we define as the wave function in the coordinate representation, and x is the relative distance between the two particles. The Schroedinger equations are given by where which contains both the contributions from the real and imaginary parts of the Φ ( * ) . Here m Φ is the mass of Φ in the U (1) symmetry conserving case. Diagonalizing the V d will simplify each equation in the (3.3 to four decoupled functions. Although V s has already been diagonalized, we still rotate the basis for further discussions. The four corresponding particle states with their potential can then be written as It will be convenient to rewrite the above bases in the |χ i χ j forms. Notice that if we define the 4-spinor with the Weyl spinors defined in (2.3), (3. 6) we find that the four-spinor χ and χ C can be written in the form of This prompt us that Now we rewrite (3.5) to be From (3.9) we can clearly see that, only 1 √ 2 (|χ 1 χ 1 + |χ 2 χ 2 ) feels a completely attractive potential. The interactions in 1 √ 2 (|χ 1 χ 1 + |χ 2 χ 2 ) are completely repulsive, thus a bound state cannot be formed.
Remember that the total state of a fermion pair should be anti-symmetric when we interchange the two components. From (3.9) we can clearly see the symmetry of the particle wave functions. Therefore, L + S should be even combined with the 1 √ 2 (|χ 1 χ 1 ± |χ 2 χ 2 ), 1 √ 2 (|χ 1 χ 2 + |χ 2 χ 1 ), and L + S should be odd combined with the 1 √ 2 (|χ 1 χ 2 − |χ 2 χ 1 ). In the dark U (1) spontaneously broken case, δm = 0. It is then convenient to write the Schroedinger equation in the |χ i χ j case. The total wave function becomes (3.10) If we define the Schroedinger equation can be written as (3.14) Here three potentials are defined as follows: e −m γ r r , (3.16) One can compare (3.14) with (3.9) for a validation.
term in the γ propagator, where k µ and k ν finally contract with the on-shell spinors, inducing a (m χ 1 − m χ 2 ) 2 term, which is proportional to with the m 2 γ in the denominator, and Q 2 Φ g 2 cancels with the coupling constants, we finally realize that V γ L is contributed from the longitudinal polarization of γ .
Another analysis of the V γ L is through the Goldstone equivalent theorem. This term can be understood originating from the exchange of the Goldstone boson I. In the non-breaking limit, v Φ → 0, one can find that the (3.9) is recovered. 8 For the (3.13), the result is the same as the δm = 0 case, because from the structure of V d in the (3.14), we can see the extra δm appeared in both of the diagonal elements does not disturb the diagonalizing process of the wave functions. Such a δm here only shift the total energy, so the (3.13) can still be decoupled into two independent equations by diagonalizing the V d . A standard "shooting-method" is applicable for these equations. However, for the (3.12), the final wave functions will be a mixing between the |χ 1 χ 1 ± |χ 2 χ 2 basis. In the following text, we will address the method for solving these equations in detail.
The general wave functions described by the Eqn. (3.12) are given by , where κ = µα , and α is some reference value which reflect the typical dark interaction strength. Then, we have where is the radial wave function vector, and the potential term is given by Here are the relative interaction strength compared with the α , ξ 1 = κ m R and ξ 2 = κ m γ are the characteristic distance of the Yukawa potentials. γ is the relative eigen-energy in the unit of the "Bohr energy", and is defined by where µ is the reduced mass, and E is the eigen-energy of the Schroedinger equation. δγ 2 is the reduced 2δm, For the value of α , we recommend α = Max{ The method for solving these equations on bound states is based upon the "shooting method". The boundary conditions at the x → ∞ are replaced with some finite values. Here we adopt χ nl (x = 12) = 0. Without loss of generality, we assume δγ > 0. Notice that if x is sufficiently large, the asymptotic behaviours of the χ 1,2nl become This means that χ 2nl drops faster than the χ 1nl in the x → ∞ condition, so a universal infinite boundary condition χ nl (x = 12) = 0 will cause the numerical instability in the χ 2nl calculations. Therefore, we need to reduce the χ 2nl boundary area. This is done by solving the following equation to acquire x 0 > 0. Obtain x 0 = Min{x 0 , 12}, fix some γ 2 , and adopt the initial condition to use some numerical method to solve the (3.20) from x → 0 to x 0 . Then we can find the appropriate A for χ 2nl (x 0 ) = 0. Delete all the terms involving χ 2nl (x), and solve the χ 1nl (x) equation, we continue to solve the χ 1nl (x) within [x 0 , 12] area. Change different γ 2 and repeat the above process, we can finally reach χ 1nl (12) = 0 to determine the eigen value γ 2 . We need to note that when we determine the A in the first step, there might be two solutions. One is A > 0, and the other is A < 0. This can give different γ 2 's in the final step. In the δγ → 0 limit, these are exactly corresponding to the 1 √ 2 |χ 1 χ 1 + |χ 2 χ 2 and 1 √ 2 |χ 1 χ 1 − |χ 2 χ 2 states described in the (3.9). As the δγ 2 accumulates, the wave function will depart from the (3.9). This can be clearly seen from Fig. 1.
As the δγ 2 accumulates, the χ 2 χ 2 elements will be reduced in the ground-state wave functions, so the system will become more similar to the one-particle pair situations. In Fig 2, we can see that as δγ 2 increases, the bound energy parameter γ 2 approaches 0. This indicates that the χ 2 decouples in the large δγ 2 limit.
For the completeness of this section, we point out that following the similar processes, (3.13) can also be decomposed to the combinations of the angular and radial components. The radial functions are similar to (3.20), and the potential term V s,α should be replaced with where the universal energy shift δm terms are removed. One can still follow the steps described above, or beforehand diagonalize (3.28) to solve the wave functions. Combining all the descriptions above, and considering the anti-symmetry character of the fermion pairs, we can then classify the bound states with the quantum numbers characterized in Tab. 1. We had extended the traditional spectroscopic notation n 2s+1 l J symbol to n 2s+1 l s/d J ±, where "s" or "d" indicates the "same" or "different" in the |χ i χ j wave function, and ± indicates the sign of A at the origin of the wave functions.
In the following of this paper we are going to calculate the contributions from these bound states listed in Tab. 1 to the freeze-out processes.
Bound state does not exist.

Calculation of Bound State Formation Cross Section
The dark matter bound states are formed by the scattering of two free dark matter particles, with emission of γ or R bringing the extra energy away from the bound system. Now we are going to calculate the transition amplitudes. Our discussions and derivatives are based on the symbols in Ref. [39]. We omit the radiation from the interchanging mediators due to the reason described in Appendix D. The key point is to calculate the "overlap" integrals I, J , K. These integrals should be modified to be where I and K are related to the R−emission processes, while J corresponds to the γ −emission process. Compared with the Ref. [39], we discuss the two-component dark Figure 3: Diagrams emitting a R scalar boson. Notice that since the χ 1 χ 1 R and χ 2 χ 2 R couplings take different sighs, so the χ * 1l 2 ki (x)χ 1nl 1 (x) and χ * 2l 2 ki (x)χ 2nl 1 (x) should also take the opposite sign. matter model, so different signs of the coupling constants should be considered. Therefore, we introduce 2 × 2 Pauli spin matrices σ i in (4.1-4.12) to connect the two doublet wave functions, although these wave functions havbe no direct relationship with some SU (2) group. The reason for us to adopt σ 3 in the I and K has been sketched in the Fig 3. Notice that the couplings of χ 1 -χ 1 -R and χ 2 -χ 2 -R take opposite signs, so that ψ χ 1 χ 1 ↔ ψ χ 1 χ 1 and ψ χ 2 χ 2 ↔ ψ χ 2 χ 2 , or ψ χ 1 χ 2 ↔ ψ χ 1 χ 2 and ψ χ 2 χ 1 ↔ ψ χ 2 χ 1 contributions are opposite. In Fig. 4, we can see that the J + connects the crossing components of the ψ χ 1 χ 1 ψ χ 2 χ 2 and ψ χ 1 χ 2 ψ χ 2 χ 1 multiplets if we take i = 1 or 2 into Fig. 4. This is the reason why the σ 1 and σ 2 appear in the (4. 1-4.12).
Equipped with all these overlap integrals, we are now ready to calculate the transition amplitudes. In the rest frame of the final bound state system, they are given by where p R,γ are the momentums of the emitted R, γ particles respectively and M is the Figure 4: Diagrams emitting a γ . All the coupling signs are the same, but J + will induce a cross relation between the components of the ψ s and ψ d total mass of the two-body system, µ r is the center of mass reduced mass. Since the two element particles are nearly degenerate, we adopt M = 2m χ , and µ r = mχ 2 . With these amplitudes, we can utilize to calculate the differential and the total cross sections times velocity. Here P φ indicates p R or p γ . The calculation of the k → nlm + R is direct. Just take (4.13) directly into (4.16). For the k → nlm + γ cross section, Eqn. (3.3) in Ref. [39] needs to be modified because Ward identity p µ γ M µ = 0 is no longer satisfied. It should be replaced with where ∆m is the mass difference between the previous and latter fermion species in the F-F-V vertex that had been cut, M GS is the amplitude of changing the emitting γ into the Goldstone boson. It is calculated to be M GS, s→d, or d→s, k→nlm+γ Therefore, the so finally, For all of these overlap integrals, when center of mass momentum µv rel = k κ, we can use plane wave function e −i k·r 0 or 0 e −i k·r to estimate φ k in different initial states.
where j l is the spherical Bessel function. Then, after some expansions and contractions of the integrations, we acquire . This can be used to estimate (4.1-4.6). For the (4.11-4.12), we can use the Schroedinger equation to eliminate the ∇ 2 in ∇ 2 [ψ s or d,nlm ( r)], then again apply (4.22). For the (4.7-4.10), we can use the partial integration method to cast the ∇ to before e −i b · r to extract a b factor, and then apply (4.22) to calculate the remained part.
To solve (3.12) and (3.13), we start from the x → 0 boundary. The (3.27) need to be modified to where χ ik indicate the radial wave functions satisfying (3.20, 3.22, 3.28), and In the −γ 2 > δγ 2 case, the threshold χ 1 χ 1 → χ 2 χ 2 opens. There are two linearly independent solutions to (3.20) with potential (3.22, 3.28), A 1 = 0, A 2 = 0 and A 1 = 0, A 2 = 0. With these two initial conditions, and solve (3.20) to the x → ∞ limit, and adjust the absolute value of A 1 and A 2 , we acquire the asymptotic form of χ (1),(2) li corresponding to the two initial conditions, , for the A 2 = 0 boundary condition, , for the A 1 = 0 boundary condition. (4.25) Recombine the two solutions (4.25), we acquire the general solution of Eqn. (3.20) (4.26) The sine functions can be decomposed into exponential incoming and outgoing wave func- , and if we want a pure χ 1 χ 1 initial state, we acquire This is the no-incoming wave function condition for χ 2 χ 2 pair, and the incoming wave function of χ 1 χ 1 is normalized. The solution for A l and B l is (4.28) For the pure χ 2 χ 2 initial state, we have The corresponding solutions are With (4.28) and (4.30), we can also calculate the χ 1 χ 1 ↔ χ 2 χ 2 cross sections, and the χ i χ i self-scattering cross sections. However, in this paper, we only utilize A l1,2 χ (1) li (x) to calculate the wave functions' "overlap" between the χ 1 χ 1 or χ 2 χ 2 initial states and the bound states.
Following the Ref. [39] and expand e −i b· r into partial waves, we acquire for the lowest order where I t ... , upper symbol t can be empty or ±, indicates the corresponding partial wave expansions for all the (4.1-4.6), and A is the corresponding σ 3 or iσ 2 for each integrals. χ n,l (x) and χ | k|,l (x) are the 2-dimensional radial wave functions for the corresponding bound states or scattering states.
Unlike in Ref. [39], the appearance of A = σ 0,1,2,3 , and the non-orthogonal wave functions in I functions leave us non-zero For the K's, just remember that the ∇ 2 in the (4.11-4.12) can be replaced with other parts of the Schroedinger equation in (3.12-3.13), so it is easy to insert these terms in (4.33-4.38) to acquire the corresponding K ··· expansion terms. Now, we are going to calculate the J functions.
Here, again A is the iσ 2 or σ 3 corresponding to the + or − upper indices.

Pair Annihilations Inside the Bound State Particles
Once a dark matter bound state is formed, it might dissociate into two free dark matter particle again after scattering with a dark photon or dark Higgs boson within the plasma. During the dark matter freeze-out processes, the bound state formation processes are competing with the bound state dissociation processes. Only when the bound states decay rapidly enough before their dissociation can the bound state formation effectively contribute to the total annihilation processes [40]. The dark matter dissociation rate can be evaluated conveniently by (4.52), and the annihilation of the two components inside the bound state particle is the main contribution to the bound state decay processes. In this subsection, we concentrate at the algorithm to calculate the such decay widths.
To calculate the decay width, we need to calculate the squared annihilation amplitude of the particle pair in a particular wave function of relative motion, which is proportional to the ψ(0) 2 in the s-wave, and proportional to the ψ (0) 2 in the p-wave situation. However, it is more convenient to calculate the perturbative squared annihilation amplitude of a plane wave. For example, if we calculate the squared annihilation amplitude of two particles with the momentum (E 1 , 0, 0, p) and (E 2 , 0, 0, −p) travelling along the ±z direction, and the total momentum of the particle pair is zero, we write down the phase space integrated square amplitude in the form of where θ, φ are the final state phase space angles, and i indicates all of the polarization indices in the final states. Then, we need to extract the s-wave and p-wave contributions from the (4.43). This is done by "somehow" expanding the (4.43) and find the coefficient of the v 0 and v 2 terms, where v is the relative velocity of the two particles, and can be expressed by Make the expansion and take (4.45) into (4.43) to find out the v 0 and v 2 terms to acquire the s-wave and p-wave contributions to the amplitudes. We then need multiply the v 0 term with |ψ(0)| 2 , and replace v 2 term with |ψ (0)| 2 µ 2 (practically these should be |ψ , depending on the final states) to acquire the decay amplitudes.
However, we should note that E 1 and E 2 also depend on v. Directly expanding on E 1 and E 2 will also contribute to a v 2 term, which is not the real p−wave contributions. Notice that we should not expand E 1 and E 2 on v during our separation of the s-and p-wave contributions.
From the (4.21) we can see that the plane wave along the z-direction only contains the l z = 0 element in the l = 1 partial wave. We will also need the l x = 0 or l y = 0 annihilation amplitude to calculate the decay of the states in different total angular momentum j's. This is done by calculating M xi (E 1 , E 2 , p, θ, φ) and M yi (E 1 , E 2 , p, θ, φ), which are corresponding to the annihilation of two particles travelling along the ±x and ±y directions, with the total momentum of the particle pair being zero.
We also need to calculate the amplitudes of different total spins of the initial states.
where X indicates the total spin of the particle pair. For the S = 0 case, the X = σ 0 √ 2 . For the S = 1 case, the X = σ 1,2,3 √ 2 , which are corresponding to the S x = 0, S y = 0, and S z = 0 respectively. Now we are ready to calculate the annihilation amplitude for all the combinations of different |LL z SS z . Since in the L = 1, S = 1 case, (|L x = 0, S x = 0 + |L y = 0, S y = 0 + |L y = 0, S y = 0 ), (4.48) χ/χ states χ 1,2 states 2S+1 L J J CP two-body decay channels χχ/χχ Table 2: Two-body decay channels. The 2S+1 L J and J CP are the usual spectroscopy notations, indicateing the quantum numbers of the total spin S, orbital angular momentum L, total angular momentum J, C-parity, and parity under space inversion.
we can follow this to recombine the amplitudes to calculate the decay width of different |JJ Z states. In this section, since m γ m χ 1,2 , we apply the Goldstone equivalence theorem. We ignore the mass of the γ , and replace the longitudinal polarization of it with the Goldstone boson I. We also ignore the masses of the scalar particles.
Not all of the bound states can have two-body decay channels. Some are prohibited by the conservation of the quantum numbers, such as parity or the dark U (1) charge. This will be discussed in detail in Appendix E. It is convenient to analyse the two-body decay channels in the basis of χ, χ before the spontaneously symmetry breaking, and then rotate to the χ 1,2 basis for calculations. The result is shown in Table 2. The integrated amplitude of all the possible two-body final states are listed in Table 3.
For these states which do not have a two-body decay channel, we need to calculate the three-body decay amplitudes. In many of the three-body final state channels, there will be a inferred divergence in the phase space. To cancel these divergences requires the expansion of higher Fock's state |χ i χ j γ /I/R . This is equivalent to a cut-off on the phase space. We define a as the cut-off on the corner of the Dalitz-plot. Define the three momentums of the three particles in the final states to be k 1 , k 2 , k 3 ., and (k 1 + k 2 + k 3 ) 2 = 4E 2 . Since we ignore all the masses of the final state particles, the three Dalitz plot parameters are defined to be (4.49) Since x 1 + x 2 + x 3 = 2, there are only two independent parameters. The divergences usually exist in the x i = 0, x j = 0 (i = j) region. If one divergence is located in the, e.g., x 1 = 0, Table 3: Two-body decay squared matrix elements, phase-space integrated, final-state summed. The "LSJC" indicates the quantum numbers of the orbital angular momentum, total spin, total angular momentum, and C-parity. Compared with the Tab. 2, the parity is absent, because many states are no longer eigenstates of the parity.
x 2 = 0 region, we just modify the integral  Table 4. In our following calculations, we adopt a = 0.1.
With these amplitude informations, we can finally calculate the decay width where phase d(phase)|M Tab | 2 is the result multiplied by the expression listed in the "Kernel" and "Coefficient" column in Table 3 and Table 4 for each decay channels. f s is the factor for the final state identical particles. f s = 1 2 for case of two final identical particles, and f s = 1 6 for three final identical particles.

Boltzmann Equations
Now let us discuss the Boltzmann equations for our model. In principle, we need to write down the Boltzmann equations of all the components, including χ 1 , χ 2 , γ , R and all of the bound states as well as their temperatures. This will result in a set of complicated coupled equations, which is beyond our current computing resources and necessity to solve these equations numerically on this stage. Therefore, we shall simplify the situation with the following considerations:  Table 4: Three-body decay squared matrix elements, phase-space integrated, final-state summed. Here, for abbreviation, we define g = Q χ g. The "LSJC" indicates the quantum numbers of the orbital angular momentum, total spin, total angular momentum, and Cparity.
• When T γ m γ , m R , where T γ is the photon temperature of the plasma, although γ or R might have frozen out from the thermal plasma before the dark matter freezes out from them, their red-shifts as nearly massless particles let them cool down synchronously with the plasma. Therefore, the kinetic equilibrium keeps T χ = T γ = T R ≈ T γ , where T χ is the temperature of χ 1 , χ 2 , and T γ , T R are the temperatures of the γ and R.
• When T γ m γ , m R , the light components become non-relativistic and their number densities drop dramatically. The kinetic equilibrium between m γ and χ 1,2 is broken.
Also based upon the steps listed in the Ref. [40], we define where Γ X and Γ X,dis are the thermal averaged bound state decay width and the dissociation rate respectively. There definitions are where K 1 and K 2 are the Bessel functions, Γ X is the decay width calculated from (4.50), i,j = 1, 2 indicate the χ 1 , χ 2 components, n eq A (T ) is the thermal-equivalent number density of component A at the temperature T . The σv T,χ i χ j →X+γ /R is the thermal-averaged bound state formation cross section where n χ i is the number density of component χ i , σ χ i χ j →X+γ /R is calculated following the steps listed in section 4, and λ(1, a, b) = (1 − a − b) 2 − 4ab. We should note that in Ref. [40], there is also an additional term σv gR→gg ng, which should be something like, e.g., σv χ i X→χ j γ n χ i in our paper. We ignore this term because compared with the Γ X,dis , this term will be suppressed by n χ when the temperature drops below the χ i,j mass. For a pair of particle χ i χ j , define their effective cross section times velocity to be where σv χ i χ j ,anni indicates the annihilation rate considering the Sommerfeld effect. In this paper, we only consider the 2-body final state when calculating the σv χ i χ j ,anni . Although Ref. [92,93] provided a complete unitarity-preserved calculation of the Sommerfeld effect in the large boost situation, we only perform a rough numerical scan on the parameter space while missing the resonances in this paper, so the methods in Ref. [16] are sufficient. Therefore, the amplitudes are again from the Tab. 3, and calculate the cross section in the center of mass reference with where E χ i is the energy of component χ i , | p| is the magnitude of the 3-momentum of either particle in the center of mass frame, E cm is the center of mass energy of the initial particle pair. Ω dΩ cm |M χ i χ j Tab | 2 is the summation of all the annihilation channels listed in Tab. 3 with the initial particles χ i χ j . To calculate the M Tab , we need the zero-point wave functions ψ χ i χ j (0) (or derivatives ψ χ i χ j (0)) value of the scattering state. These can be extracted from the radial wave functions χ 1lk (x) and χ 2lk (x), and the coefficients A li , B li should also be multiplied for the corresponding initial states. Here we do not write the details how we acquired the Sommerfeld coefficients. These are fundamentally similar to the methods in the usual literature. Besides the amplitudes listed in the Tab. 3, we should also note that there is s-d-wave interaction term which contribute to the IR final state. This is absent in the Tab. 3, because the decay width calculations do not involve it. The "Kernel" part of the squared amplitude is calculated to be (4.56) In the "Coefficient" part, there should include the Sommerfeld boost factor "ψ(0)ψ (0)". However, the interference term is sub-dominant, so we do not consider this term and set the boost factor to be 1. Then the total effective annihilation cross section is given by where n eq = i n eq χ i . The Boltzmann equation is then written in the familiar form where H(T ) is the Hubble constant of temperature T, m χ is the mass of the dark matter, x f = mχ Tγ , and g * s is the effective degree of freedom when calculating the entropy density. Y is the effective dark matter particle number per co-moving volume. To solve (4.58), we also need to know the relationship between T χ and T γ . This requires another equation. However, in our paper, we simplify this by setting the kinetic decoupling constant T kd = min{m γ , m R }, and (4.59)

Numerical Results
Based upon the internal functions embedded in the micrOMEGAs [94,95], we solve the differential equation in (4.58). Among the parameter spaces, we choose m = 3 TeV and m = 10 TeV. ξ 1 and ξ 2 are fixed to be 15 and 100, respectively. Therefore, we vary α g = We can easily see that the Sommerfeld effect can significantly modify the relic density, while the bound state formation effects further alter the results. To see how this can happen, we plot the thermally averaged cross section times velocity and the evolution of the dark matter relic density in Figs. 7, 8.   In Figs. 7, 8, we show and compare the contributions of the perturbative calculations, the calculations considering Sommerfeld effects, and considering the bound state formation effects. We can also learn that when y takes some moderate value, the main contributions from the dark matter bound state formations mainly become effective in a late time. The main contribution in the bound state formation processes originates from the χ 1 χ 2 → 1 1 S s± 0 + γ channel. This process becomes significant because of the large longitudinal γ emissions, which is equivalent to the Goldstone emissions, and described by the (4.18). This is large due to its non-zero 0th order expansions. Finally, after T δm, the contributions from the χ 1 χ 2 channels will be suppressed according to the (4.57). Finally, the re-annihilation processes cease when z 10 5 . On the right pannel of the Figs. 7, 8, we plot a benchmark point for the y 1 situation. In this situation, the global symmetry on χ 1,2 as well as the Ward identity M µ P µ φ recovers, therefore the longitudinal contributions in the (4.19) disappears. Compared with the longitudinal contributions, the emissions of the transverse γ described by the (4.7-4.10) will be suppressed by the v 2 , so the re-annihilation process vanishes. However, a sufficiently small relic abundance of the dark matter requires a larger (Qχg) 2 4π , inducing a much larger bound state formation rate in the freeze-out epoch. At the parameter point we selected in the right panels of Figs. 7, 8, the bound state formation corrects the relic density result by several percent. If we assume that the χ 1 particles only contribute to a fraction of the now observed dark matter density, the corrections on the relic density result can be much larger.

Constraints from the Experiments
In this paper, we have ignored the λ ΦH |Φ| 2 H † H and F µν B µν terms to simplify the relic density calculations. As have been mentioned, these are the only interactions that communicate between the dark and visible sectors. Practically, λ ΦH receives loop corrections and is unable to be non-zero in all energy scales. A strict = 0 also induces a dark photon relic which is not the main topic to be described in this paper. For the non-zero λ ΦH and , the dark matter direct detection bounds constrain both of these two interactions. Higgs exotic decay data mainly confine λ ΦH , and is limited through various collider experiments. One should also be careful to avoid the big bang nucleosynthesis (BBN) to be disturbed by the long-life dark Higgs and dark photon boson decay. Although the details are beyond the discussions of this paper, we will briefly go through all these constraints to show the possibility to evade all of them. According to Ref. [96], the spin-dependent cross section for a dark matter particle scattering with a nucleon is given by where m h is the SM Higgs boson mass, m N is the mass of a nucleon, and f 0.35. µ φh is the mixing parameter between a SM Higgs boson and a dark Higgs boson. In our model, it is given by If, e.g., we would like σ SI 10 −45 cm 2 , this requires y 2 µ 2 φh 2×10 −10 GeV 2 if m φ ∼ 0.3GeV. Then, λ ΦH v Φ 1.5×10 −5 GeV. If we would like the dark photon mass to be m γ ∼ 0.5 GeV, this implies v Φ ∼ 0.5 GeV, therefore λ ΦH 10 −5 ∼ 10 −6 . Such a constraint is much more stringent than the Higgs exotic decay results. The h → φφ partial width is We just estimate the bounds Γ h→φφ Γ h SM = 4.07 MeV (For the standard model Higgs widths, one can see the reviews in Ref. [97][98][99][100], which requires λ ΦH 8 × 10 −3 . Therefore, the direct detection bounds on λ ΦH are far beyond the ability of the collider searches on Higgs exotic decay channels. The BBN constrains the φ decay lifetime. (One can also consult the axion(-like) particle results in Ref. [101] for a careful constraints on light scalar particles. However, this is beyond our current scope.) Roughly speaking, if τ φ 1 s, the cosmological φ particles disappear before the BBN epoch. The non-hadronic partial width of the φ decay width is calculated to be [96] The previous upper bound we had acquired y 2 µ 2 φh 2×10 −10 GeV 2 can result in τ φ 0.02 s when m φ = 0.3 GeV above the double muon threshold, and in the y 1 case. This is enough for the dark Higgs bosons to vanish before the beginning of the BBN.
Compared with the dark Higgs boson φ, the width of which is suppressed by the Yukawa couplings of the electrons and muons, the couplings between the dark photon γ and the charged leptons are universal. Currently, for the m φ > 2m µ ≈ 0.2 GeV, the constraint on the mixing parameter < 10 −10 is still safe. The lifetime of the dark photon is expressed as [102] τ γ 3 2 α = 6 × 10 5 yr × 10 MeV m γ × 10 −35 2 α . (5.5) This implies that τ γ 0.1 s, that can be smaller than the beginning of the BBN. In many of the parameter spaces, the can be larger than 10 −7 or 10 −5 (For the constraints on dark photon, one can see [101,102] and for references therein), which induce looser bounds on the BBN constraints.
The non-zero can also give rise to the direct detection signals. The dark matternucleon scattering cross section can be written as where a X = eQ χ g m 2 χ .
(5.7) choices of the parameters can easily evade the direct detection bounds. Finally, the strong interacting dark matter (SIDM) models are robustly constrained by the CMB spectrum [30]. The enhancement of the dark matter annihilation cross section in the very late epoch may disturb the CMB observations, if the mediator particles decaying into the SM particles faster than the Hubble rate. At the recombination epoch, where σv rec is the annihilation rate at recombination epoch, N χ = 1(2) for Majorana (Dirac) dark matter. Generally, f eff 0.1 for the SM final states other than neutrinos. Generally, such a constraint is very difficult to escape. Ref. [103] provides a model that is similar with us. In their model, there is no dark photon, so the χ 1 χ 1 annihilation is controlled by the p-wave channels to the RR or II. In our model, the mass difference between χ 1 and χ 2 is sufficient for the χ 2 to nearly disappear in the recombination epoch, therefore only the s-wave χ 1 χ 1 → γ γ might affect the CMB [104]. The bound state formation processes χ 1 χ 1 → X + γ /R are also set to be prohibited by the insufficient threshold energy in the parameter space we had studied in this paper. This is Sommerfeld enhanced and might contribute to the σv rec . Fortunately, from (3.14) we can learn that the Sommerfeld boost factor is mainly controlled by the Yukawa coupling terms. Firstly, we are interested in the y < Q χ g area, a sufficiently small but moderate y is enough for us to acquire a significant bound state formation effect while small enough boost factor. Secondly, m R = √ 2µ is independent on all the other parameters, and can be adjusted to reduce the saturated Sommerfeld factor. These can all help us easily evade the bound in (5.8).
Finally, we discuss about the cluster constraints on dark matter self-interactions. A complete calculation involves the full solution to the Schroedinger equation. Practically, it is done by a calculation up to l 20 partial wave expansions. In our paper and the parameter space adopted here, the kinetic energy for a dark matter particle inside a halo is far from the threshold to produce a χ 2 . In this situation, only the R-mediated force takes place, and for a correct dark matter relic density, our Yukawa coupling strength is weaker than the usual values in the literature. From Ref. [105], we can see that it is quite easy to evade such bounds σ T mχ 1cm 2 g −1 in the usual case, let alone our smaller Yukawa coupling situations.

Comparison with Other Similar Situations in the Literature
Before closing, let us compare our results based on the model Lagrangian (2.1) with those in other approaches in the literature.
In the literature, most studies of excited fermionic DM are based on the model without dark Higgs, described by the following model Lagrangian : Note that this Lagrangian can be obtained by integrating out the physical dark Higgs boson after the dark U (1) symmetry breaking in (2.1). Note that in this case dark photon is assumed to get its mass by Stückelberg mechanism. And the mass splitting (δ) between χ 1 and χ 2 is simply generated by the dim-3 operator which breaks U (1) softly.
Ref. [89,90] made attempts on solutions for some line-shaped photon excesses observed from near the galactic center. Both these papers are based upon one single model, in which the dark Higgs sector is eliminated compared with our model. There the mass difference of the nearly-degenerate dark matter particles is introduced with a soft breaking term, which breaks the U (1) D symmetry explicitly. Interestingly, although Ref. [90] utilized the annihilation process χχ → γ γ followed by γ → SM to explain the continuous gamma-ray spectrum from the galactic center, detailed calculations were not proceeded. We performed such calculations in Appendix A and found that the longitudinal dark photon can become catastrophic. The simplest way to cure this problem is to introduce a dark Higgs boson Φ, and the mass difference of the dark matter particles therefore originate from the Φχ C χ + h.c. term as in our paper. In this situation, besides the purely dark photon annihilation channels, Rγ , RR final states also arise. From Tab. 3, we can learn that the Rγ L , in which γ L indicates the longitudinal polarization of a dark photon, or equivalently RI channel contribute to an additional s-wave annihilation term, which can be significant compared with the γ T γ T process, in which γ T indicates a transverse polarization of a dark photon.
Ref. [91] provides another approach. There the dark matter mass difference originate from the vacuum expectation value of the dark Higgs boson, however the dark matter and dark Higgs interaction term is non-renormalizable: which induces the mass splitting δ = yv 2 φ /Λ after φ develops a nonzero VEV 9 . There the renormalizable dim-4 operator (the last term in Eq. (2.1)) was ignored without any reasons. In fact, the coupling described by (5.10) requires m γ Λ for the effective "Yukawa coupling constant" 2yv φ Λ < 1, as well as a cross section well below the unitary bound. The amplitude is calculated to be ∝ 1 . This is somehow similar to the results without a Higgs cancellation in Appendix A. In fact, the non-renormalizable model introduced by Ref. [91] still has a diagram similar to the third one with a s-channel Higgs boson listed in Fig. 9, however the h-γ -γ coupling constant is only one half compared with our model. Therefore, a precise cancellation does not occur, so the g 2 δm m 2 γ term remains.
In the Ref. [72,[89][90][91], compared with our paper, the longitudinal contribution to the off-diagonal element of the potential matrix which is proportional to c 1 in our (3.22) is absent. This is a good approximation if δm m γ . However, in Ref. [89,90], the longitudinal contribution can be absorbed by redefining the α D , with all the phenomenologies unchanged.
Finally, we need to mention that although Ref. [38,39,68] had classified nearly all the emission product cases during the dark matter bound state formation, the longitudinal dark photon emission is absent there. In their models involving a massive dark vector boson, the Ward-Takahashi identity was utilized, e.g., in the Eqn. (3.3) of Ref. [39]. Therefore, only transverse polarization dark photon emission was concerned. In our paper, we have applied a general version of the Ward-Takahashi identity given by (4.17). Therefore, the longitudinal polarization of a dark photon is considered through an equivalent Goldstone boson emission term. This term can be neglected in the Ref. [72], once the mass difference of the dark matter is much smaller than the dark photon mass.

Conclusions and Future Prospect
In this paper, we considered Z 2 fermion dark matter model with a pair of nearly-degenerate fermions defined by Lagrangian (2.1). The Yukawa potential induced by the real scalar R between the same component particle pairs χ 1 χ 1 and χ 2 χ 2 are attractive, while it is repulsive between different component pair χ 1 χ 2 . The longitudinal vector boson, or equivalently, Goldstone boson, also contributs to an additional off-diagonal term in the potential matrix. Additionally, we want to mention that such kind of Goldstone contributions to the potential should also appear in some other dark matter models with massive vector boson mediations. For example, in the bino/wino-like dark matter models, the standard model Goldstone bosons might lead to extra potential terms, modifying the Sommerfeld enhancement results. We might further work on such topics in the future.
In our model, besides the emission of the scalar R and the usual transverse vector boson γ to form a dark matter bound state, emission of a longitudinal dark photon γ , or somehow equivalently a Goldstone boson I also arises because of our dark charge assignment of the dark matter particle. The mass difference between the two components plays a crucial role in this process. Unlike Ref. [38,39], the zeroth "mono-pole" contributions to the (4.1-4.6) in our paper are non-zero, because we either need to replace the direct inner product by something inserted with a σ 1,3 , or need to compute the inner products of the wave functions acquired under different potentials. Finally, we find that the contribution from the longitudinal vector boson emission is extremely important. This leads to a reannihilation process, reducing the relic abundance of the dark matter significantly. Because this reannihilation process ceased before the BBN, the following cosmological parameters remain undisturbed 10 .
Constrained by our current computing resources and programming techniques, we are only able to calculate a simple approximation (4.58) rather than a complete version of Boltzmann equation, like those appeared in Ref. [86]. In the future, we are planning to make more complete calculations by considering various effects. For example, the deviation of the velocity distributions of each component from the Maxwell-Boltzmann distributions should also be taken into account. A careful scanning of the parameter space considering all the experimental constraints and the solutions to the "missing-satellite problem", "core-cusp problem", and "too-big-to-fail problem" will also be considered. 10 We choose benchmark parameters such that ∆m TBBN and Γχ 2 HBBN. Therefore we can safely ignore χ1 + χ2 ↔ X + γ . And χ1 + χ1 → X + R is set to be kinematically forbidden. If we get rid of the Higgs sector, and introduce the soft mass terms for the dark photon and the mass differences between χ 1 and χ 2 , we can still calculate the Higgsless diagrams of the dark matter pair χ 1 χ 1 → γ L γ L . The polarization of the longitudinal γ can be written as where k = (E, k) is the 4-momentum of the dark photon. For simplicity, we replace L (k) with k m , and omit the sub-dominant O The first term is proportional to , and the second one is proportional to In the moderate parameter space m χ 1 − m χ 2 ∼ m γ , the second term will induce a ∝ 1 m 2 γ cross section, and is very easy to break the unitarity bound if m γ m χ 1,2 . The introduction of a Higgs boson can cancel this term. A direct calculation of the third diagram in Fig. 9 gives rise to where y and A are the χχh and hγ γ couplings, respectively. In the large momentum limit, the large unitarity breaking term is precisely cancelled. Notice that in our model, m χ 1 − m χ 2 = −2yv Φ . If y = y, A = 8Q 2 χ g 2 v Φ , the unitarity breaking term is automatically cancelled in our model. This shows that the Z 2 DM model with dark Higgs mechanism behaves better.

B Modified Ward Identity in the Broken Phase
In (4.17), we modified the Ward identity by adding a non-conserving term on the righthanded side. A path-integral proof can be found in Ref. [106] for a general R-ξ gauge. Here we provide a diagramatic proof in the unitary gauge following the steps in Section 7.4 of Ref. [107]. Replace the µ (k) with the k µ in the Fig. 10, we acquire (B.1) Therefore, the (7.65) in Ref. [107] should be modified to terms will disappear again when we sum over all the possible diagrams, while has the structure of Goldstone boson emission term. However, a dark photon outer leg might "skip over" a Higgs vertex. See Fig. 11 for the diagrams. Since the χ 1 χ 1 R and χ 2 χ 2 R couplings differ by an extra minus sign, the corresponding terms in the first two diagrams in Fig. 11 can no longer cancel with each other. We then need to take into account the third diagram in Fig. 11. The method of calculating the first and second diagrams are very similar to (B.2), and we do not repeat them. The third diagram gives the result Finally, the diagramatic result becomes in which M GS are the corresponding diagrams that replace all of the dark photon k-leg with a Goldstone boson. In this paper, as we have mentioned, we have ignored the contributions that dark boson radiation from the interchanging mediators due to the extra vertices multiplied when calculating the wave function overlap, so only emissions from the fermions are calculated. Therefore, (B.6) finally becomes the M GS term in the (4.17). M begin/end are diagrams without the dark photon insertion, and the mass input of the fermionic line had been shifted by increasing/reducing k. Note that when we are calculating the diagrams in Fig. 4, we need to amputate the initial/final bound state poles to extract the S-matrix by adopting the coefficients of the double-poles, but M begin/end only indicate one bound state propagator, which contain only single-poles. Therefore, M begin/end does not contribute to the transition amplitude in (4.17).
One might consider a proof in the general R − ξ gauge. Rather than give a lengthy and complete proof, we only note that the propagator of the dark photon can be decomposed into The first term on the right handed side of the (B.7) is exactly the unitary gauge propagators. The k µ , k ν in the ξ-dependent second term will be decomposed according to (B.2) and these terms will be cancelled by the corresponding Goldstone propagator induced terms.

C Gauge Dependence of the Bound State
We have calculated the (3.17) with the 2-2 diagrams shown in Fig. 12 by assuming the onshell external legs (on-shell approximation [38]). The ξ-dependent terms in both diagrams disappear by the on-shell conditions. The most convenient way to describe the potential is working in the Feynman gauge. The left-panel in Fig. 12 exchanging the dark photon leads to (3.16), while the right-panel exchanging the Goldstone boson contributes to the (3.17).
In the unitary gauge, equivalently, k µ k ν m 2 γ term in the dark photon propagator contributes to the (3.17). However, the diagrams in Fig. 12 are only a fraction inside the ladder diagrams, and usually the fermionic external legs in them are slightly off-shell. Therefore, the ξ-dependent k µ k ν terms in the dark photon propagator cannot be simply contracted with the fermionic lines. The "off-shellness" of a fermionic component is typically ∼ µα 2 in a bound state. If ξ 1 α 2 , and notice that k ∼ µα , the k µ k ν 1, this term can be safely neglected, and the final result is (nearly) ξ-independent. However, if ξ 1 α 2 , e.g., in the unitary gauge, since µα 2 is much larger than the typical dark photon mass m γ , we will have a large deviation ∝ µ 2 α 4 m 2 γ compared with the on-shell approximation, which cannot be simply neglected. Although we can confine ourself in the ξ 1 α 2 area to apply the "on-shell approximation" to directly write down the (3.16-3.17) both for on-shell and slightly off-shell fermionic legs, we still want to see whether the large ξ, especially the unitary gauge will ruin all of our previous discussions. Since Feynmann gauge ξ = 1 is the most "natural" gauge to generate the potentials in (3.16-3.17), we will compare other ξ with the Feynmann gauge to illustrate the gauge independence of the physical results in the following text of this appendix.
Showing a complete proof on this issue is far beyond the main topic of this paper, so we only give the most important instructions. For a general R ξ gauge, we again apply the standard trick in (B.2) to decompose the k µ k ν terms in all of the exchanging dark photon propagators (B.7). Notice that only the m χa − m χ b term preserves all the fermionic propagators, and combined with the (B.7), part of its contributions can be attributed to the Feynmann gauge Goldstone, while the other part is cancelled with the general R-ξ gauge Goldstone propagators. The (p / i + k / − m χ b ) and (p / i − m χa ) terms in (B.2) will kill one of the fermionic propagators when it is internal, and will be finally cancelled with other terms if we sum over all the possible mediator connections between the bound state component lines. Finally, we acquire a complete series of Feynmann-gauge diagrams with all the gauge dependent terms only remaining in the external legs. These terms are proportional to the p / i − m i for each external fermionic leg with the momentum p i and mass m i .
For the scattering state, all the external legs are on-shell, so all the gauge dependent p / i − m i terms disappear. Therefore, the scattering state calculations are eventually ξindependent. On the other hand, for the bound state, the external leg p / i −m i remains. With the symbols of (2.16) in Ref. [108], we define L ξ as the diagram summation over all possible mediator connections, and L ξ=1 will be the Feynmann gauge results. After decomposing and cancelling all the dark photon and dark Goldstone propagator corresponding terms in L ξ , we acquire such kind of format: where S L,R (ξ) contains the p / i − m i terms for the external legs. We did not write down the loop integrals explicitly between S L,R (ξ) and L ξ=1 for brevity. Notice that S L and S R should have the same dependence on the corresponding external momentums. Therefore, it is easy to see that the general L ξ and L ξ=1 share the same pole if we define Ψ P,ξ (p) = Ψ P,ξ=1 (p) + S L (ξ, P, q)Ψ P,ξ=1 (q), Ψ * P,ξ (p) = Ψ * P,ξ=1 (p) + Ψ * P,ξ=1 (q)S R (ξ, P, q), where the q should be integrated out in a loop integration, and Ψ P,ξ=1 is the wave function of some bound state extracted from L ξ=1 , Then L ξ can be written in the form of which has exactly the same pole structures with the L ξ=1 , and Ψ P,ξ is the corresponding wave function. This means that the bound state energies are exactly gauge (or ξ-) independent. However, the wave functions need to be transformed according to (C.2). We should also note that for large ξ, Bethe-Salpeter wave function Ψ P,ξ can not even be reduced to the equal-time Schroedinger equation, so we need to solve the Bethe-Salpeter equation in this situation. This is because when ξ 1 α 2 , large awkward time-dependent gauge transformations are introduced, disrupting the time dependence of the wave functions.
We then point out that the ξ-dependence on the wave functions do not affect the physical S matrix calculations. For an example, if we want to calculate the state transition characterized by the diagrams in Fig. 4, we acquire where K ξ is the perturbative kernel to emit the dark photon. Note that after summing over all possible mediator connections to attribute all of the ξ-dependent terms to the external legs, and then adopt the poles of the initial and final states, the internal mediators in (C.5) finally become Feynmann gauge propagators, and the final result is of the format According to the principles of the LSZ reduction formula, the can be regarded as the bound state propagator as well as the "renormalzation factor" compared with the (C.4), and it should be amputated, leaving us only the Ψ * P i ,ξ=1 K ξ=1 Ψ Po,ξ=1 . Then what if we calculate the Ψ * P i ,ξ K ξ Ψ Po,ξ ? Note that when we transform all the wave functions according to (C.2), the S L,R will also exert on the interaction kernel K by adding lots of off-shell terms proportional to p / i − m i . This changes K ξ=1 to K ξ . Therefore, Ψ * P i ,ξ=1 K ξ=1 Ψ Po,ξ=1 = Ψ * P i ,ξ K ξ Ψ Po,ξ , which is exactly the gauge (or ξ-) independent physical S-matrix. Finally, we can see that the off-shell corresponding terms S L,R does not  contribute to this matrix element because their contributions to the Ψ P i ,ξ , K ξ , Ψ Po,ξ respectively cancel with each other.
Furthermore, although we need to calculate all the possible mediator connections between the bound state component fermions, contributions other than ladder diagrams are actually ignored in our paper for the same reason demonstrated in section II-A from Ref. [108]. We can easily verify that most of the ξ-dependent off-shell contributions in the ladder diagrams are actually also absorbed by these non-ladder diagrams. The remained terms are attributed to the external legs and does not disturb all of the physical results as we have discussed. Therefore, we finally recover the "on-shell" approximation principle in the most general R ξ gauge even for large ξ 1 α 2 : when calculating the diagrams in Fig. 12, we shall safely assume all the external legs to be on shell. Because all the off-shell contributions are ξ-dependent and will be exactly cancelled and absorbed into the non-ladder diagrams and the physical state definitions.

D Reasons for Neglecting the Emission from the Mediators
Ref. [68] had calculated the emission from one of the "ladder" mediators. Naively, such terms contain more vertices and can be considered as higher-order contributions. However, the momenta exchanging among these diagrams p − q are of the order µα , which appears in the propagator as 1 | p− q| n , canceling the extra coupling constants in the numerator. Ref. [68] calculated and showed that such a contribution is nearly equivalent to the diagrams where vector bosons emit from the bounded components.
In our situation, we have only two diagrams where a dark Higgs boson or a dark photon emit from a mediator, which are listed in Fig. 13. All the out leg fermions are nearly on-shell, as the process described in Ref. [38]. The unitary gauge propagator for a dark photon propagators.
In each diagram of Fig. 13 and the first diagram in Fig. 14, one of the vertices include a v Φ . For the second diagram in Fig. 14, there will be a p−q contracted with the χ i χ j γ vertex and gives a m χ i − m χ j . If we ignore the mediator masses, and similar to the Eqn. (B.2) in Ref. [68], we will try perform an integral .
This gives an inferred divergence. By analysing the precise momenta flows, we can learn that the physical cut-off on this divergence can be very complicated and depends on 1 2 µv 2 , m χ 1 − m χ 2 , and µα 2 , m R and m γ . We define the effective inferred cut-off scale Λ IR ∼ max 1 2 µv 2 , m χ 1 − m χ 2 , µα 2 , m R , m γ . With this cut off, (D.1) will then become ∼ i 2π 2 πe −Λ IRr 4Λ IR , and after picking up all the coupling constants, we can finally see roughly a suppression of Λ IR µ suppression compared with the (4.18). Only the last diagram in Fig. 14 might be problematic. Here the transverse dark photon was emitted from the mediator. The momenta dependence of this diagram is exactly similar to the (2.23a) in Ref. [68], with all the other group factors disappeared. And the factor of 8 that equation also disappear in our model due to the different Lorentz structures of the vertices. This means that our result will be suppressed by at least a factor of 1 8 . Compared with the (B.4d) in Ref. [68], we also have a y 2 Q 2 Φ g 2 < 1 factor, which is usually smaller than 1 to evade the CMB recombination bound described in (5.8), giving an extra suppression. Finally, a practical calculation had shown that the transverse dark photon emitting process is far from the main contribution compared with other bound state formation channels. Therefore, we also neglect this diagram.
One might concern whether the extended Ward identity is satisfied if we drop out these diagrams, since from the section B, we have learned that the extended Ward identity establishes only when we sum over all the possible diagrams in the same order. However, according to Eqn. (2.12) in Ref. [108], we actually acquiesce to drop out all the off-shell contributions when we are calculating the Schroedinger equations to resum the ladder diagrams. Analysis in Appendix C also reveals that we can safely apply the "on-shell" approximation to calculate the perturbative kernel in a general R-ξ gauge. It is easy to verify that if all the external fermions to be on-shell, all the ξ-dependent terms then disappear separately for a χ i χ j γ kernel and the emission from mediator described in Fig. 13. Therefore, Ward identity can still be satisfied even if we neglect some of the diagrams.

E Quantum Numbers of the DM Bound States
In this paper, before the spontaneously symmetry breaking, the Lagrangian we have adopted in (2.1) keeps both the parity and C-parity conserved, if we define the intrinsic parity of the dark Higgs boson Φ to be odd, so P Φ(t, x)P = −Φ(t, − x). This can derived if we notice P Cχ(x)CP = −CP χ(x)P C, (E.1) or P χ C (x)P = −(P χ(x)P ) C , (E.2) where C and P are the charge conjugate and parity operators. Therefore, wherex = (t, − x) are the parity transformed coordinates. After we decompose χ into χ 4 1 and χ 4 2 through (3.7), it is easily known that the C-parities for χ 1 and χ 2 are "+" and "-", respectively.
The (C-)parity of the dark photon is similar to the visible photon, which is defined to be odd.
For a two-fermion system, the total C-parity is calculated to be (−1) L+S for the χχ/χχ system, or equivalently, it is (−1) i+j for all the |χ i χ j systems, while the total parity should be (−1) L for two same charged fermions, and (−1) L+1 for opposite charged fermions. One can compare the angular momentums and the (C-)parity informations listed in the Tab. 2.
For the two-boson systems RR, II, IR, Rγ , Iγ , or γ γ , the total C-parity can be directly acquired through multiplying all the C-parities of the particle components. They are +, +, −, −, +, + respectively. While all of their parities are (−1) L .
Then we are ready to present how we acquired the selection rules shown in Tab. 2. For the χχ/χχ initial states, the dark charge conservation could only permit the γ γ , and Φ * Φ final states. Φ * Φ can be decomposed into RR, II and IR. From the C-parities of these boson pairs, we know IR can only be decayed through |χ 1 χ 2 − |χ 2 χ 1 initial states. For the 101−+ situation, it was prohibited by the parity conservation law. RR and II might appear in other combinations, however their orbital momentums are constrained to be even, so their parity are always +. Therefore, they are ruled out in the 1 S 0 (0 +− ) case due to the parity conservation, and are forbidden in the 111++ situation due to the angular momentum conservation. Finally, γ γ channel was forbidden in all the J = 1 initial states because of the Landau-Yang theorem.
For the χχ/χχ initial states, the dark charge conservation only permits the Φ ( * ) γ final states. This can be composed into Rγ and Iγ . J = 0 situations are eliminated because a transverse photon cannot form a J = 0 state together with a scalar. Rγ and Iγ were separately placed because of the C-parity conservation law.
After the symmetry breaking, the parity-odd scalar Φ takes a vacuum expectation value. This means that the parity is spontaneously broken, and the bound state eigenstates can no longer be discriminated by the parity. Therefore, in Tab. 3 and 4, we eliminate the parity. However, the selection rules discussed before P breaking are very good approximations in calculating the dark matter annihilation S-matrices because of the small v Φ we have adopted in this paper.