Perturbative Unitarity of Strongly Interacting Massive Particle Models

Dark pion is a promising candidate for the strongly interacting massive particle dark matter. A large pion self-coupling $m_\pi/f_\pi$ tends to be required for correct relic abundance, and hence the partial-wave amplitudes can violate the perturbative unitarity even for the coupling within na\"ive perturbative regime. We improve the partial-wave amplitudes in order to satisfy the optical theorem. We demonstrate that the improvement is relevant only for semi-relativistic pions, and thus this does not affect the self-scattering cross section at the cosmic structures. We also discuss the impact of the improvement of the $\pi \pi \pi \to \pi \pi$ scattering process, and we find that there is an upper bound on $m_\pi$ at which the correct relic abundance is never achieved even at large $m_\pi/f_\pi$ due to the optical theorem.


Introduction
Little is known about particle nature of dark matter (DM) even though the existence of DM has been firmly confirmed by the astrophysical observations. Strongly interacting massive particles (SIMPs) [1] are an interesting framework for the thermal relic of sub-GeV DM: the thermal relic DM is determined by the freeze-out of 3 → 2 processes. Self-interactions of DM are generically sizable to get the correct relic abundance in this framework, and it leads to a large 2 → 2 self-scattering. The small-scale structure of the Universe may indicate the sizable self-scattering of DM [2] (see Ref. [3] a review).
Dark sector is a hypothetical sector where DM resides, and has its own gauge dynamics. As a consequence of dark strong dynamics, the dark sector would consist of composite particles (dark hadrons) at the low-energy scale as with the SM hadrons  (see Ref. [52] for a review). Ref. [53] proposed a model of the SIMP framework where the dark pion is identified as DM. The dark pions arise as the pseudo-Nambu-Goldstone boson (pNGB) from the strong dynamics in the dark sector. The dark pions have the 2 → 2 self-interaction and the 3 → 2 number-changing process induced by the Wess-Zumino-Witten (WZW) term [54,55].
The chiral perturbation theory (χPT) describes the interactions among dark pions. The pion self-coupling, which is defined by the ratio of the pion mass and decay constant m π /f π , determines the size of the pionic scattering processes. The 2 → 2 self-scattering interaction arises at the leading order terms of O(m 2 π /f 2 π ), while the 3 → 2 numberchanging interaction via the WZW term appears as an O(m 5 π /f 5 π ) term of χPT. The pion self-coupling is found to be larger than unity in order to explain the relic abundance and to evade constraints on the self-scattering cross section [53]. Meanwhile, we cannot validate the perturbative expansion of χPT unless m π /f π 4π. We would not be able to ignore contributions from resonances and the higher-order terms of the chiral Lagrangian for the pion self-coupling near the naïve perturbative bound. For the consistent treatment of the chiral expansion, Ref. [56] has discussed the impact of the higher order of the chiral Lagrangian on SIMP scenarios.
We may encounter the other bound on the self-scattering cross section of dark pions in the SIMP models even when the perturbative expansion of χPT is valid. The unitarity of the S-matrix imposes constraints on the partial-wave amplitude. The partial-wave amplitude for the 2 → 2 self-scattering is T m 2 π /(32πf 2 π ) at the tree-level, and the perturbative unitarity places an upper bound 1/2β on ReT with β corresponding to velocity of pions. The perturbative unitarity bound m π /f π 4 √ π/ √ β gets weaker at the cosmic structures due to the DM velocity at maximum of ∼ 10 −2 , while this bound can be important for the annihilation mechanism determining the current relic abundance since dark pions are semi-relativistic. In other words, there are two bounds on the pion selfcouplings: one originates from the limitation of the perturbative expansion and another is the perturbative unitarity of the scattering processes. In the chiral limit (m π → 0), the perturbative unitarity violation will be cured by resumming multiple rescattering processes, which is known as the "self-healing" mechanism [57]. In this paper, we will propose the improvement of 2 → 2 and 3 → 2 partial-wave amplitudes in a similar way to the "self-healing" mechanism in the non-chiral limit since we focus on the dark pion DM. This paper is organized as follows. We will show the chiral Lagrangian for the SIMP models in Section 2. We discuss the perturbative unitarity of ππ → ππ scattering cross section and thermally-averaged πππ → ππ cross section in Section 3, and we will see these cross sections would violate the perturbative unitarity at large m π /f π . In Section 4, we propose the improved amplitude, which satisfies the optical theorem automatically, and then we will apply the procedure to the SIMP models. Section 5 is devoted to conclusions of our work.

Chiral Lagrangian for SIMP
We discuss the SIMP model that is realized by the confining gauge dynamics of a gauge group G local . We consider N f -flavor quarks with the mass below the dynamical scale of G local in the ultraviolet description of this model. This model possesses an approximate global symmetry G among quarks that is broken by the mass terms. It is believed that this model leads to the chiral symmetry breaking, and that chiral condensation breaks the global symmetry into the subgroup H. In the following, we consider three classes of the models: The dark pions, which are the pNGBs of the chiral symmetry breaking, are the fundamental degrees of freedom in the low-energy effective theory of this model. We are interested in the dark pions for realizing SIMP framework, and hence we focus on the chiral expansion with typical momentum p 2 O(m 2 π ) in the non-relativistic limit. The coset space G/H is parameterized by N π pNGB fields π a [62,63], which corresponds to the broken generators T a with a = 1, · · · , N π . The parameterization and the normalization of the dark pion fields are the following.
and only for the case (iii ), the non-linear sigma model field Σ is where J is the symplectic metric that satisfies J T = −J , J 2 = −1. Here, f π denotes the pion decay constant. Under the residual symmetry H, the dark pion fields transform as (i ) adjoint representation, (ii ) rank-2 symmetric tensor representation, and (iii ) rank-2 antisymmetric tensor representation for each different symmetry. The relevant Lagrangian of the dark pions is given by Here, the first-two terms are the leading order (LO) terms in the χPT and the soft chiral symmetry breaking term, and the third term is the Wess-Zumino-Witten (WZW) term [54,55]. We take the quark mass matrix M to be invariant under H, and hence the pion mass is universal m 2 π = Bm q with m q being the quark mass. The relevant Lagrangian contains the four-point interactions among the dark pions, which induce the self scattering of the dark pions. We obtain the four-point interaction terms from the kinetic and mass terms by expanding Σ.
where r abcd and c abcd are the coefficients defined by group-theoretical constants, which is discussed in Appendix C. We will discuss the self-scattering cross section in detail in the next section. Meanwhile, the five-point interaction among the dark pions arises from the WZW term. The πππ → ππ scattering process arising from the WZW term determines the relic abundance of the dark pions. The WZW term is given by 1 Here, k = N C for SU (N C ) and 2k = N C for SO(N C ) and U Sp(N C ) [65]. T [abcde] denotes the anti-symmetrization of five broken generators: The four-point interactions arise at the order of m 2 π /f 2 π , while the five-point interaction is at the order of m 5 π /f 5 π . In accordance with the standard order-counting of the χPT using p 2 expansion in the chiral limit, the former is the LO contribution and the latter is the next-to-leading order (NLO) contribution in the non-chiral limit.
A large coupling tends to be required for correct relic abundance in the SIMP scenarios. In the dark-pion realization, the number-changing process arises from the WZW term, which is the higher order of the chiral expansion, with the velocity suppression. Meanwhile, the χPT breaks down m π Λ χSB , and the cutoff scale is expected to be This is known as the scale estimated by naïve dimensional analysis (NDA) [66,67] with taking into account the large-N c scaling and the large-N f scaling. The large-N c scaling of f π and Λ χSB is known to be f 2 π O(N c ) and Λ χSB O(1) [68][69][70][71][72][73]. It is also known that some of the partial-wave amplitudes for the scattering process of ππ → ππ is proportional to N f in the large N f limit [74,75]. 2 Since the pion self-coupling is required to be close to its naïve perturbative bound m π /f π Λ χSB /f π in the SIMP scenario, the higherorder contributions of the χPT may affect the predictions of the relic abundance and the self-scattering cross section. Ref. [56] has discussed the impacts of the higher-order contributions of the χPT in the context of the SIMP: in particular, the NLO and the next-to-next-to-leading order (NNLO) contributions.

Perturbative Unitarity
We discuss the perturbative unitarity of ππ → ππ and πππ → ππ scattering processes. The partial-wave decomposition of the invariant amplitude for elastic scattering process is given by 3 Here, P is the Legendre polynomial with P (1) = 1, s denotes the collision energy, and θ is the scattering angle of the final state pions with respect to the collision axis. P ab;cd R denotes the projection operators: the product of two pions is projected into the irreducible representation R of residual global symmetry H. T R (s) is the partial-wave elastic amplitude for the channel of a representation R. The projection operators satisfy Here, d R denotes the dimension of the representation R. With this decomposition, the total cross section for π a π b → π c π d takes the form The unitarity of the S-matrix imposes the optical theorem that relates the imaginary part of the invariant amplitude M 2→2 for the forward scattering to the total cross section σ total . Below the four-pion threshold, the optical theorem is given by Here, p = √ sβ/2 is the momentum of incoming particles with β = (1−4m 2 π /s) 1/2 denoting the velocity of the particles in the center of mass frame. We use the relativistic formula for the two-body system (p and β), while taking the non-relativistic limit of the three-body system. This is because the two-body system is semi-relativistic for the 2 → 3 process. We include the inelastic cross section σ 2→3 in the right-hand side since it is also sizable in the SIMP models, and it vanishes below the inelastic threshold.
We define the partial-wave amplitude for the ππ → πππ process in the non-relativistic limit of the three-body final state. It is challenging to give the explicit form of the decomposition of the three-body final state into its irreducible representations, in other Figure 1: Unitarity circle on a complex plane of the partial-wave amplitude for the elastic scattering with a fixed partial-wave amplitude squared for the inelastic scattering. Unitarity circle without the inelastic scattering is shown as the gray-dashed line.
words to construct the explicit form of the projection operator P abc;de R . Meanwhile, the cross section can be decomposed into the irreducible representations of the initial state. The cross section for the ππ → πππ process may be written in a similar way to the ππ → ππ process. The elastic cross section and the inelastic cross section averaged over all initial states are given by Here, we define the square of the inelastic amplitude |T R in, | 2 . When we include the ππ → πππ inelastic scattering process, the optical theorem for each partial-wave amplitude takes the form, Let us apply this condition to perturbative analysis in the case that inelastic scattering is negligible. The optical theorem relates a fixed-order amplitude to the imaginary part of the higher-order amplitude: the tree-level partial-wave amplitude for the elastic scattering T tree is real, and is related to the imaginary part of the one-loop amplitude. We give the perturbative analysis of λφ 4 theory in Appendix A, and discuss this point explicitly. In other words, the fixed-order amplitude does not satisfy Eq. (13). Hence, it is important to unitarize the amplitude in order to satisfy the optical theorem automatically as the pion self-coupling is large.
The optical theorem places upper bounds on the partial-wave amplitudes. Fig. 1 shows the unitarity circle on a complex plane of the partial-wave amplitude for the elastic scattering. The gray dashed line depict the unitarity circle without the inelastic scattering; its center is located at (0, 1/2β) and its radius is 1/2β. The radius shrinks in the presence of the inelastic scattering as depicted as the solid-line circle, and the single point (0, 1/2β) is allowed once the inelastic scattering saturates the unitarity bound (|T R in, | 2 = 1/4β 2 ). The upper bound on the partial-wave amplitude for the elastic scattering is given by |T R | ≤ 1/β as shown in the figure. The equality holds when ReT R = 0, |T R in, | 2 = 0, and ImT R = 1/β, corresponding to the top of the gray-dashed circle. The perturbative unitarity is often imposed as |ReT R | ≤ 1/2β. The equality holds when the imaginary part satisfies ImT R = 1/2β and the inelastic scattering is negligible.
We discuss the unitarity bound on the pion self-coupling based on the partial-wave unitarity of tree-level π a (p 1 )π b (p 2 ) → π c (p 3 )π d (p 4 ) self-scattering cross section. At the LO terms of χPT, the four-point interaction is given by Eq. (4), and the tree-level amplitude of the process is obtained as follows: Here, the Mandelstam variables are defined by s = (p 1 + p 2 ) 2 , t = (p 1 − p 3 ) 2 , and u = (p 1 − p 4 ) 2 . The singlet channel for ππ → ππ has the larger coefficient compared to the others in an SU (N f ) × SU (N f )/SU (N f ) with a large N f limit, and the s-wave amplitude for this channel takes the form, where we take the non-relativistic limit of two-body system, s = 4m 2 π + 4m 2 π β 2 with small β. Once the pion self-coupling is in the range of m π /f π 4π/N f β, the partialwave amplitude goes beyond the unitarity bound. The elastic scattering cross section can exceed the perturbative unitarity bound when the pion self-coupling within the following range: This upper bound originates from the condition where tree and one-loop contributions are compatible to each other. We also discuss the unitarity bound on the partial-wave amplitude for the inelastic scattering, π a (p 1 )π b (p 2 ) → π c (p 3 )π d (p 4 )π e (p 5 ). The WZW term (5) provides the five-pion interaction that induces the inelastic scattering. The invariant amplitude at the tree-level is given by M abcde tree (p 1 , p 2 , p 3 , p 4 , p 5 ) = 16k Here, T [abcde] is defined in Eq. (6). We compute the ππ → πππ cross section in the nonrelativistic limit of the three-body final state, and we extract the partial-wave amplitude for the square of the inelastic scattering by comparing the cross section with Eq. (12).
Since the amplitude has only p-wave component at the tree level even for the relativistic limit, only antisymmetric representations in the irreducible decomposition of initial twopion states contribute to the amplitude. We find that the square of the partial-wave amplitude takes the form, where we take the non-relativistic limit of the three-body system, s = 9m 2 π + 6m π E with the relative energy E =μv 2 /2 and the three-body reduced massμ = m π / √ 3. Meanwhile, β 2 = 1 − 4m 2 π /s 5/9 + 4 √ 3v 2 /81 is semi-relativistic. t 2 R denotes a group theoretical factor, which scales as N 5 f . 2β and v correspond to the relative velocities of the twobody initial state and of the three-body final state, respectively. In general, the inelastic partial-wave amplitude squared has the velocity dependence of (T R in, ) 2 ∝ β 2 −1 v 4+2λ with λ denoting hyperangular quantum number, and the 2 → 3 cross section may scale as v 6 (λ = 1) at the leading order of the p-wave cross section. In the dark-pion realization, the partial-wave amplitude is the next-leading order of the p-wave amplitude (λ = 2).
The unitarity bound on the partial-wave amplitude, 1/2β, places an upper bound on the pion self-coupling m π /f π . The smallest representation in dimension provides the largest partial-wave amplitude. In an SU (N f ) × SU (N f )/SU (N f ) with a large N f limit, d R for the adjoint representation scales as N 2 f , while d R for the larger representations scale as N 4 f . The partial-wave amplitude for the adjoint representation gives stronger bound on m π /f π in a large N f limit, and it takes the form, This bound on m π /f π is numerically weaker than the perturbative unitarity bound obtained from the 2 → 2 scattering for the velocity β v 1. Next, we discuss the partial-wave unitarity of the thermally-averaged cross section for the πππ → ππ process. We will give detailed computations in Appendix B. When the invariant amplitude for the ππ → πππ has time-reversal invariance 4 , the cross section for πππ → ππ in the non-relativistic limit of the three-body system (s = 9m 2 π + √ 3m 2 π v 2 ) is also written by the use of the partial-wave amplitude squared |T R in, | 2 , and hence the cross section takes the form: with the Jacobi momentumk ≡ µv. The partial-wave unitarity |T R in, =1 | 2 ≤ 1/4β 2 places an upper bound on the cross section.
Here, a factor of 3 arises from the fact that πππ → ππ scattering involves only p-wave. The thermally-averaged cross section is obtained by integrating the cross section over k with the Maxwell-Boltzmann distribution.
Here, x = m π /T with T denoting temperature. This bound differs by a factor of 1/4 from what is obtained in Refs. [76,77]. We use the upper bound on the partial-wave amplitude for the inelastic scattering given by 1/2β in the presence of the elastic scattering, while the upper bound for the elastic scattering (1/β) has been used for constraining the thermally-averaged cross section in Refs. [76,77]. 5 The summation runs only over the antisymmetric representations since the square of the inelastic amplitude is nonzero only for the antisymmetric representations.
Let us now consider the πππ → ππ scattering at the leading contribution in the chiral Lagrangian. We have given the partial-wave amplitude for the ππ → πππ cross section at the leading contribution for the process in Eq. (18). The thermally-averaged cross section for the πππ → ππ process is 6 where We show the explicit forms of t 2 in different symmetry structures in Appendix C. t 2 /N 3 π approaches to 1/N f at the large-N f limit. The thermally-averaged cross section will exceed the unitarity bound given by Eq. (22) in the large m π /f π limit. Once we take the naïve perturbative bound of the pion selfcoupling, m π /f π 4π, the cross section takes the form The annihilation cross section is safe from the unitarity violation at large x, namely at low temperature, since the cross section is suppressed x −2 while the unitarity bound is proportional to x 2 . The cross section, however, will be beyond the unitarity bound at small x even for the pion self-coupling below the naïve perturbative bound.

Improved Cross Section
At the large pion self-coupling, both of ππ → ππ and πππ → ππ scattering processes at tree-level can be beyond the perturbative unitarity even for the coupling within the naïve perturbative range. In this section, we propose the improvement of the partial-wave amplitude to take into account the partial-wave unitarity of the cross section. We may use perturbative analysis for the pion self-coupling within the naïve perturbative range given by Eq. (16), but we have to modify the amplitude to satisfy the optical theorem. In particular, we need to incorporate the imaginary part of the amplitude at the LO into an improved amplitude. We define the improved amplitude whose real and imaginary parts appropriately reproduce the original amplitude at the LO as follows: This improved partial-wave amplitude automatically satisfies Eq. (13) at any m π /f π : namely, Im for each R and . In the absence of the inelastic scattering amplitude squared (T tree in ) 2 , this improvement of the partial-wave amplitudes coincides with the "self-healing" mechanism in which the perturbative unitarity violation is cured by resumming multiple rescattering processes [57].
Our improvement amplitudes are not either unique or complete to remedy the situation violating the unitarity condition, but are associated with resumming the multiple rescattering processes that arise not only from the four-point vertex but also from the treelevel five-point vertex.
where k denotes the loop momentum. These improved amplitudes correspond to a geometric sum of multiple self-rescattering processes, in other words a geometric sum of diagrams where all loops are cut as shown in Fig. 2. This improved amplitude consists only of the tree-level amplitude and cut diagrams, and hence this procedure is not appropriate when full one-loop corrections are important.

Analytic Properties
Before showing the impact of the improved amplitude on the SIMP phenomenology, we discuss the analytic properties of the improved amplitudes, such as the scattering phase shift and the poles. The phase shift e 2iδ R ≡ 1 + 2iβT R may be written as Meanwhile, the phase shift in the low-energy limit can be expanded where p = √ sβ/2 denotes the relative momentum of the system. a and r e, are known as the scattering length and the effective range, respectively. The phase shift is generically complex in the presence of the inelastic scattering, such as DM annihilation [79,80].
Using our improved partial-wave amplitude, the phase shift is given by the tree-level partial-wave amplitude as follows.
The phase shift with our improved amplitude is indeed complex since the partial-wave amplitudes are real at tree level. The imaginary part of the phase shift leads to shrinking the radius of the unitarity circle of elastic scattering as shown in Fig. 1. As for s-wave amplitudes, the elastic amplitude is proportional to β 0 in the non-relativistic limit and there is no inelastic scattering. On the other hand, the p-wave elastic amplitude is proportional to β 2 . Therefore, we confirm that β 2 +1 cot δ imp,R is analytic at β = 0 even in our improvement procedure for the application to the dark pion SIMP models. The square of the inelastic amplitude vanishes β √ 5/3 due to the kinematics, while the β-dependence of the square of the inelastic amplitude is compatible with the expression of the effective range theory. We note that, assuming that the imaginary part of the phase shift is negligible compared to the real part, the phase shift is given by the tree-level partial-wave amplitude as Re(cot δ imp,R ) (βT tree,R ) −1 and Im(cot δ imp,R ) −(T tree,R ,in /T tree,R ) 2 . Under the Born approximation, the phase shift based on the tree-level amplitude is given by δ βT tree : i.e., cot δ (βT tree ) −1 , which is reproduced by the expression of δ imp,R .
Our improved amplitudes would involve a pole. There is no inelastic channel for swave amplitudes, and the amplitude is proportional to β 0 at the LO. Therefore, the pole would appear at It is indicated that |T tree,R =0 | 1, i.e., |β| 1, in the non-relativistic limit. The bound state (Imβ pole > 0) or the virtual level (Imβ pole < 0) would appear depending on the sign of the partial-wave amplitudes.
As for the p-wave amplitudes, the amplitude for the elastic scattering is proportional to β 2 at the LO: we set the β-independent part of the amplitude as T tree,R =1 = β 2 t tree,R =1 . The partial-wave amplitude squared for inelastic scattering (T tree,R ) 2 is proportional to β, and hence we set the β-independent part as (T tree,R ) 2 = β(t tree,R ) 2 . As far as the inelastic channels is negligible, the pole would appear at We can find a pole on the imaginary axis of β, and hence the pole corresponds to the bound state (t tree,R =1 > 0) or the virtual level (t tree,R =1 < 0).

Application to dark pion SIMP
Now, we discuss the impact of the improvement on the SIMP models. Using the tree-level amplitudes given by Eqs. (4) and (5), we obtain the improved amplitudes for ππ → ππ and πππ → ππ processes. The improved amplitudes for 2 → 2 scattering coincide with the tree-level amplitudes in the β → 0 limit, when the inelastic channel closes. Hence, the improvement has a low impact on the pion 2 → 2 self-scattering cross section in cosmic structures due to the DM velocity of ∼ 10 −2 at maximum. Meanwhile, this improvement procedure will be efficient when the pions are semi-relativistic. The tree-level partial-wave amplitudes scale as T tree ∝ (m π /f π ) 2 and (T tree in ) 2 ∝ (m π /f π ) 10 , and hence the inelastic scattering more significantly affects the improvement at the large m π /f π .
We plot the improved p-wave amplitude for the elastic scattering, T imp , on a complex plane in Fig. 3 for the adjoint channel of SU (4)×SU (4)/SU (4). Both elastic and inelastic scattering processes exist at the tree-level in this channel. In the figure, we take different velocities v of the three-body initial state: v = 0.1 (brown), v = 0.7 (cyan), and v = 2.0 (red). 7 Once we specify the symmetry structure and the channel, the improved amplitude is a function depending only on m π /f π and velocity v. We note that the relative velocity of the two-body final state is fixed as β = √ 5/3 in the non-relativistic limit of the threebody initial state. As m π /f π increases from zero to infinity, the amplitude for fixed v moves from bottom to top in the figure. We put marks on each lines at m π /f π = 4π/ √ N c ( ) and 4π ( ). The improved amplitude T imp is on the unitarity circle (with the radius Figure 3: Trajectories of the improved partial-wave amplitude on the complex plane of the amplitude for the elastic scattering as m π /f π increased. The gray-dashed line corresponds to the unitarity circle (without the inelastic scattering), whose center is at (0, 1/2β) and radius is 1/2β. We assume and N c = 4, and the adjoint representation for R. Each line corresponds to fixed v: v = 0.1 (brown), v = 0.7 (cyan), and v = 2.0 (red). There are two markers on each line, which correspond to m π /f π = 4π/ √ N c and 4π. of 1/2β) when the inelastic scattering is negligible. For small m π /f π , therefore, each line follows the unitarity circle since the square of the tree-level partial-wave amplitude for the inelastic scattering is suppressed by a higher power of m π /f π than the tree-level partialwave amplitude for the elastic scattering. T imp in can be comparable to T imp at a certain point of m π /f π , and thus the radius from the center of the circle shrinks as we see in Fig. 1. Since the square of the partial-wave amplitude for the inelastic scattering, (T R in,, =1 ) 2 , is proportional to v 8 as shown in Eq. (18), the path goes inside the circle for smaller m π /f π as the velocity v gets larger. The improved partial-wave amplitudes approach to (0, 1/β) for sufficiently large m π /f π . As shown in Eq. (25), T imp → i/β and T imp in → 0 when the tree-level inelastic scattering gets increased.
We compare the (thermally-averaged) cross section with improvement (cyan), that without improvement (orange), and the unitarity bound (black) in Fig. 4. We choose different m π /f π for the top panels and for the bottom panels. We use the improved partialwave amplitude squared |T imp in | 2 in the πππ → ππ cross section given by Eq. (20). In the left panels, we show the cross section σ 3→2 v 2 as the function of velocity v = k/µ. Since the tree-level partial-wave amplitude squared (T tree in ) 2 is proportional to v 8 , the improved cross section follows the original cross section at small v. The improved cross section gets larger as v increases, and the cross section almost saturates the unitarity bound at a certain v. The partial-wave amplitude for each representation does not simultaneously saturate the unitarity bound at a certain v. As we see in small windows in Fig. 4, the improved cross section does not fully coincide with the unitarity bound, but approaches to the bound. For a larger v, the improved partial-wave amplitude (T tree in ) 2 follows the scaling of v −8 , and hence the improved cross section is suppressed compared to the unitarity-saturated Figure 4: Comparison of the πππ → ππ (thermally-averaged) cross sections. The symmetry structure is assumed to be SU (4) × SU (4)/SU (4), and we take m π = 0.1 GeV. We take m π /f π = 4π in top panels and m π /f π = 4π/ √ N c in bottom panels. (Left): Shown is the cross sections σv 2 with improvement (cyan), the cross section without improvement (orange), and unitarity bound (black) as a function of velocity v. (Right): Shown is the thermally-averaged cross section σv 2 with improvement (cyan), without improvement (orange), and unitarity bound (black) as a function of x = m π /T . cross section, whose partial-wave amplitude follows the scaling of v 0 .
The thermally-averaged cross section with the improvement hardly saturates the unitarity bound as shown in the right panels of Fig. 4 since σv 2 with the improvement saturates the unitarity bound just in a narrow range of v. A larger v is required for saturating the unitarity bound for a smaller m π /f π , and hence the improvement hardly affects the thermally-averaged cross section for a smaller m π /f π due to the Boltzmann suppression factor. In particular, the tree-level inelastic amplitude squared is quite suppressed at small v in the dark pion realizations, which is proportional to v 8 , and hence the improvement has little impact on the cross section for the small v. This is the main reasons why the improvement is less effective even for the coupling close to the naïve perturbative bound m π /f π 4π/ √ N c . We note that there is a small difference between (thermally-averaged) cross sections with and without improvement even at small v (large x). The relative velocity of the two-body final state is given by 2β 2 √ 5/3 in the small v limit, and hence the improvement by the 2 → 2 amplitude remains.
We illustrate the impact of the improvement of the πππ → ππ cross section in Fig. 5. We compute the relic abundance of the SIMP DM by the use of an approximate formula, which ignores the temperature change of the effective degrees of freedom g * during freezeout, given by Ref. [81]: Here, M Pl = 2.4 × 10 18 GeV.
We take x f = m π /T f = 15 and 20 in the figure. We use the effective degrees of freedom g * at the freeze-out temperature given by Ref. [82]. In Fig. 5, we compare the required m π /f π for the correct relic abundance Ωh 2 0.12 with the thermally-averaged cross section with and without the improvement (cyan lines): solid-thick lines for the improved cross section, while dashed-thick lines for the cross section without improvement. The self-scattering cross section σ scatter /m π , obtained along with m π /f π required for the correct relic abundance, is shown as thin-red lines; these linetypes correspond to that of cyan lines, namely, solid (dashed) lines for the cross section with (without) improvement. We take the number of flavor and colors to be fixed as N f = 4 and N c = 4 in all panels. Each panel shows the different symmetry structure.
The horizontal solid-line shows the unitarity bounds m π /f π = 4π and m π /f π = 4π/ √ N c given by Eq. (7), and the red-shaded band indicates the self-scattering cross section in a range of 0.1 cm 2 /g ≤ σ scatter /m π ≤ 1 cm 2 /g.
The required m π /f π gets larger in order to explain the correct relic abundance as the pion mass gets increased. The number-changing cross section is suppressed by the improvement procedure compared to the tree-level cross section for a large m π /f π , and hence the required m π /f π is larger than the case without the improvement. We find that any large value of m π /f π cannot explain the correct relic abundance above m π 700 MeV (m π 900 MeV) for x f = 20 (x f = 15). However, we have to incorporate the higher-order terms of the χPT when the pion self-coupling gets close to the naïve perturbative bound.
Once we take the naïve perturbative bound m π /f π 4π/ √ N c , there is no room for the allowed range of m π for x f = 20 while there remains the allowed range in the left-bottom and right bottom panels for x f = 15. The improvement is less effective for the coupling close to the naïve perturbative bound m π /f π 4π/ √ N c . Meanwhile, this bound gets mild when we take a weaker bound, e.g. m π /f π 4π, and then the allowed range of m π can be enlarged. As approaching to m π /f π 4π, the improvement numerically modifies m π /f π at the ten-percent level.

Conclusion and Discussions
The dark pion is a promising realization of the SIMP models. The pion self-coupling m π /f π tends to be large in order that the SIMP framework works, and hence, it is expected that scattering cross sections easily violate the perturbative unitarity. In this study, we have proposed the improvement of partial-wave amplitudes, which automatically hold the optical theorem. We have found that the impact of the improvement on phenomenology is less significant when the DM velocity is small. Therefore, the improvement of the cross section does not change the self-interaction of dark pions at the cosmic structures. Meanwhile, we have also examined the impact of the improvement on πππ → ππ processes, whose final states are semi-relativistic. The pion self-coupling required to achieve the correct relic abundance is hardly changed below the naïve perturbative bound m π /f π 4π/ √ N c by the improvement. The change will be more efficient near the original NDA limit m π /f π 4π [66], and it can be up to O(10) %. The impact of the improvement will be comparable to that of the NNLO correction of the χPT studied in Ref. [56], which is of order of 10 %. m π /f π required for the correct relic abundance drastically changes for m π 300 MeV, and we cannot find m π /f π explaining the abundance for m π 700 MeV. Since m π /f π is beyond the naïve perturbative bound in the mass range, there are uncertainties due to the resonances and the higher-order corrections of the χPT.
Ref. [56] has discussed the impacts of the NLO and NNLO contributions of chiral Lagrangian. They include only the real part of the amplitude since the imaginary part provides only the higher-order contribution to the cross section. On the other hand, their amplitude does not satisfy the optical theorem. Our improved amplitudes incorporate the imaginary part, but consists of the LO amplitudes of the elastic and inelastic scattering processes. We may include the higher-order contributions systematically by using the inverse amplitude method, which automatically satisfies the optical theorem, in the absence of the inelastic channel [83]. However, it is not straightforward to include inelastic channels in the method. This is beyond scope of this study.
In this study, we have focused only on the dark pion realization of the SIMP scenario. The amplitude for the number-changing process is suppressed in the following ways in the dark pion realization. The WZW term arises as the NLO contributions with a one-loop suppression factor in the chiral perturbation. Furthermore, the square of the inelastic amplitude is suppressed by v 8 in the dark pion realization, and it is a sub-leading contribution even in the p-wave contributions. In other realizations, such as the dark glueball realization [84,85] and the vector SIMP model [86], the square of the inelastic amplitude can be leading-order contribution of the perturbation theory and does not suffer from further velocity suppression. There, unitarizing the amplitude will be more important within the naïve perturbative range in other realizations. A One-loop amplitude in λφ 4 theory Let us consider a λφ 4 theory of a scalar field φ with a mass of m. The tree-level amplitude for φφ → φφ is given by iM tree = −iλ, while the one-loop amplitude is given by Here, x denotes the Feynman parameter and µ denotes the renormalization scale. The Mandelstam variables s , t , and u are given in the center-of-mass frame by where β 2 = 1 − 4m 2 /s and γ −2 = 1 − β 2 . A branch cut appears along Re(s) > 4m 2 . We obtain the partial wave amplitudes by multiplying the Legendre polynomial and integrating over the angle θ. We focus only on the s-wave partial amplitude. The tree-level amplitude is given by while the one-loop amplitude is Here, we follow the definition of the partial-wave amplitude Eq. (8). The one-loop corrected amplitude Eq. (33) is totally symmetric under the exchange of s , t , and u. Therefore, the partial-wave amplitude has another branch cut, which corresponds to branch cuts along Re(t) > 4m 2 or Re(u) > 4m 2 . In this region, Re(s) < 0, and hence β 2 γ 2 < 0. The another branch cut along Re(s) < 0 originates from the last term. As far as we consider the physical region of the scattering process, Re(s) > 0, the imaginary part of T 1-loop 0 appears only from the first logarithmic term. The argument inside the logarithmic function is negative only when (1 − β)/2 < x < (1 + β)/2, and hence the imaginary part of the one-loop partial-wave amplitude is Here, we confirm the optical theorem Eq. (13) without the inelastic scattering at the LO of the coupling λ in the λφ 4 theory. Finally, we discuss the coupling expansion of the total cross section and the importance of the imaginary part. We find the total cross section up to the one-loop level as follows: Here, we ignore the logarithmic factors of the real part of the one-loop correction in this expression. In the square bracket, the first term corresponds to the tree-level contribution, the second is the interference between the tree-level and the real part of the one-loop correction, the third shows the imaginary part of the one-loop correction, and the last is the contribution from the real part of the one-loop amplitude. We note that the last-two terms must be affected by the interference between the tree-level and 2-loop amplitudes. All loop corrections similarly contribute to the cross section when 16π 2 λ, and hence the coupling expansion is no longer validated. The improved amplitude in the text incorporates only the imaginary part, and hence the improvement is expected to be important as the coupling is just below the naïve perturbative bound.

B N-body Cross Sections and Thermal Averaging
In this appendix, we discuss the N -body cross section and its thermal averaging.

B.1 Cross sections of non-relativistic N -body system
We consider the N -body system, which is composed of N particles with momentum p i and mass m i (i = 1, · · · , N ). Jacobi coordinates of the system are often useful to simplify the N -body system. When the particle i is positioned at x i , the Jacobi coordinates of the system are defined by Here, r i (i = 1, · · · , N − 1) represents the relative coordinates between the position of the particle i + 1 and the center-of-mass position for the sub-system of i particles. r N denotes the center-of-mass position of the total system. We introduce Jacobi coordinates of the momentum space, which we call the Jacobi momenta k i (i = 1, · · · , N − 1) and the center-of-mass momentum k N .
The Jacobi momenta corresponds to the relative momentum of a particle i + 1 and the center-of-mass of subsystem with i particles. The total angular momentum is written as the vector product of the Jacobi coordinates and Jacobi momenta.
In terms of k i (i = 1, · · · , N ), the total energy of the system is Here, µ i (i = 1, · · · , N − 1) denotes the reduced mass of a particle i + 1 and the centerof-mass of subsystem with i particles, and µ N denotes the total mass of all particles, µ N = m 1 + · · · + m N . We compute the phase space integral by the use of the Jacobi momenta. The Jacobian of this transformation is unity, and hence the N -body phase space integral is computed as follows.
Here, P denotes the total 4-momentum P = (E, P ). One of advantages in using the Jacobi momenta is the simplification of the momentum integral with the delta function. The delta function of 3-momentum is given as a function of a single variable k N by the use of the Jacobi momenta, and hence we can easily carry out the k N -integral. We take the center-of-mass frame of p i , with P = 0, in the following. We can simplify the phase space integral by replacing the variables The phase space integral takes the following form: Here, dΩ d denotes the integration over the solid angle in d dimensions, and k 2 N ≡ κ 2 1 +· · ·+ κ 2 N −1 . The solid angle in d dimensions is given by We note that the subscripts for µ N and k N just denote the N -body phase space, not N -th variables. Now, we consider 2-body elastic and inelastic cross sections of identical scalar particles χ i . The scattering amplitudes for these processes are generally the functions of solid angles for the initial states and final states, even though these are independent of the angles for the 2-body scattering. Integrating the amplitude over the angle of the final state, and averaging over the angle of the initial state, we obtain the averaged cross sections [87]. For our purpose, we consider the 2 → 2 elastic scattering and the 2 → N inelastic scattering. These cross sections take the form Here, T R and |T R in, | 2 are the partial-wave amplitude integrated over the phase space of the initial state. N χ denotes the number of degrees of freedom of χ i . p = √ sβ/2 is the momentum of incoming particles. β = (1 − 4m 2 /s) 2 for the identical (up to flavor indices) particles with the mass m, where √ s is the collision energy. We use the relativistic formula for the two-body system (p and β), while taking the non-relativistic limit of the N -body system. This is because the two-body system is semi-relativistic for the 2 → N process. The prefactor in the cross section originates from the two-body phase space integral including a factor for the identical particles, 2!. A factor of 4β 2 in front of the partial-wave amplitude originates from the definition of the S-matrix elements S for the partial-wave , S = 1 + 2iβT .
Once the system has the time-reversal invariance, the partial-wave amplitudes for the 2 → N scattering and for the N → 2 scattering are given by |T R in, | 2 . In a similar way to the 2-body scattering, we define the initial-angle averaged scattering cross section for N → 2 process in the non-relativistic limit of the initial states.
Here, the prefactor is due to averaging the N initial states. We may define the velocity for the N -body states as follows: v ≡ k N /µ N . Now, we compute the thermally-averaged cross section. Another advantage of the use of the Jacobi momenta is that the Jacobi momenta are invariant under the Galilei transformation except for k N . Indeed, the momentum variables of the thermal averaging do not necessarily coincide with those of a scattering system, and hence the difference of the momentum variables does affect the thermal averaging unless we consider only s-wave process. The Jacobi momenta describe the relative motion of the subsystem except for k N describing the motion of the center-of-mass. The cross section must be expressed only by the Jacobi momenta (except for k N ) since the cross section depends only on the relative motions of scattering particles. The thermally-averaged cross section is defined as We substitute the integral variables p i → k i , and the Jacobian is unity as stated before. The integral of the cross section with respect to k N is trivially carried out since the cross section does not depend on k N . We may simplify the integral again by replacing the variables k i (i = 1, · · · , N − 1) to k N and hyperangles as shown before, and the phase space integral of the initial state may be rewritten as Here, averaging over the angle variables gives the initial-angle averaged cross section (49). Taking the partial-wave amplitudes to be the maximum value |T R in, | 2 ≤ (1/2β) 2 , we obtain the upper bound of the thermally-averaged cross section for N → 2 process.
Here, R and run only over representations whose amplitudes are non-zero, x = m χ /T , and we take the identical mass to be m χ . We again note that we independently obtain the upper bounds from Refs. [76][77][78]. As we have stated in the text, our result is consistent with Refs. [76,77] up to a factor of 1/4 due to maximizing the partial-wave amplitude, but our result disagrees with the results in Ref [78] by a factor of N √ N /2 √ 2.

B.2 Application to the SIMP models
The tree-level five-pion amplitude is given by Eq. (17) in the text. We compute the thermally-averaged cross section of the 3 → 2 process. First, we compute the 2 → 3 cross section in the non-relativistic limit of the three-body final state, and then we obtain the 3 → 2 cross section by the use of the relation between cross sections Eqs. (48) and (49) (with N = 3). We take the center-of-mass frame of the initial pions of the 2 → 3 process, p 4 + p 5 = 0, and the collision energy is assumed to be √ s, and thus the 4-momentum of the initial pion is given by with | β| = (1 − 4m 2 π /s) 1/2 . The amplitude takes the form We carry out the three-body phase-space integral for the final-state pions to obtain the 2 → 3 cross section. We introduce a practical technique to compute the six-dimensional hyperspherical integral. k 3 3 = κ 2 1 + κ 2 2 since we take the center-of-mass frame. We may parametrize each momentum as follows.
Here, (θ i , φ i ) denotes the solid angle for the momentum κ i , and θ distributes k 3 to κ 1 and κ 2 . This assignment of angles is called as the Delves coordinate [88,89]. The integral measure is where dΩ i 3 denotes the solid-angle integral for the momentum κ i . The three-body phase space integral in the non-relativistic limit takes the form: Here, k 2 3 = 2µ 3 E. Meanwhile, the relativistic form of the three-body phase space integral is given by where s 23 = (p 2 + p 3 ) 2 . The relativistic counterpart of the Delves angle θ is the Dalitz variable s 23 . s 23 takes a value within the range of (m 2 + m 3 ) 2 ≤ s 23 ≤ ( √ s − m 1 ) 2 . We may consider the non-relativistic limit of the integral by parameterizing as √ s 23 = being the center-of-mass energy in the limit. We expand β and ds 23 in small E limit as follows.
and therefore, we obtain the non-relativistic form of the phase-space integral given by Eq. (57) by replacing E = (k 2 3 /2µ 3 ). The 2 → 3 cross section in the non-relativistic limit of the three-body final state is Here, t 2 R is the decomposition of t 2 into each representations. We carry out the three-body phase-space integral by following Eq. (57). By comparing the cross section with Eq. (48) (with N = 3), we find the square of the partial-wave amplitude (18) for the inelastic scattering in the text.
The thermal averaging of the 3 → 2 annihilation is defined by Eq. (51) (with N = 3). Since the partial-wave amplitude is averaged over hyperangles, the hyperangle integral is trivially carried out and only k 3 -integral remains. We find the thermally-averaged cross section as follows.
Our result is smaller than the cross section in the original paper [53] by a factor of 1/3. The difference could arise from the Galilei transformation of the invariant amplitude Eq. (54), which is computed in the center-of-mass frame. p 1 × p 2 is not Galilean invariant, and in fact it transforms as v being a constant velocity. The corresponding part may be rewritten in terms of the Jacobi momentum as follows.
As we mentioned, k 3 = 0 in the center-of-mass frame. Hence, the second term vanishes since the amplitude (54) is computed in the center-of-mass frame, and then we obtain Eq. (61). Meanwhile, we reproduce the result in Ref. [53] by keeping the second term artificially and by keeping k 3 integral with the Maxwell distribution (with µ 3 ). We note again that it is important to rewrite the cross section in terms of the Jacobi momenta, namely the Galilean-invariant way, especially for computing thermally-averaged cross section.

C Group-theoretical Computation
We consider the irreducible decomposition of products of broken generators of G/H. It is convenient to use the SU (N f ) generators for describing the broken generators even for G/H = SU (N f )/SO(N f ) and SU (N f )/U Sp(N f ). The fundamental generators are denoted by T a , whose normalization is defined by The Fierz identity for the generator is The commutation and anticommutation of the generators gives the group-theoretical constants, d and f as and we use these constants for the short-hand notation.
The broken generator is the adjoint representation of SU (N f ), and the direct product of the generator is symbolically given by Here, Ad denotes the adjoint representation of SU (N f ). The subscripts S and A denote symmetrization and anti-symmetrization of indices of adjoint representations in the lefthand side. Other representations are summarized in Table 1. As seen in the table, several representations vanish for N f ≤ 3. For N f = 2, Ad S , T A ⊕ T A , and Y S do not appear and the decomposition is given by and for N f = 3, Y S does not appear and the decomposition is given by The projection operators with adjoint indices are defined by Here, the subscript R stands for the representations, and a factor of 4 is for the Dynkin normalization. (P R ) jl;i k ik;j l denotes the projection operators for the fundamental indices, which reflect the symmetries of corresponding Young tableaux. Now, we decompose the product (T a ) i j (T b ) k l into its irreducible representations, which corresponds to (T a ) i j (T b ) k l (P R ) jl;i k ik;j l . Ad We use the short-hand notation a i j ≡ (T a ) i j . We define the traceless part of the product a i j b k l as follows.
Here, the trace is given by tr(ab) = a i j b j i . Using this quantity, we can decompose a i j b k l into their irreducible parts. We introduce (P ab R ) ik jl as a projection of a i j b k l into its irreduceble representation R and each representation takes the form Here By multiplying 4c j i d l k , we obtain the projection operators with indices of pions, P ab;cd R . The projection operators are P ab;cd To verify the second equality, we use several identities with d abc and f abc , such as and where d R refers to the dimensions of the representation R (see Table 1 for the dimensions).
Let us consider the symmetry breaking SU (N f ) → SO(N f ). The number of unbroken generators is N f (N f − 1)/2, and hence the number of broken generators is (N f + 2)(N f − 1)/2. The broken generator satisfies T a = (T a ) T , and hence the pions transform as a second rank symmetric tensor field. The product of the symmetric tensors is decomposed as follows.
Here, the subscripts S and A denote symmetrization and anti-symmetrization of indices of symmetric representations in the left-hand side. Table 2 shows the representations and their dimensions. As seen in the table, several representations vanish for N f ≤ 3. For N f = 3, Y S does not appear and the decomposition is given by It is convenient to make the upper indices of the broken generator lower since the broken generators are symmetric: (T a ) ij ≡ (T a ) j i . The Fierz identity for the broken generator is given by Here, the summation over a runs only over the broken generators. We construct the projection operators for the product of broken generators as with the SU (N f ). The projection operators with the labels of the broken generators are defined by Now, we decompose the product (T a ) ij (T b ) kl into its irreducible representations, which corresponds to (T a ) ij (T b ) kl (P R ) ijkl;i j k l . We use the short-hand notation a ij ≡ (T a ) ij . We define the traceless part of the product a ij b kl as follows.
The trace is defined by tr(ab) ≡ a ij b ji . Using this quantity, we can decompose a ij b kl into their irreducible parts. We introduce (P ab R ) ijkl as a projection of a ij b kl into its irreducible representation R and each representation takes the form (P ab S ) ijkl = 1 6 (ã ijbkl +ã kjbil +ã kibjl +ã ljbik +ã ilbkj +ã klbij ) , By multiplying 4c ij d kl , we obtain the projection operators with indices of pions, P ab;cd R . Since the broken generators are parts of SU (N f ) generators, we may write the projection operators in terms of the group-theoretical constants of SU (N f ), f abc and d abc , as in the previous subsection. Hence, the commutation and anticommutation relations of the broken generators are given by where the indices with bars run over all generators of SU (N f ), but those without bars denote the broken generators. We find f abc is non-zero only whenc runs over the unbroken generators, while d abc is non-zero only whenc runs over the broken generators, which are known as Cartan decomposition.
We find the projection operators P ab;cd R in SU (N f )/SO(N f ) as follows.
These are consistent with the projection operators given in Ref. [90]. The trace of projection operators gives the dimension of the representation as with the previous subsection, a,b,c,d P ab;cd R P cd;ab where d R refers to the dimensions of the representation R (see Table 2 for the dimensions). We confirm properties of the projection operators by taking the trace of the projection operators:  a ⊗ a = 1 S ⊕ a S ⊕ Ad A ⊕ T A ⊕ Y S ⊕ S S .
Here, the subscripts S and A denote symmetrization and anti-symmetrization of indices of pions (or π a T a ) in the left-hand side. Table 3 shows the representations and their dimensions. As seen in the table, several representations vanish for N f ≤ 6. For N f = 4, a S , T A , and S S do not appear and the decomposition is given by and for N f = 6, S S does not appear and the decomposition is given by We construct the projection operators for the product of broken generators as with the previous subsections. We define the generator with the symplectic metric as (τ a ) ij ≡ (T a ) j i J jj , which satisfies (τ a ) T = −τ a . The Fierz identity for the broken generator is given by The projection operators with the labels of the broken generators are defined by Now, we decompose the product (τ a ) ij (τ b ) kl into its irreducible representations as with previous subsection. We use the short-hand notation a ij ≡ (τ a ) ij . We define the traceless part of the product a ij b kl as follows.
2(3N f + 2)J ij J kl + (N 2 f + 4)(J il J jk − J ik J jl ) tr(ab) Here, the invariant tensor is J ij , but not δ ij , and the trace is defined by tr(ab) ≡ J ik J jl a ij b kl . This trace coincides with the trace of SU (N f ) generators T a as follows: Using the traceless part, we can decompose a ij b kl into their irreducible parts. We introduce (P ab R ) ijkl as a projection of a ij b kl into its irreducible representation R, and each representation takes the form (P ab S ) ijkl = 1 6 (ã ijbkl −ã ikbjl +ã ilbjk +ã jkbil −ã jlbik +ã klbij ) , (P ab Y ) ijkl = 1 6 (2ã ijbkl +ã ikbjl −ã ilbjk −ã jkbil +ã jlbik + 2ã klbij ) , We obtain the projection operators in terms of the pion indices by multiplying 4c ij d kl . Similarly to the previous subsection, the broken generators are parts of the SU (N f ) generators, and hence we may write the projection operators in terms of the grouptheoretical constants for the SU (N f ) generators.
Here, the indices without bars denote the labels of broken generator, and the indices with bars run over all SU (N f ) generators. These are consistent with the projection operators given in Ref. [90]. Again, the trace of projection operators gives the dimension of the representation, a,b,c,d P ab;cd R P cd;ab where d R refers to the dimensions of the representation R (see Table 3 for the dimensions). We confirm properties of the projection operators by taking the trace of the projection operators: R a,b P ab;ab Here, as we discussed in the previous section, we use the Fierz identity for the broken generators [see Eq. (114)] instead of Eq. (87).

C.4 Partial-wave amplitudes in 2-to-2 vertices
We summarize the tree-level partial-wave amplitudes T R , which are defined in Eq.