Probing $e\mu$ flavor-violating ALP at Belle II

Recently, it was pointed out that the electron and muon g-2 discrepancies can be explained simultaneously by a flavor-violating axion-like particle (ALP). We show that the parameter regions favored by the muon g-2 are already excluded by the muonium-antimuonium oscillation bound. In contrast, those for the electron g-2 can be consistent with this bound when the ALP is heavier than 1.5 GeV. We propose to search for a signature of the same-sign and same-flavor lepton pairs and the forward-backward muon asymmetry to test the model at the Belle II experiment.

Recently, a new candidate of the BSM signal has been reported in the electron g − 2, (g − 2) e . The fine-structure constant was determined precisely by measuring the cesium mass [20]. Consequently, the theoretical uncertainty of a e was reduced by a factor of 3 compared to the previous result based on the measurement of the rubidium mass [21]. It revealed that the experimental result [22] is deviated from the SM prediction [6] (cf. Ref. [7]) as corresponding to a 2.5 σ discrepancy. It is noticed that the experimental value of a e is smaller than the SM prediction. This is in contrast to the result of a µ , where the experimental value is larger than the SM one.
Although it is not difficult to explain either one of the g − 2 anomalies by BSM models, it is challenging to explain both of them simultaneously by a single model because of this sign difference. Furthermore, the magnitude of the electron g − 2 discrepancy is large: If ∆a µ is explained by a BSM model, the electron g − 2 is expected to receive a contribution from this model of ∼ m 2 e /m 2 µ × ∆a µ = O(10 −13 ), because the BSM contributions are proportional to the lepton mass squared in a wide class of models. It is obvious that O(10 −13 ) is too small to saturate the discrepancy in Eq. (1.2). Therefore, an additional mechanism is required to enhance the contribution to the electron g − 2.
In literature, many BSM models have been proposed to explain the anomalies simultaneously [23][24][25][26][27][28][29][30][31][32]. In particular, an axion-like particle (ALP) with lepton-flavor violating (LFV) interactions was introduced recently [31,32]. #1 Although the ALP is likely to have #1 To explain the mass hierarchy and mixing matrices of the matter fermions, flavor symmetries (horizontal symmetries) are often introduced. Flavor-violating ALPs have been originally proposed as a pseudo-Nambu-Goldstone boson connected with the spontaneous flavor symmetries breaking at a high scale [33][34][35][36][37], and are called flavons or familons. small couplings, it is expected to be light, and thus, its contribution to the g − 2 can be large. The sign difference between ∆a e and ∆a µ is accommodated if axial-vector couplings are larger than vector ones. Besides, the LFV couplings of the ALP enables us to enhance the contribution to the electron g − 2 due to a chirality enhancement.
In this paper, we study a unique LFV ALP model which can accommodate the electron and muon g −2 anomalies. #2 Since the ALP is a real scalar field, its couplings to the electron and muon induce the transition of the muonium into antimuonium. It will be shown that the parameter regions which explain both of the g − 2 anomalies are already excluded by this process. In particular, the ALP contributions to the muon g − 2 are tightly constrained. In contrast, those to the electron g − 2 can be consistent with it. We will study future prospects of the Belle II experiment to probe such parameter regions. In this paper, the following signatures are evaluated; (1) a production of the same-sign and same-flavor lepton pairs, e + e − → µ ± µ ± e ∓ e ∓ via an on-shell ALP production, and (2) a forward-backward (FB) asymmetry in muon pair production, e + e − → µ + µ − . This paper is organized as follows. In Sec. 2, we briefly introduce a flavor-violating ALP model. In Sec. 3, the ALP contributions to the electron and muon g − 2 are explained, and the bound of the muonium-antimuonium oscillation is examined in Sec. 4. Future prospects at the Belle II experiment are investigated in Sec. 5: In Sec. 5.1, a signature of the onshell production of the ALP is studied, and the FB asymmetry of the muon is discussed in Sec. 5.2. Finally, Sec. 6 is devoted for the conclusions and discussion. In Appendix A, the ALP contribution to the transition probability of the muonium into antimuonium is derived, and an analytic formula of the FB asymmetry is provided in Appendix B.

Flavor-violating ALP model
Let us introduce ALP (a), which is a real (pseudo-) scalar field, with flavor-violating interactions. Since such an ALP is regarded as a pseudo-Nambu-Goldstone boson of a broken symmetry, the mass m a becomes naturally small compared to the broken scale Λ. The low-energy effective Lagrangian is obtained as [31,32,39] up to dimension-five operators. Here, v ij , a ij , c eff γγ and c eff gg are dimensionless parameters, which depend on details of the UV models. In particular, v ij and a ij are Hermitian matrices with flavor indices i, j. Here, f i denotes a matter fermion including the lepton in the i-th generation in the mass basis.
For the flavor-conserving interaction (i = j), it is noticed that contributions of v ii vanish automatically and only the pseudo-scalar term is left. This is obvious if we consider a case #2 See also Refs. [32,38] for phenomenology of general flavor-violating ALP models.
when the ALP couples to on-shell fermions. Then, the af i f j interaction is rewritten as by using the equation of motion, where m i is the mass of f i . For i = j, the coefficient of v ii is found to be zero. Note that c eff γγ and c eff gg are induced generally by fermion one-loop diagrams as c eff γγ , c eff gg = O(a ii ), even if they are suppressed in a high energy scale [31,40]. In this paper, the ALP interactions are assumed to satisfy to evade severe constraints from LFV observables. #3 The second inequality is imposed because v eµ , a eµ can induce LFV decays of muons such as µ → eγ, π → eµ, µ → eee, and µ → e + invisible when combined with a ii , c eff γγ or c eff gg [31,32,41]. On the other hand, the first condition is required because other types of LFV decays such as τ → µγ, τ → eγ, τ → µµe, and τ → µee can be generated by combining v eµ , a eµ with those including the tau lepton [32,41]. Such a hierarchy between the ALP interactions could be obtained from Z 4 lepton flavor symmetry (cf. Ref. [42]).
From Eq. (2.1), it is noticed that v ij and a ij appear in associated with Λ in the expressions of ALP contributions to observables. It is convenient to define dimensionless scalar coupling (y V ) eµ and pseudo-scalar coupling (y A ) eµ as #4 For instance, by neglecting the electron mass, the interactions (2.2) are expressed as Note that v µe = v * eµ and a µe = a * eµ by Hermiticity. Therefore, the ALP contributions are represented by the following parameters, (2.6) Note that the partial wave unitarity sets the upper bound [32]: We ignore ALP interactions with quarks because they are irrelevant for the lepton g −2.
The interactions with neutrinos are also irrelevant in the following analysis. #4 The couplings y V and y A are related to those in Ref. [31] as 3 Lepton g − 2 The ALP contributions to the lepton g − 2 are generated at the one-loop level as displayed in Fig. 1. From the effective Lagrangian in Eq. (2.1), we obtain the results as, #5 for m 2 e m 2 µ , m 2 a , where x a = m 2 a /m 2 µ . These results are consistent with those in Refs. [31,32]. #6 Although a ALP µ seems to diverge at x a = 1, this is because the approximation becomes invalid; at x a = 1, the result becomes for m 2 e m 2 µ . As pointed out in Ref. [31], the sign of a ALP e can be opposite to that of a ALP µ particularly when | (y A ) eµ | > | (y V ) eµ | is satisfied. In fact, the loop function in a ALP e is positive for any x a , while the function in a ALP µ is positive when x a 0.9.
This result is understood as follows: the definition of the anomalous magnetic moment a is normalized by the lepton mass m and requires a chirality flip for the lepton. In the ALP #5 In this paper, we focus on the flavor-violating ALP contributions. However, it was argued that the electron and muon g − 2 discrepancies can also be explained simultaneously by large flavor-conserving ALP couplings as |c eff γγ | |a µµ | |v ij |, |a ij | i =j [31,32]. #6 For a model with the flavor-violating Yukawa interactions (2.5), the muon g − 2 was studied in Ref. [43] with providing the exact formula, and its relation to the electron g − 2 was discussed in Ref. [44].  contributions (see Fig. 1), the latter for the electron g − 2 is caused by the intermediate muon, while it is provided by the external muon for the muon g − 2. Thus, the ratio is scaled by the single power of the lepton mass. Then, if the ALP contribution to the muon g − 2 is comparable to ∆a µ , the contribution to the electron g − 2 becomes as large as O(10 −11 ), which overshoots ∆a e in Eq. (1.2) by an order of magnitude. Therefore, to explain the anomalies simultaneously, a parameter tuning is necessary between | (y V ) eµ | and | (y A ) eµ | at an order of 1-10% levels, depending on m a .
In Fig. 2, the g − 2 favored regions are shown for m a = 110 MeV (left), 1.5 GeV (middle) and 10 GeV (right). The discrepancies of (g−2) e and (g−2) µ are explained within the 2σ level in the green and yellow regions, respectively. Here and hereafter, ∆a µ = (27.8 ± 7.4) × 10 −10 [13] is adopted. It is found that the both anomalies can be reconciled by a mild tuning between (y V ) eµ and (y A ) eµ when the ALP is light, while the parameter tuning becomes tighter for larger m a , as mentioned in Ref. [31]. In the figures, we also show the exclusion limit from the muonium-antimuonium oscillation measurement by the blue-shaded regions, which will be discussed in the next section.
When m a is smaller than (m µ − m e ), the decay of µ → ea is kinematically open and tightly constrained by a search for µ → e + invisible [31,32,45]. Hence, no parameter space is available in m a < (m µ − m e ) to explain both of the anomalies.
Finally, let us comment on CP -violating contributions from the flavor-violating ALP couplings. The most severe constraint comes from the electric dipole moment (EDM) of the electron, |d e | < 1.1 × 10 −29 e cm (90% CL) [46]. The contribution is generated by the similar diagram as Fig. 1, but the contribution is proportional to Im (y V ) eµ (y A ) * eµ . Naively, it is estimated as |d e | ∼ e/(2m e )a ALP e ∼ O(10 −23 ) e cm, by requiring a ALP e ∼ ∆a e . Therefore, the ALP couplings are required to satisfy Im (

Muonium-antimuonium oscillation
Since the ALP is a real scalar boson, an exchange of the flavor-violating ALP generates effective [μe][μe]-type operators in a low-energy scale. Such operators can be probed by measurements of a transition of muonium (M = µ + e − ) into antimuonium (M = µ − e + ), which is so-called the muonium-antimuonium (M-M) oscillation [47][48][49]. Since the SM prediction is highly suppressed by the neutrino masses [50], it is very sensitive to the eµ flavor-violating interactions.
The derivation is provided in Appendix A. Note that this formula is obtained by integrating out the intermediate ALP, i.e., valid for m a > m µ . We also assumed the CP symmetry, and thus, parity-violating interferences between the scalar and pseudo-scalar interactions are dropped [54]. The most precise measurement has been performed by the MACS experiment at PSI [55]. The upper bound on the M-M oscillation was obtained as at the 90% CL. In the experiment, the external magnetic field of B = 0.1 Tesla was applied to detect an energetic e − from the antimuonium decay in a magnetic spectrometer. Besides, c F,m F stands for a population of the muonium state in the experimental setup where F is the total angular momentum of the muonium and m F = −F, −F + 1, . . . , F . For the MACS experiment, it is estimated as |c 0,0 | 2 = 0.32 and |c 1,0 | 2 = 0.18 [51,52]. #7 For specific cases, we obtain the upper bounds, #8 #7 As a crosscheck, we reproduced magnetic field correction factors S B in Table II in Ref. [55] by supposing an equal population, |c 0,0 | 2 = |c 1,−1 | 2 = |c 1,0 | 2 = |c 1,1 | 2 = 0.25.
#8 These bounds are consistent with those in Refs. [41,56], where the magnetic effects are taken into account via S B . Also, our results are slightly severer than those in Refs. [57,58].
In Fig. 2, the blue-shaded regions are excluded by the M-M oscillation. It is shown that the region favored by the muon g − 2 is completely excluded. #9 Since both of the ALP contributions to a µ and P MM are scaled by (y V,A ) 2 /m 2 a for m a m µ , ∆a µ cannot be explained even by a heavier ALP. On the other hand, ∆a e can be explained for m a > 1.5 GeV.
Since the loop function of the ALP contribution to a e is proportional to log m 2 for m a m µ , it is likely to be amplified compared to that for a µ , and the constraint from the M-M oscillation becomes relaxed. In the next section, we will study how to search for such an ALP at the Belle II experiment.

Collider signals at Belle II
In this section, we investigate experimental sensitivities to the eµ flavor-violating ALP at the Belle II experiment. First, we consider a process e + e − → µ ± e ∓ a → µ ± µ ± e ∓ e ∓ , which is effective for m a ≤ √ s BelleII , where √ s BelleII = 10.58 GeV is the center-of-mass energy of the experiment. Next, the FB asymmetry in the process e + e − → µ + µ − is investigated, where an off-shell ALP contribute, and thus, the observable has sensitivity to the ALP with m a ≥ √ s BelleII .

5.1
e + e − → µ ± µ ± e ∓ e ∓ via on-shell ALP In the eµ flavor-violating ALP model, productions of the same-sign and same-flavor leptons pairs can proceed, #10 e + e − → µ ± e ∓ a → µ ± µ ± e ∓ e ∓ . (5.1) The branching ratios of the ALP is BR(a → µ + e − ) = BR(a → µ − e + ) = 0.5 for m a ≥ m µ + m e . In Fig. 3, we show some of the diagrams which contribute to the process. In the left and middle diagrams, an on-shell ALP is produced, while it is exchanged in an off-shell state in the right diagram. #11 This process is quite unique; the final state includes two pairs of the same-sign electrons and muons. Such same-flavor and same-sign leptons processes are never produced within the SM. Besides, the charge reconstruction of leptons is quite accurate in Belle II experiment [59][60][61]. Hence, we simply neglect the SM background in the following analysis. #12 The fiducial production cross section is estimated by using MadGraph5 [63] with the ALP model file generated by FeynRules [64]. Signal events with the asymmetric beam energy #9 The same conclusion was made in Ref. [41] for a light scalar model with effective Yukawa couplings. #10 A process e + e − → µ ± e ∓ a → µ ± µ ∓ e ± e ∓ is also predicted but less distinctive due to larger backgrounds. #11 In the analysis, although the off-shell contributions are included, we checked that the production cross section is dominated by the on-shell ALP productions. #12 Another distinctive feature of the signal is an invariant mass peak around the ALP mass, (p e +p µ ) 2 = m 2 a . Although there are two possible combinations to construct the eµ resonance, the wrong combination just provides a continuum distribution. To obtain a clear peak, we further need to drop it in event-by-event basis as discussed in Sec 3.3.1 of Ref. [62]. In this paper, we do not impose this condition. E(e − ) = 7 GeV and E(e + ) = 4 GeV [61], are generated, and the following kinematical cuts for the final state leptons are imposed in the laboratory frame [60,61]: where θ is an angle to the e − beam line in the laboratory frame. Besides, the electron and muon tagging efficiencies are taken into account for each final-state lepton by following Ref. [61]; they depend on the electron/muon energy but are assumed to be independent of the polar and azimuthal angles for simplicity. Figure 4 shows the fiducial cross section after the kinematical cuts as a function of the ALP mass. Here, (y A ) eµ and m a are varied with (y V ) eµ = 0 fixed. #13 The green band represents a predicted cross section in which the electron g − 2 discrepancy is explained within the 2σ level, and the green dashed line stands for the central value of the discrepancy. The horizontal dotted, dashed, and solid blue lines represent the expected sensitivities of the Belle II experiment with the integrated luminosity of 1, 5, and 50 ab −1 , respectively. If no signal events are observed, the region above the horizontal lines will be excluded at 95% CL, where the Poisson distribution is applied. It is found that when the ALP accounts for the central value of ∆a e , the Belle II experiment with the integrated luminosity of 5 (50) ab −1 can test the model for 1.5 ≤ m a ≤ 9.8 GeV (0.15 ≤ m a ≤ 9.9 GeV). In a smaller mass region, the muon emitted by the ALP decay becomes soft and cannot pass the kinematical cut in Eq. (5.3). On the other hand, the production cross section of e + e − → µ ± e ∓ a for a heavier ALP is suppressed both by the phase space and by the kinematical cuts.
Let us comment a case with m a √ s BelleII . Although the on-shell production of a is kinematically forbidden, the off-shell processes are still allowed. However, since the production cross section becomes very small in the parameter region where the electron g − 2 discrepancy is explained, the ALP model will not be probed by the Belle II experiment via e + e − → µ ± µ ± e ∓ e ∓ . #14 #13 When (y V ) eµ = 0, larger couplings are required to explain ∆a e because of the cancellation in Eq. (3.1). Then, larger cross sections are predicted and the model can be tested with smaller luminosity. #14 Recently, LHC sensitivities to the flavor-violating ALP were studied by Ref. [58] in a process of pp → h → µ ± µ ± e ∓ e ∓ . It was shown that the region of 20 m a 80 GeV can be probed accurately and its e + e ! µ ± µ ± e ⌥ e ⌥ Figure 4. The fiducial production cross section of e + e − → µ ± µ ± e ∓ e ∓ as a function of m a . Here, (y A ) eµ and m a are varied, while (y V ) eµ = 0 is fixed. The electron g − 2 discrepancy is explained within the 2σ level in the green band with the central value shown by the green dashed line. The horizontal dotted, dashed, and solid blue lines represent the expected sensitivity of the Belle II experiment with the integrated luminosity of 1, 5, and 50 ab −1 , respectively.

5.2
Forward-backward asymmetry of e + e − → µ + µ − Next, we consider the FB asymmetry of e + e − → µ + µ − . Within the SM, it occurs by an interference of s-channel γ and Z exchanges at the tree level. Since the latter contribution is suppressed by a factor of s BelleII /m 2 Z 1, the SM value becomes tiny at the Belle II experiment [65,66]. On the other hand, the flavor-violating ALP contributes in a t-channel process. Such a contribution can modify an angular distribution of the muon pair and be probed by measuring the FB asymmetry: where θ * is an angle between µ + and e + in the center-of-mass frame and N + (cos θ * > 0) is the number of µ + satisfying cos θ * > 0. The ALP contribution to the FB asymmetry is evaluated by calculating an interference of the transition amplitudes of the SM and ALP contributions to the production cross section. Since the SM amplitude is dominated by the s-channel γ contribution, we obtain the sensitivity is better than the M-M oscillation. Future lepton-collider sensitivities were also discussed in Ref. [41].
interference term I as where s = s BelleII and k is a momentum transfer of the ALP, The derivation is provided in Appendix B. Consequently, the ALP contribution to the FB asymmetry is given as When |(y A ) eµ | |(y V ) eµ | and m 2 a s, it is approximated as If the Belle II experiment is supposed to detect A ALP FB > δA FB in future, the ALP coupling could be probed for |(y A ) eµ | |(y V ) eµ | and m 2 a s BelleII . The Belle II experiment may achieve a statistical uncertainty of δA FB ∼ 10 −5 with the integrated luminosity of 50 ab −1 [65], though the systematic uncertainty is still unknown. On the other hand, the uncertainty of the SM prediction is likely to be dominated by that of the vacuum polarization. #15 Its size is inferred to be as large as or slightly larger than the experimental statistical uncertainty by considering an analogy to a study of the FB asymmetry in the electroweak precision test (see e.g., Ref. [67] for a recent work). Thus, reductions of the SM uncertainties could be essential to improve the Belle II sensitivity. See Refs. [68,69] for prospects of the vacuum polarization contribution.
In this paper, we consider δA FB = 10 −4 and 10 −5 for the future sensitivity as a reference. In order to investigate the ALP contribution to the FB asymmetry at these accuracies, one needs O(10 8−10 ) event samples for the Monte Carlo simulation. This is beyond the scope of this paper and we compare Eq. (5.7) with δA FB to derive the future sensitivity. Note that although signal acceptance and efficiencies should be taken into account, we have checked #15 Although uncertainties from long-distance QED corrections could also be large, they might be suppressed by investigating µ + µ − angular distributions thanks to high statistics at the Belle II experiment [66]. that such corrections are minor; the event number may be reduced by ∼ 10%. We neglect them in the analysis for simplicity. In Fig. 5, the Belle II sensitivity to the FB asymmetry is shown for δA FB = 10 −4 and 10 −5 by the dashed and solid red lines, respectively. Here, (y V ) eµ = 0 is fixed. In the green and yellow regions, ∆a e and ∆a µ are explained within the 2σ level with the central value drawn by the green and orange dotted lines, respectively. The M-M oscillation excludes the blue-shaded region at the 90% CL. It is seen that the region favored by the muon g − 2 is completely excluded by the M-M oscillation. In contrast, although ∆a e can be explained with satisfying the M-M oscillation bound, it is found that the Belle II measurement of the FB asymmetry with δA FB = 10 −5 can probe most of such parameter regions for m a 9 GeV.
The LEP experiment also performed measurements of the FB asymmetry of e + e − → µ + µ − for the center-of-mass energies from 130 GeV to 207 GeV [70]. The experimental uncertainties are dominated by the statistical one, and one of the most precise results is provided at √ s = 207 GeV as A LEP FB = 0.535 ± 0.028 ± 0.004 , (5.10) where the first and second errors are statistical and systematic uncertainties, respectively. On the other hand, it is referred in Ref. [70] that the SM value is 0.552 with the uncertainty smaller than the above statistical error. Since this is consistent with the experimental result, the ALP contribution is constrained by the LEP experiment. In Fig. 5, we show the LEP A FB bound at the 2σ level by the red-shaded region. #16 It is found that the LEP measurement does not constrain the parameter region favored by the electron g − 2. Although the limit may be improved by combining all the LEP results in various center-of-mass energies, we expect that it is still weaker than the constraint by the M-M oscillation and cannot reach the Belle II sensitivities. #17 Finally, in Fig. 5, the Belle II sensitivity to a search for e + e − → µ ± e ∓ a → µ ± µ ± e ∓ e ∓ , which was discussed in the last subsection, is also shown for the integrated luminosity of 1 and 50 ab −1 by the dashed and solid blue lines, respectively. For 1 m a 10 GeV with 50 ab −1 , it is found that the process is useful to probe the parameter region which is favored by the electron g − 2 and consistent with the M-M oscillation data. This result is complementary to the FB asymmetry; we conclude that the ALP parameter region favored by the electron g − 2 could be tested almost entirely by the Belle II experiment.

Conclusion
In this paper, we revisited the eµ flavor-violating ALP model motivated by the electron and muon g − 2 anomalies. Such an ALP inevitably induces the muonium-antimuonium oscillation, and it was found that whole the parameter regions which explain both of the discrepancies simultaneously are already excluded. #16 In the analysis, we include the s-channel Z contribution to the SM amplitude as well as that from the γ exchange, though the asymmetry is dominated by the interference of the ALP amplitude with the γ contribution for √ s = 207 GeV. #17 In Ref. [70], constraints for four-fermion interactions are also investigated by combining all the LEP results; according to Table 3.15, one finds Λ > 12.1 TeV for A0 − model, which corresponds to a purely pseudo-scalar electron-muon interaction. See the reference for definitions of the parameters. Since (y A ) eµ is related to Λ as |(y A ) eµ | = √ 8πm a /Λ, the limit is converted to be at the 95% CL, where the ALP is assumed to be decoupled. This result is consistent with Ref. [41]. On the other hand, by using Eq. (5.10) we obtain for m a 207 GeV. Thus, our result relying solely on Eq. (5.10) is not much weaker than the fully combined limit, Eq. (5.11). Nevertheless, the model can accommodate only the electron g − 2 anomaly. In order to test such parameter regions, we investigated the Belle II sensitivity of searching for the on-shell ALP production of e + e − → µ ± e ∓ a followed by a → µ ± e ∓ and the forward-backward asymmetry of e + e − → µ + µ − . We found that the former provides a good sensitivity for m a < 10 GeV, and the latter does for m a > 10 GeV. Hence, these measurements are complementary and provide better sensitivities than the muonium-antimuonium oscillation bound for m a > 1 GeV. As a result, the parameter region favored by the electron g − 2 anomaly can be tested almost entirely by the Belle II experiment with the integrated luminosity of 50 ab −1 .
Finally, we briefly comment on a light and flavor-violating complex scalar field. If this complex scalar S carries an electron or muon charge, the low-energy effective interactions become −(∂ µ S/Λ)ēγ µ (v eµ −a eµ γ 5 )µ or (y V ) eµ Sēµ+(y A ) eµ Sēγ 5 µ with their Hermitian conjugate, while interactions of e ↔ µ are forbidden. Although the contributions to the lepton g − 2 become analogous to Eqs. (3.1) and (3.2), the transition of the muonium into antimuonium does not proceed via the scalar. Hence, the flavor-violating complex scalar can explain both g − 2 anomalies simultaneously, though the UV completion is nontrivial [58]. #18 At the Belle II experiment, the measurement of the forward-backward asymmetry could probe such a parameter region when m S > O(1) GeV; the muon g − 2 discrepancy requires larger couplings, which enhance A FB , too. Although the same-flavor and same-sign productions of the electron and muon pairs do never proceed, one can consider a flavor-violating resonant search in the same-flavor and opposite-sign lepton pair productions of e + e − → µ ± e ∓ a → µ ± e ∓ µ ∓ e ± , which could be a target in the early stage of the Belle II experiment.

A Muonium-antimuonium transition
In this appendix, a probability of the transition of the muonium (M = µ + e − ) into antimuonium (M = µ − e + ) is derived under an external magnetic field B. We follow the analysis explored in Ref. [52].
Neglecting effects of the B field, depending on spins of the leptons, there are four types of the 1S muonium state |M; J, m J : Under the nonzero external B filed, they are modified via the magnetic dipole moment of leptons: and the |M; 1, ±1 states change only their energy levels. Here, s and c are with a dimensionless parameter, where µ B = e/(2m e ) is the Bohr magneton, g e g µ 2 is the magnetic moment of the electron/muon, and a 1.864 × 10 −5 eV is the 1S muonium hyperfine splitting. Similarly, the antimuonium sates are represented as The transition probability of M → M under the nonzero B field is denoted as P MM |c 0,0 | 2 P where |c J,m J | is the population probability of the muonium initial state with (J, m J ), which satisfies |c 0,0 | 2 + |c 1,−1 | 2 + |c 1,0 | 2 + |c 1,1 | 2 = 1. Note that the transitions of |M; 0, 0 B → |M; 1, 0 B and |M; 1, 0 B → |M; 0, 0 B are extremely suppressed due to the hyperfine splitting, and that of |M; 1, ±1 B → |M; 1, ±1 B is also suppressed because the energy levels of the initial and final states are modified under the nonzero magnetic field. The transition probabilities are represented by the transition amplitudes as [48,49], In the case of the effective Lagrangian of Eq. (2.5), when the ALP is sufficiently heavier than the muonium, the effective Hamiltonian is obtained by integrating the ALP field as where S =μe and P =μγ 5 e. Here, the CP conservation is assumed. Neglecting effects of the magnetic field, the transition matrix elements are [51] M; F = 0|S 2 |M; F = 0 0 = 2 πa

B Forward-backward asymmetry
In this appendix, the ALP contribution to the FB asymmetry of e + e − → µ + µ − is calculated. The asymmetry is obtained by an interference of the SM and ALP scattering amplitudes.
In the former, a contribution with the virtual γ exchange dominates the amplitude for the energy s m 2 Z . Then, it becomes We define θ * as the angle between µ + and e + in the center-of-mass frame. The integration of the squared amplitude over cos θ * including a factor of the polarization sums leads to (v e + γ µ P R u e − )(ū µ − γ µ P L v µ + ) + (y V ) eµ + (y A ) eµ

2
(v e + γ µ P L u e − )(ū µ − γ µ P R v µ + ) for s m 2 µ . Note that the differential cross section is given as where the second line corresponds to the interference term.