Hunting for ALPs with Lepton Flavor Violation

We examine the low-energy signatures of axion-like particles (ALPs) in lepton flavor violating (LFV) processes. By using a dimension-5 effective Lagrangian, we compute the most general ALP contributions to LFV decays of leptons and mesons. The provided expressions are valid for any choice of ALP mass and couplings. We explore the complementarity of different processes, identifying specific patterns to be experimentally tested. Constraints on LFV couplings are derived from existing data and prospects for forthcoming experiments are also discussed. As a by-product, we revisit the possibility of a simultaneous explanation of the observed discrepancies in the muon and electron $g-2$ through ALP interactions.

alizing and complementing previous analyses [20,21]. We consider two distinct classes of processes: (i) purely leptonic and (ii) hadronic decays. The first class comprises radiative decays j → i γ, three-body decays j → i k k and j → i γγ (with j = µ, τ ), as well as µ → e conversion in nuclei, which we consider to be a leptonic process since the only coherently-enhanced contributions are the ones coming from photon penguins. Hadronic decays include leptonic meson decays P → i j , their inverse processes τ → P j , as well as semileptonic decays P → P (V ) i j , where P ( ) and V stand for pseudoscalar and vector mesons, respectively. We derive general formulae for these processes that can be applied for any choice of the ALP mass and couplings. We show in detail that the relative importance of the various processes depends not only on the relative strength of ALP couplings to photon/gluons and fermions but also on the ALP mass. In particular, the correlations among observables can change drastically depending on whether ALPs are produced onshell or off-shell. This makes it possible to infer the ALP mass indirectly, through their virtual effects to LFV processes.
Another observable which is highly sensitive to ALP couplings is the anomalous magnetic moment of leptons a = (g − 2) /2 [10,22]. This quantity received a lot of attention due to the longstanding ≈ 3.6σ discrepancy between the experimental measurement and the SM prediction, ∆a µ = a exp µ − a SM µ = (27.1 ± 7.3) × 10 −10 [23][24][25], which might be soon clarified by the ongoing analysis at the Muon g-2 experiment (Fermilab) [26]. In a large class of NP scenarios, contributions to magnetic moment scale with the square of lepton masses (the so-called "naive scaling"), in such a way that the current discrepancy ∆a µ suggests a NP effect in a e at the level of (7 ± 2) × 10 −14 [27]. 1 Testing NP with (g − 2) e became possible only very recently thanks to the improved measurement of the fine-structure constant α em from atomic physics experiments [29]. Remarkably, the reevaluation of ∆a e employing the latest α em value has shown a 2.4σ deviation from the SM prediction, namely ∆a e = (−87 ± 36) × 10 −14 [30], which departs considerably from the "naive scaling" expectation and has the opposite sign compared to ∆a µ . Another goal of this paper is to assess to which extent ALPs can explain simultaneously both g − 2 deviations, exploring the potential interplay with LFV observables (see also Ref. [21] for a recent discussion).
The paper is organized as follows. In Sec. 2, we remind the reader of the effective description of ALP interactions in terms of dimension-five operators. In Sec. 3, we discuss the purely leptonic probes of LFV ALP couplings, deriving general expressions for the relevant observables and extracting constraints from available experimental results. In Sec. 4, we extend our discussion to LFV processes involving hadrons. Our findings are then summarized in Sec. 5.

Effective field theory description
At low energies, the most general dimension-5 effective Lagrangian describing ALP interactions with fermions and photons/gluons reads [6] L d≤5 eff = 1 2 (∂ µ a)(∂ µ a) − m a a 2 2 where the invariance under the gauge symmetry SU (3) c × U (1) em has been imposed. In this equation, f ∈ {u, d, } denotes SM fermions in the mass basis, i, j stand for flavor indices and the dual field strengths are defined as X µν = 1 2 ε µναβ X αβ , with ε 0123 = +1. The effective couplings to photons and gluons are denoted by c γγ and c gg , while v f ij and a f ij stand for the vector and axial-vector ALP couplings to SM fermions, and Λ for the EFT cutoff. 2 The matrices v f and a f are taken to be hermitian and real to avoid CP violating effects. By using the equations of motion, the last term in Eq. (1) can be recast as The anomaly equation for the axial-vector current divergence also entails a modification of c γγ and c gg couplings, see e.g. discussion in Ref. [10]. In the following, we will denote the effective couplings accounting for the full one-loop contributions as c γγ ≡ c eff γγ and c gg ≡ c eff gg . From Eq. (2), we see that v f ij contributes to flavor-violating observables only, as expected from the vector-current conservation, while a f ij enters both flavor-conserving and violating observables. For future convenience, we define the effective couplings s ij which appears in the expressions of most LFV processes, as shown below. Before studying the phenomenological implications of the effective Lagrangian defined above, we discuss the theoretical bounds on the ALP couplings arising from perturbative unitarity [31]. Indeed, for sufficiently large values of the energy √ s, the EFT description is expected to break down. We estimate these constraints by computing the partial wave unitarity bounds on γγ → γγ, gg → gg andf f →f f amplitudes mediated by a pseudoscalar boson in the limit of high-energies ( √ s m a ). By requiring partial waves of total angular momentum J = 0 to satisfy |Re a 0 | < 1/2, we obtain the following conditions As an example, if Λ = 1 TeV and c gg (c γγ ) = 1, from Eq. 4 we learn that our EFT remains unitary up to energies √ s 10(40) TeV. On the other hand, bounds on the Yukawa couplings v f ij , and a f ij do not depend on √ s. For instance, for Λ = 1 TeV, Eq. 5 yields . We now proceed to examine several LFV observables sensitive to the couplings defined above, starting with purely leptonic processes.

Purely leptonic processes
The most promising decay channels in this category are the radiative decays j → i γ, and the three-body decays j → i k k and j → i γγ (with i < j). In the following, we provide the general expressions for each of these processes and discuss their potential in constraining ALP couplings.

j → i γ
The amplitude for the process j → i γ can be generically parameterized as where p and q denote the momentum of the heavy lepton and the photon, respectively. Since the photon is on-shell in this process, gauge invariance implies that the most general Dirac structure is given by where F ij 2 (0) and G ij 2 (0) are dimensionless form-factors. The above expression allows us to write the decay rate as where m j m i has been used. The leading contributions to the dipole form-factors F ij 2 (0) and G ij 2 (0) come from the diagrams illustrated in Fig. 1, which depend either linearly (left diagram) or quadratically (right diagram) on the lepton Yukawas, cf. Eq. (2).
Linear diagrams, which mimic the so-called Barr-Zee contributions [22,32], are typically dominant due to the logarithmic dependence on the ultraviolet cut-off. Their contribution to the dipole form factors reads where x j = m 2 a /m 2 j − iη (with η → 0 + ) and g γ (x) ≈ 2 log Λ 2 /m 2 a . The complete loop function is reported in Apendix A.1. By contrast, the quadratic contributions are finite, once self-energies are included, and depend on the flavor of the lepton k running in the loop. The contributions involving only one LFV coupling (i.e. with k = i or k = j) read 3 In the µ → eγ case, there is an additional contribution from the τ -loop exchange which is induced by a double LFV source. We find which show a m τ /m µ enhancement compared to contributions involving a single LFV coupling. Similarly, τ → µγ receives contributions from electron loops, which read while muon loops contribute to τ → eγ as follows, with the loop functions g i (x) also collected in Appendix A. We find full agreement with the results reported in Ref. [21]. In summary, the most general contributions to j → i γ with a single LFV coupling are given by the sum of the linear and quadratic contributions, see Eq. (9)- (12). Moreover, one should include the additional effects of Eq. (13) and (14) in the µ → eγ case, Eq. (15) and (16) for τ → µγ, and finally Eq. (17) and (18) for τ → eγ.

j → i k k
The processes j → i k k are described by the diagrams shown in Fig. (2). The ALP can contribute both at tree-level (right panel) or at one loop, via the effective j → i γ * vertex (left panel). Depending on the ALP mass, two different regimes arise: (i) for m a > m j −m i or m a < 2 m k the ALP is never produced on-shell, in such a way that there is a competition between tree and loop-level contributions, while (ii) for 2 m k < m a < m j − m i the ALP can be produced on-shell, making the tree-level exchange dominant. In the following we provide the relevant expressions in both cases and discuss the phenomenological implications. We start by parameterizing the general amplitude for the emission of an off-shell photon. In this case, Eq. (6) should be replaced by where the F ij 1,2 (q 2 ) and G ij 1,2 (q 2 ) are form-factors depend on q 2 and on the masses of the particles running in the loops depicted in Fig. 1. The general expression for these functions, which are reported in Appendix A.3, have been computed by independently using the packages Feyncalc [33] and Package-X [34]. We verified that these expressions coincide with the results given in Sec. 3.1 in the limit q 2 → 0. In particular, F ij 1 (0) = G ij 1 (0) = 0, as expected by gauge invariance.
Even though the form-factors reported in Appendix A.3 provide the most general description of the transition j → i γ * , it is convenient to derive simplified expressions which are valid for off-shell ALPs, i.e. for m a > m j − m i , and which are more convenient for phenomenological analyses. In this case, F ij 1 and G ij 1 are well-approximated by a series around q 2 = 0, whereΦ ≡ dΦ/dq 2 . Similarly, the dipole form-factors F ij 2 (q 2 ) and G ij 2 (q 2 ) are well described at leading order by setting q 2 = 0. The complete expressions for F ij The one-loop contributions computed above can now be combined with the tree-level one (see Fig. 2) to provide the general expression for Γ( j → i k k ). For compactness, form factors are expanded around q 2 = 0, as described above.
Off-shell decay rates We first consider the off-shell scenario with m a > m j − m i . The j → i k k decay rate in this regime can be decomposed in three pieces, namely (i) the photonic contribution, (ii) the ALP-mediated tree-level exchange and (iii) their interference, namely We compute each of these contributions by keeping the leading q 2 -dependence in the oneloop form factors. For the photonic contribution, we find where we have used m j m i , finding agreement with the standard expressions available in the literature [35]. For the tree-level term, we obtain the following expression by neglecting the ALP width, where x j = m 2 a /m 2 j , as before, and the phase-space function ϕ ik 0 (x) is reported in Appendix A.2. In the limit of large ALP masses, this function satisfies ϕ ik 0 (x) ∝ 1/x 2 + O(1/x 3 ), in agreement with the decoupling limit. Similarly, we computed the interference of both contributions, which is given by where the phase-space functions satisfy ϕ 1,2 (x) = 1/x + O(1/x 2 ), with the complete functions collected in Appendix A.2. Note that the interference between tree and loop-level contributions vanishes identically for k = i. The phenomenological implication of these expressions are discussed in Sec. 3.5.
In the off-shell scenario with m a < 2 m k , Eq. (23) remains the same, while Eq. (24) and (25) should be reevaluated with the appropriate phase-space integration. In this case, it is more difficult to provide a compact analytical expression as there are more mass scales involved. In the phenomenological analysis, we integrate the form factors reported in Appendix A.3 numerically. We have also checked that the form factor expansion around q 2 = 0 remains a reasonable approximation.
On-shell decay rates The above expressions can be simplified in the case where the ALP is produced on-shell, i.e. for 2m k < m a < m j − m i . In this case, the interference term in Eq. (25) becomes negligible, while the photonic contribution remains identical to Eq. (23). On the other hand, the tree-level ALP exchange can be described in the narrow-width approximation, with where we have used that m j m i , and where τ a denotes the ALP lifetime.
To assess the limits on ALP couplings in the on-shell regime, one should estimate the ALP flight distance and verify that it decays inside the detector, as shall be discussed next. The ALP boosted decay length in the lab frame is given by where p a denotes the ALP momentum in the lab frame. By assuming that j decays at rest, as in the case of µ → e experiments such as MEG [36] or Mu3e [17], |p a | reads . In our numerical analysis, we will naively impose the relaxed bound that a is not larger than ≈ 1 m in order for the ALP decay to be considered prompt. A more refined analysis can be performed in experimental searches, by using the displaced vertex as a tool to set even more stringent limits than searches based on prompt decays [20]. 4 Another limitation of our analysis is the reinterpretation of τ → e and τ → µ limits. Indeed, since τ 's are not produced at rest in current experiments, Eq. (30) does not apply in this case. The correct assessment of τ LFV limits in the resonant region would require a dedicated experimental study, as already suggested in Ref. [20] 3.3 j → i γγ The next purely leptonic decay mode we discuss is j → i γγ. We focus on the tree-level contribution illustrated in Fig. 3, which is the dominant one for light ALPs. Once again, we separate the off-shell region, m a > m j − m i , from the on-shell one, m a < m j − m i .
Off-shell decay rates The general expression for the branching ratio of j → i γγ in the off-shell regime (m a > m j ) is given by where x = m 2 a /m 2 j and the loop-function ϕ(x) reads On-shell decay rates In the on-shell regime, the branching ratio of j → i γγ can be obtained exploiting the narrow-width approximation, leading to the following result where and the leptonic decay rate is given in Eq. (27). As discussed in Sec. 3.5, given the present experimental constraints, the process j → i γγ turns out to be effective in limiting the ALP parameter space only in the on-shell regime.

µ − → e − conversion in nuclei
We now discuss µ − → e − conversion in nuclei, which will become one of the most sensitive LFV probes in the coming years (cf. Ref. [39] for an overview of experimental prospects). We consider this observable to be a purely leptonic probe, since ALP couplings to quarks and gluons do not induce coherent contributions to the conversion rate, being therefore negligible [35]. The only relevant contributions are then the ones stemming from the (offshell) photon penguins computed in Sec. 3.2.
The conversion branching fraction is defined as the ratio of the µ → e conversion rate over the nuclear capture one. Following Ref. [35], we have where Z eff is the effective atomic charge, F p parameterizes the nuclear matrix element and Γ capt stands for the total muon capture rate. The factor ξ 2 accounts for the photonexchange contributions as follows where (20) and Appendix A.3. For m a > m µ , evaluating the form-factors at q 2 = 0 is a very good approximation. In particular, considering only one flavor-violating coupling and neglecting the electron mass, this expression simplifies to as in the LFV quantities discussed above. In the specific case of gold ( 197 79 Au) and aluminum ( 27 13 Al) atoms, the necessary inputs are given respectively by [40], which can be replaced in Eq. (35) to give Currently, the most stringent limit is B(µ − Au → e − Au) < 7 × 10 −13 , obtained by the Sindrum-II collaboration [41]. In the near future, the Mu2e experiment at Fermilab [17] and Comet [19] at J-PARC aim to improve the experimental sensitivity to O(10 −17 ) by using aluminum atoms, cf. Table 1.

Numerical results and discussion
In this section we discuss the phenomenological implications of the results obtained above. Our main goal is to explore the complementarity of the different leptonic probes and their experimental prospects at current/future experimental facilities.
To derive constraints from existing data, we focus on a benchmark scenario defined by lepton-flavor universal couplings to leptons (a ≡ a ii ), with the other flavor-conserving couplings vanishing at tree-level. In this case, the effective aγγ coupling is generated by the (irreducible) one-loop contributions from leptonic couplings [10], where τ i = 4m 2 i /m 2 a − i η and the loop-function is given by which behaves as B 1 (τ ) ≈ 1/ √ τ in the τ → ∞ asymptotic limit, in such a way that only fermions lighter than m a contribute significantly to Eq. (41). To derive the relevant constraints to this scenario, it is crucial to determine the ALP total width, Γ a , and its flight distance in a given experiment [20], as discussed above. The constraints on |a µe |, |a τ e | and |a τ µ | are shown in Fig. 4, as a function of the ALP mass m a , by setting the flavor-conserving coupling to a = 1. Several comments are in order: • Out of the whole set of µ → e processes, B(µ → 3e) and B(µ → eγγ) impose the most stringent bounds for 20 MeV < m a 100 MeV due to the resonant-enhancement of the branching fractions. The upper bound corresponds to the kinematical threshold of on-shell ALP production, i.e. m a < m µ − m e , while the lower bound m a 20 MeV comes from the requirement that the ALP decays inside the detector, i.e. a 1 m, cf. Eq. (29).
• For m a 20 MeV, the ALP decays outside of the detector, in such a way that µ → ea is indistinguishable from µ → e + inv., from which we obtain the most stringent constraint in this mass range. In principle, these limits could be improved if dedicated experimental searches for displaced vertices are performed [20].
• For m a > m µ , the process B(µ → 3e) is dominated by the loop exchange of ALPs. Indeed, while tree-level effects decouple with inverse powers of m a , loop-induced contributions exhibit a smoother logarithmic dependence. For this reason, µ → eγ is currently the most constraining process for these masses. Currently, µ → e conversion in nuclei is less sensitive to ALP couplings, since . This is expected to change in the future thanks to the Mu2E [17] and COMET [19] experiments, which will improve the present sensitivity by orders of magnitude, cf. Table 1.
• Bounds on τ → µ and τ → e transitions share similar features to the ones already outlined above for the µ → e transition. An important difference stems from the interplay between constraints from ALP on-shell production and the requirement that the ALP decays inside the detector. In particular, the three body decays τ → 3µ and τ → eµµ set the most stringent bounds in the ranges 2 m µ < m a < m τ − m µ and Decay mode Exp. limit Future prospects Ref.   Table 1. We consider a benchmark scenario where a ii = 1 are the only flavor-conserving tree-level couplings, while c γγ is induced at one-loop level, cf. Eq. (41). The other couplings are neglected in our analysis. 2 m µ < m a < m τ − m e , respectively, which correspond precisely to the kinematical thresholds for on-shell ALP production. On the other hand, τ → µee and τ → 3e are the most sensitive processes in the region 7 MeV m a < 2m µ , where the lower limit stems from the ALP lifetime constraint. Note, in particular, that our flavor ansatz is such that the relation holds exactly in the region where these observables are resonantly enhanced.
Let us now analyze the interplay between the different contributions to j → i k k computed in Sec. 3.2 and the correlations among LFV processes. From the above discussion, it is clear that j → i k k is the dominant decay mode if the ALP can be produced on-shell, i.e. for 2 m i < m a < m j −m i . On the other hand, this comparison is less evident below and above the resonant mass interval, as the interplay of the tree and loop-level contributions to j → 3 i becomes non-trivial. To better illustrate this feature, in Fig. 5 we plot B( j → 3 i )/B( j → i γ) as a function of m a for µ → e (left panel) and τ → µ (right panel) transitions. In this plot, the ALP couplings are set to c γγ = 1/(16π 2 ) and a jj = 1, with a ii /a jj fixed to a few representative values. The LFV coupling a ji is not specified as it cancels in the ratios we are interested in, while the other couplings are neglected.
From Fig. 5 we see that for m a > m j the tree-level contribution to j → 3 i decouples as 1/m 4 a , becoming subdominant compared to the loop-induced photon-penguin contributions, which have a milder logarithmic dependence on m a . We find that tree-level contributions to B(µ → 3 e) can be neglected for m a 1 GeV, while masses as large as m a ≈ 100 GeV are needed for τ → 3 µ (see also Fig. 4). In both cases, for sufficiently large m a values, B( → 3 i )/B( j → i γ) approaches the prediction of dipole form-factor dominance, which yields ≈ 6 × 10 −3 , ≈ 10 −2 and ≈ 2 × 10 −3 for the µ → e, τ → e and τ → µ transitions, respectively. In principle, this relation can be broken by contributions to j → 3 i stemming from the anapole form-factors F 1 (q 2 ) and G 1 (q 2 ). Nonetheless, in most cases these contributions are sub-dominant with respect to the tree-level ones or the ones originating from dipoles. This can be understood from the ultraviolet logarithmic enhancement of the Barr-Zee dipole form-factor in Eq. (9) and (10), and/or the kinematical logarithmic enhancement of Eq. (23). Lastly, we also comment on the case where m a < 2 m i . For these values of m a , the tree-level contribution dominates for |a ii |/|a ij | 1, as depicted in Fig. 5. Loop-induced contributions can become dominant only for large values of a ii compared to a jj and c γγ . Another peculiar signature of ALP-induced LFV arises in the correlation between µ + N → e + N and µ → eγ. If the effects from dipole form-factors are dominant, as predicted in many NP scenarios, the following prediction holds To investigate possible departures from this relation, we plot the same ratio in Fig. 6 for universal ALP couplings a ii ≡ a and various values of c γγ . For large values of c γγ (red line), we find that the dipole effects driven by the Barr-Zee contributions dominate for any values of m a , reproducing the results of Eq. (44). For c γγ = ±1/16π 2 , the above relation is broken by the anapole contributions for small values of m a . Instead, for large m a , the Barr-Zee effects dominate due to their logarithmic sensitivity to m a , cf. Eq. (9), and Eq. (44) holds. Finally, for c γγ = 0, anapole and dipole effects have a comparable size for any value of m a yielding a significant departure from Eq. (44). Before moving into the discussion of processes involving hadrons, we turn our attention to the anomalous magnetic moment of leptons (g − 2) which are tightly related to LFV processes, as we are going to see in the following.
3.6 On the (g − 2) e and (g − 2) µ anomalies The anomalous magnetic moment of leptons, a = (g − 2) /2, provides one of the most accurate tests of the SM validity. The longstanding discrepancy between the experimental value and the SM prediction, ∆a µ = a exp µ − a SM µ = (27.1 ± 7.3) × 10 −10 [23][24][25], at the level of ≈ 3.6 σ, received increased attention recently due to the anticipated new experimental results by the Muon g − 2 Collaboration at Fermilab [26]. Furthermore, the recent measurement of the fine-structure constant α em in atomic physics experiments [29] allow us to concretely use for the first time the electron g − 2 as a NP probe [27]. Surprisingly, the reevaluation of ∆a e employing the latest value of α em shows a mild discrepancy ∆a e = (−87 ± 36) × 10 −14 [30], at the level of 2.4σ. This value not only has a different sign than ∆a µ , it also shows a departure from the expectations derived by considering the "naive scaling", i.e. the assumption that ∆a scales as m 2 , which is valid for a large class of NP models [27].
Light ALPs are promising candidates to accommodate both g − 2 anomalies since they can contribute in different ways to (g − 2) , with = e, µ, τ , breaking the "naive scaling" expectations. The leading ALP contributions are given by the one-loop diagrams shown in Fig. 1 (with i = j), as well as by the two-loop light-by-light and vacuum polarization diagrams, which are entirely induced by the aγγ coupling [22]. Since we are mainly interested in the connection between a and LFV observables, hereafter we focus only on the contributions arising from Yukawa interactions, cf. Fig. 1. The expression for a i = F ii 2 (0) can then be written as where the two contributions derive from lepton flavor-conserving (LFC) and LFV couplings, respectively. Indeed, although a i are flavor-conserving observables, in presence of LFV couplings they can receive contributions from loops involving a lepton of different flavor, cf. Fig. 1 with i = j = k. The LFC expressions are found to be universal and read while the LFV contributions are different for each lepton flavor, where x i = m 2 a /m 2 i , as before, and the loop functions h i (x) and g i (x) are reported in Appendix A. Note, in particular, that (∆a i ) LFV receives a chiral enhancement of order m τ /m i for i = e, µ, hereby violating the "naive scaling". We find that the leading UVsensitive term in Eq. (46) agrees with [22], while the sub-leading finite terms proportional to h 1,2 (x i ) agree with Refs. [47] and [21], respectively. Moreover, the LFV contributions depicted in Eq. (47) agree with the recent results from Ref. [21].
The inspection of Eqs. (45) and (47) leads to the following remarks: • The functions h 1 (x), h 2 (x) and g 3 (x) are identically positive, while the sign of h 3 (x) depends on the value of the ALP mass [21]. Therefore, LFC contributions from Yukawa interactions (cf. right panel of Fig. 1) cannot account for the (g−2) µ anomaly due to the wrong sign, as noted before in Ref. [21,22,27,47].
• Flavor conserving contributions of Barr-Zee type (cf. left panel of Fig. 1) can accommodate both (g − 2) e and (g − 2) µ anomalies provided |c γγ | 10 × |a µµ | and |a ee | 10 × |a µµ |, with c γγ a µµ < 0 and a ee a µµ < 0. This is illustrated in Fig. 7 for two benchmark values for m a , namely m a = 1 GeV and 10 GeV, and for fixed values of a ee /Λ = 10 TeV −1 (left panel) and 50 TeV −1 (right panel). Although phenomenologically viable, this scenario requires large couplings to electrons and photons, which might be challenging to obtain in a ultraviolet complete scenario. In this plot, the different shape of the (g − 2) e,µ constraints can be traced back to the interplay of Barr-Zee and pure Yukawa effects, cf. Eq. (45). For a sufficiently large (small) a µµ (c γγ ) coupling, the Yukawa contribution -which has the wrong sign to explain the (g − 2) µ anomaly -tends to dominate over the Barr-Zee one, setting a lower bound on c γγ . This lower bound is relaxed for increasing values of m a as the Yukawa effects decouple faster than the Barr-Zee ones.
• LFV contributions can be especially relevant for (g −2) e given the large m τ /m e chiral enhancement in Eq. 47. Therefore, a natural explanation of these anomalies can be obtained invoking flavor conserving contributions of Barr-Zee type for (g − 2) µ and LFV effects for (g − 2) e . This solution is viable as long as |a µτ /a eτ | 10 −4 and |a τ τ /a eτ | 10 2 , in order to avoid the experimental bounds from µ → eγ and τ → eγ, respectively. 5 This is illustrated in Fig. 8 for m a = 1 GeV (left panel) and m a = 10 GeV (right panel), for a fixed values c γγ /Λ = 1 TeV −1 . In this scenario, the (g − 2) µ anomaly is explained by flavor-conserving couplings (the dominant effects stemming from Barr-Zee diagrams), while (g − 2) e is explained via chirally-enhanced LFV contributions induced by a τ e . The most important feature in these plots is the different impact of LFV processes depending on the ALP mass. In particular, for m a > m τ − m µ , the most stringent bounds arise from τ → eγ, which leaves the possibility of explaining the (g − 2) anomalies (right panel). By contrast, for 2m µ < m a m τ − m µ (left plot) tree-level effects to B(τ → eµµ) are so large that the existing limits precludes the possibility of explaining these anomalies.
• For completeness, we also comment on (g − 2) τ . In this case, there is no parametric enhancement for LFV effects, since m τ is the largest fermionic mass, therefore the dominant effects stem from LFC couplings. For a τ τ , v τ τ ∼ O(1), we expect that |∆a τ | 10 −5 , still far from the poor sensitivity of current experiments [48].

Hadronic processes
LFV decays of mesons also provide a powerful probe of ALP interactions. On the one hand, they are highly complementary to the purely leptonic processes discussed in Sec. 3 as they are sensitive to different combinations of ALP couplings. On the other hand, we expect significant experimental improvements in the coming years thanks to the effort of the NA62 [14], LHCb [15,49] and Belle-II [16,50] collaborations. There are three types of hadronic processes that can be studied experimentally: (i) two body-decays P → i j , (ii) semileptonic decays P → P i j and (iii) P → V i j , where P ( ) and V denote generic pseudoscalar and vector mesons, respectively. Concerning the two body-decays, we focus on pseudoscalar mesons instead of vector ones, since parity conservation implies that 0|G µν G µν |V = 0, while the relation 0|q ( ) γ 5 q|V = 0 can be derived from the Ward identities. We will derive now the most general expressions for the branching ratios of these processes and discuss their potential to probe ALP couplings. Our notation is such that leptonic charged conjugated final states are added i j ≡ ±

General expressions
We start by considering the simplest LFV hadronic processes, namely the decays P → i j . These processes can be induced via ALP couplings to quarks and/or gluons, as illustrated in Fig. 9. While the quark contribution is always present, the gluonic one can only contribute to decays of unflavored mesons, such as P = π, η, η . Assuming that m j m i , the branching fraction for P → i j is given by where N P ≡ Λ 0|L D≤5 eff |P is a function of the ALP couplings to quarks and gluons, and of the relevant hadronic parameters, which can be expressed as follows where the summation over quark-flavor indices is implicit and the hadronic constant a P is defined as In Eq. (49), the matrix element 0|q i γ 5 q j |P depends on the relevant meson decay constants which should be determined together with a P by non-perturbative means, as will be discussed below. If kinematically allowed, the same couplings generating P → i j also generate the inverse process j → P i . The corresponding branching fraction is which depends on the parameter N P defined above.

Hadronic inputs: N P
The next step is to derive the expression for N P , defined in Eq. (49), for each pseudoscalar meson. This quantity depends on the pseudoscalar density, 0|q i γ 5 q j |P , as well as on the anomaly matrix element, parameterized by a P in Eq. (50). In this case, the only non-vanishing contribution comes from the second term in Eq. (49). By expressing the heavy-light meson as P = Qq, the axial matrix element reads from which one can show that where f P is the P -meson decay constant. By replacing this expression in Eq. (49), we obtain In these expressions, the only needed inputs are f P , which have been determined in all cases by means of numerical simulations of QCD on the lattice, cf. Table 2. Similarly, in the kaon system, we define |K L(S) = (|K 0 ± |K 0 )/ √ 2 and write showing that K L(S) leptonic decays can probe either the real or imaginary part of a d 12 .  Pseudoscalar quarkonia states For the heavy quarkonia states η c and η b , we find in a similar way that where f ηc and f η b are also listed in Table 2. The anomaly contribution is also present in this case, but it is sub-dominant since these particles are much heavier than Λ QCD .
Light unflavored mesons Finally, we discuss the more subtle case of π 0 and η ( ) mesons. For pions, one can use the exact isospin limit to derive the pseudoscalar density, while the anomaly contribution can be obtained by taking the divergence of the axial current [54,55]. We find that where z = m u /m d , in such a way that the anomaly contribution vanishes in the isospin conserving limit, m u = m d . For η ( ) , the anomaly contribution plays an even more important role. By denoting q = u, d and taking m q = (m u + m d )/2, the pseudoscalar densities can be parameterized as where h q η ( ) and h s η ( ) are decay constants. 6 These definitions allow us to write The best available computation of a η ( ) , h q η ( ) and h s η ( ) relies on the so-called Feldmann-Kroll-Stech (FKS) mixing scheme [56]. This phenomenological approach is based on the assumption that the states |η q = (|uū + |dd )/ √ 2 and |η s = |ss only mix through the anomaly. In this case, by using inputs such as the mixing angle between η and η , the authors of Ref. [56,57] obtained the inputs collected in Table 3.

General expressions
The next processes we consider are semileptonic decays of the type P → P i j . These processes can be induced via the ALP couplings to quarks, cf. Fig. 9 with q = q , and are complementary to the processes P → i j described above due to the parity symmetry. More specifically, since these decays arise from the underlying transition q k → q l − i + j (with i = j and k = l), the relevant hadronic matrix elements read where q 2 = (p − k) 2 and f 0 (q 2 ) ≡ f P →P 0 (q 2 ) denotes the P → P scalar form factor. Since the pseudoscalar matrix element vanishes, these decays can only constrain the vector ALP couplings v q kl . The general branching fraction is then given by [58] dB where we have assumed once again m j m i , and the phase-space function is given by λ P ≡ λ(m P , m P , q 2 ). This expression can be directly applied to the decays D → π, D s → K and B → K, among others. The only subtle case regards kaon decays, for which the above expression should be amended by replacing |v q kl | 2 by In other words, the different neutral kaon decays can probe either the real or imaginary parts of the Wilson coefficients, depending on their CP properties. The largest contribution to B(P → P i j ) arises when the ALP can be produced onshell, i.e. for m a ∈ (m i + m j , m P − m P ). In this case, the above expression can be simplified by means of the narrow-width approximation where and the leptonic branching fraction is given in Eq. (28). 7

Hadronic inputs
The last theoretical input needed in the above expression is the form factor f 0 . The latest LQCD results are summarized in Ref. [59]. For the B → π and B → K transitions, for which LQCD form factors are not available in the full q 2 -range, we use results from light-cone sum rules [60].

General expressions
We now turn to P → V semileptonic decays, with V being a generic vector meson, which turn out to be complementary to the observables described above. In this case, the relevant matrix elements are where q 2 = (p − k) 2 , ε µ denotes the V -meson polarization, and A 0 ≡ A P →V 0 stands for the pseudoscalar P → V form factor. Since the scalar matrix element vanishes in this case, these decays can only probe the axial coupling a q kl , differently from the P → P processes described above, that are only sensitive to v q kl . The general expression for their branching fraction can be recast from Ref. [58], giving where λ V (q 2 ) = λ(m P , m V , q 2 ). 7 Note that our notation is such that For on-shell ALP production, i.e. m a ∈ (m i + m j , m P − m V ), the above formula can be simplified as where the ALP branching fraction is given in Eq. (28) and

Hadronic inputs
The most relevant P → V transition for our study is B → K * . Since the nedeed form factor is not yet available from LQCD simulations in the full q 2 range, we consider the combination of LQCD and light-cone sum rules results from Ref. [61].

Numerical results and discussion
In this section we derive the limits from existing data and compare the sensitivity of the different decay channels listed above. The present experimental constraints are collected in Table 4 along with the future prospects when available. Let us first consider the ALP couplings to gluons, c agg , which can trigger LFV processes involving light unflavored mesons, namely π 0 , η and η . Starting with the µ → e transition, the only kinematically-allowed processes are π 0 → µe and η ( ) → µe. By combining Eq. (48) with the hadronic inputs in Eq. (61) and (64), and neglecting the ALP couplings to quarks, we find that B(P → µe) = c µe P |s 12 | 2 with {c µe π 0 , c µe η , c µe η } {2.6, 0.02, 0.2} × 10 −13 .
From these expressions, given the current experimental sensitivity depicted in Table 4, as well as the existing limits on s 12 derived from leptonic observables in Sec. 3, we conclude that such processes are not promising probes of ALPs. The only exception are the very narrow regions around m a ≈ m P , where a resonant contribution is produced. Similar conclusions can be obtained by considering, instead of c gg , the axial ALP couplings to quarks with appropriate flavor indices. The situation is much more promising for the τ → e and τ → µ transitions. In this case, the available processes are τ → lπ 0 and τ → lη ( ) , with l = e, µ. By only keeping the contributions driven by c agg and by focusing on the decays into muons, we find that which are considerably larger than the values found in Eq. (77). These decay modes are particularly interesting given the expected experimental resolutions at Belle-II, which are going to improve the present limits by at least one order of magnitude [16], cf. Table 4.
To understand why the processes τ → P are more sensitive to new physics than P → µe, one should compare the total lifetime of the decaying particles. More precisely, one finds that τ τ τ π ≈ 3.4 × 10 3 , τ τ τ η ≈ 5.8 × 10 5 , and τ τ τ η ≈ 8.6 × 10 7 which can be understood from the fact that τ 's can only decay through weak interactions, differently than π 0 and η ( ) . For that reason, τ decays are much better probes of new physics than light-meson decays.
To further explore the potential of τ decays to constrain ALPs, in Fig. 10 we plot B(τ → µP ), normalized to |s 23 | 2 , as a function of m a in two scenarios: (i) c agg = 1/(16π 2 ) and (ii) a d ii = a u ii = 1, with the other couplings taken to be zero. From these plots, we see that the largest branching fractions are obtained at the resonances, i.e. m a ≈ m P . However, large branching fractions can also be attained for other masses. Notice, also, that these decays modes have a complementary sensitivity to c agg , a d ii and a u ii , as it can be seen by comparing the left and right panels of Fig. 10.
Next, we discuss the potential of (semi)leptonic kaon and B-meson decays to probe ALP couplings. These decay modes are very sensitive to NP contributions due to their quark-flavor changing nature. Furthermore, there is a rich experimental program at NA62,  LHCb and Belle-II experiments that will improve the experimental sensitivity on many observables. Unlike the processes discussed above, (semi)leptonic decays can only probe the flavor violating ALP couplings to quarks, a q ij and v q ij , with i = j. Leptonic decays are sensitive to axial couplings, while semileptonic decays can probe either the axial or vector ones, depending on the spin of the meson in the final state. For this reason, these processes provide complementary information on the NP couplings. To quantitatively compare the different decay modes, we assume that a d ij = v d ij = 10 −3 and plot the branching fractions normalized by the LFV coupling as a function of m a . This is shown in Fig. 11 and 12 for kaon and B (s) -meson observables, respectively. We find that the semileptonic rates are always the largest ones in the resonant regions, while the leptonic ones are typically dominant for large and/or small values of m a , outside the resonant region.
To compare the sensitivity of kaon and B-meson decays channels, we consider a benchmark model that allows us to connect different quark transitions. We assume that the matrices a d and v d satisfy where V denotes the CKM matrix and c d is a constant. Such relation is predicted, for instance, from electroweak loops in models with predominant ALP couplings to the Wboson and/or the top-quark [13]. The constraints on s ij derived from kaon and B-meson decays are shown in Fig. 13 for a fixed value of c d . To derive these constraints, one should also specify the total ALP width, which greatly affects the constraints derived in the resonant regions. In Fig. 13, we assume for illustration a constant value Γ a = 10 −6 GeV. Smaller (larger) values would imply stronger (weaker) constraints in these regions, without affecting the off-shell ones. We stress that our expressions and experimental/theoretical inputs are given in full generality, so that the correct (resonant) bound could be easily assessed to specific flavor models. For the illustrative case we consider, as shown in Fig. 13, we see that the most stringent constraints indeed come from semileptonic decays such as K → πµe, B → K ( * ) µe and B → K ( * ) µτ , in the resonant regions. On the other hand, for small masses the most stringent constraints come from the decay K L → µe. Prospects for existing/future experiments are also shown in Fig. 13 when available, cf. Table 2.

Decay mode
Exp. limit Future prospects Ref.
has been considered, while the ALP width is fixed to an illustrative value Γ a = 10 −6 GeV. Larger (smaller) values of Γ a would decrease (increase) the excluded regions in the on-shell regions for semileptonic decays. Constraints on |s 31 | are not depicted, since they turn out to be very similar to |s 32 |.

Conclusions
In this work, we have explored the signatures of axion-like particles (ALPs) in lepton flavor violating (LFV) observables at low energies. By using the most general dimension-5 effective Lagrangian, which accounts for the ALP couplings to SM fermions and gauge bosons, we have derived complete expressions for the most relevant LFV decays of leptons and hadrons. These general formulae can be applied for any ALP mass, as well as for any choice of ALP couplings, thus generalising and complementing previous results available in the literature [20,21].
Purely leptonic observables comprise the decays j → ı γ, j → i k k and j → i γγ as well as µ → e conversion in nuclei. We find that, currently, the most stringent limits concern the µ → e transition, and arise from the decay modes µ → 3e and µ → eγγ for m a < m µ . For m a > m µ , the most stringent constraint arises from µ → eγ, which is going to be superseded in the future by the experimental searches for µ → e conversion in nuclei at COMET and Mu2E experiments [17,19]. Likewise, for the τ → µ and τ → e transitions, the three body decays τ → 3µ and τ → eµµ set the most stringent bounds in the range 2m µ m a m τ , while the radiative decays are the most constraining modes for m a > m τ . We have fully explored the complementarity of the different decay modes by showing, in particular, that tree-level contributions to j → i k k are not entirely negligible above the threshold of on-shell ALP production, and by estimating these contributions along with their interference with loop-level contributions, see Fig. 5. Furthermore, we show that the ALP mass can be inferred from the correlation among the different leptonic processes, as illustrated, for instance, by comparing B(µ → eγ) with B(µ → 3e) and B(µ + N → e + N ) in Figs. 5 and 6, respectively.
Concerning hadronic processes, we have focused on the leptonic decays P → i j and τ → P j , and the semileptonic ones P → P (V ) i j , where P ( ) and V stand for pseudoscalar and vector mesons, respectively. Leptonic decays can be induced via ALP couplings to quarks and/or gluons. We find that τ decays into light unflavored mesons (π 0 , η and η ) are the most sensitive probes of the gluonic coupling, while decays of kaons and B-mesons are particularly sensitive to quark-flavor violating ALP interactions. On other hand, semileptonic decays can only probe the vector or axial ALP couplings, depending on the spin of the meson in the final state. For this reason, these processes are very complementary probes of ALP interactions, as we have made explicit for a benchmark scenario in Fig. 13.
As a by-product of this study, we have also revisited ALP contributions to the anomalous magnetic moment of leptons, and we have reassessed the possibility of simultaneously explaining the observed discrepancies in (g − 2) e and (g − 2) µ via ALP contributions. For flavor conserving contributions, we find that very large ALP couplings to electrons and photons are needed. On the other hand, LFV contributions can be especially relevant for (g − 2) e given the large m τ /m e chiral enhancement at the amplitude level, cf. Eq. 47. This enhancement can be exploited to provide a more natural explanation of these anomalies, by invoking flavor conserving contributions of Barr-Zee type for (g − 2) µ and LFV effects for (g − 2) e , see Fig. 8. The main prediction of this scenario would be large values of B(τ → eγ), within reach of Belle-II.
In summary, ALPs can induce a plethora of low-energy LFV phenomena with specific patterns that we have identified in this paper. The ongoing experimental program at present NA62 [14], LHCb [15] and Belle-II [16] experiments, as well as the future ones Mu2E [17], Mu3E [18] and COMET [19], plan to improve the current experimental sensitivities by orders of magnitude, offering many possibilities to discover ALPs indirectly through their LFV interactions.

A.3 Form factors for j → i γ *
In this Appendix we collect the complete one-loop form factors for j → i γ * in terms of the Passarino Veltman (PaVe) functions, as defined in the Package-X documentation [34]. The photon 4-momentum is denoted as q and we use the following shorthand notation for the ALP coupling to leptons, where m i ≡ m i . The three-point PaVe appearing in our expressions have three possible arguments, which we denote as The results provided below can be evaluated by using, for instance, Package-X [34].
These expressions have been obtained within dimensional regularization, in a scheme where the Levi-Civita symbol is a D-dimensional object [64]. The scheme choice affects the finite terms for the dipole form-factors since they are UV-sensitive, see discussion in Ref. [10].

A.4 Taylor-expanded anapole form factors
For m a > m j , it is a good approximation to Taylor expand the form factors F 1 and G 1 around q 2 = 0, cf. Eq. (21). In this Appendix, we provide the explicit expression for the q 2 -derivative of F 1 (q 2 ) and G 1 (q 2 ) evaluated at q 2 = 0.

A.4.2 Quadratic contributions
Similarly to the discussion in Sec. 3.1 for the dipole operators, the expression for the chirality-conserving form-factors will depend on the mass of the particle running in the loop, as described in the following.
• For j = k > i, we obtain thaṫ • Similarly, for j > k = i, we obtain thaṫ • For µ → eγ, there is an additional contribution from τ -loops (i.e. k > j > i in our notation), which is given bẏ G 1 (0) = + v τ e a τ µ + a τ e v τ µ where h 4 (x) is given by • Lastly, τ → µγ might receive a contribution from electrons running in the loop. In this case,Ḟ with For compactness, we left explicit the dependence on the PaVe function C 0 in the above expressions. This function can be evaluated by using the integral form with ∆(x, y) = (1 − x − y) m 2 0 + x m 2 1 + y m 2 2 − x(1 − x) r 2 10 − y(1 − y) r 2 20 + xy(r 2 10 + r 2 20 − r 2 21 ) , or, alternatively, by using Package-X [34].