1 Introduction

The axion mechanism [1, 2] is currently our best solution to the strong CP puzzle. The non-observation of a neutron electric dipole moment (EDM) constrains the QCD theta term, \(\theta G_{\mu \nu } \tilde{G}^{\mu \nu }\) to be tiny, \(\theta \lesssim 10^{-10}\) [3]. As this coupling receives contributions from two unrelated sectors of the Standard Model (SM), a topological QCD contribution and an electroweak contribution from the quark Yukawa couplings, both a priori of \( \mathcal {O}(1)\), such a tiny value requires an unacceptable fine-tuning. The axion solution [4, 5] relies on the axion a being the Goldstone boson associated to the spontaneous breaking of an anomalous U(1) symmetry, the PQ symmetry [1, 2]. This ensures a coupling of the axion to gluons, \(aG_{\mu \nu }\tilde{G} ^{\mu \nu }\), which develops into a potential for the axion in the low-energy limit. At the minimum of this potential, the axion field absorbs the \(\theta \) term, making it unobservable.

The initial implementation of the axion mechanism relied on the axion emerging at the electroweak scale, and was quickly ruled out as this would imply way to large couplings to matter particles. Invisible axion scenarios were then developed, most notably the DFSZ [6, 7] and KSVZ [8, 9] models, in which the axion is very light and very weakly coupled. In most cases, being in addition very long-lived, the axion emerged as a viable dark matter (DM) candidate (for a review of the axion in a cosmological context, see e.g. Ref. [10]). Interestingly, experimental strategies could then take advantage of the rather high flux of such dark matter axions (see e.g. Refs. [11, 12]). In practice, dark matter axion production mechanisms ensure the axion is rather cold, and being in addition very light, it can be represented by a classical coherent pseudoscalar field, typically \(a(\textbf{r},t)=a_{0}\cos (Et-\textbf{p}\cdot \textbf{r})\), \(E^{2}=\textbf{p}^{2}+m_{a}^{2}\), \(m_{a}\) the axion (or axion-like particle) mass, and \(a_{0}\) set by the local DM density, \( m_{a}a_{0}=\sqrt{2\rho _{DM}}\) with \(\rho _{DM}\approx 0.4\ \)GeV/cm\(^{3}\) [13].

The goal of the present paper is to analyze the couplings to SM fermions of such a dark matter axion background, in the non-relativistic limit. The usual starting point is the axion Lagrangian (to simplify the notation, a coupling constant \(g=m/\Lambda \) with \(\Lambda \) the PQ breaking scale is understood to be absorbed into a throughout this paper)

(1)

Such a derivative interaction to the axion is reminiscent of its Goldstone boson nature. The corresponding Dirac equation is \(i\partial _{t}\left| \psi \right\rangle =\mathcal {H}_{D}\left| \psi \right\rangle \) with

$$\begin{aligned} \mathcal {H}_{D}=\gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{p}+m-\frac{ \gamma ^{0}\gamma _{5}\dot{a}}{m}+\frac{\gamma _{5}\varvec{\gamma }\cdot \varvec{\nabla }a}{m}\right) , \end{aligned}$$
(2)

where \(\dot{a}=\partial _{t}a\). In the Dirac representation, where \( \gamma ^{0} \) is diagonal, \(\gamma ^{5}\) directly couples the fermion and antifermion degrees of freedom of \(\left| \psi \right\rangle \). Consequently, in the non-relativistic limit, the \(\dot{a}\) term receives a dependence on \(\textbf{p}=-i\varvec{\nabla }\):

$$\begin{aligned} \mathcal {H}_{D}^{\textrm{NR}}{} & {} =\gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m}+ \frac{i\gamma ^{5}\varvec{\gamma }\cdot \varvec{\nabla }a}{m}\right) \nonumber \\{} & {} \quad +\frac{ \gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}}{2m^{2}}+\mathcal {O} (1/m^{3}). \end{aligned}$$
(3)

These two leading interactions have been extensively studied in the literature [14, 15]. The so-called axion wind term, \(\gamma ^{5}\varvec{\gamma }\cdot \varvec{\nabla }a\), leads to a coupling of the gradient of the axion field to the spin of the fermion. It can be searched for experimentally e.g. using NMR techniques  [16,17,18,19] or magnons [20].

The second term is dubbed the axioelectric effect [21,22,23,24]. It translates as a coupling of \(\dot{a}\) to the combination \(\varvec{p}\cdot \textbf{S}\) of the momentum \(\textbf{p}\) and spin \(\textbf{S}\) of the fermion. As a result, sufficiently energetic axions could kick bound electrons out, in analogy with the photoelectric effect. The sun could produce a consequent flux of such axions, whose possible detection via these ionization processes, or more generally electron recoil effects, gave rise to a rather intense experimental activity [25,26,27,28,29,30,31,32]. The corresponding constraints on the axion are reviewed e.g. in Ref. [33], as well as more recently in Refs. [34, 35] in the context of the excess events observed at XENON1T [36]. Note, though, that these experiments also probe different mechanisms and/or the coupling of the axion to photons.

A peculiar feature of Goldstone bosons is that there are different ways to parametrize them. For the axion, an equally valid Lagrangian uses the so-called polar or exponential parametrization:

(4)

The derivative interaction is replaced by an infinite tower of interactions, starting by the pseudoscalar coupling \(a\bar{\psi }i\gamma ^{5}\psi \). The corresponding Hamiltonian is then

$$\begin{aligned} \mathcal {H}_{E}= & {} \gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{p} +m\exp \left( 2i\gamma ^{5}\frac{a}{m}\right) \right) \nonumber \\= & {} \gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{p}+m+2i\gamma ^{5}a\right) +\mathcal {O} (a^{2}), \end{aligned}$$
(5)

and its non-relativistic limit can be worked out to be

$$\begin{aligned} \mathcal {H}_{E}^{\textrm{NR}}= & {} \gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m}+ \frac{i\gamma ^{5}\varvec{\gamma }\cdot \varvec{\nabla }a}{m}\right) \nonumber \\{} & {} +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}}{4m^{2}}+\mathcal {O} (1/m^{3}). \end{aligned}$$
(6)

The same axion wind and axioelectric interactions emerge, but the coefficient of the latter differs by a factor two. Historically, this fact was well known in the context of nucleon-pion interactions. The equivalence of the pseudoscalar and axial interaction was first discussed by Dyson in 1948 [37] (see also Refs. [38, 39]), on the basis of the axion wind term being the same. Later, this ambiguity in the time-dependent term, as well as in some higher order terms in the non-relativistic expansion, generated a lot of attention [40,41,42,43,44,45]. As we will see, part of the issue was related to the truncation of the exponential parametrization. After all, many of these works date back to before Goldstone theorem was formulated, let alone the pion identified as a pseudo-Goldstone boson of the chiral symmetry breaking. Nowadays, the equivalence between the derivative and exponential representation is an established fact, but surprisingly, a non-relativistic expansion truly reflecting this has not been worked out yet. This is the purpose of the present paper.

In particular, adopting a modern language, we will see that the \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) coupling can be systematically rotated away for a neutral fermion. The demonstration is actually quite simple and can readily be given. First, remember that a non-relativistic expansion is not unique.Footnote 1 As customary in quantum mechanics, unitary transformations cannot change the physics. So, performing such a transformation, and provided the block-diagonal nature of the Hamiltonian is maintained, an equally valid non-relativistic expansion is found. Now, as proposed a long time ago in Refs. [40, 42], consider

$$\begin{aligned} \left| \psi \right\rangle \rightarrow \left| \psi ^{\prime }\right\rangle =\exp (iS)\left| \psi \right\rangle ,\ \ S=\frac{\mu }{ 4m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},a\}. \end{aligned}$$
(7)

If \(i\partial _{t}\left| \psi \right\rangle =\mathcal {H}\left| \psi \right\rangle \), then \(i\partial _{t}\left| \psi ^{\prime }\right\rangle =\mathcal {H}^{\prime }\left| \psi ^{\prime }\right\rangle \) with \( \mathcal {H}^{\prime }=\mathcal {H}-\dot{S}\) to \(\mathcal {O}(1/m^{2})\) since \([ \mathcal {H},S]\) starts at \(\mathcal {O}(1/m^{3})\). Thus, acting on \(\mathcal {H }_{D}^{\textrm{NR}}\) with \(\mu =2\), or on \(\mathcal {H}_{E}^{\textrm{NR}}\) with \(\mu =1\), the \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a} \}\) coupling is replaced by \(\mathcal {O}(1/m^{3})\) and higher terms.

At the time, this was interpreted as an ambiguity that should cancel out in physical observables. Here, we will go one step further and argue that \( \gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) does not encode any true physical effects. In other words, for neutral fermions, the operator \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) is totally screened at \(\mathcal {O}(1/m^{2})\) in the non-relativistic expansion. One reason for this interpretation has to do with Schiff’s theorem [47], which states that charged fermion EDMs are screened. The transformation S of Eq. (7) is closely related to Schiff’s transformation, and even becomes the Schiff’s transformation for a charged fermion. The consequence in that case is that the covariant \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}\) coupling becomes equivalent to an axion-induced EDM operator, \(a\varvec{ \sigma }\cdot \textbf{E}\) at \(\mathcal {O}(1/m^{2})\). Phenomenologically, these two operators are indistinguishable, ensuring that physics is independent of the choice of parametrization. Yet, this equivalence makes it manifest that unexplored but promising avenues do exist to search for dark matter axions.

The paper is organized as follow. To set the stage, we start in the next section by a brief overview of the construction of the non-relativistic expansions via the Foldy–Wouthuysen method [46]. This also gives us the opportunity to introduce Schiff’s theorem [47] and its generalizations. Then, in Sect. 3, we enter the core of the subject, and perform the non-relativistic expansion of the axionic Hamiltonian up to and including \(\mathcal {O}(1/m^{-3})\) terms, firstly in the absence of electromagnetic (EM) fields, secondly for a charged fermion minimally coupled to EM fields, and thirdly for a neutral fermion having electric and magnetic dipole interactions with the EM fields. These results are then put to use in Sect. 4 to analyze axion-induced lepton and nucleon EDMs, showing how and when some new effects could be expected. Finally, in Sect. 5, our results are summarized along with their phenomenological consequences.

2 Brief overview of the non-relativistic expansion

The techniques used in the present paper are covered in most textbooks on relativistic quantum mechanics. In particular, recovering the Pauli equation by a non-relativistic expansion of the Dirac Hamiltonian for a spin 1/2 field minimally coupled to EM fields,

$$\begin{aligned} i\partial _{t}\left| \psi \right\rangle =\mathcal {H}_{EM}\left| \psi \right\rangle \ ,\ \ \mathcal {H}_{EM}=\gamma ^{0}(\varvec{\gamma }\cdot \textbf{P}+m)+e\phi \ , \end{aligned}$$
(8)

where \(\textbf{P}=\textbf{p}-e\textbf{A}\), \(\textbf{p}=-i\varvec{\nabla }\), and the EM potential is \(A^{\mu }=(\phi ,\textbf{A})\), is a standard exercise. Though well-known, we think it is nevertheless useful to briefly review this so as to fix our notations, and because it forms the backbone on which we will add axions later on. Further, once the magnetic moment and electric moment operators \(\sigma _{\mu \nu }F^{\mu \nu }\) and \(\sigma _{\mu \nu }\tilde{F} ^{\mu \nu }\) are added, it permits to introduce the Schiff’s theorem that will be central to the axion discussion.

2.1 Foldy–Wouthuysen transformation

The Dirac equation involves four-dimensional spinors, and thus includes both particles and antiparticles simultaneously. In the non-relativistic limit though, the energy is not sufficient for pair creation, and the antiparticle degrees of freedom are not dynamical. In practice, the Dirac equation must reduce to a decoupled pair of two-dimensional Pauli equations, describing the dynamics of spin 1/2 particles only. Several procedures exist to perform this reduction, starting historically by Pauli’s elimination method [48]. To set the stage, let us briefly describe the main idea. We first adopt the Dirac representation for the gamma matrices, that is,

$$\begin{aligned} \gamma ^{0}=\left( \begin{array}{cc} \textbf{1} &{} 0 \\ 0 &{} -\textbf{1} \end{array} \right) ,\ \ \varvec{\gamma }=\left( \begin{array}{cc} 0 &{} \varvec{\sigma } \\ -\varvec{\sigma } &{} 0 \end{array} \right) ,\ \ \gamma ^{5}=\left( \begin{array}{cc} 0 &{} \textbf{1} \\ \textbf{1} &{} 0 \end{array} \right) , \end{aligned}$$
(9)

where \(\varvec{\sigma }\) are the usual Pauli matrices. Note also the identities \(\gamma ^{i}\gamma ^{j}=(-\delta ^{ij}-i\varepsilon ^{ijk}\sigma ^{k}) \textbf{1}\) and \(\varvec{\sigma }\otimes \mathbf {1=}-\gamma ^{0}\gamma ^{5} \varvec{\gamma }\), as well as the fact that \(\varvec{\gamma } ^{\dagger }=\gamma ^{0}\varvec{\gamma }\gamma ^{0}=-\varvec{\gamma }\), but \(\gamma ^{0\dagger }=\gamma ^{0}\) and \(\gamma ^{5\dagger }=\gamma ^{5}\). The diagonal form of \(\gamma ^{0}\) is instrumental for performing the non-relativistic expansion. Indeed, if the Dirac spinor \(\left| \psi \right\rangle \) is split into a pair of two-component spinors, the Dirac equation Eq. (8) takes the matrix form (after \(\chi \rightarrow -\chi \))

$$\begin{aligned} \left( \begin{array}{cc} m-E+e\phi &{} \varvec{\sigma }\cdot \textbf{P} \\ -\varvec{\sigma }\cdot \textbf{P} &{} m+E-e\phi \end{array} \right) \left( \begin{array}{c} \varphi \\ \chi \end{array} \right) =\left( \begin{array}{c} 0 \\ 0 \end{array} \right) . \end{aligned}$$
(10)

Factoring out the time evolution due to the rest mass, defining \( E^{\prime }=E-mc^{2}\), and \(\chi \rightarrow -\chi \), this becomes

$$\begin{aligned} \left( \begin{array}{cc} e\phi &{} \varvec{\sigma }\cdot \textbf{P} \\ \varvec{\sigma }\cdot \textbf{P} &{} -2m+e\phi \end{array} \right) \left( \begin{array}{c} \varphi \\ \chi \end{array} \right) =E^{\prime }\left( \begin{array}{c} \varphi \\ \chi \end{array} \right) . \end{aligned}$$
(11)

Thus, because of the 2m term, \(\chi \) is essentially determined by \(\varphi \). It corresponds to a small \(\mathcal {O}(v/c)\) component relative to the large \(\varphi \) component. Plugging \(\chi \approx \varvec{\sigma }\cdot \textbf{P}\varphi /2m\) back into the equation of \(\varphi \) permits to reduce the Dirac equation to a Pauli equation for \(\varphi \),

$$\begin{aligned} i\partial _{t}\varphi =\left[ \frac{(\varvec{\sigma }\cdot \textbf{P})^{2}}{ 2m}+e\phi \right] \varphi =\left[ \frac{(\textbf{p}-e\textbf{A})^{2}}{2m}- \frac{e}{2m}\varvec{\sigma }\cdot \textbf{B}+e\phi \right] \varphi . \end{aligned}$$
(12)

This is the essence of Pauli elimination method that can be generalized to the presence of other interactions and to higher orders. In those cases though, the method becomes very cumbersome because hermiticity of the Hamiltonian is not guaranteed, and additional renormalizations of the \( \varphi \) field are in general required [49].

The Foldy–Wouthuysen (FW) procedure is designed to systematize the block-diagonalization of the Dirac Hamiltonian [46]. Starting from \(i\partial _{t}\left| \psi \right\rangle =\mathcal {H}\left| \psi \right\rangle \), the idea is to construct a unitary rotation \( \psi \rightarrow \psi ^{\prime }=e^{iS}\psi \) such that \(i\partial _{t}\left| \psi ^{\prime }\right\rangle =\mathcal {H}^{\prime }\left| \psi ^{\prime }\right\rangle \) with

$$\begin{aligned} \mathcal {H}^{\prime }=e^{iS}\left( \mathcal {H}-i\partial _{t}\right) e^{-iS}\ , \end{aligned}$$
(13)

with \(\mathcal {H}^{\prime }\) now block-diagonal. This decouples the large two-component spinor from the small one, and should be valid as long as the energy involved does not allow for pair creation. Since we started by performing a unitary transformation, there is no hermiticity issue with \( \mathcal {H}^{\prime }\). However, in the presence of interactions, an exact solution for S cannot be found in general, and one relies instead on a perturbative expansion in \(c^{-1}\). That is, instead of a single unitary transformation S, a sequence of unitary transformations is performed to bring \(\mathcal {H}^{\prime }\) to a block diagonal form, up to some order \( c^{-n}\). For dimensional reasons, an expansion in 1/c is essentially identical to an expansion in 1/m, so we will rather concentrate on the latter and keep \(c=1\).

Details of this construction are in Appendix A. In summary, one first uses the diagonal \(\gamma ^{0}\) to write the Hamiltonian as

$$\begin{aligned} \mathcal {H}=\gamma ^{0}(m+\mathcal {O})+\mathcal {E}\ ,\ \end{aligned}$$
(14)

where \(\mathcal {O}\) stand for odd terms, \(\mathcal {O}\gamma ^{0}=-\gamma ^{0} \mathcal {O}\), and \(\mathcal {E}\) for even terms, \(\mathcal {E}\gamma ^{0}=\gamma ^{0}\mathcal {E}\). In general, \(\mathcal {O}\) and \(\mathcal {E}\) are differential operators that do not commute. The term \(\mathcal {O}\) is the offending one that couples small and large components. So, in the first step, we must remove it by some unitary transformation S. Since to leading order, \(\mathcal {H}^{\prime }=\mathcal {H}+[iS,\mathcal {H}]-\dot{S}+\cdots \), this cancellation must come from \([iS,\gamma ^{0}m]=-\gamma ^{0}\mathcal {O}\), that is, \(iS=\mathcal {O}/(2m)\). Performing that transformation cancels the \( \mathcal {O}\) term in \(\mathcal {H}\), but brings back odd terms at higher orders (proportional to \([\mathcal {O},\mathcal {E}]\), \(\mathcal {O}^{3}\), etc), so the procedure must be iterated up to some given order in 1/m. After three steps, the Hamiltonian becomes

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}= & {} \gamma ^{0}\left( m-\frac{\mathcal {O}^{2}}{2m}- \frac{\mathcal {O}^{4}}{8m^{3}}+\frac{\mathcal {V}_{1}^{2}}{8m^{3}}\right) +\mathcal {E}\nonumber \\{} & {} +\frac{[\mathcal {O},\mathcal {V}_{1}]}{8m^{2}}+\mathcal {O} (1/m^{4}), \end{aligned}$$
(15)

where \(\mathcal {V}_{1}\equiv [\mathcal {O},\mathcal {E}]+i\mathcal {\dot{O} }\). When applied on a four-component spinor, the upper two and lower two components are decoupled. Given the choice of \(\gamma ^{0}\), only the large upper component needs to be kept, as the lower small component dynamics is dampened by the rest mass, i.e., by a \(\textbf{P}/m\) factor.

The FW transformation will be the first step in all our developments. Yet, it is important to stress that it is not the end of the story. As was realized comparing various block-diagonalization methods, including the elimination method, there are some ambiguities in the final form of \( \mathcal {H}^{\textrm{NR}}\). This simply reflects the fact that additional unitary transformations \(\psi \rightarrow \psi ^{\prime }=e^{iS}\psi \) are still allowed as long as S is even (for a review, see e.g. Ref. [49]). This feature, at the root of Schiff’s theorem, will be used extensively in the following.

2.2 Application to electromagnetic interactions

Taking \(\mathcal {O}=\varvec{\gamma }\cdot (\textbf{p}-e\textbf{A})\equiv \varvec{\gamma }\cdot \textbf{P}\) and \(\mathcal {E}=e\phi \), and keeping only terms up to \(\mathcal {O}(1/m^{3})\), the standard result is recovered:

$$\begin{aligned} \mathcal {H}_{EM}^{\textrm{NR}}&=\gamma ^{0}\left( m+\frac{\textbf{P}^{2}}{2m }-\frac{e\varvec{\sigma }\cdot \textbf{B}}{2m}\right. \nonumber \\&\quad \left. -\dfrac{\textbf{P}^{4}-e\{ \textbf{P}^{2},\varvec{\sigma }\cdot \textbf{B}\}-e^{2}(\textbf{E}^{2}- \textbf{B}^{2})}{8m^{3}}\right) +e\phi \nonumber \\&\quad -\frac{e\left( (\varvec{\nabla }\cdot \textbf{E})+i\varvec{ \sigma }\cdot (\varvec{\nabla }\times \textbf{E})+2\varvec{\sigma }\cdot ( \textbf{E}\times \textbf{P})\right) }{8m^{2}}\nonumber \\&\quad +\mathcal {O}(1/m^{4}). \end{aligned}$$
(16)

By convention, \(\varvec{\nabla }\) acts on the quantity immediately to its right, but \(\textbf{P}\) acts on everything. Note that the \(\varvec{ \sigma }\) matrices occurring are to be interpreted as \(\textbf{1}\otimes \varvec{\sigma }\), since this Hamiltonian still acts on four-dimensional spinors. Yet, being diagonal, the reduction to the Pauli equation is now trivial. As is well-known, one can identify the Zeeman magnetic coupling \( \varvec{\sigma }\cdot \textbf{B}=2\varvec{S}\cdot \textbf{B}\) with \( \textbf{S}\) the spin operator and \(g=2\) the magnetic moment, the spin orbit coupling \(\varvec{\sigma }\cdot (\textbf{E}\times \textbf{P})\), and the Darwin term \(\varvec{\nabla }\cdot \textbf{E}\).

A more interesting application starts by including the higher order magnetic moment and electric moment operators

$$\begin{aligned} \mathcal {H}_{EM}&=\gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{P}+m\!+\! \frac{\delta _{\mu }}{2}\sigma ^{\mu \nu }F_{\mu \nu }\!-\! i\frac{d}{2}\sigma ^{\mu \nu }\gamma ^{5}F_{\mu \nu }\right) \!+\!e\phi \nonumber \\&=\gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{P}+m+i\gamma ^{0} \varvec{\gamma }\cdot ((\delta _{\mu }\textbf{E}+d\textbf{B})\right. \nonumber \\&\quad \left. +i\gamma ^{5}(\delta _{\mu }\textbf{B}-d\textbf{E}))\right) +e\phi \ , \end{aligned}$$
(17)

where electromagnetic fields satisfy \(F^{0i}=-E^{i}\), \(B^{i}=-1/2\varepsilon ^{ijk}F_{jk}\) and \(\delta _{\mu }\equiv ea/2\,m\). Plugging the odd term \( \mathcal {O}=\varvec{\gamma }\cdot \textbf{P}+i\gamma ^{0}\varvec{ \gamma }\cdot (\delta _{\mu }\textbf{E}+d\textbf{B})\) and the even term \( \mathcal {E}=e\phi +\gamma ^{5}\varvec{\gamma }\cdot (\delta _{\mu } \textbf{B}-d\textbf{E})\) in Eq. (15), keeping in mind that \(\delta _{\mu }\) and d are \(\mathcal {O}(m^{-1})\), and discarding terms of \( \mathcal {O}(m^{-4})\) and higher, the block-diagonal Hamiltonian is now

$$\begin{aligned} \mathcal {H}_{EM}^{\textrm{NR}}&=\gamma ^{0}\left( m+\frac{\textbf{P}^{2}}{2m }-\frac{e\left( 1+a\right) \varvec{\sigma }\cdot \textbf{B}}{2m}\right. \nonumber \\&\quad \left. +d \varvec{\sigma }\cdot \textbf{E}-\frac{\textbf{P}^{4}-e\{\textbf{P}^{2}, \varvec{\sigma }\cdot \textbf{B}\}-e^{2}(\textbf{E}^{2}-\textbf{B}^{2})}{ 8m^{3}}\right) \nonumber \\&\quad +e\phi +\frac{ie(1+2a)[\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{E}]}{8m^{2}}+\frac{id[\varvec{\gamma } \cdot \textbf{P},\varvec{\gamma }\cdot \textbf{B}]}{2m} \nonumber \\&\quad +\gamma ^{0}\left( \frac{a(a+1)e^{2}}{8m^{3}}\textbf{E}^{2}+\frac{ d^{2}}{2m}\textbf{B}^{2}\right. \nonumber \\&\quad \left. -\frac{e(1+2a)d}{8m^{2}}\{\varvec{\gamma }\cdot \textbf{B},\varvec{\gamma }\cdot \textbf{E}\}\right) \nonumber \\&\quad +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\{ \varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})\}\}+\gamma ^{0}\{\varvec{\gamma }\cdot \textbf{P },\varvec{\gamma }\cdot (\delta _{\mu }\varvec{\dot{E}}+d\varvec{\dot{B}})\}}{8m^{2}}\nonumber \\&\quad +\mathcal {O}(1/m^{4}). \end{aligned}$$
(18)

The magnetic operator thus describes the deviation of the magnetic moment from its Dirac value, \(a=(g-2)/2\). The \(\varvec{\sigma }\cdot \textbf{E}\) term describes the electric dipole interaction, with d the EDM. If we remember the identities

$$\begin{aligned}{}[\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{X}]&=-(\textbf{p}\cdot \textbf{X})-i\varvec{\sigma }\cdot ( \textbf{p}\times \textbf{X})+2i\varvec{\sigma }\cdot (\textbf{X}\times \textbf{P})\ , \end{aligned}$$
(19a)
$$\begin{aligned} \{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{X} \}&=-(\textbf{p}\cdot \textbf{X})-i\varvec{\sigma }\cdot (\textbf{p} \times \textbf{X})-2(\textbf{X}\cdot \textbf{P})\ , \end{aligned}$$
(19b)

the Darwin and spin-orbit couplings are identified inside \([\varvec{ \gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{E}]\), now modified by a magnetic moment contribution and accompanied by magnetic interactions induced by d.

2.3 Schiff’s theorem and beyond

As stated before, the FW transformed Hamiltonian can still be unitarily rotated without breaking its block-diagonal character. The simplest such transformation is

$$\begin{aligned} iS_{1}=-\frac{i\alpha }{m}\gamma ^{5}\varvec{\gamma }\cdot \textbf{P}\ . \end{aligned}$$
(20)

The transformation \(\exp (iS_{1})\) is unitary, and importantly, it commutes with the mass term \(\gamma ^{0}m\). One should not be put off by the fact that this transformation involves the external fields via \(\textbf{P}\). Actually, we already did many such transformations to block-diagonalize the Hamiltonian, since the first FW transformation is \(\exp (iS)\) with \(iS= \mathcal {O}/(2\,m)\) and \(\mathcal {O}=\varvec{\gamma }\cdot \textbf{P} +i\gamma ^{0}\varvec{\gamma }\cdot (\delta _{\mu }\textbf{E}+d\textbf{B})\). All that differs here is the \(\gamma ^{5}\) factor, making \(S_{1}\) even with respect to \(\gamma ^{0}\).

The new Hamiltonian \(\mathcal {H}^{\prime }=e^{iS_{1}}\left( \mathcal {H} -i\partial _{t}\right) e^{-iS_{1}}\) can be expanded as before, and since \( iS_{1}\sim \mathcal {O}(m^{-1})\), we need to compute:

$$\begin{aligned} \mathcal {H}^{\prime }= & {} \mathcal {H}+[iS_{1},\mathcal {H}]-\dot{S}_{1}+\frac{1}{2 }[iS_{1},[iS_{1},\mathcal {H}]-\dot{S}_{1}]\nonumber \\{} & {} +\frac{1}{3!} [iS_{1},[iS_{1},[iS_{1},\mathcal {H}]-\dot{S}_{1}]]+\mathcal {O}(m^{-4}). \end{aligned}$$
(21)

Now, the key in Schiff’s theorem [47] is to note that the \( \mathcal {O}(1/m)\) terms miraculously combine as

$$\begin{aligned}{}[iS_{1},e\phi ]-\dot{S}_{1}{} & {} =-\frac{e\alpha }{m}\gamma ^{5}\varvec{ \gamma }\cdot (\varvec{\nabla }\phi +\varvec{\dot{A}})\nonumber \\{} & {} =\frac{e\alpha }{m} \gamma ^{5}\varvec{\gamma }\cdot \textbf{E}=-\frac{e\alpha }{m}\gamma ^{0}\varvec{\sigma }\cdot \mathbf {E\ }. \end{aligned}$$
(22)

Thus, with \(\alpha =md/e\), the EDM term in \(\mathcal {H}_{EM}^{\textrm{NR}}\) is rotated away! More accurately, we should say that it is transformed into higher order corrections coming from the rest of Eq. (21). After some algebra, the transformed Hamiltonian is found to be, keeping only terms at most linear in either a or d, since these quantities are experimentally small,

$$\begin{aligned} \mathcal {H}_{EM}^{\textrm{NR}}&=\gamma ^{0}\left( m+\frac{\textbf{P}^{2}}{2m }-\frac{e\left( 1+a\right) \varvec{\sigma }\cdot \textbf{B}}{2m}\right. \nonumber \\&\quad \left. -\frac{ \textbf{P}^{4}-e\{\textbf{P}^{2},\varvec{\sigma }\cdot \textbf{B}\}-e^{2}(\textbf{E}^{2}-\textbf{B}^{2})}{8m^{3}}\right) +e\phi \nonumber \\&\quad +ie\frac{1+2a}{8m^{2}}[\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{E}]+\frac{id}{2m}[\varvec{\gamma } \cdot \textbf{P},\varvec{\gamma }\cdot \textbf{B}] \nonumber \\&\quad +\gamma ^{0}\left( \frac{ae^{2}}{8m^{3}}\textbf{E}^{2}-\frac{ed}{ 8m^{2}}\{\varvec{\gamma }\cdot \textbf{B},\varvec{\gamma }\cdot \textbf{E}\}\right) \nonumber \\&\quad +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\{ \varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})\}\}+\gamma ^{0}\{\varvec{\gamma }\cdot \textbf{P },\varvec{\gamma }\cdot (\delta _{\mu }\varvec{\dot{E}}+d\varvec{\dot{B}})\}}{8m^{2}} \nonumber \\&\quad +\frac{d}{8m^{2}}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P} ,[\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{E} ]]+\mathcal {O}(1/m^{4})\ , \end{aligned}$$
(23)

where the only non-trivial reduction is \([\varvec{\gamma }\cdot \textbf{P },\textbf{P}^{2}]=-e\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{ P},\varvec{\gamma }\cdot \textbf{B}]\), the rest being straightforward algebraic manipulations.

2.3.1 Higher order Schiff transformations and operator redundancies

Central to Schiff’s theorem is the presence of the \(-\dot{S}_{1}\) piece that directly enters in the transformed Hamiltonian in Eq. (21), and can thus directly interfere with the other terms. When applied on \(\gamma ^{5}\varvec{\gamma }\cdot \textbf{P}\), it generates a \(\gamma ^{5} \varvec{\gamma }\cdot \varvec{\dot{A}}\) term out of which \(\gamma ^{5} \varvec{\gamma }\cdot \textbf{E}\) emerges without an additional \(m^{-1}\) factor. This same trick can be used for any term that involves time derivatives of external fields. For instance, consider now

$$\begin{aligned} iS_{2}=\frac{i\beta }{8m^{2}}\gamma ^{0}\{\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot (\delta _{\mu }\textbf{E}+d\textbf{B})\}. \end{aligned}$$
(24)

Since it is already of \(\mathcal {O}(m^{-3})\) (remember \(\delta _{\mu }\) and d are \(\mathcal {O}(m^{-1})\)), only the leading commutator with \(e\phi \) needs to be computed. Again, \([iS_{2},e\phi ]\) combine with the \( \varvec{\dot{P}}\) in \(-\dot{S}_{2}\) to give a \(\varvec{\nabla }\phi +\mathbf { \dot{A}}=-\textbf{E}\) factor:

$$\begin{aligned}{}[iS_{2},e\phi ]-\dot{S}_{2}= & {} -\frac{e\beta }{8m^{2}}\gamma ^{0}\{ \varvec{\gamma }\cdot \textbf{E},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{E}+d\textbf{B})\}\nonumber \\{} & {} -\frac{\beta }{8m^{2}}\gamma ^{0}\{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot (\delta _{\mu }\varvec{\dot{E}}+d \varvec{\dot{B}})\}. \end{aligned}$$
(25)

This time though, we find a redundancy among \(\mathcal {O}(1/m^{3})\) operators, up to higher order corrections. Our preferred choice is to take \( \beta =1\) to get rid of the \(\varvec{\dot{E}}\) and \(\varvec{\dot{B}}\) operators, but one could equally well decide to keep the \(\varvec{\dot{E}}\) operator and eliminate the \(\textbf{E}^{2}\) term, or keep the \(\varvec{\dot{B }}\) term and eliminate the \(\textbf{E}\cdot \textbf{B}\) couplings. A third possible transformation is

$$\begin{aligned} iS_{3}=\frac{i\varepsilon }{m^{3}}\gamma ^{5}\{\{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{P}\},\varvec{\gamma }\cdot \textbf{P}\}\ , \end{aligned}$$
(26)

which also introduces a redundancy among \(\mathcal {O}(1/m^{3})\) operators, up to higher order corrections,

$$\begin{aligned}{}[iS_{3},e\phi ]-\dot{S}_{3}= & {} -\frac{e\varepsilon }{m^{3}} \gamma ^{5}\left( 2\{\varvec{\gamma }\cdot \textbf{P},\{\varvec{\gamma } \cdot \textbf{P},\varvec{\gamma }\cdot \textbf{E}\}\}\right. \nonumber \\{} & {} \left. +\{\varvec{\gamma } \cdot \textbf{E},\{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{P}\}\}\right) . \end{aligned}$$
(27)

These redundancies can be used to reduce the number of relevant operators. In Appendix B, we present one possible choice of \(S_{2}\) and \( S_{3} \) that bring \(\mathcal {H}_{EM}^{\textrm{NR}}\) to a somewhat optimal form. It should be stressed though that the final coefficients for the higher order operators in \(\mathcal {H}_{EM}^{\textrm{NR}}\) should not be taken too literally. Indeed, once adopting an effective description with the \(\mathcal {O}(1/m)\) couplings \(\sigma ^{\mu \nu }F_{\mu \nu }\) and \(\sigma ^{\mu \nu }\tilde{F}_{\mu \nu }\), one could in principle also include \(\mathcal {O} (1/m^{2})\) or \(\mathcal {O}(1/m^{3})\) operators. For example, if one adds the \(F_{\mu \nu }F^{\mu \nu }\) or \(F_{\mu \nu }\tilde{F}^{\mu \nu }\) operators in \( \mathcal {H}_{EM}\), their coefficients will directly correct those of \( \textbf{E}^{2}-\textbf{B}^{2}\) and \(\textbf{E}\cdot \textbf{B}\) in \(\mathcal {H }_{EM}^{\textrm{NR}}\).

Finally, it is worth to stress that this list certainly does not exhaust possible unitary transformations, and that not all such transformations encode useful information. For example, consider

$$\begin{aligned} iS_{4}=\frac{i\eta }{m^{2}}\{\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{P}\}\ , \end{aligned}$$
(28)

which is even and hermitian for \(\eta \) real. The change in the Hamiltonian is

$$\begin{aligned}{}[iS_{4},\mathcal {H}]-\dot{S}_{4}=-\frac{2e\eta }{m^{2}}\{\varvec{ \gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{E}\}+\mathcal {O} (1/m^{4}). \end{aligned}$$
(29)

This transformation just adds the \(\{\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{E}\}\) operator to the Hamiltonian, up to higher order terms. To understand why this has no impact on the physics, let us first expand it using Eq. (19),

$$\begin{aligned}{} & {} -\frac{2e\eta }{m^{2}}\{\varvec{\gamma }\cdot \textbf{P},\varvec{ \gamma }\cdot \textbf{E}\}=-i\frac{2e\eta }{m^{2}}\nonumber \\{} & {} \quad \times \left( \varvec{\nabla } \cdot \textbf{E}+i\varvec{\sigma }\cdot (\varvec{\nabla }\times \textbf{E }) +2\textbf{E}\cdot \varvec{\nabla }\right) . \end{aligned}$$
(30)

If we could take \(\eta \) imaginary, this operator would interfere with the Darwin and spin-orbit operator \(i[\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{E}]\), but this would make \(\exp (iS_{4})\) non-unitary. Actually, this operator has no impact because \(\varvec{\nabla } \cdot \textbf{E}\) and \(2\textbf{E}\cdot \varvec{\nabla }\) compensate each other when acting on wavefunctions (both are standard forms for the Darwin operator), while the \(\varvec{\sigma }\cdot (\varvec{\nabla }\times \textbf{E})=-\varvec{\sigma }\cdot \varvec{\dot{B}}\) term drops out for static fields (and could be rotated away by a dedicated unitary transformation with \(S_{5}\sim \varvec{\sigma }\cdot \textbf{B}\) anyway).

2.3.2 Schiff theorem and charged fermion EDMs

Schiff’s theorem shows that the energy of a charged particle cannot be influenced by its EDM at leading order. The naive interpretation of this result is that a charged particle plunged in an electric field would feel the Lorentz force and fly away. The Schiff’s transformation is then viewed as a translation that moves us in some sort of rest frame for the charged fermion in which there is no electric field anymore, hence where the EDM operator vanishes and cannot contribute to Stark energy shifts. Thus, for charged fermions, \([\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{B}]\) encodes the leading impact the EDM has on the particle energies in the non-relativistic limit. Using Eq. (19), one can recognize in this term the spin-dependent \(\varvec{ \sigma }\cdot (\textbf{p}\times \textbf{B})\) coupling discussed originally by Schiff [47]. To feel the EDM with electric fields, one has to go fetch the \(\mathcal {O}(1/m^{3})\) operators \(\gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{P},\{\varvec{\gamma }\cdot \textbf{P}, \varvec{\gamma }\cdot \textbf{E}\}\}\) or \(\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},[\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma } \cdot \textbf{E}]]\) which, thanks to Eq. (27), can both be reduced to \(\gamma ^{0}\{\textbf{P}^{2},\varvec{\sigma }\cdot \textbf{E} \}\). This operator thus encodes the leading relativistic corrections. For the case of an electron in a heavy atom, significant enhancements of this operator have been found that guarantee an experimental sensitivity to the electron EDM [50]. Finally, it should be mentioned that another way to evade the shielding of the EDM is to account for finite-size effects, that clearly go beyond the current formalism (for a review, see Refs. [51, 52]).

Schiff’s theorem is a statement about the Hamiltonian, and thus applies to the energy levels of bound charged fermions. It does not mean that the EDM operator cannot be felt using other observables. In particular, the electric field does exert a torque on the spin of charged fermions, leading to its precession in adequate experimental settings. Specifically, to leading order, the spin operator \(\varvec{S}=\varvec{\sigma }/2\) evolves according to \(\mathcal {H}=\gamma ^{0}\left( m+d\varvec{\sigma }\cdot \textbf{E}\right) +e\phi \) as

$$\begin{aligned} \varvec{\dot{S}}^{i}{} & {} =-i[\varvec{S}^{i},\mathcal {H}]=\frac{i}{2} d\gamma ^{0}[\varvec{\gamma }^{i},\varvec{\gamma }\cdot \textbf{E} ]=d\gamma ^{0}\varepsilon ^{ijk}\varvec{\sigma }^{k}\textbf{E} ^{j}\nonumber \\{} & {} =-2d\gamma ^{0}\varvec{S}\times \textbf{E}. \end{aligned}$$
(31)

which is nothing but one term of the generalized Bargmann–Michel–Telegdi equation [53, 54]. After the Schiff rotation, the \(d\varvec{\sigma }\cdot \textbf{E}\) operator is removed from \(\mathcal {H}\), and the spin operator in that basis satisfies

$$\begin{aligned} \varvec{\dot{S}}^{\prime }=-i[\varvec{S}^{\prime },\mathcal {H} ^{\prime }]=-i[\varvec{S}^{\prime },e\phi ]. \end{aligned}$$
(32)

The crucial point that makes this equation compatible with that of \( \varvec{S}\) is that the spin operator does not commute with the Schiff transformation, so that \(\varvec{S}^{\prime }=e^{iS_{1}}\varvec{S} e^{-iS_{1}}\ne \varvec{S}\). Plugging this in the above equation implies

$$\begin{aligned}{} & {} \frac{d}{dt}(e^{iS_{1}}\varvec{S}e^{-iS_{1}})=-i[e^{iS_{1}}\varvec{S} e^{-iS_{1}},e\phi ]\nonumber \\{} & {} \qquad \Longrightarrow \varvec{\dot{S}}=i\left[ \varvec{S },[iS_{1},e\phi ]-\dot{S}_{1}\right] , \end{aligned}$$
(33)

which obviously holds by construction, since \([iS_{1},e\phi ]-\dot{S} _{1}=-\gamma ^{0}d\varvec{\sigma }\cdot \textbf{E}\), see Eq. (22). This exercise provides another interpretation of Schiff’s theorem. In a gauge in which \(\phi =0\), \(\varvec{S}^{\prime }\) appears constant in time, so the Schiff rotation of Eq. (20) is actually that to the rotating frame in which the spin appears static. This could have been guessed from the start if one notes that \(S_{1}\) actually involves the helicity operator, \(\gamma ^{5}\varvec{\gamma }\cdot \textbf{P}=-\gamma ^{0}\varvec{\sigma }\cdot \textbf{P}\).

2.3.3 Schiff theorem and neutral fermion EDMs

Schiff’s theorem cannot apply to neutral fermions. Indeed, one can simply send \(e\rightarrow 0\) to decouple EM fields in Eq. (17) while keeping an explicit EDM term, but the parameter of the Schiff’s transformation in Eq. (20) has to be set to \(\alpha =md/e\), which is undefined in that limit. Said differently, it is only through a delicate interplay with the couplings to the external EM fields that the Schiff’s transformation can interfere destructively with the EDM term. Thus, for the neutron, all one can do is to eliminate the \(\varvec{\dot{E}}\) and \(\varvec{\dot{B}}\) couplings, and starting from Eq. (17) in the \(e\rightarrow 0\) limit, one ends up with

$$\begin{aligned} \left. \mathcal {H}_{EM}^{\textrm{NR}}\right| _{e\rightarrow 0}&=\gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m}-\frac{\textbf{p}^{4}}{8m^{3}}-\delta _{\mu }\varvec{\sigma }\cdot \textbf{B}+d\varvec{\sigma }\cdot \textbf{E}\right. \nonumber \\&\quad \left. +\frac{(\delta _{\mu }\textbf{E}+d\textbf{B})^{2}}{2m}\right) +\frac{i[\varvec{\gamma }\cdot \textbf{p},\varvec{\gamma } \cdot (\delta _{\mu }\textbf{E}+d\textbf{B})]}{2m}\nonumber \\&\quad +\frac{\gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{p},\{\varvec{\gamma }\cdot \textbf{p}, \varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})\}\}}{8m^{2}}\! +\!\mathcal {O}(1/m^{4}). \end{aligned}$$
(34)

It is not possible to rotate away the EDM. At the fundamental level, d is induced by all the CP-violating operators involving gluons and/or quarks (for a review, see e.g. Ref. [55]). The most important contribution is that of the \(\theta \) term, at the root of the strong CP puzzle, and which is estimated as [56, 57]

$$\begin{aligned} d_{n}=-(2.7\pm 1.2)\times 10^{-16}\theta ~e\text { cm .} \end{aligned}$$
(35)

In the SM, the CKM contribution is negligible, but in principle, some New Physics may also induce fundamental EDMs for the quarks (see e.g. Ref. [58] and references cited there). As those are certainly far from non-relativistic inside a neutron, Schiff’s theorem should be largely evaded. In a SU(6) model, the neutron EDM receives then the additional contribution [55]

$$\begin{aligned} d_{n}=\frac{4}{3}d_{d}-\frac{1}{3}d_{u}\ , \end{aligned}$$
(36)

Note that the same rather naive model gives \(\delta _{\mu }=(4\mu _{d}-\mu _{u})/3\) with \(\mu _{u,d}=e/m_{u,d}\). With constituent quark masses \( m_{u}=m_{d}=m_{N}/3\), this gives \(\delta _{\mu }=-2e/(2m_{N})\), in fairly good agreement with the measured \(\delta _{\mu }=-1.913e/(2m_{N})\). Though this hardly suffices to justify Eq. (36) as there is no analog of Schiff’s screening for the magnetic moment, it is in fairly good agreement with recent lattice calculations [59] \(d_{n}\approx (0.82\pm 0.03)d_{d}-(0.21\pm 0.01)d_{u}\). This shows that Schiff’s screening theorem does not apply to quarks, as could have been expected since those are bound not by the electromagnetic interactions but by the strong interactions.

3 Axion interactions in the non-relativistic limit

Nowadays, the equivalence between the pseudoscalar and derivative axial interactions is understood as particular application of the general reparametrization theorem to Goldstone bosons [60]. Let us recall the essence of the argument (see Ref. [61] for more details). For a typical axion model, one starts with a spontaneously broken chiral symmetry, \(U(1)_{PQ}\). Then, the statement that Goldstone boson a must interact derivatively leads to the unique interaction term (remember that \(g=m/\Lambda \) is absorbed into a):

(37)

Obviously, this interaction is invariant under constant shifts of the Goldstone field. This is the standard form for most axion analyses, but one should emphasize that the Goldstone field is actually parametrized non-linearly in this representation, since its simple shifts \(a\rightarrow a+m\theta \) must span the vacuum manifold as \(\theta \) varies between 0 and \( 2k\pi \). Another peculiarity of this representation is that the fermion field \( \psi \) ends up neutral under the original \(U(1)_{PQ}\) symmetry, even though it must initially be charged otherwise it would not occur in the Noether current of the \(U(1)_{PQ}\) symmetry.

This interpretation is made clearer by adopting a different parametrization for the fields, that in which the fermion field keeps its original charge. This is achieved via a chiral reparametrization of the fermion field

$$\begin{aligned} \psi \rightarrow \exp (ia\gamma _{5}/m)\psi . \end{aligned}$$
(38)

When plugged in \(\mathcal {L}_{D}\), the derivative coupling is replaced by a tower of pseudoscalar interactions

(39)

A trivial mass term \(m\bar{\psi }\psi \) necessarily breaks the axial symmetry \(U(1)_{PQ}\), so the fermion mass must arise through the symmetry breaking itself, like in the SM. Such a mass term is still invariant under the original chiral symmetry because the phase the fermion field acquires under \( U(1)_{PQ}\), \(\psi \rightarrow \exp (i\theta \gamma _{5})\psi \), is compensated by the shift of the Goldstone field \(a\rightarrow a+m\theta \).

Now, to leading order in a, this interaction produces a pseudoscalar coupling of the fermion to the axion:

(40)

Truncating the theory in this way, one should remember that \(\mathcal {O} (a^{2})\) terms and above are neglected. This approximation is only valid for on-shell fermions, since by integration by part, \(\bar{\psi }(\gamma ^{\mu }\gamma _{5}\partial _{\mu }a)\psi =-\bar{\psi }(2im\gamma _{5}a)\psi \) upon enforcing the free equation of motion . As mentioned in the Introduction, part of the historic controversy on the equivalence between the axial and pseudoscalar descriptions of nucleon-pion interactions has to do with this truncation. Nowadays, the equivalence between both representations is built in chiral effective theories. For axion models, it is not always fully embedded yet, as we will see. Further, additional care is needed because the \(U(1)_{PQ}\) symmetry being anomalous, so is the chiral reparametrization Eq. (38). As analyzed in details in Ref.  [61] (see also Refs. [62, 63]), the two representations are then equivalent only up to the presence of specific anomalous contact interactions of the axion to gauge bosons. In the present section, these effects are not relevant and will not be discussed further, but we will come back to them when analyzing the passage from the quark to the nucleon level in Sect. 4.

Our goal is to construct and analyze the axion-fermion interactions in the non-relativistic limit. To treat both representations simultaneously, we adopt the trick proposed by Friar a long time ago and described in Ref. [45]. Specifically, let us start from \(\mathcal {L}_{D}\). The Euler-Lagrange equation gives \(i\partial _{t}\left| \psi \right\rangle = \mathcal {H}_{D}\left| \psi \right\rangle \) with

$$\begin{aligned} \mathcal {H}_{D}=\gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{p}+m- \frac{\gamma ^{0}\gamma _{5}\dot{a}}{m}+\frac{\gamma _{5}\varvec{\gamma } \cdot \varvec{\nabla }a}{m}\right) . \end{aligned}$$
(41)

Then, we partially perform the fermion reparametrization Eq. (38), which is nothing but a unitary transformation \(\psi \rightarrow \psi =e^{iS(\mu )}\psi \) with

$$\begin{aligned} iS(\mu )=-\frac{i\mu }{m}a\gamma _{5}. \end{aligned}$$
(42)

Calculating \(\mathcal {H}(\mu )=e^{iS(\mu )}\left( \mathcal {H}_{D}-i\partial _{t}\right) e^{-iS(\mu )}\) with the help of \(\exp (i\alpha \gamma _{5})=\cos \alpha +i\gamma _{5}\sin \alpha \), we find

$$\begin{aligned} \mathcal {H}(\mu )= & {} \gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{p}+ \frac{\mu -1}{m}\gamma ^{0}\gamma _{5}\dot{a}+\frac{1-\mu }{m}\gamma _{5} \varvec{\gamma }\cdot \varvec{\nabla }a\right. \nonumber \\{} & {} \left. +m\exp \left( \frac{2i\mu }{m} a\gamma ^{5}\right) \right) . \end{aligned}$$
(43)

So, this form permits to interpolate between the exponential and derivative representations, with \(\mathcal {H}(0)=\mathcal {H}_{D}\) and \(\mathcal {H}(1)= \mathcal {H}_{E}\). Let us now perform the non-relativistic expansion of this expression, first as it stands, and then adding electromagnetic interactions.

3.1 In the absence of EM fields

In the Hamiltonian Eq. (43), the terms \(\varvec{\gamma }\cdot \textbf{p}\) and \(\gamma _{5}\varvec{\gamma }\cdot \varvec{\nabla } a=i\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},a]\) are diagonal since \( \gamma ^{5}\varvec{\gamma }=-\gamma ^{0}\varvec{\sigma }\), but the \( \gamma _{5}\) piece coming from the exponential and \(\gamma _{5}\dot{a}\) are not. Splitting the exponential using \(\exp (i\alpha \gamma _{5})=\cos \alpha +i\gamma _{5}\sin \alpha \), the elements to be used for the FW transformation are

$$\begin{aligned} \mathcal {O}= & {} \varvec{\gamma }\cdot \textbf{p}-\frac{1-\mu }{m} \gamma ^{0}\gamma ^{5}\dot{a}+i\gamma ^{5}S_{a},\nonumber \\ \mathcal {E}= & {} \frac{1}{m} \gamma ^{0}C_{a}+i\frac{1-\mu }{m}\gamma ^{0}\gamma ^{5}[\varvec{\gamma } \cdot \textbf{p},a], \end{aligned}$$
(44)

with \(S_{a}\equiv m\sin (2\mu a/m)=2\mu a+\cdots \) and \(C_{a}\equiv m^{2}(\cos (2\mu a/m)-1)=-2\mu ^{2}a^{2}+\cdots \). The calculation, though cumbersome, does not present any particular difficulty and we find

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}(\mu )&=\gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m}-\frac{\textbf{p}^{4}}{8m^{3}}+\frac{i}{2m}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},S_{a}+2(1-\mu )a]\right) \nonumber \\&\quad +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{S} _{a}+4(1-\mu )\dot{a}\}}{8m^{2}}+\frac{\gamma ^{0}\mathcal {H}_{3}}{8m^{3}} \nonumber \\&\quad +\gamma ^{0}\left( \frac{S_{a}^{2}+2C_{a}}{2m}-\frac{ 4S_{a}^{2}C_{a}+S_{a}^{4}}{8m^{3}}\right) +\mathcal {O}(1/m^{4})\ , \end{aligned}$$
(45)

with

$$\begin{aligned} \mathcal {H}_{3}&=4(1-\mu )^{2}\dot{a}^{2}+2(1-\mu )\dot{a}\dot{S}_{a}+\dot{S} _{a}^{2}-i(1-\mu )\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},\ddot{a} ]\nonumber \\&\quad -2(1-\mu )\ddot{a}S_{a} +i(1-\mu )\gamma ^{5}\nonumber \\&\quad \times [\varvec{\gamma }\cdot \textbf{p},[ \varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p} ,a]]]-i\gamma ^{5}\{\textbf{p}^{2},[\varvec{\gamma }\cdot \textbf{p} ,S_{a}]\} \nonumber \\&\quad -[\varvec{\gamma }\cdot \textbf{p},S_{a}]^{2}+\{\varvec{ \gamma }\cdot \textbf{p},\{\varvec{\gamma }\cdot \textbf{p},C_{a}\}\}-\{ \textbf{p}^{2},S_{a}^{2}\}\nonumber \\&\quad +(1-\mu )\{S_{a},[\varvec{\gamma }\cdot \textbf{p} ,[\varvec{\gamma }\cdot \textbf{p},a]]\} \nonumber \\&\quad -2i\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p} ,S_{a}C_{a}]+i\gamma ^{5}[S_{a},\{\varvec{\gamma }\cdot \textbf{p} ,C_{a}\}]\nonumber \\&\quad -2i\gamma ^{5}S_{a}^{2}[\varvec{\gamma }\cdot \textbf{p},S_{a}]. \end{aligned}$$
(46)

The non-derivative sine and cosine terms are singled out in the last line of Eq. (45) because they can be dropped. The specific combination \( S_{a}^{2}+2C_{a}\) already gives a term of \(\mathcal {O}(a^{4})\) and, when combined with the \(\mathcal {O}(1/m^{3})\) terms, gives the totally negligible contribution

$$\begin{aligned}{} & {} \frac{S_{a}^{2}+2C_{a}}{2m}-\frac{4S_{a}^{2}C_{a}+S_{a}^{4}}{8m^{3}}\nonumber \\{} & {} \quad =-2m\sin ^{8}\left( \mu a/m\right) =-\frac{2\mu ^{8}}{m^{7}}a^{8}+\cdots . \end{aligned}$$
(47)

So, even though the polar representation initially involves non-derivative operators in \(a^{n}\), \(n>1\), none of them survive in the non-relativistic limit. This fact was not realized in Ref. [45], where only terms linear in the pseudoscalar field were kept.

At this stage, we recover the expression in Eq. (3) and (6) by setting \(\mu =0\) and \(\mu =1\), respectively. As stated there, the axion wind term is independent of the parametrization, and actually

$$\begin{aligned}{} & {} \frac{i}{2m}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},S_{a}+2(1-\mu )a]=\frac{1}{m}\gamma ^{5}\varvec{\gamma }\cdot \varvec{\nabla }a\nonumber \\{} & {} \qquad -2\mu ^{3}\frac{ a^{2}\varvec{\nabla }a}{m^{3}}+\mathcal {O}(1/m^{5}). \end{aligned}$$
(48)

On the other hand, the time-derivative term is not [45]

$$\begin{aligned} \frac{1}{8m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{S} _{a}+4(1-\mu )\dot{a}\}=\frac{2-\mu }{4m^{2}}\gamma ^{5}\{\varvec{\gamma } \cdot \textbf{p},\dot{a}\}+\mathcal {O}(1/m^{4}). \end{aligned}$$
(49)

This coupling even disappear for the specific choice \(\mu =2\). Since there are no other \(\mathcal {O}(1/m^{2})\) terms, for this to make sense, this operator must not embody any real physical effects.

3.1.1 Schiff’s transformations

Let us first concentrate on the \(\mathcal {O}(1/m^{2})\) terms. In analogy with the transformation done in Sect. 2.3 to eliminate the EDM operator, we can perform the unitary transform \(\psi \rightarrow e^{iS_{1}}\psi \) with [40, 42]

$$\begin{aligned} iS_{1}=\frac{i}{8m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p} ,S_{a}+4(1-\mu )a\}. \end{aligned}$$
(50)

This transformation is unitary and commutes with the mass term, \([\exp (\pm iS_{1}),\gamma _{0}m]=0\). This means that, with \(iS_{1}\sim \mathcal {O} (m^{-2})\) and \(\mathcal {H}^{\textrm{NR}}(\mu )-\gamma _{0}m\sim \mathcal {O} (m^{-1})\), only the first term of the expansion needs to be kept

$$\begin{aligned} \mathcal {H}^{\textrm{NR}\prime }(\mu )= & {} e^{iS_{1}}\left( \mathcal {H}^{\textrm{ NR}}(\mu )-i\partial _{t}\right) e^{-iS_{1}}=\mathcal {H}^{\textrm{NR}}(\mu )\nonumber \\{} & {} +[iS_{1},\mathcal {H}^{\textrm{NR}}(\mu )]-\dot{S}_{1}+\mathcal {O}(m^{-4}). \end{aligned}$$
(51)

Explicitly, plugging in the expression of \(S_{1}\),

$$\begin{aligned}&[iS_{1},\mathcal {H}^{\textrm{NR}}(\mu )]-\dot{S}_{1} =-\frac{1}{ 8m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{S}_{a}+4(1-\mu )\dot{a}\} \nonumber \\&\quad +\frac{1}{16m^{3}}\gamma ^{0}[\{\varvec{\gamma }\cdot \textbf{p },S_{a}+4(1-\mu )a\},[\varvec{\gamma }\cdot \textbf{p},S_{a}+2(1-\mu )a]]\nonumber \\&\quad +\frac{i}{16m^{3}}\gamma ^{0}\gamma ^{5}[\{\varvec{\gamma } \cdot \textbf{p},S_{a}+4(1-\mu )a\},\textbf{p}^{2}]+\mathcal {O}(m^{-4}). \end{aligned}$$
(52)

The \(\dot{S}_{1}\) term cancels precisely the \(\mathcal {O}(1/m^{2})\) terms, by construction. That is Schiff’s theorem trick in action. What it means is that this operator is actually a higher order effect, now embodied in the \( \mathcal {O}(m^{-3})\) operators. In other words, we have succeeded at replacing the \(\mathcal {O}(1/m^{2})\) terms involving time-derivatives by \( \mathcal {O}(1/m^{3})\) terms involving only space derivatives, that is, axion wind operators.

At this stage, it is clear that Schiff’s trick can be used to remove or simplify the terms in \(\mathcal {H}_{3}\) involving time derivatives.Footnote 2 Specifically, we can perform

$$\begin{aligned} iS_{2}=\frac{1}{8m^{3}}(1-\mu )\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},\dot{a}]\ , \end{aligned}$$
(53)

to remove the term \(\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p}, \ddot{a}]\), up to some \(\mathcal {O}(1/m^{4})\) contributions. The final transformation we consider presents us with an alternative. Let us now rotate \(\mathcal {H}^{\textrm{NR}}(\mu )\) with

$$\begin{aligned} iS_{3}=-\frac{i}{4m^{3}}(1-\mu )\gamma ^{0}\dot{a}S_{a}. \end{aligned}$$
(54)

Since \([iS_{3},\mathcal {H}^{\textrm{NR}}(\mu )]\) is of \(\mathcal {O}(1/m^{4})\), the transformed Hamiltonian is just \(\mathcal {H}^{\textrm{NR}\prime }(\mu )= \mathcal {H}^{\textrm{NR}}(\mu )-\dot{S}_{3}\). This kills the \(\ddot{a}S_{a}\) coupling and corrects the \(\dot{a}\dot{S}(a)\) in precisely the right way to make it \(\mu \) independent:

$$\begin{aligned}{} & {} 4(1-\mu )^{2}\dot{a}^{2}+2(1-\mu )\dot{a}\dot{S}_{a}+\dot{S}_{a}^{2}-2(1-\mu ) \ddot{a}S_{a}\overset{-\dot{S}_{3}}{\rightarrow }\nonumber \\{} & {} \qquad (2\dot{a}(1-\mu )+\dot{S} _{a})^{2}=4\dot{a}^{2}+\mathcal {O}(1/m^{2}). \end{aligned}$$
(55)

Now, we could have done the opposite, that is, make the \(\ddot{a}S_{a}\) coupling \(\mu \) independent by removing entirely the \(\dot{a}^{2}\) coupling. This time, the Schiff’s transformation is not removing an operator, but telling us that two of them are redundant, up to higher order corrections.

3.1.2 Final Hamiltonian in the non-relativistic limit

All in all, after the unitary transformations \(S_{1}\) in Eq. (50), \(S_{2}\) in Eq. (53), and \(S_{3}\) in Eq. (54), and after expanding \(S_{a}\) and \(C_{a}\) and keeping only terms up to \( \mathcal {O}(1/m^{3})\), the Hamiltonian becomes

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}(\mu )= & {} \gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m}- \frac{\textbf{p}^{4}}{8m^{3}}+\frac{i}{m}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},a]\right) \nonumber \\{} & {} +\frac{1}{8m^{3}}\gamma ^{0}\mathcal {H}_{3}+\mathcal {O} (m^{-4}), \end{aligned}$$
(56)

with

$$\begin{aligned} \mathcal {H}_{3}&=4\dot{a}^{2}+i(1-\mu )\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p},a]]]\nonumber \\&\quad -2\mu i\gamma ^{5}\{\textbf{p}^{2},[\varvec{\gamma }\cdot \textbf{p},a]\} \nonumber \\&\quad +(2-\mu )i\gamma ^{5}[\{\varvec{\gamma }\cdot \textbf{p},a\},\textbf{ p}^{2}]-4\mu ^{2}[\varvec{\gamma }\cdot \textbf{p},a]^{2}\nonumber \\&\quad -2\mu ^{2}\{\varvec{\gamma }\cdot \textbf{p},\{\varvec{\gamma }\cdot \textbf{p} ,a^{2}\}\} \nonumber \\&\quad -4\mu ^{2}\{\textbf{p}^{2},a^{2}\}+2\mu (1-\mu )\{a,[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p},a]]\} \nonumber \\&\quad +2(2-\mu )[\{\varvec{\gamma }\cdot \textbf{p},a\},[\varvec{ \gamma }\cdot \textbf{p},a]]\nonumber \\&\quad -4\mu ^{3}i\gamma ^{5}[a,\{\varvec{\gamma }\cdot \textbf{p},a^{2}\}] \nonumber \\&\quad +8i\mu ^{3}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p} ,a^{3}]-16i\mu ^{3}\gamma ^{5}a^{2}[\varvec{\gamma }\cdot \textbf{p},a]\nonumber \\&\quad -\frac{16}{3}i\mu ^{3}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},a^{3}], \end{aligned}$$
(57)

where the last term with the 16/3 coefficient comes from the expansion of the \(\mathcal {O}(m^{-1})\) term involving \(\varvec{\nabla }S_{a}\), see Eq. (48). At this stage, algebraic manipulations of \(\mathcal {H}_{3}\) using commutator and anticommutator identities, e.g.,

$$\begin{aligned} \{a,[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p} ,a]]\}&=[\{a,\varvec{\gamma }\cdot \textbf{p}\},[\varvec{\gamma }\cdot \textbf{p},a]]\ , \end{aligned}$$
(58a)
$$\begin{aligned} \{\varvec{\gamma }\cdot \textbf{p},\{\varvec{\gamma }\cdot \textbf{p} ,a^{2}\}\}&=-[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p},a^{2}]]-2\{a^{2},\textbf{p}^{2}\}\ , \end{aligned}$$
(58b)
$$\begin{aligned}{}[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p },a^{2}]]&=2[\varvec{\gamma }\cdot \textbf{p},a]^{2}+\{a,[\varvec{ \gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p},a]]\}\ , \end{aligned}$$
(58c)
$$\begin{aligned}{}[\varvec{\gamma }\cdot \textbf{p},[\varvec{\gamma }\cdot \textbf{p },[\varvec{\gamma }\cdot \textbf{p},a]]]&=-2[\varvec{\gamma }\cdot \textbf{p},\{a,\textbf{p}^{2}\}]-[\{\varvec{\gamma }\cdot \textbf{p},a\}, \textbf{p}^{2}]\ , \end{aligned}$$
(58d)

permit to show that its \(\mathcal {O}(a^{3})\) terms cancel out completely, and its \(\mathcal {O}(a^{2})\) and \(\mathcal {O}(a)\) terms become independent of \(\mu \). The final Hamiltonian is very simple and contains only five non-trivial operators:

(59)

with the further information that \(\dot{a}^{2}\) can be freely traded for \( \ddot{a}a\). Remember that the axion coupling constant has to be put back by \( a\rightarrow ga\) with \(g=m/\Lambda \) and \(\Lambda \) the PQ breaking scale. Three comments are in order.

  • It is remarkable that all \(\mu \) dependences have cancelled out, and this involved highly non-trivial cancellations. In our opinion, it shows that the essential physical content is correctly identified, and redundancies kept at a minimum. Interestingly, this Hamiltonian cannot be obtained by setting \(\mu \) to some value in \(\mathcal {H}^{\textrm{NR}}(\mu )\) of Eq. (45). This is evident since \(S_{1}\), \(S_{2}\), and \(S_{3}\) do not all vanish for the same value of \(\mu \). Said differently, the sequence of Schiff transformations \(S_{1}\), \(S_{2}\), and \(S_{3}\) does not trivially undo the original Dyson rotation of Eq. (42). Note though that in practice, setting \(\mu =2\) in Eq. (45) already goes a long way since \(S_{1}\) has the most impact but vanishes for that value, at least for operators up to \(\mathcal {O}(m^{-3})\).

  • The \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) ends up completely screened, in a way analogous to Schiff’s EDM screening. What is different though is that we do not expect significant violations of this screening. First, finite size effects were relevant for the EDM as the electric charge density is far from constant in atomic systems. By contrast, the axion background should be relatively homogenous, even on macroscopic scales. Second, relativistic corrections were found significant for the EDM. But, as discussed in Sect. 2.3, the relativistic corrections to \(\varvec{\gamma }\cdot \textbf{E}\) were embodied in the very similar \(\{\textbf{P}^{2},\varvec{\gamma }\cdot \textbf{E}\}\) operator. Here, the relativistic corrections replacing \(\gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) are totally different in nature: they all involve the axion wind and even vanish if \(\varvec{\nabla } a=0\) (note that \(\left[ \textbf{p}^{2},\{\varvec{\gamma }\cdot \textbf{p},a\}\right] =\{\varvec{\gamma }\cdot \textbf{p},[\textbf{p}^{2},a]\}\)). In that \(\varvec{\nabla }a=0\) scenario, the relativistic corrections replacing \( \gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) would at best arise at \(\mathcal {O}(m^{-4})\). For these reasons, we expect the screening of \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) to be particularly effective.

  • The leading fermionic coupling in a \(\varvec{\nabla }a=0\) scenario is \( \dot{a}^{2}/(2m^{3})\), which is not a relativistic correction to \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) but a genuine independent coupling. In this case though, being quadratic in the axion field, it is presumably totally negligible, and better windows could exist. In particular, in most scenarios, the axion also couples to photons. Classically, the \(aF_{\mu \nu }\tilde{F}^{\mu \nu }\) coupling can generates a \(\dot{a}\textbf{B}\) term that act as a current density [64].

  • On a technical note, let us stress that it is crucial to use the full exponential parametrization to correctly identify the final operators. Had we truncated the polar representation to its leading term by setting \( S_{a}=2\mu a\) and \(C_{a}=0\), not only would there still be \(\mathcal {O} (a^{3})\) operators in the final Hamiltonian, but the \(\mu \) dependence would not have cancelled completely [45]. This explains why historically, the \(\mu \) dependence was interpreted as an ambiguity. Now, we see that requiring reparametrization invariance actually points to a preferred basis of operators for \(\mathcal {H}^{\textrm{NR}}\).

The fact that the axioelectric operator is screened can be demonstrated in an alternative way, shedding a different light on the mechanism at play behind the Schiff transformation. Let us assume for now that \(i[\varvec{ \gamma }\cdot \textbf{p},a]=\varvec{\gamma }\cdot \varvec{\nabla }a=0\), and define

$$\begin{aligned} \mathcal {H}_{0}^{\textrm{NR}}=\gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m} \right) \ ,\ \ \ \mathcal {V}(t)=\frac{2-\mu }{4m^{2}}\gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{p},\dot{a}\}. \nonumber \\ \end{aligned}$$
(60)

In the interaction picture, \(\left| \psi ^{I}(t)\right\rangle =\exp (i \mathcal {H}_{0}^{\textrm{NR}}t)\left| \psi (t)\right\rangle \), the time-evolution of \(\left| \psi ^{I}(t)\right\rangle \) can be encoded into the evolution operator

$$\begin{aligned} U(t,t_{0})=T\exp \left[ -i\int _{t_{0}}^{t}dt^{\prime }\mathcal {V} ^{I}(t^{\prime })\right] \ , \end{aligned}$$
(61)

with T the time-ordered product, and such that \(\left| \psi ^{I}(t)\right\rangle =U(t,t_{0})\left| \psi ^{I}(t_{0})\right\rangle \). The interaction picture perturbation is the same as the Schrödinger one at leading order

$$\begin{aligned} \mathcal {V}^{I}(t)=\exp (i\mathcal {H}_{0}^{\textrm{NR}}t)\mathcal {V}(t)\exp (-i\mathcal {H}_{0}^{\textrm{NR}}t)=\mathcal {V}(t)+\mathcal {O}(m^{-3})\ , \end{aligned}$$
(62)

since \([\gamma ^{0}m,\mathcal {V}(t)]=0\). Now, we see that whenever \(\mathcal { V}(t)=\partial _{t}\mathcal {X}(t)\), the evolution operator collapses to the universal \(U(t,t_{0})=\exp (i(\mathcal {X}(t)-\mathcal {X}(t_{0})))\). If the perturbation disappears at early and late times, this constant phase drops out.Footnote 3 Thus, perturbations that are total time derivatives do not change energy levels.Footnote 4 Note also that \(U(t,t_{0})\) is precisely the Schiff transformation done in Eq. (50) to get rid of the perturbation in the first place. It corresponds to \(\left| \psi ^{I}(t)\right\rangle \rightarrow \left| \psi ^{\prime I}(t)\right\rangle =\exp (i\mathcal {X} (t))\left| \psi ^{I}(t)\right\rangle \), with then \(i\partial _{t}\left| \psi ^{\prime I}(t)\right\rangle =0\) since \(\mathcal {V} ^{\prime I}(t)=\mathcal {V}(t)-\partial _{t}\mathcal {X}(t)=0\). In this picture (in the quantum mechanical sense), since \(\mathcal {X}(t)\) commutes with \(\mathcal {H}_{0}^{\textrm{NR}}\) up to terms of \(\mathcal {O}(m^{-3})\), \( \left| \psi ^{\prime I}(t)\right\rangle \) stays fixed to some linear combination of eigenstates of the free Hamiltonian \(\mathcal {H}_{0}^{\textrm{ NR}}\). In this sense, performing the Schiff transformation to get rid of \( \mathcal {V}(t)\) produces a non-relativistic Hamiltonian that better reflects the physics of the system. That is the same idea as the original Schiff transformation for EDMs: \(\mathcal {H}_{EM}^{\textrm{NR}}\) in Eq. (23) better reflects the energy level of the system than that in Eq. (18).

3.2 For charged fermions in an external EM field

The situation described in the previous section changes in a crucial way in the presence of minimally coupled electromagnetic fields. To show this, let us repeat all the steps of the previous section, but starting from

$$\begin{aligned} \mathcal {H}(\mu )= & {} \gamma ^{0}\left( \varvec{\gamma }\cdot \textbf{P}+m- \frac{1-\mu }{m}\gamma ^{0}\gamma _{5}\dot{a}+i\frac{1-\mu }{m}\gamma _{5}[ \varvec{\gamma }\cdot \textbf{P},a]\right. \nonumber \\{} & {} \left. +\left( \exp \left( 2i\frac{\mu }{m} a\gamma ^{5}\right) -1\right) m\right) +e\phi . \end{aligned}$$
(63)

Note that \([\varvec{\gamma }\cdot \textbf{P},a]=[\varvec{\gamma } \cdot \textbf{p},a]=-i\varvec{\gamma }\cdot \varvec{\nabla }a\) since a is electrically neutral. This Hamiltonian can be block-diagonalized by plugging

$$\begin{aligned} \mathcal {O}&=\varvec{\gamma }\cdot \textbf{P}-\frac{1-\mu }{m}\gamma ^{0}\gamma ^{5}\dot{a}+i\gamma ^{5}S_{a}\ ,\ \end{aligned}$$
(64)
$$\begin{aligned} \mathcal {E}&=\gamma ^{0}\frac{1}{m}C_{a}+i\frac{1-\mu }{m}\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},a]+e\phi \ , \end{aligned}$$
(65)

in Eq. (15). This produces

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}(\mu )&=\mathcal {H}_{EM}^{\textrm{NR}}+\frac{ i\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},S_{a}+2(1-\mu )a]}{2m} \nonumber \\&\quad +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{S} _{a}+4(1-\mu )\dot{a}\}}{8m^{2}}-\frac{eS_{a}\gamma ^{5}\varvec{\gamma } \cdot \textbf{E}}{4m^{2}}\nonumber \\&\quad +\frac{1}{8m^{3}}\gamma ^{0}\mathcal {H}_{3}\ , \end{aligned}$$
(66)

where \(\mathcal {H}_{EM}^{\textrm{NR}}\) is the electromagnetic Hamiltonian, Eq. (16), and \(\mathcal {H}_{3}\) is obtained from the neutral one in Eq. (46) by replacing \(\varvec{\gamma }\cdot \textbf{p} \rightarrow \varvec{\gamma }\cdot \textbf{P}\) and \(\textbf{p} ^{2}\rightarrow \textbf{P}^{2}+e\gamma ^{0}\gamma ^{5}\varvec{\gamma } \cdot \textbf{B}\) (which is nothing but \((\varvec{\gamma }\cdot \textbf{p })^{2}\rightarrow (\varvec{\gamma }\cdot \textbf{P})^{2}\)). Compared to the neutral case, the only unexpected new addition is the EDM coupling \( S_{a}\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}=2\mu a\gamma ^{5} \varvec{\gamma }\cdot \textbf{E}+\cdots \). Because it does not arise starting from the axion derivative interaction, it does not appear in the literature (though it is present in Ref. [45]).

As in the free case, to get a better handle on the physical couplings, let us perform the sequence of Schiff transformations:

$$\begin{aligned} iS_{1}&=\frac{i}{8m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P} ,S_{a}+4(1-\mu )a\}\ , \end{aligned}$$
(67a)
$$\begin{aligned} iS_{2}&=\frac{1}{8m^{3}}(1-\mu )\gamma ^{0}\gamma ^{5}[\varvec{\gamma } \cdot \textbf{P},\dot{a}]\ ,\ \end{aligned}$$
(67b)
$$\begin{aligned} iS_{3}&=-\frac{i}{4m^{3}}(1-\mu )\gamma ^{0}\dot{a}S_{a}. \end{aligned}$$
(67c)

After this, the Hamiltonian becomes

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}(\mu )= & {} \mathcal {H}_{EM}^{\textrm{NR}}+\frac{i\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},S_{a}+2(1-\mu )a]}{2m}\nonumber \\{} & {} - \frac{e\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{E},S_{a}+2(1-\mu )a\}}{ 2m^{2}}+\frac{1}{8m^{3}}\gamma ^{0}\mathcal {H}_{3},\nonumber \\ \end{aligned}$$
(68)

with

$$\begin{aligned} \mathcal {H}_{3}&=(2\dot{a}(1-\mu )+\dot{S}_{a})^{2}+i(1-\mu )\gamma ^{5}[ \varvec{\gamma }\cdot \textbf{P},[\varvec{\gamma }\cdot \textbf{P},\nonumber \\&\quad \times [ \varvec{\gamma }\cdot \textbf{P},a]]]+\{\varvec{\gamma }\cdot \textbf{P},\{\varvec{\gamma }\cdot \textbf{P},C_{a}\}\} \nonumber \\&\quad -[\varvec{\gamma }\cdot \textbf{P},S_{a}]^{2}-i\gamma ^{5}\{ \textbf{P}^{2}+e\gamma ^{0}\gamma ^{5}\varvec{\gamma }\cdot \textbf{B},[ \varvec{\gamma }\cdot \textbf{P},S_{a}]\}\nonumber \\&\quad -\{\textbf{P}^{2}+e\gamma ^{0}\gamma ^{5}\varvec{\gamma }\cdot \textbf{B},S_{a}^{2}\}\nonumber \\&\quad +(1-\mu )\{S_{a},[\varvec{\gamma }\cdot \textbf{P},[ \varvec{\gamma }\cdot \textbf{P},a]]\}-2i\gamma ^{5}\nonumber \\&\quad \times [\varvec{\gamma } \cdot \textbf{P},S_{a}C_{a}]+i\gamma ^{5}[S_{a},\{\varvec{\gamma }\cdot \textbf{P},C_{a}\}] \nonumber \\&\quad -2iS_{a}^{2}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P} ,S_{a}]+\frac{i}{2}\gamma ^{5}[\{\varvec{\gamma }\cdot \textbf{P} ,S_{a}\nonumber \\&\quad +4(1-\mu )a\},\textbf{P}^{2}+e\gamma ^{0}\gamma ^{5}\varvec{\gamma }\cdot \textbf{B}] \nonumber \\&\quad +\frac{1}{2}[\{\varvec{\gamma }\cdot \textbf{P},S_{a}+4(1-\mu )a\},[\varvec{\gamma }\cdot \textbf{P},S_{a}\nonumber \\&\quad +2(1-\mu )a]]. \end{aligned}$$
(69)

Let us now expand \(S_{a}\) and \(C_{a}\) and keep only terms up to \(\mathcal {O} (m^{-3})\). This calculation is simpler than it seems because most of the algebra done in the neutral case relied on the use of commutator and anticommutator identities, see Eq. (58), which remain essentially valid. One only has to pay attention to the extra \(\varvec{\gamma }\cdot \textbf{B}\) terms coming from \((\varvec{\gamma }\cdot \textbf{p} )^{2}\rightarrow (\varvec{\gamma }\cdot \textbf{P})^{2}=-2\textbf{P} ^{2}-2e\gamma ^{0}\gamma ^{5}\varvec{\gamma }\cdot \textbf{B}\), which implies for example \([\textbf{P}^{2},\varvec{\gamma }\cdot \textbf{P} ]=e\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},\varvec{ \gamma }\cdot \textbf{B}]\) from \([\varvec{\gamma }\cdot \textbf{P},\{ \varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{P} \}]=-2[\varvec{\gamma }\cdot \textbf{P},\textbf{P}^{2}]-2e\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma } \cdot \textbf{B}]\) and \([\varvec{\gamma }\cdot \textbf{P},\{\varvec{ \gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{P}\}]=0\) since \( [A,\{B,C\}]=\{C,[A,B]\}-\{B,[C,A]\}\). Putting all together, the \(\mu \) dependence again cancels out precisely, \(\mathcal {H}^{\textrm{NR}}(\mu )= \mathcal {H}^{\textrm{NR}}\), and \(\mathcal {H}_{3}\) greatly simplifies to only a few operators:

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}&=\mathcal {H}_{EM}^{\textrm{NR}}+\frac{i\gamma ^{0}\gamma ^{5}[\varvec{\gamma }\cdot \textbf{P},a]}{m}-\frac{ea\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}}{m^{2}} \nonumber \\&\quad +i\gamma ^{0}\gamma ^{5}\frac{2\{\{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{P}\},[\varvec{\gamma }\cdot \textbf{P},a]\}+[\{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma } \cdot \textbf{P}\},\{\varvec{\gamma }\cdot \textbf{P},a\}]}{16m^{3}}\nonumber \\&\quad +\frac{\gamma ^{0}a[\varvec{\gamma }\cdot \textbf{P},[ \varvec{\gamma }\cdot \textbf{P},a]]}{m^{3}}+\frac{\gamma ^{0}\dot{a}^{2} }{2m^{3}}+\mathcal {O}(m^{-4}). \end{aligned}$$
(70)

where \(\{\varvec{\gamma }\cdot \textbf{P},\varvec{\gamma }\cdot \textbf{P}\}=-2\textbf{P}^{2}-2e\gamma ^{0}\gamma ^{5}\varvec{\gamma } \cdot \textbf{B}\). Apart from the new EDM coupling, this expression is identical to the neutral case, but for \(\varvec{\gamma }\cdot \textbf{p} \rightarrow \varvec{\gamma }\cdot \textbf{P}\).

This is not our final form for the Hamiltonian. Because of their importance, we think it is crucial to keep track of the redundancies when they involve operators of the same order. So, let us reintroduce two free parameters explicitly and perform a final unitary transformation

$$\begin{aligned} iS_{4}=-\frac{i\alpha }{2m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},a\}+\frac{i\beta }{2m^{3}}\gamma ^{0}a\dot{a}. \end{aligned}$$
(71)

Then, we obtain:

(72)

Remember that the axion scale has to be put back in these operators by writing \(a\rightarrow ga\) with \(g=m/\Lambda \) and \(\Lambda \) the PQ breaking scale. To be more explicit, the leading operators read

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}(\alpha )= & {} \mathcal {H}_{EM}^{\textrm{NR}}-\dfrac{ \varvec{\sigma }\cdot \varvec{\nabla }a}{\Lambda }+\alpha \gamma ^{0} \dfrac{i\varvec{\sigma }\cdot \varvec{\nabla }\dot{a}-2\dot{a} \varvec{\sigma }\cdot \textbf{P}}{2m\Lambda }\nonumber \\{} & {} +(1-\alpha )\dfrac{ea}{ m\Lambda }\gamma ^{0}\varvec{\sigma }\cdot \textbf{E}+\mathcal {O} (m^{-3}), \end{aligned}$$
(73)

where we used \(\gamma ^{5}\varvec{\gamma }=-\gamma ^{0}\otimes \varvec{\sigma }\) to put the operator in the standard form.

The choice of \(\alpha \) and \(\beta \) is totally free in principle, but care is needed to interpret the phenomenological consequences. What matters at the end are matrix elements of these operators between external states, and it is only at that level that the \(\alpha \) and \(\beta \) parameters cancel out. Below, we give a first series of keys to interpret the equivalence embodied by the Hamiltonian Eq. (72) assuming the axion field oscillates sufficiently rapidly. This is important because otherwise, the pseudoscalar coupling becomes essentially an imaginary contribution to the fermion mass term, that can always be disposed of by a (fixed) chiral rotation of the fermionic states. The distinction between fermionic reparametrizations and changes of basis is delicate but crucial phenomenologically in some region of parameter space, as will be discussed in details in Sect. 4.

3.2.1 On the axioelectric–axionic EDM equivalence

At this stage, we have two operators at \(\mathcal {O}(1/m^{2})\) whose relative weight can be freely tuned, but whose overall impact must be identical. In other words, starting from \(\mathcal {H}^{\textrm{NR}}(\alpha ,\beta )\), \(\alpha \) and \(\beta \) must drop out of physical observables. Clearly, this means that the covariant axioelectric operator and the axionic EDM operators must be equivalent:

$$\begin{aligned} \dfrac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}}{2m^{2}} \Leftrightarrow -\dfrac{ea\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}}{ m^{2}}. \end{aligned}$$
(74)

At first sight, these operators appear to encode different physics. One depends on \(\dot{a}\) but not on \(\textbf{E}\), while the other depends on \( \textbf{E}\) but not on \(\dot{a}\). Yet, as we now discuss, there are several ways to interpret this equivalence, and to understand that at the level of observables, both operators always end up being strictly equivalent.

  • Double screening: The interplay between the \(\gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{P},\dot{a}\}\) and \(a\gamma ^{5}\varvec{ \gamma }\cdot \textbf{E}\) operators should have been expected. We know from the previous section that in the absence of electromagnetic fields, \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}\rightarrow \gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{p},\dot{a}\}\) can be eliminated. And, Schiff’s theorem is telling us that if the axion field is constant, \(a\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}\) becomes a fixed EDM coupling that can be rotated away. So, we see that both a time-varying axion field and minimal couplings to the external electromagnetic fields are required to get a physical effect. Each form of the operator makes one of these screening manifest, but only their equivalence embodies their true impact on the physics.

  • Duality in the Dirac equation: Before the Schiff transformation, the derivative and polar representations do not match and rather produce, from Eq. (66),

    (75)
    (76)

    Yet, upon the equivalence of Eq. (74), one can choose to put the whole \(\mathcal {O}(1/m^{2})\) part of the Hamiltonian into the form of an EDM coupling \(-ea\gamma ^{5}\varvec{\gamma }\cdot \mathbf {E/}m^{2}\), which then has a very straightforward interpretation. For a charged particle, it is well-known that the Dirac equation predicts a magnetic moment \(g=2\) via the Zeeman term,

    $$\begin{aligned} m\bar{\psi }\psi \rightarrow \frac{e}{2m}\gamma ^{0}\varvec{\sigma }\cdot \textbf{B}. \end{aligned}$$
    (77)

    The axion coupling to fermion can be viewed as an oscillating pseudoscalar mass term, dual to the scalar mass term. As a result, the Dirac equation then predicts an electric moment,

    $$\begin{aligned} m(2a/m)\bar{\psi }i\gamma ^{5}\psi \rightarrow \frac{e}{2m}(2a/m)\gamma ^{0} \varvec{\sigma }\cdot \textbf{E}\ , \end{aligned}$$
    (78)

    since duality interchanges \(\textbf{B}\) and \(\textbf{E}\). In this sense, the prediction \(d=ea/m^{2}\) for the axionic EDM is the exact analogue of \(g=2\) for the magnetic moment. It represents an inescapable consequence of the Dirac equation whenever the charged fermion has a pseudoscalar coupling to the axion.Footnote 5 The observability of this oscillating EDM is another question though, because one must fight the various screening effects, and will be discussed in Sect. 4.

  • Time-dependent perturbation theory: If we set \(\phi =0\) and write \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}=\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}-2e\dot{a}\gamma ^{5} \varvec{\gamma }\cdot \textbf{A}\), the \(\gamma ^{5}\{\varvec{\gamma } \cdot \textbf{p},\dot{a}\}\) piece can be rotated away as in the neutral case. Basically, this contribution is kinematically suppressed, and encoded into axion wind operators of \(\mathcal {O}(1/m^{3})\). Then, at leading order, what the Schiff transformation Eq. (67a) is telling us is that a coupling \(\dot{a}\varvec{\sigma }\cdot \textbf{A}\) is equivalentFootnote 6 to a coupling \(a\varvec{\sigma }\cdot \varvec{\dot{A}}\), that is, \(a \varvec{\sigma }\cdot \textbf{E}\), exactly like the transformation Eq. (67c) is telling us that \(\dot{a}^{2}\) encodes the same physics as \(a\ddot{a}\). These pairs of operators must give the same result when acting on fermion wavefunctions within physical observables (see Table 1). This is clearly in accordance with the interaction picture evolution of Eq. (61), where a perturbation like \(a\varvec{\sigma }\cdot \textbf{E}\) or \(a\ddot{a}\) is to be integrated over time. This also shows how the original Schiff screening comes back if a(t) becomes constant, as the time-integral of \(a\varvec{\sigma }\cdot \textbf{E} =\partial _{t}(a\varvec{\sigma }\cdot \textbf{A})\) then sums up to an unobservable constant rephasing of the fermion wavefunction (note, though, that boundary terms can encode important physical effects in this situation, as analyzed in Ref. [65]).

  • The axioelectric effects as EDM-induced: If one encodes entirely the \(\mathcal {O}(1/m^{2})\) terms into an EDM operator, the usual axioelectric effect is nevertheless still there. Because the two forms in Eq. (74) are equivalent, the same matrix elements as in Ref.  [23] must be recovered. The equivalence in the \(\phi =0\) gauge was discussed above, so let us now concentrate instead on that with \( \textbf{A}=0\), so that \(\textbf{E}=-\varvec{\nabla }\phi \). In this case, the covariant axioelectric operator reduces as \(\gamma ^{5}\{\varvec{ \gamma }\cdot \textbf{P},\dot{a}\}=\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{a}\}\). Though identical in form with the neutral fermion axioelectric operator, it cannot be rotated away when \(\phi \ne 0\) (there is an extra term in Eq. (52) from \([iS_{1},\phi ]\ne 0\) with \( S_{1}\) in Eq. (50)). As explicitly calculated in Ref. [23], the \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p}, \dot{a}\}\) can then induce transitions between energy levels for an electron bound in the potential \(\phi \). Now, starting instead from the EDM operator, we can write for an electrostatically bound electron,Footnote 7\(ea\varvec{\sigma }\cdot \textbf{E}=-ea\varvec{\sigma }\cdot \varvec{\nabla }\phi =-ia\varvec{\sigma }\cdot \mathbf {[p}, \mathcal {H}^{\textrm{NR}}]+\cdots \). With this, the transition matrix element of Ref. [23] is trivially recovered as \(\left\langle \psi _{f}\right| ea\varvec{\sigma }\cdot \textbf{E}\left| \psi _{i}\right\rangle =\partial _{t}a\left\langle \psi _{f}\right| \varvec{\sigma }\cdot \textbf{p}\left| \psi _{i}\right\rangle \) upon using \(i\partial _{t}\left| \psi _{i,f}\right\rangle =\mathcal {H}^{ \textrm{NR}}\left| \psi _{i,f}\right\rangle \) and integrating by part over time (see Table 1). Note that these mathematical steps essentially undo the Schiff transformation of Eq. (67a). This shows that whatever the operator, the same matrix element for the axioelectric effect is obtained, as it should since physics must not depend on the representation. Yet, we think it sheds new light to interpret the axioelectric effects rather as a manifestation of an axion-induced EDM, especially in view of the other points discussed previously.

Table 1 Schematic representation of the equivalence of Eq. (74) at the level of observables. In the top line, time-dependent perturbation theory is understood, while for the bottom line, the operators are understood to be sandwiched between bound fermionic states

3.3 For neutral fermions having an EDM interaction

The final application is the non-relativistic limit of the Hamiltonian for a neutral state coupled to the axion, but in the presence of both the magnetic and electric dipole operators. Those are not invariant under the PQ symmetry, so one has to decide how they should be introduced. We consider that they arise in the same way as the mass term, through the PQ symmetry breaking. The fermion field is neutral under the PQ symmetry only in the derivative representation, so those effective operators can be added to Eq. (37) as

(79)

Then, if we use again Friar’s trick, Eq. (42), the Lagrangian interpolating between derivative and polar representations is

$$\begin{aligned} \mathcal {L}_{D}= & {} \bar{\psi }\left( i\gamma ^{\mu }\partial _{\mu }-m\left( 1+ \frac{C_{a}}{m^{2}}\right) -i\gamma _{5}S_{a}+\frac{1-\mu }{m}\right. \nonumber \\{} & {} \left. \gamma ^{\mu }\gamma _{5}\partial _{\mu }a-\frac{\tilde{\delta }_{\mu }}{2}\sigma ^{\mu \nu }F_{\mu \nu }+i\frac{\tilde{d}}{2}\sigma ^{\mu \nu }\gamma ^{5}F_{\mu \nu }\right) \psi , \nonumber \\ \end{aligned}$$
(80)

where

$$\begin{aligned} \tilde{\delta }_{\mu }=\delta _{\mu }+\frac{S_{a}}{m}d+\frac{C_{a}}{m^{2}} \delta _{\mu }\ ,\ \ \tilde{d}=d-\frac{S_{a}}{m}\delta _{\mu }+d\frac{C_{a}}{ m^{2}}\ , \end{aligned}$$
(81)

and \(S_{a}=m\sin (2\mu a/m)\), \(C_{a}=m^{2}(\cos (2\mu a/m)-1)\) as before. The fact that the dipole operators end up proportional to the axion field in the exponential representation is similar as in the SM, where they necessarily involve the Higgs boson field [68].

The Hamiltonian in the non-relativistic limit can be obtained by plugging the odd and even elements

$$\begin{aligned} \mathcal {O}&=\varvec{\gamma }\cdot \textbf{p}-\frac{1-\mu }{m}\gamma ^{0}\gamma ^{5}\dot{a}+i\gamma ^{5}S_{a}+i\gamma ^{0}\varvec{\gamma }\cdot ( \tilde{\delta }_{\mu }\textbf{E}+\tilde{d}\textbf{B})\ ,\ \end{aligned}$$
(82)
$$\begin{aligned} \mathcal {E}&=\gamma ^{0}\frac{1}{m}C_{a}+\frac{1-\mu }{m}i\gamma ^{0} \gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},a]+\gamma ^{5}\varvec{ \gamma }\cdot (\tilde{\delta }_{\mu }\textbf{B}-\tilde{d}\textbf{E})\ , \end{aligned}$$
(83)

in Eq. (15). After some algebra, and noting that \(\tilde{\delta }_{\mu }=\delta _{\mu }+\mathcal {O}(1/m^{2})\) and \(\tilde{d}=d+\mathcal {O}(1/m^{2})\), we arrive at

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}(\mu )&=\gamma ^{0}\left( m+\frac{\textbf{p}^{2}}{2m }-\frac{\textbf{p}^{4}}{8m^{3}}+\frac{i\gamma ^{5}[\varvec{\gamma }\cdot \textbf{p},S_{a}+2(1-\mu )a]}{2m}\right. \nonumber \\&\quad \left. +\frac{(\delta _{\mu }\textbf{E}+d\textbf{B})^{2} }{2m}+\frac{\{\varvec{\gamma }\cdot \textbf{p},\varvec{\gamma } \cdot (\delta _{\mu }\varvec{\dot{E}}+d\varvec{\dot{B}})\}}{8m^{2}}\right) \nonumber \\&\quad +\gamma ^{5}\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E })\left( 1+\frac{S_{a}^{2}+2C_{a}}{2m^{2}}\right) \nonumber \\&\quad +\frac{i[\varvec{\gamma }\cdot \textbf{p},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{E}+d\textbf{B})]}{2m} \nonumber \\&\quad -i\frac{\{S_{a},[\varvec{\gamma }\cdot \textbf{p},\varvec{ \gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})]\}}{8m^{2}}\nonumber \\&\quad +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{S}_{a}+4(1-\mu )\dot{a}\}}{8m^{2}} \nonumber \\&\quad +\frac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\{\varvec{ \gamma }\cdot \textbf{p},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})\}\}}{8m^{2}}\nonumber \\&\quad +\frac{1}{8m^{3}}\gamma ^{0}\mathcal {H}_{3}\ , \end{aligned}$$
(84)

with the same \(\mathcal {H}_{3}\) as before, Eq. (46). We have used \( [A,\{B,C\}]=\{C,[A,B]\}-\{B,[C,A]\}\) to rewrite some operators,

$$\begin{aligned}{}[S_{a},\{\varvec{\gamma }\cdot \textbf{p},\varvec{\gamma } \cdot (\delta _{\mu }\textbf{B}-d\textbf{E})\}]&=\{\varvec{\gamma } \cdot (\delta _{\mu }\textbf{B}-d\textbf{E}),[S_{a},\varvec{\gamma }\cdot \textbf{p}]\}\ , \end{aligned}$$
(85)
$$\begin{aligned}&=\{S_{a},[\varvec{\gamma }\cdot \textbf{p},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E} )]\}\nonumber \\&\quad -\{\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E}),[S_{a}, \varvec{\gamma }\cdot \textbf{p}]\}. \end{aligned}$$
(86)

Notice that the same combination of \(S_{a}\) and \(C_{a}\) as in the last line of Eq. (45) has already been dropped. Similarly, \( S_{a}^{2}+2C_{a}=-4m^{2}\sin ^{4}(\mu a/m)\sim \mathcal {O}(1/m^{2})\) can be discarded. Again, we observe that the infinite towers of interactions in the polar representation, the \(\exp (2ia/m)\sigma ^{\mu \nu }F_{\mu \nu }\) and \(\exp (2ia/m)\sigma ^{\mu \nu }\tilde{F}_{\mu \nu }\) terms in Eq. (80) for \(\mu =1\), are automatically truncated when expanded in the non-relativistic limit. This fact would have been totally missed if we had truncated the series already in Eq. (80). There is another interesting aspect of this truncation. Setting \(\mu =1\) in Eq. (80), a direct coupling of the axion to the electric dipole operator \(a\bar{\psi }\sigma ^{\mu \nu }\gamma ^{5}\psi F_{\mu \nu }\) is present in the polar representation, but not in the derivative one, and with a coefficient proportional to the magnetic moment of \(\psi \). We now see that this coupling disappears in the non-relativistic limit, making both representations compatible. This information will play an important role in analyzing nucleon EDMs in Sect. 4.

At this point, we start to perform some Schiff transformations, specifically, \(S_{1}\) as given in Eq. (50), \(S_{2}\) in Eq. (53), \(S_{3}\) in Eq. (54), and finally,

$$\begin{aligned} iS_{4}=\frac{i}{8m^{2}}\gamma ^{0}\{\varvec{\gamma }\cdot \textbf{p}, \varvec{\gamma }\cdot (\delta _{\mu }\textbf{E}+d\textbf{B})\}\ , \end{aligned}$$
(87)

to remove the operator involving \(\delta _{\mu }\varvec{\dot{E}}+d \dot{\textbf{B}}\). The \(S_{2}\) and \(S_{3}\) transformations reorganize the terms in \( \ddot{a}\) occurring in \(\mathcal {H}_{3}\), exactly as in Eq. (55). For \(S_{1}\), an additional term appears (compare with Eq. (52))

$$\begin{aligned}&[iS_{1},\mathcal {H}^{\textrm{NR}}(\mu )]-\dot{S}_{1} \nonumber \\&\quad =-\frac{1}{ 8m^{2}}\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p},\dot{S}_{a}+4(1-\mu )\dot{a}\} \nonumber \\&\qquad +\frac{i}{16m^{3}}\gamma ^{0}\gamma ^{5}[\{\varvec{\gamma }\cdot \textbf{p},S_{a}+4(1-\mu )a\},\textbf{p}^{2}] \nonumber \\&\qquad +\frac{1}{16m^{3}}\gamma ^{0}[\{\varvec{\gamma }\cdot \textbf{p} ,S_{a}+4(1-\mu )a\},\nonumber \\&\qquad \times [\varvec{\gamma }\cdot \textbf{p},S_{a}+2(1-\mu )a]]\nonumber \\&\qquad +\frac{i}{8m^{2}}[\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B} -d\textbf{E}),\{\varvec{\gamma }\cdot \textbf{p},S_{a}\nonumber \\&\qquad +4(1-\mu )a\}]+ \mathcal {O}(m^{-4}). \end{aligned}$$
(88)

This new term combines with the \(\{S_{a},[\varvec{\gamma }\cdot \textbf{p },\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})]\}\) operator of Eq. (84) to make it \(\mu \) independent. The other terms combine with those in \(\mathcal {H}_{3}\) as in Sect. 3.1, and the final Hamiltonian no longer depends on \(\mu \) at all in the non-relativistic limit:

(89)

One can recognize in the first and fourth lines the terms of \(\mathcal {H} _{EM}^{\textrm{NR}}\) in the \(e\rightarrow 0\) limit, Eq. (34), the terms in the second and third as those of the neutral fermion, Eq. (59), so the only new feature is the operator in the last line. It encodes higher order effects induced by the Schiff transformation of the \( \dot{a}\) term, and can be worked out using

$$\begin{aligned} i[\varvec{\gamma }\cdot \textbf{p},\varvec{\gamma }\cdot (\delta _{\mu }\textbf{B}-d\textbf{E})]= & {} d\varvec{\nabla \cdot E}-i\delta _{\mu } \varvec{\sigma }\cdot \varvec{\nabla }\times \textbf{B}\nonumber \\{} & {} -2\varvec{ \sigma }\cdot ((\delta _{\mu }\textbf{B}-d\textbf{E})\times \textbf{P}), \end{aligned}$$
(90)

where we have set \(\varvec{\nabla \cdot B}=0\) and \(\varvec{\nabla }\times \textbf{E}=0\). These couplings are rather similar to the Darwin and spin-orbit couplings, and for a neutral fermion, should not be directly accessible. Further, whenever d is already induced by the axionic background, these couplings represent a negligible second order effect. So, the important conclusion of this calculation is that for a neutral state, there is no coupling to the time-derivative of the axion background, but the leading non-axionic EDM term is physical.

4 Axionic EDM observability and estimates

Whether in its axioelectric or axionic EDM form, the \(\mathcal {O}(m^{-2})\) operators lead to oscillating EDM for charged states, i.e., to charged leptons and quarks, and thereby presumably to nucleons. Naively, if the analogy with the magnetic moments is valid, the expected size of these oscillating EDMs should be larger than those due to the axion-gluon or axion-photon local anomalous interactions, \(aG_{\mu \nu }\tilde{G}^{\mu \nu }\) or \(aF_{\mu \nu } \tilde{F}^{\mu \nu }\), respectively. In some sense, the \(\mathcal {O}(m^{-2})\) operators are the equivalent of the Dirac prediction \(g=2\) (see Eqs. (7778)), while the loop-level anomalous contributions play the role of the anomalous magnetic moment. At the same time, there are also reasons to think that this analogy does not hold because the \(\mathcal {O} (m^{-2})\) operators and the anomalous contributions to a given particle EDM appear different in their scaling with the axion mass, in their connection with Schiff’s screening, and in the case of the nucleons, in their hadronization. So, to confirm the naive expectation, it is necessary to delve into the details of how these operators translate into observables, from which oscillating EDMs could in principle be accessed. We will not do a systematic analysis of all the possible experimental settings, and to avoid complications arising from nuclear and atomic effects, we will focus on the simplest system consisting of a single particle precessing in external EM fields, first in the leptonic case, and then in the more intricate nucleon case. This treatment will prove sufficient to establish once more the equivalence between the axioelectric and axionic EDM operators, and to show and characterize the situations in which they do lead to much larger EDMs than expected on the basis of the anomalous axion couplings alone.

4.1 Leptons

Imagine a charged lepton, not bound in an atom, travelling in a region where \(\varvec{\nabla }a\) is negligible and only some electric fields are present. Then, its spin precession is dictated by the Hamiltonian in Eq. (72) as, to leading order in 1/m,

$$\begin{aligned} \varvec{\dot{S}}= & {} -i[\varvec{S},\mathcal {H}^{\textrm{NR}}(\alpha ,\beta )] \nonumber \\ {}= & {} -i\left[ \varvec{S},\alpha \dfrac{\gamma ^{5}\{\varvec{ \gamma }\cdot \textbf{P},\dot{a}\}}{2m^{2}}-(1-\alpha )\frac{ea\gamma ^{5} \varvec{\gamma }\cdot \textbf{E}}{m^{2}}\right] . \end{aligned}$$
(91)

In writing this equation, we implicitly use the fact that the spin operator is the same for all \(\alpha \) (but for the gauge variance due to \(\textbf{P}\), which cancels only when acting on the fermion wavefunction). This makes sense because, compared to the discussion in Sect. 2.3.2, the Schiff transformation of Eq. (67a) relates two equivalent forms of the axion coupling. It does not affect the mass term, which anchors the rest-frame in which \(\varvec{S}\) is defined. In other words, at the level of the Lagrangian, the original Schiff transformation of Eq. (20) is related to the purely chiral rotation \(\exp (i\alpha \gamma ^{5})\), as was made clear in Eqs. (7778), while that for the axion coupling, Eq. (67a), is rather related to the Goldstone boson reparametrization \(\exp (i\alpha \gamma ^{5}a/m)\).

Reparametrization invariance. Let us first prove that observables do not depend on \(\alpha \). In the \(\phi =0\) gauge, the \(\varvec{\gamma }\cdot \textbf{p}\) term does not contribute and \(\textbf{E}=-\varvec{\dot{A}}\), so the equation for \( \varvec{S}\) reduces to

$$\begin{aligned} \varvec{\dot{S}}=\frac{2e}{m^{2}}\gamma ^{0}\varvec{S}\times (-\alpha \dot{a}\textbf{A}-(1-\alpha )a\textbf{E}). \end{aligned}$$
(92)

Since this equation is of the \(\dot{f}=f\times g\) type, for which a solution involves the time integral of g, the two operators give the same contribution and \(\alpha \) drops out (under appropriate boundary and gauge conditions [65]). We recover the equivalence of the two representations for the \( \mathcal {O}(m^{-2})\) axion couplings, as depicted on the top line of Table  1.

If we take instead a gauge where \(\textbf{A}=0\), the \(\varvec{\gamma } \cdot \textbf{p}\) produces the required \(\varvec{\nabla }\phi \) term of \(\textbf{E}\), as depicted in the bottom line of Table 1. Let us check this explicitly, assuming \(\varvec{\nabla }a=0\). We start with the EDM form of the operator in the equation of motion of \( \varvec{S}\), and first replace \(e\textbf{E}=-e\varvec{\nabla }\phi =-i[ \textbf{p},\mathcal {H}_{0}^{\textrm{NR}}]+\mathcal {O}(1/m)\) with \(\mathcal {H} _{0}^{\textrm{NR}}\) the Hamiltonian with \(a=0\):

$$\begin{aligned} \left. \varvec{\dot{S}}\right| _{a\textbf{E}}=-i\left[ \varvec{S} ,-\frac{ea\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}}{m^{2}}\right] = \left[ \varvec{S},\frac{a\gamma ^{5}\varvec{\gamma }\cdot [\textbf{p},\mathcal {H}_{0}^{\textrm{NR}}]}{m^{2}}\right] . \end{aligned}$$
(93)

Acting with \(\mathcal {H}_{0}^{\textrm{NR}}\) on the external spinors gives the energy difference, which comes entirely from the difference in electric potential felt by the initial and final states. This must match the energy brought in by the axion, i.e. \(m_{a}\), so that \(a\gamma ^{5}\varvec{ \gamma }\cdot [\textbf{p},\mathcal {H}_{0}^{\textrm{NR}}]\) collapses to \(\dot{a}\gamma ^{5}\varvec{\gamma }\cdot \textbf{p}\), as it should.Footnote 8 In full details, the correspondence is verified by writing explicitly the external spinors. According to the Ehrenfest theorem,

$$\begin{aligned} \frac{d}{dt} \langle \psi \vert \varvec{S}\vert \psi \rangle \vert _{a\textbf{E}}= & {} \frac{1}{m^{2}}\Bigg \langle \psi \left| \left[ \varvec{S},a\gamma ^{5}\varvec{\gamma }\cdot [\textbf{p},\mathcal {H}_{0}^{\textrm{NR}}]\right] \right| \psi \Bigg \rangle \nonumber \\= & {} \frac{1}{m^{2}}\langle \psi \vert -[\mathcal {H}_{0}^{\textrm{NR }},[\varvec{S},a\gamma ^{5}\varvec{\gamma }\cdot \textbf{p}]]\nonumber \\{} & {} -[[ \varvec{S},\mathcal {H}_{0}^{\textrm{NR}}],a\gamma ^{5}\varvec{\gamma }\cdot \textbf{p}]\vert \psi \rangle \nonumber \\= & {} \frac{1}{m^{2}}\langle \psi \vert i\overleftarrow{\partial } _{t}[\varvec{S},a\gamma ^{5}\varvec{\gamma }\cdot \textbf{p}]]\nonumber \\{} & {} +[ \varvec{S},a\gamma ^{5}\varvec{\gamma }\cdot \textbf{p}]]i \overrightarrow{\partial }_{t}\vert \psi \rangle \nonumber \\= & {} -i\Bigg \langle \psi \bigg \vert \bigg [\varvec{S},\frac{\gamma ^{5}\{ \varvec{\gamma }\cdot \textbf{p},\dot{a}\}}{2m^{2}}\bigg ]\bigg \vert \psi \Bigg \rangle \nonumber \\= & {} \frac{d}{dt} \langle \psi \vert \varvec{S }\vert \psi \rangle \vert _{\dot{a}\textbf{p}}, \end{aligned}$$
(94)

using the equation of motion of the external states \(i\partial _{t}\left| \psi \right\rangle =\mathcal {H}_{0}^{\textrm{NR}}\left| \psi \right\rangle \), the Jacobi identity \([A,[B,C]]+[C,[A,B]]+[B,[C,A]]=0\), and integrating by part over time, i.e., imposing the conservation of energy. We do not include \(\left\langle \psi \right| \varvec{\dot{S}} \left| \psi \right\rangle \) in the evolution of \(\left\langle \psi \right| \varvec{S}\left| \psi \right\rangle \) and used \([ \varvec{S},\mathcal {H}_{0}^{\textrm{NR}}]=i\varvec{\dot{S}}=0\) because as an operator, \(\varvec{S}\) does not vary in time. This is consistent since in this picture, \(\mathcal {H}_{0}^{\textrm{NR}}=\gamma ^{0}(m+\cdots )\) and \(\varvec{S}=\varvec{\sigma }/2=-\gamma ^{0}\gamma ^{5}\varvec{\gamma }/2\), so clearly \([\varvec{S},\mathcal {H}_{0}^{ \textrm{NR}}]=0\). We thus recover the equivalence between the axionic EDM and axioelectric operators of Eq. (74). All in all, the situation is totally analogous to that for the axioelectric effect discussed in Sect. 3.2.1: observables are gauge invariant and independent of the choice of operators in \(\mathcal {H}^{\textrm{NR}}(\alpha ,\beta )\).

Numerical estimates. Altogether, the \(\alpha \) parameters drops out from the equation of motion of \(\varvec{S}\) which can be written as

$$\begin{aligned} \varvec{\dot{S}}=-\frac{2ea}{m^{2}}\gamma ^{0}\varvec{S}\times \textbf{E}. \end{aligned}$$
(95)

An immediate question looking at this equation is what happens if the axion is sufficiently slowly varying compared to the observation time T, i.e., when T is small enough compared to \(1/m_{a}\). In effect, the axion field is constant, and it may seem \(\varvec{\dot{S}}\sim a\varvec{S}\times \textbf{E}\) survives as \(m_{a}\rightarrow 0\) since it is linear in a. This is not true though, because if the axion field is constant, then \(\varvec{S}\) is no longer the right spin operator. With \(a(t)=a_{0}\), the mass term becomes complex (see Eq. (39)), and a chiral rotation of the wavefunction becomes necessary to identify the true spin operator. This additional change of basis, required only if the axion field is constant over the observation time (or other relevant time scale), is what distinguish the situation with \(\alpha =0\) from \(\alpha =1\) in Eq. (91). More generally, if we write \(a(t)=a_{0}+\dot{a}t+\cdots \), then the \(a_{0}\) term disappears when the fermion mass is made real. When \(T\lesssim 1/m_{a}\), the observable change in the spin orientation ends up linear in \(m_{a}\), with \(a\left| \textbf{E}\right| \approx E_{0}a_{0}m_{a}T\). This makes the precession inobservable for very small axion masses, and reproduces the expected decoupling of the axion in the \(m_{a}\rightarrow 0\) limit.

Importantly, this means the current limit on the electron EDM, \(d_{e}^{\exp }<1.1\times 10^{-29}\) e cm [69] cannot be used to constrain the axion-electron coupling. Beside the fact that it is extracted from atoms, i.e., bound electrons, it assumes the EDM is sufficiently slowly varying and do not integrate to zero if the observation time T runs over several oscillation cycles. But as we just demonstrated, in that case, the axion background decouples and does not induce any spin precession.

More promising are situations in which the electric field is oscillating in time at some frequency \(\omega \approx m_{a}\). Then, no change of basis is needed, \(a\left| \textbf{E}\right| =E_{0}a_{0}\cos (m_{a}t)\cos (\omega t)\) is not linear in \(m_{a}\), and it integrates to a non-zero value over some long enough observation time. Clearly, one can no longer take the \( m_{a}\rightarrow 0\) limit here, it is ill-defined for this situation. Also, the matching with the axioelectric form is trivial since to \(\left| \textbf{E}\right| =E_{0}\cos (\omega t)\) corresponds \(\left| \textbf{ A}\right| =(E_{0}/\omega )\sin \omega t\), so \(\dot{a}\left| \textbf{A }\right| \) matches \(a\left| \textbf{E}\right| \), up to fixed boundary terms. This shows how oscillating electric fields really takes the full advantage of the fact that the derivative of the axion field, \(\dot{a}\), is coupled to the vector potential, and not to the electric field.Footnote 9 Thus, provided the axion mass is not too small so that T covers more than a fraction of an oscillation, it is in principle possible to access directly to the axion-induced EDM, which could reach a sizeable value. For an electron, taking the coherent classical axion background \(a(t)=a_{0}\cos (m_{a}t)\) with \( m_{a}a_{0}=\sqrt{2\rho _{DM}}\) and \(\rho _{DM}=0.4\ \)GeV/cm\(^{3}~\) [13]

$$\begin{aligned} d_{e}(t)=\frac{ea(t)}{m_{e}\Lambda }\approx 10^{-11}\frac{a(t)}{\Lambda }~e \text { cm (for } m_a > 0). \end{aligned}$$
(96)

With in addition the QCD axion mass and scale related by \(m_{a}\Lambda \approx f_{\pi }m_{\pi }\approx m_{\pi }^{2}\), we find

$$\begin{aligned}{} & {} d_{e}(t)\approx 10^{-11}\frac{\sqrt{2\rho _{DM}}}{m_{\pi }^{2}}\cos (m_{a}t)\approx 10^{-30}\nonumber \\{} & {} \qquad \cos (m_{a}t)~e\text { cm (for } m_a > 0), \end{aligned}$$
(97)

independently of \(m_{a}\) and \(\Lambda \) [70, 71]. Alternatively, this same estimate can be expressed in terms of the axion-electron coupling (which corresponds to taking \(a\rightarrow g_{ae}a\) in Eq. (72)),

$$\begin{aligned} d_{e}(t)= & {} eg_{ae}\dfrac{\sqrt{2\rho _{DM}}}{m_{e}^{2}m_{a}}\cos (m_{a}t)\approx (10^{-19}~\text {eV})\text { }\frac{g_{ae}}{m_{a}}\nonumber \\{} & {} \cos (m_{a}t)~e\text { cm (for } m_a > 0). \end{aligned}$$
(98)

The electron EDM thus appears very promising to set competitive bounds on the axion (or ALP) couplings using electric microwaves, covering the \(\mu \)eV range of axion masses favored by the misalignment mechanism.

This reasoning remains valid in a CASPER-like situation [16] in which the spin of the charged lepton is precessing at a Larmor frequency \(\omega _{L}\approx m_{a}\) in a magnetic field. In that case, a constant electric field is seen in the rotating frame as oscillating at that \(\omega _{L}\) frequency and Eq. (96) applies. Let us stress though that Eq. (96) does not apply to CASPER itself as currently designed, since we are dealing with free charged leptons here. Further, we do expect very significant suppressions for charged fermions bound into a neutral atomic system, as will be shown in the next section for the neutron. Further work is needed to cover these systems, to see whether some sensitivity can be retained for some range of axion masses. In that respect, the equivalence between the axionic EDM and axioelectric forms of the operator may turn out to be useful. We already know that they both have precisely the same capability to kick bound electrons out when the axion brings enough energy, and we have seen above that they act in the same way on free lepton spins, so there is no reason to think they could act differently on bound electrons. As a tool, this equivalence could thus help in obtaining realistic numerical estimates for atomic systems. This is left for future work.

4.2 Nucleons

Let us now turn to the nucleons, for which the situation is much more complicated because of hadronic effects, and because the quarks are not expected to be non-relativistic inside a nucleon. We concentrate on the connection between the quark and nucleon levels here, with an idealized nucleon precession experiment in mind, and leave the discussions about the nuclear or atomic levels to future work. To proceed, let us characterize the various contributions to the nucleon EDMs in terms of effective Lagrangian couplings.

1. Quark constant EDMs. The simplest mechanism to generate a nucleon EDM occurs when quarks develop constant EDMs, like in the presence of some new CP violating sources. The quark electric moment operators then translate naturally into the corresponding nucleon operators

$$\begin{aligned} \mathcal {L}_{q,g}{} & {} \supset \bar{\psi }_{q}\left( i\frac{d_{q}}{2}\sigma ^{\mu \nu }\gamma ^{5}F_{\mu \nu }\right) \psi _{q}\nonumber \\ {}{} & {} \rightarrow \mathcal {L} _{N}\supset \bar{\psi }_{N} \left( i\frac{d_{N}}{2}\sigma ^{\mu \nu }\gamma ^{5}F_{\mu \nu }\right) \psi _{N}, \end{aligned}$$
(99)

with \(q=u,d\). Given current lattice estimates [59], hadronization appears essentially transparent on these local operators, and the naive SU(6) estimate in Eq. (36) is actually quite good. In case the short-distance quark EDMs scale as \(1/m_{q}\), analogy with the magnetic moment would suggest a similarly transparent hadronization, but with running quark masses replaced by constituent masses. The important information is that even if the neutron is neutral, it is not “EDM neutral”, because the EDM interaction involves a combination of spins and electric charges. As a result, even very soft photons, insensitive to the quark structure, do interact with the neutron via its spin.

2. Gluonic contributions. The fundamental axion interactions with quarks or gluons do not involve the photon field at leading order. Yet, the quarks being electrically charged, non-local processes at the partonic level can induce local EDM operators at the nucleon level. The most well-known such non-local EDM contribution comes from the \(\theta \) term of QCD. Indeed, in the presence of an axion field, one expects a coupling

$$\begin{aligned} \mathcal {L}_{q,g}&\supset \frac{g^{2}}{32\pi ^{2}}\left( \frac{a}{\Lambda } +\theta \right) G_{\mu \nu }\tilde{G}^{\mu \nu } \nonumber \\&\quad \rightarrow \mathcal {L} _{N}\supset \bar{\psi }_{N} \left( i\frac{d_{N}}{2}\frac{a}{\Lambda }\sigma ^{\mu \nu }\gamma ^{5}F_{\mu \nu }\right) \psi _{N}. \end{aligned}$$
(100)

The constant \(\theta \) term is cancelled by the axion field falling to its true minimum, but this leaves a \(aG_{\mu \nu }\tilde{G}^{\mu \nu }\) coupling. In the presence of a dark matter axion background, \(a(t,\textbf{x})=a_{0}\cos (m_{a}t-\textbf{k}\cdot \textbf{x}+\phi )\), and from Eq. (35), this term then induces an EDM for the nucleons [70, 71]. Using the matrix element estimates quoted in Ref. [57]:

$$\begin{aligned} d_{n}(t)= & {} -(2.7\pm 1.2)\times 10^{-16}\frac{a(t)}{\Lambda }\ ~e\text { cm,} \end{aligned}$$
(101a)
$$\begin{aligned} d_{p}(t)= & {} +(2.1\pm 1.2)\times 10^{-16}\frac{a(t)}{\Lambda }\ ~e\text { cm, } \end{aligned}$$
(101b)

(we use \(d_{N}(t)\) to denote the total nucleon EDM, and \(d_{N}\) the coefficient of the operator in Eq. (100)). Note though that recent lattice estimates reduce this matrix element by about a factor of two  [72, 73]. Anyway, this prediction has motivated dedicated experimental searches [16, 74], with specific strategies designed to tackle the oscillatory nature of the EDM. At this stage, we should point out though that strictly speaking, the matrix elements was extracted for a constant \(\theta \), by extrapolating the form-factor for \(N\rightarrow N\gamma (q)\) to \(q^{2}\rightarrow 0\). The idea is that provided the axion background is not varying too quickly, QCD has time to account for the presence of the action field as a kind of effective axionic \(\theta \) term. Yet, notice that in the general case, the axion field is also injecting some energy, albeit a small amount, and \(d_{N}\) in Eq. (100) is actually a form-factor that depends on both the photon and axion momenta. When applying this estimate to the axion EDM coupling, one implicitly makes the assumption that the axion is very soft and that the limit \(m_{a}\rightarrow 0\) is smooth. We will come back to this point below.

3. Ward identity. The pseudoscalar and/or derivative couplings of the axion to the quarks also contribute to the nucleon operator in Eq. (100) via similar non-local processes (see Fig. 1). To estimate their relative size compared to the gluonic contribution, a crucial piece of information comes from the Ward identity of the anomalous PQ symmetry (we assume for now that the axion couples only to a single quark \(\psi _{q}\)):

$$\begin{aligned} \partial _{\mu }(\bar{\psi }_{q}\gamma ^{\mu }\gamma ^{5}\psi _{q})=2im_{q} \bar{\psi }_{q}\gamma ^{5}\psi _{q}+\frac{g^{2}}{16\pi ^{2}}G_{\mu \nu } \tilde{G}^{\mu \nu }. \nonumber \\ \end{aligned}$$
(102)

This means that the Lagrangian interpolating between the derivative and polar representations should read

(103)

where one can recognize Eq. (39) when \(\alpha =0\), and Eq. (37) when \(\alpha =1\), plus the anomalous term. Thus, the gluon-induced and quark-induced contributions cannot truly be disentangled and must be treated together. This fact is actually often used to estimate theoretically the nucleon EDM for constant \(\theta \), as it can be easier to deal with a phase for the quark masses than with the anomalous \(G\tilde{G}\) coupling  [75, 76]. Note, though, that if the \(aG\tilde{G}\) coupling also receives contributions from other heavy states, either SM quarks or new heavy fermions like in the KSVZ scenario, then the quark Lagrangian should rather read

(104)

where we have put back the coupling \(g_{q}\) and \(g_{g}\) to distinguish axion-quark and axion-gluon couplings. Notice that in this case, it is always possible to chose \(\alpha \) (or \(\beta \)) such that one of the coupling disappears, i.e., without the axial, the pseudoscalar, or the anomalous coupling:

(105a)
(105b)
(105c)

Also, notice that in the quark non-relativistic limit, \(g_{g}\) never contributes to the quark axion wind or the quark axionic EDM operator.

4. Sutherland–Veltman theorem. What the Ward identity Eq. (102) shows is that in the soft limit, i.e., \(\partial _{\mu }a\rightarrow 0\), the pseudoscalar axion-quark coupling is strictly equivalent to the anomalous axion-gluon coupling, and thus that

$$\begin{aligned} \left\langle N\gamma \right| -2i\frac{m_{q}}{\Lambda }a\bar{\psi } _{q}\gamma ^{5}\psi _{q}\left| N\right\rangle \overset{\partial _{\mu }a\rightarrow 0}{\rightarrow }\left\langle N\gamma \right| \frac{a}{ \Lambda }\frac{g^{2}}{16\pi ^{2}}G_{\mu \nu }\tilde{G}^{\mu \nu }\left| N\right\rangle . \end{aligned}$$
(106)

Actually, this is essentially the Sutherland–Veltman theorem [77, 78], well-known in the context of \(\pi ^{0}\rightarrow \gamma \gamma \). Here, it proves that the non-local contributions of \(a\bar{\psi }_{q}\gamma ^{5}\psi _{q}\) to the nucleon EDM must reproduce that of \(aG_{\mu \nu }\tilde{G}^{\mu \nu }\) quoted in Eq. (101) in the soft limit. This serves as a baseline, and the question now is whether the EDM can be enhanced compared to Eq. (101) even slightly away from that limit. Note, for completeness, that the Ward identity also relates the matrix elements of the axial current and the anomalous term in the chiral limit. Indeed, for a massless quark, the axion decouples entirely from Eq. (103), as can be seen taking \( m_{q}=0\) and \(\alpha =0\). Thus, the matrix elements of \(\bar{\psi }_{q}\gamma ^{\mu }\gamma _{5}\psi _{q}\partial _{\mu }a\) and \(aG_{\mu \nu }\tilde{G} ^{\mu \nu }\) must match when \(m_{q}\rightarrow 0\) since the two must cancel each other [61].

5. On the \(m_{a}\rightarrow 0\) limit. The soft limit, \(\partial _{\mu }a\rightarrow 0\), and the \(m_{a}\rightarrow 0\) limit are not entirely equivalent. Naively, if \(a(t,\textbf{x})=a_{0}\cos (m_{a}t- \textbf{k}\cdot \textbf{x}+\phi )\) for some constant \(\phi \), \(\partial _{\mu }a\rightarrow 0\) is equivalent to \(\dot{a}=0\) if \(\textbf{k}\) is negligible, and \(\dot{a}(t)=0\) can be attained for all time only with \( m_{a}=0\). Now, simply setting \(m_{a}=0\), the axion field becomes constant and some of its couplings to quarks and gluon in Eq. (103) survive, and so is the nucleon EDM operator in Eq. (100). At the same time, theoretically, \(m_{a}\rightarrow 0\) requires \(\Lambda \rightarrow \infty \) since \(m_{a}\) comes from the gluon coupling \(aG_{\mu \nu }\tilde{G} ^{\mu \nu }\). But if we send \(\Lambda \) to infinity, the axion entirely decouples, there will be no axionic EDM at all, and the strong CP puzzle is back. Actually, there is probably a threshold for \(m_{a}\) below which the axion would take too much time to realign to compensate for some preexisting \(\theta \) term, in which case all the axion contributions would be overshadowed by the large constant EDM due to \(\theta \). This goes beyond what is discussed here, and we still assume that term is absent. Yet, if we expand the coherent background axion field as \(a(x^{\mu })=a_{0}+x^{\mu }\partial _{\mu }a+\cdots \), how to treat the constant term \(a_{0}\) needs caution. In effect, QCD with such an external field looks very much like QCD with a non-zero \(\theta \) term, which is CP-violating and quite different from QCD with an axion and \(\theta =0\), which is CP-conserving (see e.g. Ref. [79] for a comparison). In particular, the \(a_{0}\) term generates complex phases for the quark condensate, and changes how to treat the chiral symmetry breaking terms. Specifically, when \(a=a_{0}\) is constant, Dashen theorem must be called in, and after the necessary realignment of the chiral vacuum [75],

$$\begin{aligned} \frac{m_{q}}{\Lambda }a_{0}\bar{\psi }_{q}(i\gamma _{5})\psi _{q}{} & {} \rightarrow \left( \frac{1}{m_{u}}+\frac{1}{m_{d}}+\frac{1}{m_{s}}\right) ^{-1}\frac{ a_{0}}{\Lambda }\nonumber \\{} & {} \quad \times \sum _{q=u,d,s}\bar{\psi }_{q}(i\gamma _{5})\psi _{q}. \end{aligned}$$
(107)

This drastically changes the character of the axion-quark coupling, and suppresses it significantly (it now vanishes if any of the quark masses vanishes, as it should). This isospin singlet quark current accounts for the isospin singlet anomalous term \(a_{0}G_{\mu \nu }\tilde{G}^{\mu \nu }\). This explains how Sutherland–Veltman theorem sets in: the \(\mathcal {L} _{q,g}(\alpha =0)\) term contains both a suppressed term collapsing to the anomalous one, and a term matching the axial interaction (i.e., proportional to \(\partial _{\mu }a\)), so that any observable calculated from \(\mathcal {L} _{q,g}(\alpha =0)\) or \(\mathcal {L}_{q,g}(\alpha =1)\) are equal.

6. Axion couplings to nucleons. The three axion couplings in \(\mathcal {L}_{q,g}(\alpha )\) of Eq. (103) generate the nucleon EDM operator, but also simpler axion couplings to nucleons. With them, the nucleon effective Lagrangian becomes:

(108)

for some prefactor \(g_{N}\) a priori of \(\mathcal {O}(1)\), and \(D^{\mu }=\partial ^{\mu }-ieQ_{N}A^{\mu }\) with \(Q_{N}\) the nucleon electric charge. From the previous points, all the \(\mathcal {L}_{q,g}(\alpha )\) couplings appear to contribute to all the \(\mathcal {L}_{N}(\alpha ^{\prime }) \) couplings. For instance, the axion-gluon coupling contributes to both \( d_{N}\) and \(g_{N}\), since \(aG_{\mu \nu }\tilde{G}^{\mu \nu }\) does not need photons to generate a CP-violating coupling. Similarly, the axion-quark couplings in \(\mathcal {L}_{q,g}(\alpha )\) naturally induce axion-nucleon couplings, but also the local anomalous term \(d_{N}\), as is obvious starting from \(\mathcal {L}_{q,g}(\alpha =0)\) or invoking Sutherland–Veltman theorem, Eq. (106). Yet, there are subtleties at play here. First, even if the the two axion couplings to nucleons are necessarily present and related under the reparametrization \(\psi _{N}\rightarrow \exp (ig_{N}\alpha \gamma ^{5}a/\Lambda )\psi _{N}\), \(\alpha ^{\prime }\) is not necessarily equal to \( \alpha \). We neglect the QED anomaly here, so the Ward identity underpinning the Goldstone-boson reparametrization invariance at the nucleon level is simply the classical one (and \(\alpha ^{\prime }\) has to cancel from observables). Second, for a constant axion background \(a(t,\textbf{x})=a_{0}\), the axion couplings to nucleons vanish. This is clear for \(\alpha ^{\prime }=1\), while it requires a chiral rotation \(\psi _{N}\rightarrow \exp (i\alpha g_{N}\gamma ^{5}a_{0}/\Lambda )\psi _{N}\) to make the mass term real for \(\alpha ^{\prime }=0\) (this can also be understood as a nucleon-level Sutherland–Veltman theorem: the pseudoscalar coupling is equivalent to the axial one, which vanishes in the \(\partial _{\mu }a\rightarrow 0\) limit). Third, the \(d_{N}\) coupling does not represent the whole axionic EDM of the nucleon. From an effective theory point of view, \( d_{N}\) only represents the short-distance contribution, to which tree-level (and loop-level once pions and other light mesons are included) contributions from the leading couplings have to be added, see Figs. 1 and 2.

7. Nucleon magnetic moment. Deciding like in Sect. 3.3 to add the magnetic dipole operator to \(\mathcal {L}_{N}(\alpha ^{\prime }=1)\), the nucleon Lagrangian becomes (see Eq. (81))

(109)

up to dipole terms of \(\mathcal {O}(a^{2})\). The magnetic dipole operator accounts for the proton and neutron magnetic moments \(2+\mu _{p}\approx 2.8\) and \(\mu _{n}\approx -1.9\), respectively. This Lagrangian is invariant under the Goldstone boson reparametrizations \(\psi _{N}\rightarrow \exp (ig_{N}\alpha ^{\prime }\gamma ^{5}a/\Lambda )\psi _{N}\), again up to terms quadratic in the axion field. In the non-relativistic limit, a single axionic EDM operator arises at leading order:

$$\begin{aligned} \mathcal {L}_{N}(\alpha ^{\prime })\rightarrow -\left( d_{N}+\frac{eg_{N}}{ m_{N}}Q_{N}\right) \dfrac{a}{\Lambda }\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}. \end{aligned}$$
(110)

The contribution from \(\mu _{N}\) drops out, independently of \(\alpha ^{\prime }\) (as was already apparent for the neutron in Eq. (89)). It is interesting to understand the mechanism of this cancellation (see Ref. [80]): the \(\mu _{N}\)-dependent shift of \(d_{N}\) is compensated by the long-distance contributions arising from one-nucleon reducible diagrams with the photon emitted via the \(\mu _{N}\bar{\psi } _{N}\sigma ^{\mu \nu }F_{\mu \nu }\psi _{N}\) vertex. Now, this cancellation crucially relies on the assumption that one should introduce the magnetic dipole operator to \(\mathcal {L}_{N}(\alpha ^{\prime }=1)\). In Sect. 3.3, this was justified by the fact that quarks are PQ neutral in the derivative representation. Nucleons, on the contrary, do not have definite PQ charges if the PQ-breaking \(aG\tilde{G}\) coupling contributes to the \(a \bar{N}N\) vertex. To circumvent this, we require that only the local term \( d_{N}\) should be present in the \(\partial _{\mu }a\rightarrow 0\) limit. Indeed, the reparametrization then becomes a chiral rotation, i.e., a change of basis. What is to be called the magnetic moment and the EDM have to be defined in the basis in which the nucleon mass is real (see e.g. Ref. [81] for a detailed discussion). With the assumption that this limit is smooth despite the fact that a change of basis for \(\psi _{N}\) is implied, this restores a definite PQ charge for the nucleons. The representation in Eq. (109) confines the contributions of the aGG coupling to the local term \(d_{N}\), together with some axion-quark contributions given Eqs. (105) and (106), leaving the quark couplings to induce \(g_{N}\). Thus, we expect \(d_{N}(g_{q},g_{g})\), but \(g_{N}(g_{q})\), with the \(g_{N}(g_{q})\) contribution cancelling out when \( \partial _{\mu }a=0\). Note, finally, that \(\mathcal {L}_{N}(\alpha ^{\prime }=1)\) corresponds to the usual form employed in the literature, see e.g. Ref. [71].

7. Proton EDM. The important property of the first two operators of \(\mathcal {L}_{N}(\alpha ^{\prime })\) is that they combine with the electromagnetic coupling to produce non-relativistic EDM operators for the proton, but not for the neutron. The phenomenology of a precessing proton is thus very similar to that discussed for charged leptons. In the non-relativistic limit for the proton, using \(\mathcal {L}_{N}(\alpha ^{\prime }=1)\), the axion EDM couplings are

$$\begin{aligned} \mathcal {H}^{\textrm{NR}}\supset g_{p}\dfrac{\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}}{2m_{p}\Lambda }-d_{p}\dfrac{a}{\Lambda }\gamma ^{5}\varvec{\gamma }\cdot \textbf{E}. \end{aligned}$$
(111)

The covariant axioelectric operator cannot be rotated away, and does induce spin precession, see Eq. (94). We choose to write \(\mathcal {H}^{ \textrm{NR}}\) starting from \(\mathcal {L}_{N}(\alpha ^{\prime }=1)\) instead of as in the equivalent form of Eq. (110) to emphasize the different nature of these two contributions. First, the \(g_{p}\) is explicitly vanishing if \(\partial _{\mu }a=0\), but not the \(d_{p}\) term. Yet, both sum up to an EDM-like precession, and even more, both operators becomes identical if \(\left| \textbf{E}\right| =E_{0}\sin (\omega t)\) since then \(\dot{a}\gamma ^{5}\varvec{\gamma }\cdot \textbf{P}\supset \dot{a}\gamma ^{5}\varvec{\gamma }\cdot \textbf{A}=a\gamma ^{5} \varvec{\gamma }\cdot \textbf{E}\) provided \(m_{a}=\omega \ne 0\). Second, these two operators encode different physics: the \(d_{p}\) contribution is local already at the hadronic level, but the axioelectric contribution has an intrinsically non-local origin, see Fig. 2, and becomes local in the non-relativistic limit only (as said earlier, we do not consider nuclear systems here). From the discussion in the lepton case, we do expect that the axion-induced proton EDM to increase to

$$\begin{aligned} d_{p}(t)\approx & {} g_{p}\frac{ea(t)}{m_{p}\Lambda }\approx 10^{-14}\frac{a(t)}{ \Lambda }~e\text { cm}\nonumber \\\approx & {} 10^{-14}\times \frac{\sqrt{2\rho _{DM}}}{m_{\pi }^{2}} \cos (m_{a}t)\nonumber \\\approx & {} 10^{-33}\cos (m_{a}t)~e\text { cm,} \end{aligned}$$
(112)

in the “resonant” situation in which the EM field matches the axion frequency, and assuming \(g_{p}\sim \mathcal {O}(1)\). This represent an enhancement of the long-distance contribution by about two orders of magnitude compared to the local contribution tuned by \(d_{p}\), Eq. (101). Beware though that, obviously, the same provisions about the implicit observation time constraints as in the lepton case do apply, since the \(g_{p}\) contribution does decouple if \(\partial _{\mu }a=0\), as is manifest in Eq. (111). Note, finally, that there is some model dependence in comparing Eq. (101) to Eq. (112). For instance, hadrophobic scenarios can be designed in which the axion couplings to the u and d quarks conspire to suppress \(g_{p}\) (see e.g. Ref. [82] and references there). Barring these possibilities though, Eq. (112) may represent our best window into the axion-light quark couplings, even compared to axion wind operator that are suppressed by the local galactic axion speed [14].

8. Neutron EDM. The purely long-distance enhancement mechanism at play for the proton is not available for the neutron since it is neutral, see Fig. 2 (as said earlier, this crucially rely on how the neutron magnetic dipole operator is introduced though). Instead, with only the \(a\bar{\psi }_{n}\sigma ^{\mu \nu }\gamma ^{5}F_{\mu \nu }\psi _{n}\) coupling, there will be an enhancement if the non-local quark-level matrix elements

$$\begin{aligned} \left\langle n\gamma \right| \frac{\bar{\psi }_{q}\gamma ^{\mu }\gamma _{5}\partial _{\mu }a\psi _{q}}{\Lambda }\left| n\right\rangle , \end{aligned}$$
(113)

can be significant away from the \(m_{a}=0\) limit, so that the enhancement identified at long-distance somehow spills over at short-distance. If that is the case, this violation would show up in \(d_{N}\), which should be understood to be a form-factor:

$$\begin{aligned} d_{N}(g_{q},g_{g})=d_{N}(g_{q},g_{g};q_{\gamma }^{2},q_{a}^{2},q_{a}\cdot q_{\gamma }). \end{aligned}$$
(114)

While we know that \(d_{N}(g_{q},g_{g};q_{\gamma }^{2},q_{a}^{2},q_{a}\cdot q_{\gamma })\rightarrow ~\) Eq. (101) when \(\partial _{\mu }a=0\), the behavior reaching that limit may not be that smooth if Eq. (113) does not go to zero sufficiently fast as \(\partial _{\mu }a\rightarrow 0\). Let us imagine that the proton and the neutron are simply collections of loosely bound non-relativistic constituent quarks. Then, the long-distance hadronic mechanism at play for the proton would have a direct counterpart as a non-local constituent quark mechanism (e.g. from the third diagram in Fig. 1). Both the proton and the neutron EDM would then be expected to reach Eq. (112) in the presence of “resonant” EM fields since, as explained in point 1 above, the neutron is not neutral for spin-dependent electric interactions. In practice, in this picture, one way to understand Eq. (111) would be from a term in \(d_{N}\) scaling like \(q_{a}^{2}/q_{a}\cdot q_{\gamma }=m_{a}/\omega \), vanishing in the \(m_{a}\rightarrow 0\) limit, but of \( \mathcal {O}(1)\) when \(\omega \approx m_{a}\) and \(m_{a}\) is not too small. Of course, this constituent quark picture is not particularly realistic, but in our opinion, it nevertheless suggests that some level of enhancement of the neutron EDM is possible. Indeed, the real world situation should lie somewhere in between no enhancement, as expected looking at Fig. 2 with the neutron not interacting with photons, to a significant enhancement thanks to residual interplays between the axion and photon couplings to the quarks inside the neutron. Obviously, to get a definitive answer from first principle is complicated and probably requires detailed lattice simulations starting from the general Lagrangian of Eq. (103), away from the \(m_{a}^{2}=0\) and \(q_{\gamma }^{2}=0\) limit.

Fig. 1
figure 1

Non-local partonic contributions to the nucleon local EDM operator of Eq. (100)

Fig. 2
figure 2

Partonic contributions to the nucleon-axion and nucleon-photon interaction generate a non-local, long-distance EDM effect for the proton, at the hadronic level and in the non-relativistic limit. The axion-gluon and axion-quark contributions are related by a Ward identity, but their combination is imposed to ensure that the axion-nucleon interactions decouple entirely in the soft limit, \(\partial _\mu a \rightarrow 0\). This is justified since in that limit, a non-decoupling axion-nucleon coupling corresponds to a complex mass term, so it has to be removed by a chiral rotation of the nucleon field

To close this section, we stress once more that the above discussion does not immediately apply to nuclear or atomic probes of the axion-induced proton and neutron EDMs (assuming the axion-electron coupling is absent). The non-relativistic limit appears crucial to collapse the axion couplings to an EDM-like operator for the nucleons, which can then be enhanced with suitable EM fields. Further, estimating how an oscillatory external electric field can penetrate the nuclear, atomic, and/or even the molecular system, accounting in addition for the presence of resonances, and estimating the resulting observable EDM it would induce is beyond the scope of the present work [83,84,85,86].

5 Summary

In this paper, the non-relativistic description of the axion interactions with fermions was systematically analyzed. We relied on rather old and well-established techniques like the Foldy–Wouthuysen transformation [46], the unitary transformations of Ref. [40], and Schiff theorem [47]. Yet, as these techniques had not been fully combined and supplemented by the reparametrization invariance for the axion field, to our knowledge, none of the final non-relativistic expansions for the Hamiltonian presented here were derived before. Our results can be summarized in three points:

  • For a neutral fermion, we demonstrated by adapting Schiff theorem that the axioelectric operator \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{p}, \dot{a}\}\) is totally screened. As shown in the final Hamiltonian for this scenario, Eq. (59), there are only axion wind operators up to \( \mathcal {O}(1/m^{3})\), except for a very suppressed \(\dot{a}^{2}\) coupling. Since there should be no finite-size effects, and because \(\mathcal {O} (1/m^{3})\) relativistic corrections are of a different nature, this screening should even hold to a much higher level than the usual Schiff screening of charged fermion EDMs. Phenomenologically, this scenario is not very relevant since normal matter is essentially made of charged particles, but it provides the basis to understand the result in the charged case.

  • Specifically, for a charged fermion, the final Hamiltonian is in Eq. (72). The covariant axioelectric operator \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}\) is found equivalent to an axion-induced EDM operator \(a\varvec{\sigma }\cdot \textbf{E}\), see Eq. (74) and Table 1. Both operators encode the same physics and lead to the same observables, but different properties are manifest in each representation. The former clearly shows the decoupling of the axion background in the massless limit, \(m_a \rightarrow 0\), while the latter makes the decoupling manifest in the absence of EM fields, or for a neutral fermion. Phenomenologically, the usual axioelectric effect is recovered whatever the chosen form of the operator, both having the same matrix elements for observables. Besides the axioelectric effect, these operators can also induce EDMs for all charged particles. The important points are first that these EDM operators are, in some sense, tree-level. They are directly predicted by the Dirac equation itself for all charged fermions, in a way totally analogous to the magnetic moment factor of 2. Secondly, these EDMs are not constant in time, and cannot be screened since Schiff transformation would simply change the relative weight of \(\gamma ^{5}\{\varvec{\gamma }\cdot \textbf{P},\dot{a}\}\) and \(a\varvec{\sigma }\cdot \textbf{E}\), something irrelevant since they lead to the same observables. Thus, though specific search strategies have to be designed to tackle the oscillatory nature of these EDMs as well as their decoupling in the \(m_a \rightarrow 0\) limit, their relatively large sizes, especially for the electron, makes them particularly promising.

  • Finally, concerning the proton and the neutron, the final Hamiltonians are in Eqs. (72) and (89). The main issue here is whether the axion-induced quark EDMs, which are intrinsically non-relativistic, can manifest themselves at the hadronic level. We find that this is the case for the proton, whose axion-induced EDM is significantly enhanced by long-distance effects compared to current estimates based solely on a local EDM operator induced by the axion-gluon coupling (see Fig. 2). For the neutron, if taken as point-like in a first approximation, the axion-induced EDM coupling coming from the axion-quark couplings vanishes exactly since the purely long-distance hadronic contribution is absent. Beyond leading order, some effects are likely as the neutron is not transparent to quark EDM interactions, but further work is needed to estimate these finite-size effects and establish whether they can compete with the axionic EDM coming from the axion-gluon coupling.

All these results clarify the construction of non-relativistic expansions in the presence of Goldstone bosons. Yet, to conclude, we would like to stress again that this formalism, in itself, has some limitations. For instance, our starting point was the Dirac equation for a single fermion in the presence of external background fields, electromagnetic and axionic. In our opinion, further work is urgently needed to obtain estimates for realistic experimental settings, in particular in the atomic or nuclear contexts (or even for the neutron–antineutron system [87]). Thus, extending the formalism itself, or even grounding it within a fully relativistic quantum field theory setting, would be very welcome, not least to confirm the promising phenomenological opportunities we identified for the detection of dark matter axions.

5.1 Note added

While this paper was under review, Ref. [65] appeared. There, the authors confirm the presence of the axionic EDM operator in some representations of the non-relativistic expansion of the Hamiltonian, using the elimination method instead of Foldy–Wouthuysen transformations. They also elucidated the important role boundary terms have in time-dependent perturbation theory to maintain the decoupling of the axionic EDM in the \(m_a \rightarrow 0\) limit.