Nucleon axial structure from lattice QCD

We present a new analysis method that allows one to understand and model excited state contributions in observables that are dominated by a pion pole. We apply this method to extract axial and (induced) pseudoscalar nucleon isovector form factors, which satisfy the constraints due to the partial conservation of the axial current up to expected discretization effects. Effective field theory predicts that the leading contribution to the (induced) pseudoscalar form factor originates from an exchange of a virtual pion, and thus exhibits pion pole dominance. Using our new method, we can recover this behavior directly from lattice data. The numerical analysis is based on a large set of ensembles generated by the CLS effort, including physical pion masses, large volumes, and lattice spacings down to $0.039\text{fm}$, which allows us to take all relevant limits. We find that some observables are much more sensitive to the choice of parametrization for the form factors than others. On the one hand, the $z$-expansion leads to significantly smaller values for the axial dipole mass than the dipole ansatz ($M_A^{\text{$z$-exp}}=1.01(9)\text{GeV}$ versus $M_A^{\text{dipole}}=1.30(7)\text{GeV}$). On the other hand, we find that the result for the induced pseudoscalar coupling at the muon capture point is almost independent of the choice of parametrization ($g_P^{\star \ \text{$z$-exp}}=8.69(49)$ and $g_P^{\star \ \text{dipole}}=8.32(18)$), and is in good agreement with both, chiral perturbation theory predictions and experimental measurement via ordinary muon capture.

From the theoretical side, one can gain insight into the form factors through various techniques. At small momentum transfer, chiral perturbation theory (ChPT) yields valueable low energy theorems [8,[22][23][24][25] (motivating, e.g., the above mentioned PPD ansatz), while, at intermediate and large virtualities, the form factors can be determined (up to some systematic uncertainty of ∼ 15%) using light-cone sum rules [26,27]. Another interesting approach is the application of functional renormalization group methods [28].
In this work, we will use lattice QCD, which enables a determination of hadronic observables from first principles. Once all systematic uncertainties are under control, this method provides the cleanest and most direct access to hadron form factors. Many studies of nucleon couplings and form factors have been carried out in the past using a wide variety of lattice actions and analysis methods (see, e.g., refs. ). Recent studies of form factors at finite virtualities with data close to physical pion masses have faced two problems: first of all, it is difficult to reconcile the data with the partial conservation of the axial current (PCAC). Even though PCAC is approximately fulfilled on the correlation function level, the corresponding relation between the ground state form factors, which are extracted using a spectral decomposition, is broken to a much larger extent. Secondly, the PPD ansatz for the induced pseudoscalar form factor fails to describe the data at small momentum transfer and small pion masses, which is the domain where one would expect this ansatz to give the best approximation. In both cases an explanation in terms of finite lattice spacing effects is unlikely, since the violation of PCAC is largest at small virtualities and masses. For nice presentations of these problems see, e.g., refs. [55,58]. A prime suspect that may be responsible for both effects is a particularly large excited state contamination, albeit it was demonstrated in ref. [57] that the problem persist if one uses a traditional fit ansatz with up to three free excited states. In ref. [60] we have proposed a subtraction method that removes excited state contributions that violate the equations of motion for the nucleon. While this leads to a recovery of the PCAC relation on the form factor level, the PPD ansatz still remained strongly broken. While this is not impossible as such, the induced pseudoscalar charge at the muon capture point remained at variance with the experimental value [15,[18][19][20][21].
A deeper insight into the excited state effects is possible using effective field theory (EFT). ChPT based analyses [64][65][66] (along the lines of refs. [67][68][69][70] using interpolating currents from ref. [71]) indicate that the subtraction method mentioned above does not remove all excited states and that the violation of the PPD ansatz is due to additional, large contributions from N π exited states that predominantly affect the induced pseudoscalar and the pseudoscalar form factors. While an a posteriori subtraction of the effect performed in refs. [64,65] leads to satisfying results, such a procedure appears inadequate from the lattice QCD perspective as it introduces a dependence on ChPT input parameters and cannot be consistently combined with standard excited state fits. Moreover, an independence of the operator smearing is assumed, which may not be justified in lattice simulations where spatially extended (smeared) nucleon interpolators with radii that are not very small compared to the inverse pion mass are employed.
The procedure advocated in this article makes use of the same effective field theory methods as refs. [64][65][66]72] in order to calculate the leading excited state contribution to the correlation function explicitly for all axial and pseudoscalar channels with less assumptions. We will show that exploiting the EFT knowledge stabilizes excited state fits considerably and allows us to extract ground state form factors, which are found to obey both the PCAC relation on the form factor level and the PPD ansatz reasonably well. Recently an alternative analysis method has been proposed in ref. [73], which also allows an extraction of nucleon form factors that satisfy PCAC. We will discuss similarities and differences between our approach and this method in sections 3.3 and 4.1.
This article is structured as follows. In section 2 we will give a detailed description of the EFT calculation needed to determine the leading N π contribution to the correlation function and how it can be combined with the usual excited state analysis. The lattice setup and the employed ratios of correlation functions are detailed in section 3. Section 4 contains the results for the form factors (using both, dipole fits and the z-expansion) and includes an analysis of the PCAC relation as well as of the PPD ansatz. We also explore parametrizations that are consistent with PCAC in the continuum. In the latter case the continuum limit is under much better control. We summarize our findings in section 5.

Definitions
In order to study hadron structure using lattice QCD one has to calculate two-and threepoint correlation functions, where hadron states with matching quantum numbers are created by a suitable interpolating currentN at the source time t src , and are destroyed by N at the sink time t snk (here, we will always set t src = 0 and t snk = t without loss of generality). In the case of three-point correlation functions one inserts a local current O at some insertion time τ with t > τ > 0 and, usually, one is interested in the ground state matrix element of this current insertion. The momenta can be fixed by appropriate Fourier transforms, in our case at the sink and the insertion, such that the initial state and final state momenta are p and p , respectively: x,y e −ip · x+i(p −p) · y Γ αβ N β (x, t)O(y, τ )N α (0, 0) , , P + , and Γ are matrices in Dirac space with the corresponding spin indices α and β. The three-quark nucleon interpolating current is defined via the usual quark-diquark structure with the charge conjugation matrix C, N α (x, t) = u(x, t) T Cγ 5 d(x, t) u α (x, t) , (2.3) where each quark is smeared separately in the spatial directions using Wuppertal smearing [74] on spatially APE-smoothed links [75]. Note that Minkowski scalar products and gamma matrix conventions are used throughout this work. At zero three-momentum P + = (1+γ 0 )/2 annihilates the leading negative parity contribution. For the analysis of the pseudoscalar and axialvector form factors we choose Γ to be P i + = P + γ i γ 5 , i = 1, 2, 3. In order to relate the correlation functions to matrix elements, one inserts identity operators (corresponding to sums over all hadronic states) and uses the translational properties of the currents to carry out the Fourier transforms. When evaluating the result at large Euclidean times, t, τ , and t−τ , excited states are exponentially suppressed and the correlation functions can be approximated by the ground state contributions: where all currents are located at the origin and |N p σ corresponds to a nucleon state with three-momentum p and spin-projection σ. The parity projected overlap matrix elements can be parametrized as where u β p,σ is a nucleon spinor and √ Z ± p are momentum-and smearing-dependent overlap factors. For smeared currents √ Z + p and √ Z − p can differ from each other due to the explicit breaking of Lorentz invariance by the operator smearing, cf. refs. [76] and [77] for more details. Since in our analysis only positive parity projected overlap matrix elements occur, we define Z p ≡ √ Z + p . The form factor decomposition for the nucleon-nucleon matrix element of a generic current can be written as where J[O] is matrix valued and can be parametrized in terms of form factors, cf. eqs. (2.12) and (2.13) below. Using the spinor identity σ u p,σūp,σ = / p + m, where m is the nucleon mass, one arrives at the ground state contribution For the two-point function we can explicitly evaluate the trace with P + to find (2.10) For the three-point functions the trace with Γ depends on the current-specific decomposition (2.7). In practice it turns out (in particular in case of the three-point functions) that, at Euclidean time distances t and τ with acceptable signal-to-noise ratio, not only the ground state contributes. In most cases this problem can be treated by taking into account generic excited state contributions in the fit functions (see section 2.3). However, there are situations in which the excited states constitute (at the available temporal distances) the dominant contribution. In the latter case the generic excited state parametrizations fail to describe the data appropriately and further physical insight into the excited state structure is needed, cf. section 2.2. For the isovector pseudoscalar and axialvector currents used in this work, the explicit decompositions in terms of the pseudoscalar form factor, G P (Q 2 ), as well as the axial and induced pseudoscalar form factors, G A (Q 2 ) and GP (Q 2 ), are where q = p − p is the momentum transfer and Q 2 = −q 2 is the virtuality. The three form factors used above are not independent in the continuum theory, since the axial Ward identity yields ∂ µ A µ = 2i m P known as partial conservation of the axialvector current (PCAC). Here m is the light quark mass. On the lattice this relation can be broken by discretization effects. For the nucleon matrix elements it implies that where we can safely ignore discretization effects linear in the lattice spacing a, since our analysis is fully order a improved, cf. section 3.1. Using the definitions (2.12) and (2.13) together with the equations of motion one can deduce the corresponding relation for the form factors (called PCAC FF in ref. [60]): Eqs. (2.14) and (2.15) should both be satisfied, once the ground state matrix elements have been extracted reliably.

EFT-based analysis
Employing a theory where hadrons are the effective degrees of freedom (like baryon chiral perturbation theory) in order to elucidate the excited state structure in correlation functions is appealing, in particular if multi-hadron states with additional pions are the relevant excitations, see refs. [67][68][69][70]78]. In many cases, however, these contributions are relatively small and one can deal with them using standard methods like, e.g., source/sink-smearing and multi-exponential fits that allow for generic excited state contributions. As will be explained in detail in this section, the situation is different in the context of isovector axial and pseudoscalar form factors, where N π contributions can actually be a leading term from the EFT point of view due to pion pole dominance (PPD). Especially for small pion masses this effect outweighs the exponential suppression at the currently available sourcesink distances due to the small energy gap. 1 In this situation multi-exponential fits with generic excited states become very unstable and usually fail to resolve the ground state. 2 In refs. [65,66] nucleon three-point functions with axialvector and pseudoscalar current insertions have been analyzed using ChPT and compelling qualitative evidence has been presented that the violations of the PCAC and PPD relations are indeed caused by N π excited states. This is done as follows: first, one calculates the excited state contribution to the form factor using ChPT. The predicted, excited state contaminated form factor is found to agree quite well with recent data from the PACS collaboration [58], cf. refs. [66,72]. In a second step, one may attempt to correct the error by subtracting the calculated excited state contaminations a posteriori (see, e.g., refs. [64,65], where such a subtraction has been performed for the induced pseudoscalar form factor). While this method yields convincing qualitative results, there are some open questions and limitations that need to be addressed: 1. The EFT calculation disregards the effect of operator smearing. There are heuristic arguments that the effect of the smearing should be negligible as long as the smearing radii r sm are much smaller than the Compton wavelength of the pion λ π ≈ 1.41 fm, cf. refs. [67,68,70,78]. However, these arguments are to some extent called into question by the observation that the operator smearing used in actual simulations has, in general, a strong impact on the signal of excited states. In refs. [38,40] it has been found that smearing radii of roughly r sm ∼ 0.5 fm maximize the ground state overlap. In the lattice analysis performed in this article, the optimized smearing radii are on some ensembles even larger (up to 0.8 fm, cf. table 2), and it is questionable whether a dependence on the smearing can be completely excluded for such smearing radii.
2. So far, an a posteriori subtraction of the excited states has only been performed in combination with the ratio method on the lattice. It is unclear how one would avoid double counting, if one combines it with a standard excited state analysis, e.g., by using multi-exponential fits.
3. Estimating the systematic error tied to the ChPT based subtraction is challenging.
From a lattice QCD perspective the situation is in our opinion quite clear concerning point 2. If there is a large N π excited state contribution, then it should be taken into account explicitly in the multi-state fits to the correlation functions. 3 In this approach point 1 can be addressed simultaneously by allowing for a smearing dependence of the N π 1 Note that, due to the exponential deterioration of the signal, one cannot expect the source-sink distances to become dramatically larger in future simulations. 2 An alternative method has been proposed in ref. [73], which appears to resolve the ground state contribution in this situation. We will comment on this method in some detail in sections 3.3 and 4.1.
3 One can also try to circumvent the problem entirely by either suppressing or subtracting the unwanted excited state contributions. In ref. [79] the pion pole contribution is suppressed by analyzing the matrix elements of currents with a Gaussian profile instead of local currents. Ref. [60] presents a method to subtract some of the excited state contributions. The squares correspond to explicitly inserted operators: the right and left ones correspond to smeared three-quark baryon interpolating currents at the source (at time 0) and the sink (at time t), respectively, while the ones in the middle depict a pseudoscalar or an axialvector operator insertion (at time τ ). The circles correspond to pion-nucleon interaction vertices, while the dashed and solid lines represent pion and nucleon propagators, respectively. The dotted red vertical lines indicate the sums over hadronic states one usually introduces to interpret correlation functions.
coupling to the interpolating currents. Furthermore, we can avoid systematic uncertainties (point 3) by relaxing ChPT constraints. In the following, we will describe in detail how this can be achieved.
The first and second rows of figure 1 show the tree-level Feynman diagrams that contribute to the correlation functions. As discussed in ref. [65], these yield the most important contribution to the correlation function. The squares on the right and left depict the smeared source and sink currents, while the one in the middle corresponds to the inserted local quark bilinears (axialvector or pseudoscalar currents in our case). The dashed and solid lines depict pion and nucleon propagators, while the circle stands for a pion-nucleon interaction vertex. The dotted red lines are for illustration only and indicate the identity operators (i.e., the sum over all hadronic states) that are usually inserted between source and current as well as between current and sink, cf. eq. (2.5). This elucidates that the diagram in the first row yields a contribution to the ground state, while the diagrams on the left-and right-hand sides in the second row give rise to a nucleon-pion excitation in the final and initial state, respectively. For the diagram in the middle of the second row, however, the situation is not that simple, since the nucleon-pion interaction is not restricted to a specific time-slice. As a consequence, the diagram contributes to both the ground state and the excited states, as shown in the bottom row of figure 1. This follows from an explicit calculation of the diagrams (see below). We emphasize that there is no one-to-one correspondence between the individual contributions in the spectral decomposition and the diagrams. For example, both the diagram in the first row and the diagram in the middle of the second row contribute to the ground state and, actually, an infinite number of diagrams will contribute to each state if one takes into account higher orders in ChPT (see ref. [65] for a list of one-loop diagrams). Finally, a single diagram can contribute to multiple states in the spectral decomposition, cf. the bottom row of figure 1. We will exploit the fact that the pion pole contribution to the ground state automatically gives rise to an associated excited state.
Before addressing the details, let us note that the following calculation is in large parts already contained in refs. [65,66], where also one-loop diagrams are taken into account. Also the presentation in ref. [79] is based on similar considerations (cf. also ref. [80]). However, we will present the result in a more general way (without using a particular spin projection or fixing initial and final state momenta to a predefined configuration) such that it can be used in a variety of simulation setups. The first ingredient we need in order to evaluate the diagrams in figure 1 are the corresponding Feynman rules. Here we follow the conventions of ref. [81], but adapt them to our choices for the currents (see eq. (2.11)) and convert them to position space. We work in two-flavor baryon ChPT here. However, since we only consider the nucleon sector and are only working at tree-level accuracy, a three-flavor calculation would give exactly the same result. Note that in this section all time variables are in Minkowski time and will be rotated to imaginary times only at the very end. The pion and nucleon propagators read For the vertices of the current insertions we have where we only take into account the leading contribution in the chiral counting and all derivatives are understood to act on the pion propagator. Here, F π and g A correspond to the pion decay constant and the axial coupling in the chiral limit, respectively, while B is the condensate parameter and σ a are Pauli matrices. For the leading N π interaction vertex we have The vertices for local three-quark currents have been derived in ref. [71]. We adapt these to the smeared interpolating currents used here by allowing for momentum-and smearingdependent couplings. With the nucleon isospinor Ψ N , where Ψ p = (1, 0) T and Ψ n = (0, 1) T , the leading order vertices read where one can actually assume Z p = Z p (p 2 ) and Z p,q = Z p,q (p 2 , p · q, q 2 ) up to lattice artifacts (obviously, the couplings will also depend on the masses, the smearing method and the smearing radii). We will use Z = Z p , Z = Z p ,Z =Z p ,q , andZ =Z p,q as shorthand notations. In the following we always consider protons, i.e.,Ψ p σ 3 Ψ p = 1. We will not assume Z p,q = Z p + higher order , Z p ,q = Z p + higher order , (2.25) which should hold at least approximately for small smearing radii, as discussed above.
Instead, we will test the validity of this assumption by comparing it to our data, cf. figure 5 in section 3.2. We complete the setup with the definition of the following energies and fourmomenta . (2.27) We will now consider one example for each type of diagram in figure 1 with an axialvector current insertion, starting with the purely nucleonic diagram (in the first row of figure 1). Defining the four-vectors x = (t, x), y = (τ, y) and the energies E i = q 0 i , we obtain In the first step, one integrates over the positions which gives delta distributions in momentum space, which in turn eliminate the integrals over the three-momenta from the propagators. Then, we close both integration contours in the lower half of the complex plane and use Cauchy's residue theorem twice. Rotating to imaginary times (t → −it and τ → −iτ ) one obtains the axial part of eq. (2.9) to zeroth order accuracy in ChPT, exactly as expected. Next, we consider the left diagram in the second row of figure 1, where the current insertion couples to a pion that directly connects to the sink, while the nucleon propagates directly from source to sink. We find where we have introduced the notation E 2 q µ , etc., to list the components of a 4-vector.
The pion carries the three-momentum q, while the nucleon propagates with momentum p.
As in the first diagram, the integrals over the energies can be calculated independently. The diagram yields an N π excitation in the final state with the energy E + E π . In general this will not be the excited state with the smallest possible energy. For example, if the final state momentum p is fixed to zero, the excited state with the smallest possible energy gap would be the one where both hadron momenta vanish, and E + E π = m 2 + p 2 + m 2 π + q 2 ≥ m + m π . For the diagram where the pion propagates from the source to the insertion (cf. the right diagram in the second row of figure 1) one obtains, carrying out an analogous calculation, which yields an N π excitation in the initial state. Finally, the diagram where the nucleon-pion interaction happens dynamically (the middle diagram in the second row of figure 1) gives . (2.31) In this case, where the virtual pion has the three-momentum q and the energy E 2 − E 1 , the remaining integrations over E 1 and E 2 are not independent of each other. We will perform them consecutively starting with E 1 . Similarly to the procedure for the other diagrams, both integration contours can be closed in the lower half of the complex plane. There, the integrand has two single poles, which collapse to a double pole, if E 2 = E − E π . The latter case has to be treated separately. The result after the first integration is where, for . (2.33) is finite, which is the only relevant information since it means that there is no pole at this point when using the residue theorem for E 2 later on. Thus, one finds that f (E 2 ) has three poles in the lower half of the complex plane. The first term in eq. (2.33) has two single poles, while the second term in eq. (2.33) has only one single pole. Its second, seeming pole is at E 2 = E − E π , where eq. (2.33) is not evaluated. One obtains three contributions that correspond to the diagrams in the bottom row of figure 1: where we have written the result in terms of the four-vectors defined in eqs. (2.27). The first term yields a contribution to the ground state. It is responsible for the leading, pole dominant contribution to the induced pseudoscalar form factor. The second and the third term contribute to the same N π excitations in the final and initial states as those in eqs. (2.29) and (2.30), respectively. This concludes our calculation of the tree-level diagrams shown in figure 1 for the axialvector current insertion. For the pseudoscalar current the calculation is analogous and we will not repeat it here. By matching the result obtained for the ground state with the usual form factor decompositions (using eq. (2.9) in combination with eqs. (2.12) and (2.13) after rotating to Euclidean times) one finds We emphasize that we will not enforce these results for the ground state contribution. In eq. (2.35) this corresponds to augmenting the axial coupling in the chiral limit to the full axial form factor, which is justified at leading order accuracy. In the same spirit, we have already tacitly used the actual nucleon mass in the propagator instead of its chiral limit value, which is also correct to leading order accuracy in ChPT. It is consistent to perform the same replacement g A → G A in the complete calculation. (We will show that this choice is in much better agreement with the data at nonzero Q 2 , cf. section 3.2 and, in particular, figure 5.) After doing so, eq. (2.36) yields the PPD assumption for the induced pseudoscalar form factor, as expected.
It turns out to be convenient to define the ratios where a = a = 1 would correspond to the assumption that the smearing does not affect the overlap of the interpolating currents with the N π excited states (compared to the ground state). Note that in general a and a are functions of the momenta. Putting everything together and rotating to Euclidean time (t → −it and τ → −iτ ) we find 41) and the dots represent additional excited state contributions. These results can be used for all momentum configurations and with arbitrary spin projections. After taking the trace with the specific matrices P i + that we use here, the result can be further simplified, see below.

Spectral decomposition
In this section we will provide the explicit expressions for the correlation functions that are used in our analysis, including our parametrization of additional generic excited states. For the latter we will assume that they occur with the same energies in both, two-and three-point functions. State-of-the-art lattice analyses of form factors take into account up to three excited states in the two-point and up to two excited states in the threepoint functions, see, e.g., ref. [82]. Whether this is necessary depends on the available statistics and on the applied source/sink smearing. In our simulation a relatively large number of smearing steps was performed, leading to large smearing radii, cf. table 2. In this situation, we find it sufficient to add only one generic excited state to the two-and three-point correlators on top of the pion pole enhanced state that we have calculated in the last section. Including the additional generic excited state term, we obtain for the two-point function In the following we will abbreviate ∆E = ∆E p and ∆E = ∆E p . Note that we do not assume any dispersion relation for the excited state energies, nor do we assume that these are single hadron states. Instead, we treat them as free fit parameters. We define the trace occurring in the ground state contribution to the three-point function as The explicit results can be found in appendix A, together with the remaining traces needed to evaluate eqs. (2.39) and (2.40). For the three-point functions we obtain the parametrization where we have suppressed the dependence of the excited state parameters on the momenta, the spin-projection and the current insertion: (2.47) Equations (2.46) and (2.47) are only valid up to higher order corrections in ChPT. For instance, one could replace G A by (Q 2 + m 2 π )GP /(4m 2 ) or by (Q 2 + m 2 π )m G P /(mm 2 π ) in the N π excited state contributions (cf. eqs. (2.35), (2.36) and (2.37)) and the result would still be valid at leading order. From a plain vanilla ChPT power-counting point of view one could even replace G A by g A . Therefore, in anticipation of possible higher order corrections, we may relax the assumptions even further by using c, c , d, and d as free fit parameters, which minimizes the ChPT bias. This has the additional advantage, that it does not allow the excited state signal to have a direct influence on the result for the ground state form factors. Naturally, one has to pay for the increased number of fit parameters with a slightly larger statistical error for the ground state result -a small price considering that one gets rid of one entire source of systematic uncertainty. In section 3.2 we will assess the validity of the ChPT predictions by comparing them to the results obtained from the fits. In particular we will be able to check whether the data is consistent with the parameter-free ChPT prediction for d and whether the direct coupling of the smeared three-quark interpolating currents to the N π state differs from the leading order ChPT prediction calculated for local currents. Note the elegance of the parametrization given in eqs. (2.44) and (2.45). Even after relaxing the conditions (2.46) and (2.47), it encodes the relative strength of the N π excited state contribution in the different channels. The importance of this knowledge must not be underestimated. For instance, one can directly see that any determination of the axial form factor using solely the A 1 , A 2 , and A 3 channels is not affected by these excited states at all.
Finally, let us note that for the kinematics we use in the numerical analysis, setting the final state momentum to zero, p = 0, such that p = −q (this setup is used in many lattice simulations), the parametrization becomes even simpler since one can replace c p i +d q i = e q i (with e = d −c ) and c p i +d q i = dq i . In this kinematic situation, the N π excited state energy corresponds to E N (0)+E π (−q) in the initial state, and E N (p)+E π (q) in the final state.

Lattice setup
In order to determine the axial and (induced) pseudoscalar form factors using the correlation functions described in section 2, we have analyzed a large set of lattice ensembles generated within the CLS effort [83]. 4 The ensembles have been generated using a treelevel Symanzik improved gauge action and N f = 2 + 1 flavors of nonperturbatively order a improved Wilson (clover) fermions. An efficient and stable hybrid Monte Carlo sampling is achieved by applying twisted-mass determinant reweighting [85], which avoids near-zero modes of the Wilson Dirac operator. The individual quarks in the nucleon interpolators at the source and the sink are Wuppertal-smeared [74], employing spatially APE-smoothed [75] gauge links.
Some of the CLS ensembles (cf. table 2 for a full list of the ensembles used in this work) have been simulated employing very fine lattices down to a = 0.039 fm. For these  lattices we avoid large autocorrelation times by using open boundary conditions in the time direction [85,86]. The latter allow the topological charge to flow into and out of the simulation volume through the temporal boundaries and thus topological freezing is avoided. While employing open boundary conditions is crucial for fine lattice spacings, we use lattices with both open and periodic boundary conditions for the coarser spacings. In total we have five different lattice spacings ranging from a = 0.039 fm to a = 0.086 fm, see table 1.
As illustrated in figure 2, the available ensembles have been generated along three different trajectories in the quark mass plane. 5 This strategy is explained in ref. [88]. The ensembles cover a range of volumes with 3.5 ≤ m π L ≤ 6.4 allowing us to investigate and control finite volume effects. The majority of the ensembles has m π L > 4. Having multiple quark mass trajectories with a wide range of lattice spacings and volumes enables us to simultaneously extrapolate to physical masses, to infinite volume, and to the continuum limit by means of a global fit to 37 ensembles. Our extrapolation strategy is explained in detail in section 4.2.  Table 2.
List of the ensembles used in this work, labeled by their identifier and sorted by the inverse coupling β and the pion masses. We specify the geometries N t × N 3 s as well as the boundary condition in time (periodic (p) or open (o)). The light and strange hopping parameters κ and κ s used in the simulation and the resulting approximate meson masses are given in MeV. Where available, we also provide the root mean squared smearing radii r sm for the light quark sources in fm. #conf. gives the number of configurations analyzed. The column t/a lists the source-sink distances in lattice units that have been analyzed on this lattice. The subscript #meas. specifies how many measurements have been performed for the respective source-sink distance. In physical units these distances roughly correspond to 0.7 fm, 0.9 fm, 1.0 fm, and 1.2 fm. The last column specifies on which trajectories in the quark mass plane the ensemble lies, cf. figure 2. An in-depth description of the ensemble generation can be found in ref. [83]. Note that ensemble D201 was only used for the test with nonzero final momentum shown in figure 7. The local axial and pseudoscalar currents in our calculation have to be renormalized. We use the renormalization factors Z A from ref. [89] (as recommended in this reference, we use the values Z l A,sub from their table 7), which have been determined using a new method based on the chirally rotated Schrödinger functional [90]. In addition, we use the nonperturbative quark mass-dependent order a improvement coefficients described in ref. [91] (but with updated values from ref. [92]). The isovector currents are multiplicatively renormalized using where m imp is the PCAC light quark mass obtained from improved currents, The bare quark mass is calculated using the hopping parameter κ q (cf. table 2) and its critical value κ crit [93].
We exploit the fact that the product of quark mass and pseudoscalar current renormalizes in exactly the same way as the axialvector current.b A has been found to be zero within errors smaller than 0.1 [92]. This corresponds to shifts of at most 4 , depending on the ensemble, that decrease towards the continuum limit. We neglect this effect, which is small compared to the other sources of error, and proceed with continuum limit extrapolations that are quadratic in a. Within the Symanzik improvement program [94,95] also the currents themselves have to be O(a)-improved. For the axialvector current this yields where we use the improvement coefficient c A , nonperturbatively determined in ref. [96], and ∂ µ denotes the symmetrically discretized derivative. For the pseudoscalar current P imp = P.

Fits to the correlation functions
To calculate the three-point functions (2.2) one has to evaluate all possible contractions. Disconnected diagrams do not contribute in our case, since we only consider isovector currents. The connected diagrams can be evaluated using sequential sources [97]. Since each sink momentum requires new inversions, we restrict the numerical analysis to the case in which the final state three-momentum is set to zero (p = 0). Note, however, that the parametrizations provided in section 2.3 are applicable to all possible kinematic situations.
On each ensemble we have analyzed 4 source-sink separations that have been chosen such that they correspond roughly to the physical distances 0.7 fm, 0.9 fm, 1.0 fm, and 1.2 fm. The source-sink distance in lattice units and the corresponding number of measurement per configuration are specified in table 2. On some ensembles we have reduced the computational cost by applying the coherent sink technique [33], where one inverts on multiple, temporally separated sequential sources simultaneously. For the statistical analysis we generate 500 bootstrap samples per ensemble using a bin size of 20 molecular dynamics units to eliminate autocorrelations. The nucleon energies determined from fits to two-point functions using a spectral decomposition with one generic excited state (2.42) agree with the continuum dispersion relation, see figure 3. With this justification, we employ the continuum dispersion relation for single nucleon energies in the subsequent analysis. The nucleon isovector form factors are obtained by a simultaneous fit to two-point functions and to the ratio using the parametrizations given in section 2.3. In the literature also the ratio is found, which is constructed such that the overlap factors drop out and the ground state contribution is time-independent. This is not the case for the ratio (3.6), where the ground state contribution is ∝ e −(E−E )τ . Nevertheless, we find it to be advantageous for various reasons: 1. It allows for a maximal cancellation of correlations, since the interpolating currents at the source and the sink occur at exactly the same spacetime positions with exactly the same phase factors in two-and three-point functions, cf. eqs. (2.1) and (2.2).
2. In contrast to eq. (3.7) it does not introduce additional excited states from two-point functions at small separations τ or t − τ .
3. One avoids a technical problem of eq. (3.7): in the course of the error analysis one can encounter negative values for single bootstrap samples due to statistical fluctuations such that the argument of the square root is negative.
Results of the simultaneous fits using the ratio (3.6) are shown in figure 4, where we have selected cases in which the effect due to the pion pole enhanced excited states is large, i.e., ensembles with small pion masses at small (but nonzero) momentum transfer. Note that for our kinematics the parametrization (2.44) and (2.45) only includes two additional fit parameters (d and e ) in addition to the usual excited state parametrization. These two parameters describe the N π related excited state contributions for the axialvector and pseudoscalar channels simultaneously, for all spin-projections. That this is even possible strongly indicates that the results given in section 2.3 are a very good approximation of the underlying physics.
In order to take into account systematic uncertainties of our excited state analysis, we perform a fit range variation, where the minimal distance between the operators is varied between 2a and 4a in the ratios, and between 2a and 3a in the two-point functions. In figures 4 and 6 full circles (dots) correspond to data points that are always (never) part of the fitted window, while the open symbols indicate data points that are used only in some of the fits. The error bands of the extracted ground state contributions contain both the statistical error and the error related to the choice of the fit range.
In figure 4 the yellow bands correspond to the ground state contributions extracted from the EFT-inspired ansatz for the three-point function (eqs. (2.44) and (2.45)), while the gray band is the ground state signal obtained from a traditional multistate fit ansatz (also using eqs. (2.44) and (2.45), but without the explicit N π contribution, i.e., setting c = c = d = d = 0). As one can see, the ground state contribution can be disentangled from the huge signal of the N π state (which fails to be resolved using the traditional ansatz with generic excited state contributions). Here, it is particularly advantageous that the coefficients of the N π contributions are constrained for various channels and spin projections in our fit. This results in a crosstalk between the different rows in figure 4 and allows us to pin down the corresponding excited state parameters very precisely. To this end, the seemingly strange linear behavior in A 0 (i.e., row 3 in figure 4, where the spin projection is aligned with the momentum) is actually helpful and it is noteworthy that the data can be described almost perfectly by our fit ansatz. The ratio shown in the top panels  Figure 5. The plot on the left shows how well the parameter free tree-level ChPT prediction d ChPT (circles; see eq. (2.47)) describes the data obtained from the fit (d fit ). As anticipated in section 2.2, the estimate using g A instead of G A (crosses) is not as good. This simply means that at nonzero momentum transfer the coupling of the pion to the nucleon is given by G A (Q 2 ) instead of g A , as expected. In the plot on the right we show a (cf. eq. (2.38)) obtained from our fit to the data. A value of a = 1 would imply that the leading order ChPT estimate for the coupling of N π to the three-quark operators is exact and that the operator smearing does not affect the coupling at all. As one can see, the data is not very sensitive to the value of a . We do not see any significant momentum dependence and no strong smearing effect.
(which is sensitive to G A but independent of GP ) is not affected by the N π excited state contribution at all. This explains why past lattice determinations of the axial form factor using this channel in combination with traditional excited state fits were able to extract a correct, reliable result. The ansatz including the N π excited states explicitly allows for a much better description of the data. In the case of D200 for instance, fits using block-correlated covariance matrices yield χ 2 /d.o.f. ≈ 1.05 (including N π) versus χ 2 /d.o.f. ≈ 5.59 (excluding N π). Note, however, that we have decided to use uncorrelated fits to extract the results. This avoids instabilities in the covariance matrix and prevents an underestimation of the statistical errors.
We find that almost the complete excited state contamination can be attributed to this N π state, and that there are only very mild additional contributions in the final state (where p = 0). Nevertheless, we refrain from removing the additional generic excited states from the parametrization, in order to exclude an underestimation of the error in the extracted ground state contribution. Actually, one can also obtain a very good description of the data with even smaller statistical errors if one would use the ChPT-biased parametrizations discussed in section 2.2, which may indicate that possible higher order corrections are small. Nevertheless, the latter would entail a systematic uncertainty that we intend to avoid.
However, we can confront the results of our fits with the corresponding ChPT prediction, see figure 5. In particular for the parameters d and d in eqs. (2.44) and (2.45) ChPT yields a parameter free prediction, see eq. (2.47). Since d corresponds to one of our fit parameters, a direct comparison is possible (left plot in figure 5). As anticipated in section 2.2, the prediction using G A (Q 2 ) (circles) as the pion-nucleon coupling instead Comparison to the subtraction method proposed in ref. [60] for the ratio R 0,p Γ,O (defined in eq. (3.6)) at the momentum transfer q = −p = 2π L (0, 0, 1) T for ensemble D200. The solid and dashed lines show fits to the unsubtracted and subtracted data, respectively, where the yellow and red bands show the corresponding ground state signals. In both cases we have taken into account the leading N π contribution. For the subtracted current the fit ansatz has to be adapted, cf. appendix B.
of g A = G A (0) (crosses) is in almost perfect agreement with our data, even at large Q 2 , where one would usually not expect ChPT to work. For our kinematics, the N π excitation in the final state can also couple directly to the three-quark operator (this corresponds to the diagrams on the left and right in the second row of figure 1). Therefore, we can try to determine a (defined in eq. (2.38)) directly from the data. A value a = 1 means that the leading order ChPT estimate for the coupling of N π to the three-quark operators calculated for local currents is exact in spite of the smearing. As one can see from the large statistical errors in the right plot of figure 5, our data is not very sensitive to a . We neither see a significant momentum dependence nor a strong smearing effect. If anything, the direct coupling of the three-quark operators to N π seems to be slightly enhanced by the smearing.
In figure 6 we reinvestigate the subtraction method that some of us have proposed in ref. [60]. As one can clearly see in the upper panels of figure 6, it removes the strangely linear behavior in the A 0 channel almost entirely. We find that the results for the ground state obtained from fits to the unsubtracted (solid lines; ground state yellow) and the subtracted (dashed lines; ground state red) data are mutually compatible, once we take , but rescaled such that the ground state contribution in all channels corresponds to g A ) at momentum transfer q = 0, with p = p = 2π L (0, 0, 1) T and with p = p = 0 for the contributing axial channels. This analysis has been performed on ensemble D201. The solid lines correspond to a simultaneous fit to all channels taking into account the leading N π contribution using eqs. (2.44) and (2.45), where the yellow band corresponds to the ground state. The bands include the statistical error and an error due to a variation of the fit range. into account the leading N π contribution. 6 For the subtracted correlation functions, the fit ansatz given in section 2.3 has to be adapted appropriately, cf. appendix B. However, the ground state extracted from the subtracted data has a much larger statistical uncertainty. A closer look shows that the subtraction method here has fallen victim to its own success: since the largest and clearest excited state contaminations (in A 0 ) have been subtracted successfully, the corresponding parameters can be determined less reliably, which in turn leads to a large error in the ground state. One can conclude that a combination of the analysis method proposed here (taking into account the relevant N π excitation explicitly in the fit to the correlation function) and the subtraction method proposed in ref. [60] is not advantageous.
As a consistency check, we have also considered the case q = 0 with p = p = 0 on one of our ensembles (D201). In this situation eq. (2.44) predicts that the correlation functions of A 1 , A 2 , and A 3 are not affected by the N π excited state, while A 0 gets a contribution ∝ exp −(E N + m π /2)t cosh m π (τ − t/2) in the three-point function. In figure 7 we show that this is indeed the case and that a simultaneous fit using eq. (2.44) yields a consistent description of the data for all channels.

Excited state energies
In ref. [73] it has been proposed to use the signal of the timelike axialvector channel to determine the energy of the low-lying N π excitation. The main difference to the traditional excited state fit method is that one does not impose that the leading excited states in the two-and three-point functions have the same energy. In figure 8 (which roughly reproduces Fig. 3 of ref. [73] 7 ) we show the energy gaps to the various excited states obtained from two different fits to the correlation functions on ensemble D200. The dots (fit 1) have been obtained using the method proposed in ref. [73] (with the slight difference that we perform a simultaneous fit to all channels instead of the two-step method presented in ref. [73]), while the crosses (fit 2) have been obtained using our fit ansatz from eqs. (2.44) and (2.45) but taking ∆E N π and ∆E N π as free fit parameters. In contrast to fit 1, fit 2 contains the additional excited states known from the two-point function, which leads to larger statistical uncertainties in particular when the energy levels of the N π state and the 7 Figure number from the arXiv v2 version.
excited state from the two point function (blue data points) get close to each other. Both kinds of fits lead to energies for the nucleon-pion states that approximately correspond to those of a noninteracting system (cf. the diagrams in the left and the right column of figure 1), which for our kinematics means E N π = E π (q)+E N (0) in the initial state (orange, dotted line) and E N π = E π (q) + E N (−q) in the final state (green, dashed line). That both methods lead to compatible values for the N π excited state energies is encouraging and suggests that the physical interpretation obtained using EFT (cf. section 2.2) is correct.
In particular for the low-lying N π state (which for our kinematics occurs in the initial state) at intermediate Q 2 one can see that the energies obtained from the fits slightly undershoot those of the noninteracting system. This effect is found to be a bit more significant in ref. [73]. One may speculate that this small deviation is due to an interaction between the nucleon and the pion. For the time being we have chosen to ignore these small deviations in our fits in favor of a better stability.
We comment that the pole enhanced N π excited state contribution occurs either in the initial state or in the final state, but certainly not in both at once. Therefore, it is not necessary to allow for an excited-state to excited-state contribution in this case.

Approximate restoration of PCAC and PPD
As mentioned in the introduction, form factors extracted from data using a traditional fit ansatz (with same excited state energies in the two-and the three-point function) show strong violations of PCAC and PPD. In particular in the case of PCAC this result was puzzling since the latter is fulfilled at the correlation function level (up to some small, expected discretization effects). In order to quantify the violation of the PCAC relation at the form factor level (cf. eq. (2.15)), we define the ratio (cf. also ref. [55]) where r PCAC = 1 if PCAC is fulfilled exactly. As the panel on the left-hand side of figure 9 demonstrates, using the parametrization of excited state contributions described in section 2.3 the PCAC relation is now fulfilled reasonably well on all ensembles, in particular on the ensembles with small pion masses, which previously exhibited the largest deviations. We emphasize that our fit ansatz does not impose PCAC on the ground state. While we see a significant improvement for all ensembles, small deviations of ∼5% remain in some cases. The induced pseudoscalar form factor is often estimated by which is usually referred to as the pion pole dominance (PPD) assumption. Note that this relation does not have to hold exactly, even in the continuum. However, one would expect it to be fulfilled at least approximately for small pion mass. The panel on the right-hand side of figure 9 shows that this is indeed the case if one explicitly takes into account the pion pole enhanced excited states in the spectral decomposition of the correlation function.
As reported in ref. [73] the problem can also be resolved (though within larger statistical uncertainties), if one uses a traditional multi state fit ansatz, but relaxes the condition that the excited state energies in the two-and three-point function have to match. One exploits the huge excited state signal in the timelike axialvector channel to determine the energy gaps quite precisely (cf. also section 3.3). This can be seen as further confirmation that the previously observed large deviations from PCAC and PPD were indeed caused by unresolved, pion pole enhanced excited states. Note, however, that our ansatz (shown in eqs. (2.44) and (2.45)) conveys a deeper insight into the structure of the excited state contribution compared to a generic parametrization. For instance, it is clear that, for A µ with µ = 1, 2, 3, the result for G A cannot be affected by the N π excited state at all. Heuristically speaking, this is because G A is not subject to pion pole dominance. Therefore, the deviation in G A reported in ref. [73] is clearly an artifact.

Parametrization and extrapolation
In this section we will explore two common form factor parametrizations: the traditional dipole ansatz and the z-expansion, which has become fashionable lately. In both cases we also consider parametrizations that are consistent with PCAC in the continuum (section 4.2.3) and we will use a generic ansatz for the combined continuum, chiral and volume extrapolation explained in section 4.2.4.

Dipole ansatz
where the pion pole is isolated (cf. also ref. [52]) such that one can use similar parametrizations for the residual form factors X(Q), X ∈ {A,P , P }. The prefactors not only ensure that all X(Q) have the same mass dimension, but also allow us to obtain the correct chiral behavior of the form factors at small Q 2 despite using the same generic ansatz for all form factors, see section 4.2.4 below. One can consider various parametrizations for the residual form factors. For instance, one can use a dipole ansatz which reproduces the traditional dipole form for the axial form factor with the axial coupling g A and the axial dipole mass M A . This parametrization not only yields the correct low-energy behavior (if one uses a generic parametrization for the pion mass, volume and lattice spacing dependence of g X and M X , cf. section 4.2.4 below), but also yields the correct asymptotic limit G A ∝ 1/Q 4 , GP ∝ 1/Q 6 , and G P ∝ 1/Q 6 [98], at large momentum transfer.

z-expansion
One may also parametrize the residual form factors using the z-expansion [99,100], which automatically imposes analyticity constraints. This corresponds to an expansion of the form factors in the variable where t cut = 9m 2 π is the particle production threshold and t 0 is a tunable parameter. 8 We then parametrize where the X(Q) are defined as in section 4.2.1. Without additional constraints this parametrization has N + 1 free parameters and is usually called a z (N +1) ansatz. Again, the generic parametrization discussed in section 4.2.4 will yield the correct chiral behavior. However, eq. (4.6) does not incorporate any constraints at large momentum transfer. In order to reproduce the correct asymptotic behavior one has to enforce constraints of the type which can be implemented (as long as n < N ) by demanding 0 = N l=0 l k a X l , for 0 ≤ k ≤ n . These can be incorporated, e.g., by fixing Alternatively, one can solve the problem recursively by setting To enforce the correct scaling in the asymptotic limit, G A ∝ 1/Q 4 , GP ∝ 1/Q 6 , and G P ∝ 1/Q 6 [98], we have to apply the formulas above for n = 3, thereby fixing a X k for k = 0, 1, 2, 3, such that 4 coefficients are fixed and only N − 3 coefficients are free parameters. 9 This parametrization with correct asymptotic behavior is usually referred to as the z 4+(N −3) ansatz.

Consistency with PCAC in the continuum
Let us assume the following ansatz for the extrapolation to the physical point (m π → m phys π , a → 0, L → ∞), where we have factorized the dependence on the lattice spacing with for all parameters in the form factor decompositions, i.e., x ∈ {g A , M A , gP , MP , g P , M P } for the dipole ansatz, and x ∈ {a A n , aP n , a P n }, n = 4, 5 . . . , N for the z-expansion. This allows us to perform a combined fit to all ensembles for each form factor. The expressions used for x a and x a will be given below in section 4.2.4.
Since we know that the partial conservation of the axial current has to be satisfied exactly in the continuum limit, we can use eq. (2.15) to obtain G P from G A and GP : However, one then has to impose the additional constraints (4.14) in order to preserve the correct asymptotic behavior of G P , cf. also eq. (4.3). For the dipole parametrizations one gets The equivalent constraints for the z-expansion can be obtained using eq. (4.9) and read , for k = 4, 5 . (4.16) Let us now parametrize the pseudoscalar form factor using The ansatz (4.17) becomes consistent with PCAC in the continuum limit once we demand that Unfortunately, PCAC is broken on the lattice by discretization effects, such that P 1 (Q) and P 2 (Q) differ from A(Q) andP (Q) at nonzero lattice spacing. Hence, we use the same ansatz for both (e.g., the dipole form (4.4) or the z-expansion (4.6)), but we start with independent parameters. Here, the asymptotic constraints yield independent of a. Note, that eq. (4.18) and (4.19) can only be fulfilled simultaneously if the axial and induced pseudoscalar form factors meet the requirement (4.14). For the two parametrizations (cf. sections 4.2.1 and 4.2.2) that we consider, the constraints for n < 4 are fulfilled automatically. Similar to the above, the remaining two constraints can be satisfied by (4.20) when using the dipole ansatz, and by when using the z-expansion.
To summarize, if we want our form factor parametrizations to obey PCAC in the continuum limit, we start by parametrizing P (Q) as in eq. (4.17), thereby introducing more parameters at first. However, as discussed above, these parameters are highly constrained such that the ansatz enforcing PCAC will have less free fit parameters in the end. Using the dipole ansatz, we have g A , M A , gP , MP , g P 1 , M P 1 , g P 2 , M P 2 , which can be factorized in a lattice spacing dependent and a lattice spacing independent part as shown in eq. (4.11). The constraints discussed above can be incorporated by setting , g a P 1 = g a A , g a P 2 = g ã P , (4.22) where the constraint in brackets is not independent of the others. If one uses the zexpansion, one starts with a A n , aP n , a P 1 n , a P 2 n , n = 4, 5 . . . , N . Again, we assume these coefficients to be factorized as in eq. (4.11). Here, the constraints discussed above can be implemented by setting As above, the constraints in brackets are not independent of the others.

Continuum, chiral, and volume extrapolation
In our combined analysis of all ensembles we will consider four kinds of fits: the dipole ansatz (2P), the z-expansion with correct asymptotic behavior (z 4+ (N −3) ), and the two corresponding parametrizations where PCAC holds automatically in the continuum (!2P and !z 4+ (N −3) , respectively). They are listed in table 3. We have factorized the occurring parameters x = x a x a (see eq. (4.11)) into a continuum limit part x a , and a part which describes discretization effects x a . In particular, in the parametrizations that respect PCAC, the number of parameters is reduced due to the constraints derived in section 4.2.3 (see also table 3). We perform a combined continuum, chiral and volume extrapolation using the generic ansatz x a (a, m π , m K ) = 1 + a 2 d x where m 2 η = (4m 2 K −m 2 π )/3. The functional form of the finite volume terms is motivated by the leading contribution found in ChPT calculations of the axial coupling, cf. refs. [101,102]. To parametrize the quark mass plane we have defined the linear combinations such that δm = 0 corresponds to exact flavor symmetery, i.e., the blue line in figure 2, while the green line with physical average masses is defined bym = phys. ≈ 411 MeV. Along Table 3. Overview of form factor parametrizations. We will use the dipole ansatz (2P) and the z-expansion with correct asymptotic behavior (z 4+(N −3) ) as described in sections 4.2.1 and 4.2.2, respectively. For both cases, we also consider parametrizations where PCAC is fulfilled in the continuum limit (marked by a preceding ! in the identifier), cf. section 4.2.3. In the rightmost column, we give the total number of fit parameters used for the combined continuum, chiral, and volume extrapolation per form factor, assuming that formulas (4.28) and (4.29) are used for the extrapolation of x a and x a , respectively.
id PCAC x a x a #params per FF , . . . , a P 1 ,a N , a P 2 ,a 6 , a P 2 ,a 7 , . . . , a P 2 ,a N the line of an approximately physical strange quark mass, i.e., the red line in figure 2, the average mass varies; all ensembles used in this study havem < 500 MeV. Note that we do not consider terms with chiral logarithms (m 2 π ln(m 2 π /m 2 ), etc.) in eq. (4.28) since they are of the same order in chiral counting as m 2 π . Therefore, their inclusion does not really make sense as long as not all of them are known explicitly from a calculation within chiral perturbation theory. Figure 10 provides a compilation of (continuum, chiral and finite volume extrapolated) form factors that have been obtained from the parametrizations discussed in the previous sections. The parameters producing the central values can be taken from table 4. Surprisingly, even the fits using a dipole ansatz (2P) give a reasonable description of the data (actually, it has in most cases the smallest χ 2 /d.o.f. of all fits, cf. table 4), despite the fact that the functional form is very constrained. However, the latter may lead to an underestimation of the error, and it may also induce a smaller slope at zero momentum transfer. In order to reduce this bias one may relax the constraints due to the choice of parametrization. The currently most popular and probably best suited ansatz for this task is the z-expansion described in section 4.2.2, which allows us to increase the number of parameters seamlessly. To this end, we have performed z 4+3 , and z 4+4 fits (and the corresponding fits that are constrained to be consistent with PCAC in the continuum limit, see below). While the   z 4+3 fit is almost as restrictive as the dipole ansatz, expansions with a larger number of parameters (z 4+4 , z 4+5 , etc.) introduce less and less parametrization bias. In practice, however, the choice will always be a balancing act between reducing parametrization bias and being able to control the systematics of all occurring parameters (due to discretization effects, volume effects and the necessary chiral extrapolation, cf. section 4.2.4). Therefore, we have decided to perform the analysis using various parametrizations. We emphasize that PCAC was not enforced when extracting the form factors from fits to the correlators. Nevertheless, due to the advances in the understanding of excited state contaminations in the correlation functions, we are now able to resolve the ground state contributions such that the obtained form factors agree with PCAC (and also PPD) reasonably well. This enables us to perform combined fits to all form factors using parametrizations that automatically obey PCAC in the continuum limit. As one can easily see in table 3, the resulting parametrizations are much more restrictive in the continuum than their counterparts. For example, the dipole fit (!2P) has in total three free parameters (at the physical point in the continuum limit) for all form factors. However, in contrast to the parametrization bias discussed above, the PCAC constraints do not evoke any kind of systematic uncertainty, since they only reflect an exactly known symmetry. Unsurprisingly, we find that the continuum extrapolation is more stable when using these PCAC-consistent parametrizations. Overall, we find that both the !2P and the !z 4+3 fit yield a very good Table 4. Results for the parameters at the physical point in the continuum occurring in the dipole ansatz (4.4) and the z-expansion (4.6) together with the uncorrelated χ 2 per degree of freedom of the corresponding fit. For convenience, we also provide the values for the parameters, which are entirely fixed by constraints.  Our final results will therefore be obtained from these fits. The !z 4+4 fit also provides a very good description of our data (χ 2 /d.o.f. = 0.79). However, it is less trustworthy since it relies on an excessive number of parameters, which leads to larger systematic uncertainties in the combined continuum, chiral, and volume extrapolation.

Results
In figures 11, 12 and 13, we show how well the !z 4+3 fit actually describes the data. (For the !2P fit such plots look similarly convincing.) The 6 rows in each figure correspond to the five available lattice spacings and to the continuum limit, while the columns correspond to the different quark mass trajectories. The yellow band corresponds to the extrapolated result at physical masses and in infinite volume. The data show that the form factors exhibit an increasing slope (in Q 2 ) for decreasing pion masses and lattice spacings. It is particularly encouraging that the data for our physical meson mass ensemble (E250) nicely reproduces the expected pion pole structure in the (induced) pseudoscalar form factor (cf. eq. (4.3)).
Above, in figure 9, we have demonstrated that the nucleon form factor data extracted from the correlation functions using the results presented in section 2.3 agrees reasonably   Figure 14. The r PCAC (left panel) and r PPD (center panel) ratios (defined in eqs. (4.1) and (4.2)) at the physical point in the continuum limit. They are obtained using a 2P (dotted, blue), z 4+3 (solid, green) or a z 4+4 (dashed, red) fit ansatz. In the case of r PPD , we also show results obtained from the corresponding fits that are constrained to be consistent with PCAC in the continuum limit (right panel), cf. table 3. Table 5. Results for the form factors G X (0) at zero momentum transfer and for the mean squared radii r 2 X = −6G X (0)/G X (0) obtained from fits using various form factor parametrizations. We also provide results for the pion-nucleon coupling g πN N and for the induced pseudoscalar coupling at the muon capture point g P , which can be directly compared to the experimental value g P = 8.01 (55) from muon capture [18,19].
id well with PCAC and PPD. In figure 14 we show the result for the ratios r PCAC (left panel) and r PPD (center panel) after the extrapolation using the previously discussed form factor parametrizations that do not enforce PCAC. We find that both PCAC and PPD are fulfilled within very large statistical errors. This observation is true for dipole fits as well as fits using the z-expansion. For the parametrizations that enforce PCAC in the continuum r PCAC = 1 by construction. As one can see by comparing the center and the right panel (note the difference in the scale between the two plots), the PCAC-consistent fits allow for a much better resolution of possible deviations from the pion pole dominance assumption for the induced pseudoscalar form factor. We find an almost perfect realization of the PPD assumption (at a 1%-2% level) for all values of momentum transfer independent of the choice of parametrization.
The results for the form factors at zero momentum transfer and for the mean squared radii are given in table 5, where we also provide the results for the induced pseudoscalar coupling at the muon capture point with the muon mass m µ = 105.6 MeV, and for the pion-nucleon coupling constant where we use the PDG value of F π = 92.07 MeV [103]. As a general trend we find that the fits which ensure that PCAC is fulfilled in the continuum limit yield a smaller statistical uncertainty. A main achievement is that we now have control over the pion pole enhanced excited states occurring in the pseudoscalar channels. As a consequence, we find reasonable values for g P that are in agreement with an approximate realization of PPD in nature. 10 From table 5 one can actually read off that the different parametrizations yield compatible results, with the exception of the axial radius, where the dipole fits give significantly smaller values, and the pion-nucleon coupling, where the !z 4+4 fit seems to be an outlier. In our opinion, the !2P and the !z 4+3 yield the most trustable results (for the fits with more free parameters the chiral and continuum extrapolation is less stable). However, given our set of available data, we cannot decide whether the !2P or the !z 4+3 fit is better. We have therefore decided to perform an analysis of systematic uncertainties for both of these fits. In table 6 we provide, in addition to the statistical error () s , estimates for the systematic uncertainties due to the chiral extrapolation () m and the continuum extrapolation () a . To this end, we have perform additional fits with cuts in the fit ranges (m < 450 MeV and a < 0.08 fm, respectively). We then take the difference between the results from these fits and our main result as an estimate for the corresponding systematic uncertainty.

Discussion
As one can see in table 6, the !2P and the !z 4+3 fit yield compatible results for almost all observables. For definiteness we choose to quote the values from the !z 4+3 fit as our final result in these cases, merely because it might have less parametrization bias and because the slightly larger statistical uncertainty is more conservative. In the case of the axial radius, which is directly linked to the axial dipole mass, however, we find that the dipole fit and the z-expansion yield significantly different results. Our main conclusion here has to be that r A is highly parametrization dependent -a nuisance which also plagues determinations from experiment.
In figure 15 we show a compilation of experimental data and lattice data for the axial dipole mass. While the 20 th century world average (cf. ref. [24]) supports a value around 1 GeV, newer experiments by K2K [108], MINOS [109], and, in particular, Mini-BooNE [110] yield larger values. This has fueled some discussions lately. One possible Table 6. Final results obtained from the !2P and the !z 4+3 fit including the statistical error () s and estimates for the systematic uncertainties due to chiral extrapolation () m and due to the continuum extrapolation () a . Since both fits satisfy PCAC in the continuum, G A (0) = m m G P (0) is fulfilled automatically.
0.275 (18)  explanation is that the discrepancy is caused by nuclear effects that have to be taken into account. In ref. [117] it has been demonstrated that, using a local Fermi gas (LFG) model combined with multi-hadron interactions and the random phase approximation (RPA), one can recover smaller values for M A from MiniBooNE data. As argued in [124], larger values for M A in MiniBooNE may also be a consequence of transverse enhancement (TE) due to meson exchange currents (MEC), cf. ref. [125]. Another line of inquiry is followed, e.g., by refs. [100] and [12], and is based on the suspicion that the dipole ansatz might be too restrictive. Using the z-expansion one finds smaller values and much larger errors for M A . In ref. [100] it is shown that the MiniBooNE data is consistent with old π electroproduction data under these circumstances. Our analysis supports this picture. The results for the axial radii obtained from the dipole fit (!2P) and the z-expansion (!z 4+3 ) correspond to the axial pole masses of M A = 1.30(7) GeV (dipole) and M A = 1.01(9) GeV (z-exp). The situation we find is thus very similar to the one reported in ref. [100], where extractions using a dipole ansatz yield M A = 1.29(5) GeV (dipole, [100]), while the z-expansion yields a smaller value M A = (0.85 +0. 22 −0.07 ± 0.09) GeV (z-exp, [100]). It is notable that the z-expansion coefficients we obtain from our fits (see table 4) roughly fulfill the constraints that are imposed in ref. [100].
For the dipole ansatz our result is in good agreement with previous lattice determinations. In particular the agreement with the continuum extrapolated value from [55] is encouraging. For the z-expansion the situation is not so clear, since the lattice results scatter over a wide range. In part this may be caused by the use of different variants of the z-expansion (number of parameters, use of priors, choice of t 0 , implementation of constraints, etc.).
In figure 16 we have compiled results for the induced pseudoscalar coupling at the muon capture point g P from experiment, ChPT, and lattice QCD. The ChPT predictions 11  are based on measurements of the axial radius and experimental data for g πN N . They persistantly call for a value slightly above 8. While older measurements of ordinary muon capture (OMC) were in agreement with this prediction (within large errors), the TRIUMF measurement [14,15] lies significantly higher. It has to be seen as a success of BChPT that the new OMC measurement by MuCap [18,19] is spot on with a small error. Independent of the choice of parametrization our results are in perfect agreement with both the ChPT prediction and the MuCap result. In particular recent lattice results that include a chiral and a continuum extrapolation using ensembles with close to physical pion masses have yielded much smaller values. In retrospect, it is clear that these findings were caused by   [22] HBChPT; M A from ν scattering; assuming g πN N = 13.31 G [23] HBChPT; M A from π electroproduction [5,115,116]; assuming g πN N = 13.0 H [24] HBChPT; M A from ν scattering; assuming g πN N = 13.10 I [25] covarian BChPT (EOMS); M A from ν scattering; assuming g πN N = 13.21 [123] J [30] N f = 2 DWF; a = 0.116 fm; dipole ansatz K [32] N f = 2 + 1 DWF; RBC/UKQCD; a = 0.114 fm; dipole ansatz L [40] N f = 2 Wilson (clover) fermions; RQCD; CE; EFT ansatz corrected by missing factor of 2 M [52] N f = 2 + 1 Wilson (clover) fermions; a = 0.114 fm; z-exp N [53] N f = 2 Wilson (clover) fermions; ETMC; a = 0.0938 fm; dipole ansatz O [54] N f = 2 Wilson (clover) fermions; CE; EFT ansatz P [55] N f = 2 + 1 + 1 Wilson (clover-on-HISQ) fermions; PNDME; CE; EFT ansatz Q [60] N f = 2 Wilson (clover) fermions; RQCD; subtraction method; CE; z-exp R [73] N f = 2 + 1 + 1 Wilson (clover-on-HISQ) fermions; PNDME; a = 0.0871 fm; takes into account N π state; z-exp S This work; N f = 2 + 1 Wilson (clover) fermions; RQCD; full resolution of N π state; CE S1 dipole ansatz S2 z-exp Figure 16. Compilation of data for the pseudoscalar coupling at the muon capture point g P from experiment (A-E), BChPT (F-I), and lattice simulations (J-S). Extractions based on a dipole ansatz are colored red, while those using any variant of the z-expansion are colored blue. Some lattice calculations use an EFT ansatz colored green (pion pole term combined with Taylor expansion). The dashed lines show the mean result of the !2P (red) and the !z 4+3 (blue) fit. The lattice results in parentheses are outdated, since they are strongly affected by the pion pole enhanced excited states treated in this article, cf. also the discussion in ref. [73]. the pion pole enhanced N π excited state contribution, which was not fully under control. See also ref. [73], where the same conclusion has been drawn.
Results for the pion-nucleon coupling constant g πN N are composed in figure 17. The experimental results from πN scattering, N N scattering, and pionic atoms have reached a high precision, and in particular recent determinations are in quite good agreement with each other and with the ChPT analysis of refs. [140,141]. The discussion is now centering on the understanding of charge and isospin breaking effects -a question that is out of reach of current lattice QCD analyses of nuclean structure, which usually ignore QED effects and use degenerate light quark masses. Also the experimental precision is not yet within reach. 12 However, a comparison of the lattice values with the experimental   [132] πN scattering; PWA; GMO E [133] np backward cross section F [134] πN scattering; PWA; DR G [135] π − p and π − d pionic atoms; GMO H [136] π − p and π − d pionic atoms; GMO I [123] π − p and π − d pionic atoms; GMO J [137] πN scattering; DR; J1 CERN data J2 TRIUMF data K [138] πN scattering; PWA; DR L [139] np, pp scattering; PWA M [140,141] ChPT analysis of πN , πd scattering length, and GMO sum rule N [32] N f = 2 + 1 DWF; RBC/UKQCD; a = 0.114 fm; dipole ansatz O [52] N f = 2 + 1 Wilson (clover) fermions; a = 0.114 fm; z-exp P [55] N f = 2 + 1 + 1 Wilson (clover-on-HISQ) fermions; PNDME; CE; EFT ansatz Q This work; N f = 2 + 1 Wilson (clover) fermions; RQCD; full resolution of N π state; CE Q1 dipole ansatz Q2 z-exp Figure 17. Compilation of data for the pion-nucleon coupling constant g πN N from experiment (A-L), ChPT (M), and lattice simulations (N-Q). We do not discriminate between charged and neutral pion-nuclon couplings here, which can be slightly different. In the lattice section we have only listed direct determinations, ignoring all results that are merely based on the Goldberger-Treiman relation [142]. Extractions based on a dipole ansatz are colored red, while those using any variant of the z-expansion are colored blue. Some lattice calculations use an EFT ansatz colored green (a pion pole term combined with a Taylor expansion). The dashed lines show the mean result of the !2P (red) and the !z 4+3 (blue) fit. For a recent review, see ref. [143]. results and with the ChPT analysis of refs. [140,141], which also includes an estimate of systematic uncertainties, can serve as a consistency check. It is thus quite encouraging that our results for g πN N from both, the !2P and the !z 4+3 fit, are in agreement with these determinations. As one can see in table 6, a meaningful determination of the Goldberger-Treiman discrepancy ∆ GT = 1 − mg A Fπg πN N is not possible with our current accuracy.

Summary
In this article we have presented a method that can control pion pole enhanced excited state contributions that occur in axial and pseudoscalar channels, which previously have not been resolved to a satisfactory degree. The technique is based on similar EFT considerations as refs. [64][65][66][67][68][69][70]78], but simultaneously minimizes the ChPT bias. The EFT analysis presented in section 2.2 is mainly used to understand the general structure of the pole enhanced N π contribution, which then can be taken into account explicitly in the spectral decomposition of the three-point functions, see section 2.3. Our numerical analysis presented in section 3 demonstrates that, using our new technique, the ground state can be extracted reliably, even at small pion masses where the pole enhanced excited state constitutes (at currently available source-sink distances) the largest contribution in some channels. We find that the extracted nucleon form factors satisfy constraints from PCAC up to small deviations of roughly 5%, which can be attributed to discretization effects. We find the PPD assumption to be fulfilled to the same degree. Note, however, that the pion pole dominance assumption for the pseudoscalar form factors is only a (seemingly very good) estimate and is not expected to be fulfilled exactly, even in the continnum. PCAC, however, has to hold exactly in the continuum. We leverage the latter information in our form factor analysis: in addition to the usual dipole ansatz and the z-expansion, we have derived (for both cases) parametrizations that are consistent with PCAC in the continuum, cf. section 4.2.3. The latter stabilize the continuum extrapolation considerably, without adding any parametrization bias.
Using a large set of available CLS ensembles, we are able to take all relevant limits (continuum limit, infinite volume limit, chiral extrapolation to physical masses) in a controlled fashion. To this end, we use generic extrapolation formulas (see section 4.2.4) for the parameters occurring in the form factor parametrization. The results at the physical point (in the continuum and for infinite volume) obtained from various form factor parametrizations are given in tables 4 and 5. For the presently available data we find that both dipole and z-expansion fits (in both cases we use that PCAC is exact in the continuum) yield a very good description of our data. The final numbers, including estimates for systematic uncertainties due to the chiral and the continuum extrapolation, can be taken from table 6. It is particularly interesting that our results for the axial mass are in agreement with the findings obtained from quasi-elastic neutrino nucleon scattering data (MiniBooNE, [100]) and that we find the same parametrization bias, i.e., larger values of the axial pole mass for dipole fits and smaller values for the z-expansion.
In figure 14, we plot the ratios r PCAC and r PPD at the physical point, where deviations from one correspond to a violation of PCAC and deviations from the PPD assumption, respectively. In particular the fits with exact PCAC in the continuum (i.e., r PCAC = 1 automatically) allow us to draw conclusions with respect to the pion pole dominance ansatz for the pseudoscalar form factors. We find that our results are consistent with the PPD ansatz independent of the choice of parametrization for the form factor. The values we extract for the induced pseudoscalar coupling at the muon capture point are in good agreement with the experimental value obtained from muon capture [18,19]. agreement no. 813942 (ITN EuroPLEx).
We used a modified version of the Chroma [147] software package along with the LibHadronAnalysis library [148] and improved inverters [85,[149][150][151]. The configurations were generated as part of the CLS effort [83,88] using openQCD (https: //luscher.web.cern.ch/luscher/openQCD/) [85]. We thank all our CLS colleagues for the joint generation of the gauge ensembles. Additional m = m s ensembles were generated with openQCD by members of the Mainz group on the Wilson and Clover HPC Clusters of IKP Mainz as well as by RQCD on the QPACE computer using the BQCD code [84]. The computation of observables was carried out on the QPACE 2 and QPACE 3 systems of the SFB/TRR-55, on the Regensburg QPACE B machine, the Regensburg HPC-cluster ATHENE 2, and at various supercomputer centers. In particular, the authors gratefully acknowledge computing time granted by the John von Neumann Institute for Computing (NIC), provided on the Booster partition of the supercomputer JURECA [152] at Jülich Supercomputing Centre (JSC, http://www.fz-juelich.de/ias/jsc/).
Regarding the generation of recent gauge ensembles, the authors gratefully acknowledge the Gauss Centre for Supercomputing (GCS) for providing computing time for GCS Large-Scale Projects on the GCS share of the two supercomputers JUQUEEN [153] and JUWELS [154] at JSC as well as on SuperMUC at Leibniz Supercomputing Centre (LRZ, https://www.lrz.de). GCS is the alliance of the three national supercomputing centres HLRS (Universität Stuttgart), JSC (Forschungszentrum Jülich), and LRZ (Bayerische Akademie der Wissenschaften), funded by the German Federal Ministry of Education and Research (BMBF) and the German State Ministries for Research of Baden-Württemberg (MWK), Bayern (StMWFK) and Nordrhein-Westfalen (MIWF).

A Traces
For the ground state contributions defined in eq. (2.43) one finds B p ,p P i + ,A µ = 2G A p i p µ + p i p µ + m(p + p) i g µ0 − g iµ (m 2 + mE + mE + p · p) + 2GP

B Fit ansatz for the subtracted currents
For the subtracted correlation functions defined in ref. [60] one inserts instead of the usual currents. Herep = (p +p)/2. By construction, this does not change the ground state contribution at all. In contrast, the excited state contributions are affected very strongly. Therefore, the fit ansatz given in eqs. (2.44) and (2.45) has to be adapted to this case. Following the same steps as discussed in detail for the standard currents in section 2.2, we find Similar to the situation with unsubtracted correlation functions, the parametrization simplifies for the particular kinematics we are using in our numerical analysis (p = 0 such that q = −p).