Constraining the GENIE model of neutrino-induced single pion production using reanalyzed bubble chamber data

The longstanding discrepancy between bubble chamber measurements of $\nu_\mu$-induced single pion production channels has led to large uncertainties in pion production cross section parameters for many years. We extend the reanalysis of pion production data in deuterium bubble chambers where this discrepancy is solved (Wilkinson et al., PRD 90 (2014) 112017) to include the $\nu_{\mu}n\rightarrow \mu^{-}p\pi^{0}$ and $\nu_{\mu}n\rightarrow \mu^{-}n\pi^{+}$ channels, and use the resulting data to fit the parameters of the GENIE (Rein-Sehgal) pion production model. We find a set of parameters that can describe the bubble chamber data better than the GENIE default parameters, and provide updated central values and reduced uncertainties for use in neutrino oscillation and cross section analyses which use the GENIE model. We find that GENIE's non-resonant background prediction has to be significantly reduced to fit the data, which may help to explain the recent discrepancies between simulation and data observed by the MINERvA coherent pion and NOvA oscillation analyses.


Introduction
A good understanding of single pion production by neutrinos with few-GeV energies is important for current and future oscillation experiments, where pion production is either a signal process, or a large background for analyses which select quasi-elastic events. At these energies the dominant production mechanism is via the production and subsequent decay of hadronic resonances.
Complete models of neutrino-nucleus single pion production interactions are usually factorized into three parts: the neutrino-nucleon cross section; additional nuclear effects which affect the initial interaction; and the "final state interactions" (FSI) of hadrons exiting the nucleus. Experimental data on nuclear targets presents a confusing picture, with recent data from the MINERνA [1,2] and MiniBooNE [3] experiments in poor agreement with each other in the framework of current theoretical models [4,5]. An additional problem is the disagreement between measurements of the neutrino-nucleon single pion production cross section in the 100 MeV to few-GeV energy range most relevant for current and planned neutrino oscillation experiments. The axial form factor for pion production on free nucleons cannot be constrained by electron scattering data, so relies upon data from the Argonne National Laboratory's 12 ft bubble chamber (ANL) and the Brookhaven National Laboratory's 7 ft bubble chamber (BNL). However, these datasets differ in normalization by 30-40% for the leading pion production process ν µ p → µ − pπ + , which leads to large uncertainties in the predictions for oscillation experiments [6][7][8][9][10][11], as well as in the interpretation of data taken on nuclear targets [12].
It has long been suspected that the discrepancy between ANL and BNL was due to an issue with the normalization of the flux prediction from one or both experiments 1 , and it has been shown by other authors that their published results are consistent within the experimental uncertainties provided [6,16]. In Reference [17], we presented a method for removing flux normalization uncertainties from the ANL and BNL ν µ p → µ − pπ + measurements by taking ratios with chargedcurrent quasielastic (CCQE) event rates in which the normalization cancels. Then we obtained a measurement of ν µ p → µ − pπ + by multiplying the ratio by an independent measurement of CCQE (which is well known for nucleon targets). Using this technique, we found good agreement between the ANL and BNL ν µ p → µ − pπ + datasets. In this work, we extend that method to include the subdominant ν µ n → µ − pπ 0 and ν µ n → µ − nπ + channels, and use the resulting data, along with the Q 2 -spectra (where Q 2 is the four-momentum transfer) from the same experiments, to constrain the parameters of the GENIE single pion production model [18]. While more sophisticated single pion production models exist [19][20][21], the GENIE generator is widely used by current and planned neutrino oscillation experiments, so tuning the generator parameters represents a pragmatic approach to improving its description of available data. We find that the reanalyzed data, where the normalization discrepancy has been resolved, is able to significantly reduce the uncertainties on the pion production parameters. We also find that the non-resonant background prediction from GENIE needs to be significantly reduced to fit the data.
Reduced uncertainties on pion production parameters are vital for current and future neutrino oscillation experiments, which have very stringent systematic uncertainty requirements [22,23]. We recommend that our new uncertainties should be used by experiments which use the GENIE neutrino interaction generator, and the reanalyzed ANL and BNL datasets presented here and in Reference [17] should be used instead of the published ANL and BNL datasets for future model comparisons.
The paper is organized as follows. In Section 2, we describe the datasets used in this analysis. In Section 3 we describe the GENIE single pion production model and compare the nominal GENIE model and error bands with the data. The χ 2 statistic which is minimized and the fit machinery are discussed in Section 4.1; the fit results are presented in Section 4.2; and there is a discussion of the goodness of fit in Section 4.3. Finally, our conclusions are presented in Section 5.

Datasets used in this analysis
In this work, we use the Q 2 and E ν -spectra from ANL and BNL for all three charged-current single pion production modes (ν µ p → µ − pπ + , ν µ n → µ − pπ 0 and ν µ n → µ − nπ + ), giving a total of twelve datasets.
The Q 2 -dependent distributions used are presented as flux-integrated event rates without any invariant mass cut applied, and were digitized from References [24] (ANL) and [25] (BNL) for this work. To produce fluxintegrated event rate predictions with GENIE, the flux was taken from References [13] (ANL) and [15] (BNL).
The E ν -dependent distributions for both ANL and BNL are taken from the reanalysis of ν µ p → µ − pπ + data presented in Reference [17], and the reanalysis of ν µ n → µ − pπ 0 and ν µ n → µ − nπ + using the same technique and presented in Appendix A. These datasets are neutrino-deuterium cross sections, as no correction has been applied to account for deuterium nuclear effects. Additionally, in Appendix B we present reanalyzed results for the three pion production channels in which an additional correction has been applied to include the effect of the invariant mass cut W ≤ 1.4 GeV on the E ν -dependent distributions. The reanalyzed results with W ≤ 1.4 GeV were not used in the present work, but we provide it for use in future model comparisons.
The E ν -dependent distributions for both ANL and BNL are shown for all three single pion production channels in Figure 1 along with other bubble chamber measurements available for these channels. The original ANL and BNL results are also shown so the effect of the reanalysis can be seen. It is clear that the reanalysis affects all channels, although the effect is more pronounced for the dominant ν µ p → µ − pπ + channel, where the statistical errors are smaller and biases are easier to see. The ANL and BNL datasets agree well in all three channels after the reanalysis. In Figures 1a and 1b, the BEBC data on a hydrogen target has an invariant mass cut of W ≤ 2 GeV [26], which removes contributions from diffractive processes. The FNAL data on a hydrogen target is selected with an invariant mass cut of W ≤ 1.4 GeV, in order to isolate the ∆ ++ contribution to the cross section, which also cuts out any diffractive contributions from the cross section [27]. Additionally, the FNAL result was scaled by 14% to account for ∆ ++ contributions with W > 1.4 GeV.
Despite the caveats associated with the subdominant ν µ n → µ − pπ 0 and ν µ n → µ − nπ + channels, we recommend that all three channels are used for future comparisons with ANL and BNL data. It should be stressed that most of the deficiencies in the reanalyzed results detailed here are also present in the original ANL and BNL results, and should be borne in mind when using reanalyzed or published ANL and BNL results.
A final note of caution regarding the use of these corrected datasets is that there is a hidden correlation between the three channels for each experiment. As the CCQE events used in the correction are common to all channels, statistical errors are correlated in a way which is difficult to quantify. An example of the problem can be seen by looking at the three ANL channels for SKAT 89 Fig. 1: The published and extracted ANL and BNL data are compared with other measurements of the three pion production channels [26][27][28][29]. All data is on hydrogen and deuterium targets, except for SKAT 1989 data which was taken on heavy freon (CF 3 Br). Note that both published and reanalyzed ANL and BNL data shown here have no invariant mass cut in the event selection, whereas the other datasets have an invariant mass cut of W ≤ 2 GeV applied unless otherwise mentioned.
1.0 ≤ E ν ≤ 1.1 GeV where an upward fluctuation of the CCQE event rate leads to a decrease in the reanalyzed cross section. It should be noted that this problem is also present in the published ANL cross section results because they used the measured CCQE event rate to correct their flux prediction. This problem is not dealt with in the fits presented in this work, we simply use the statistical errors from the reanalysis without considering correlations, but it will lead to an increase in the χ 2 of the fits (because the χ 2 penalty for a large statistical fluctuation is applied three times rather than once).

GENIE single pion production model
The single pion production model in GENIE is described in Reference [18] and does not change significantly between the GENIE major versions 2.6.X and 2.8.X (X denotes a the minor version number) investigated in this study 2 . All processes simulated in GENIE use the Bodek-Ritchie RFG model to describe the initial state nucleon momentum distribution [30] for all nuclear targets, including deuterium. In GENIE, single pion production is separated into resonant and non-resonant terms, with interference terms between the two neglected. The resonant component (RES) is a modified version of the Rein-Sehgal (R-S) model [31]. In the original R-S model, the production and subsequent decay of 18 nucleon resonances with invariant masses W ≤ 2 GeV are considered. In GENIE, only 16 resonances are included, based on the recommendation of the Particle Data Group [32]. The cross section calculation has not been modified to include lepton mass terms, but the effect of lepton mass terms on the phase space boundaries is taken into account. The cross section is cut off at a tunable invariant mass value, which is W ≤ 1.7 GeV by default (and in this study). No in-medium modifications to resonances are considered, and interferences between resonances are neglected in the calculation. By default, the resonances decay isotropically in their center of mass frame. In general, there is an additional contribution to single pion production from coherent pion production processes, which are modeled (by default) in GENIE using the Rein-Sehgal coherent model [33] with lepton mass corrections included [34]. However, for the ANL and BNL channels considered here, the selection criteria include requirements on the struck or spectator nucleon, which effectively excludes coherent pion production as the deuterium is no longer bound in the final state. As such, coherent contributions to sin-gle pion production are not considered further in this work 3 . The original Rein-Sehgal model in Reference [31] includes non-resonant single pion production as an additional resonance amplitude, while in GENIE, the nonresonant component is implemented as an extension of the deep inelastic scattering model. The non-resonant (DIS) contribution to the GENIE single pion production model is calculated using the Bodek-Yang parametrization [36], with other relevant parameters described in detail in Reference [18]. Hadronization is described by the AKGY model [37], which uses KNO scaling [38] for invariant masses of W ≤ 2.3 GeV, and PYTHIA [39] for invariant masses of W ≤ 3.0 GeV, with a smooth transistion between. The low-W KNO model is tuned to data from the Fermilab 15-foot bubble chamber experiment [40], and the high-W PYTHIA model is tuned to BEBC data [41]. We note that retuning of the PYTHIA model has been discussed elsewhere [42], although is not considered here as it affects larger W values than are relevant for this study.
A major difference between the GENIE versions 2.6.X and 2.8.X is the change to the default Final State Interaction (FSI) model, which is applied to all outgoing particles produced at the vertex for both the resonant and non-resonant contributions to single pion production. The default FSI model for both versions of GE-NIE is the hA intranuclear cascade model [18], which is tuned to π + -56 F e and p-56 F e data, then extrapolated to other targets based on A 2/3 scaling (where A is the atomic number). In GENIE v2.6.X, FSI effects are negligible for deuterium, but the hA model was retuned for GENIE v2.8.X [43], which leads the deuterium FSI to reduce the total cross section predictions for all channels relevant for this work by 20-30%. In this work we have ignored this difference and make the assumption that interactions on deuterium can be treated as interactions on quasi-free nucleons which are only loosely bound together, and so neglect FSI effects. Low-Q 2 bins (Q 2 < 0.1 GeV 2 ) are not included in the fit to avoid the region where FSI effects are expected to have a significant effect in deuterium. We note that in Reference [44] a careful study of FSI effects for pion production interactions on deuterium was carried out. This work found that interactions between the final state nucleons significantly modifies the cross section for the ν µ p → µ − pπ + and ν µ n → µ − nπ + channels, the most notable feature being a suppression of the cross section for very forward pions. A more careful treatment of FSI based on the work presented in Reference [44] would be an improvement to future iterations of this work, but the calculation does not currently predict the entire final state of the interaction so is not ready to be implemented in a generator.
In GENIE, there are a number of systematic parameters which can be varied to change the single pion production model [45]. These parameters are summarized in Table 1. Note that although GENIE allows the normalization of charged-current non-resonant single pion production on protons and neutrons separately, we have grouped them here into a single category (labelled "DIS") in the absence of any reason to treat them differently 4 .
The axial form factor used for resonant pion production in GENIE is given by where 267, Z is a renomalization factor for the axial-vector coupling constant of a quark obtained from data in Reference [46] (also considered in Reference [47]) and M RES A is the resonant axial mass, available as a parameter in GENIE's reweighting framework. The normalization of the axial form factor is not available in the GENIE reweighting framework (changing its value requires events to be regenerated with a modified value of the "RS-Zeta" parameter), but a similar parameter for the overall resonant normalization is available. Modifying F A (0) is the more physicallymotivated alternative, but as modifying the resonant normalization is more convenient for users, we perform two fits, each with one of these parameters modified.
The nominal GENIE v2.8.2 cross sections for the three single pion production channels considered in this work (ν µ p → µ − pπ + , ν µ n → µ − pπ 0 and ν µ n → µ − nπ + ) are shown as a function of the neutrino energy E ν in Figure 2, and compared with ANL and BNL data. The nominal GENIE v2.8.2 event rate predictions as a function of Q 2 , produced using the ANL and BNL fluxes, are compared separately to ANL and BNL data in Figure 3. The GENIE prediction for the Q 2 distributions is normalized separately for ANL and BNL such that the total prediction is equal to the measured rate summed over all three channels. The Q 2 predictions shown are therefore shape-only, but with the relative normalization between the different pion production channels preserved 5 . All data considered have no invariant mass cuts applied and the event selection in GENIE is based on the particles produced at the initial interaction vertex, not those surviving GENIE's FSI model as previously discussed. In Figures 2 and 3, the GENIE prediction is also shown broken down into resonant (RES) and nonresonant (DIS) contributions. Additionally, the dominant ∆ contribution to the RES component is shown separately for reference. The total GENIE prediction is the incoherent sum of the RES and DIS contributions, where interference terms have been neglected. On each plot the 1σ error band produced by combining the nominal GENIE uncertainties on M RES A , RES normalization and DIS normalization is also shown for comparison.
It is clear from Figures 2 and 3 that the nominal GENIE prediction cannot describe all of the pion production channels well for the reanalyzed datasets. In Figure 2, it is noticeable that, while the measured cross sections for the subdominant ν µ n → µ − pπ 0 and ν µ n → µ − nπ + channels are similar, there are large differences between the nominal GENIE predictions for these channels. The non-resonant component of the GENIE prediction, which contributes strongly to these channels, appears to be too large. It can be seen from Figure 3 that the nominal GENIE prediction fails to describe the low-Q 2 data well for some channels. We also note that the GENIE uncertainties are larger than the data suggests, and may be reduced by tuning the GENIE model to the ANL and BNL data. These observations motivate this work.

Fitting the GENIE model
In this section, the datasets described in Section 2 are used to constrain the GENIE model introduced in Section 3. The χ 2 statistic which is minimized is given in Section 4.1, and results are given in Section 4.2. A discussion of the goodness of fit is given in Section 4.3. The MINUIT package [48] as implemented in the ROOT library [49] is used to perform all fits.
The fits are performed separately for four GENIE configurations. Both GENIE v2.6.2 and GENIE v2.8.2 predictions are fit using all parameters available in the standard version of GENIE (as described in Table 1). A fit was performed to GENIE v2.8.2 where the normalization of the resonant axial form factor, F A (0), is used  Fig. 3: The nominal GENIE prediction is shown as a function of Q 2 for the three single pion production channels of interest, and is compared separately to the ANL and BNL data. The total prediction is broken down into the resonant (RES) and non-resonant (DIS) contributions, and additionally, the ∆-contribution to the RES component is shown. The error band shown on the total GENIE prediction shows the 1σ error bands for all GENIE default parameters given in Table 1 Table 1: Variable parameters in the GENIE single pion production model [45]. The normalization of the axial form factor is not a variable parameter in GENIE currently, but is varied in this work as described in the text.
as a fit parameter instead of the resonant normalization, as motivated in Section 3. Finally, a fit was performed to GENIE v2.8.2 without the resonant normalization or F A (0) to investigate the effect that correlations between the axial mass (which has a strong effect on the normalization of the cross section) and the normalization parameters have on the results.

χ 2 definition
No information about systematic uncertainties, correlations within datasets or correlations between datasets is available, so only statistical errors are considered for all datasets, and the function to be minimized can be expressed as a sum over the datasets included in the fit. This is reasonable as the statistical uncertainties are large. Additionally, the datasets used are efficiency corrected by the experiments, but are not unfolded, so the treatment of detector effects is likely to be inadequate 6 . For these reasons, any measure of goodness of fit should be treated as approximate.
A Poisson-likelihood statistic is used for the datasets as a function of Q 2 because many of the higher Q 2 bins have low event rates. Note that for Q 2 datasets, the sum is over the N bins with Q 2 ≥ 0.1 GeV 2 .
where n i and µ i (x) are the measured and predicted number of events in the ith bin, σ i is the statistical error on the ith bin, x are the model parameters varied in the fit and the inner summations are over the N bins of each dataset. x also contains normalization terms for ANL and BNL which affect the Q 2 datasets only. As previously remarked, the Q 2 datasets are shape-only in the fit, but the relative normalization between the three pion production modes is preserved separately for ANL and BNL.
Note that this statistic is appropriate for minimization, but χ 2 /N DOF is not strictly correct as measure of the goodness of fit because the Poisson-likelihood terms contribute constant terms to the χ 2 . A more rigorous measure of the goodness of fit is discussed in Section 4.3.  Fig. 2: The nominal GENIE prediction is shown as a function of E ν for the three single pion production channels of interest, and is compared to the corrected ANL and BNL data. The total prediction is broken down into the resonant (RES) and non-resonant (DIS) contributions, and additionally, the ∆-contribution to the RES component is shown. The error band shown on the total GENIE prediction shows the 1σ error bands for all default GENIE parameters given in Table 1 combined in quadrature (note that F A (0) is not a default GENIE parameter so is not included in the error band).

Results
Two fake data studies were performed to validate the fitter. Firstly, Asimov [50] fake data fits were produced for all four GENIE configurations considered. These provide basic validation that the fitter found the correct minimum and give the expected size of the parameter uncertainties, which can be used to validate the fit results. Secondly, pull studies were performed for all GENIE configurations to check that the test statistic is an unbiased estimator of central values and uncertainties for the parameters varied in the fits. No biases were observed.
The best fit results to all twelve datasets are shown in Table 2 for the four GENIE configurations considered in this work. The parameter uncertainties given by the fits are consistent with those predicted by the Asimov fake data study. The normalization of the non-resonant background was reduced significantly in all fits, as was expected given the nominal model comparisons shown in Section 3. The resonant axial mass, M RES A , was also reduced from the GENIE nominal value of M RES A = 1.12 GeV in all fits, although we note that there is a strong anticorrelation between M RES A and the RES normalization, and between M RES A and F A (0) (as can be seen in Figure 4). The GENIE v2.8.2 fits were also repeated using GENIE's free nucleon cross sections in order to ensure that the results are not biased by GENIE's initial state nuclear model. It was found that the results were consistent to within 1σ for all parameters in all fits.
The GENIE v2.6.2 and GENIE v2.8.2 (RES) results, with the same parameters available in the standard version of GENIE (as described in Table 1) give consistent results. The fit which uses the normalization of the axial form factor, F A (0), as a fit parameter is consistent with the other fits, although as F A (0) has a small Q 2 dependence, the correlation with M RES A is different and the value for M RES A is less suppressed than in fits with RES normalization free (which has no Q 2 dependence).
The best fit distributions for the GENIE v2.8.2 (RES) fit are shown for the ν µ p → µ − pπ + distributions in Figure 5, for the ν µ n → µ − pπ 0 distributions in Figure 6, and for the ν µ n → µ − nπ + distributions in Figure 7. Data points with Q 2 ≤ 0.1 GeV 2 are shown but not included in the χ 2 calculation.

Goodness of fit
As has been previously remarked, the χ 2 /N DOF is not an appropriate measure of the goodness of fit for Equation 2 as it involves Poisson-likelihood terms with low-    The nominal prediction is shown for reference, and the χ 2 contribution from each dataset is given in the legend for both the nominal and best fit distributions. The nominal and best fit DIS contribution to the total GENIE prediction (RES+DIS) are also shown for reference. statistics bins. In Figure 8, the expected χ 2 distribution has been produced by making 100,000 toy experiments in which a fake dataset has been produced with statistical errors thrown for all twelve datasets included in the fit. The χ 2 is calculated between each toy experiment and the nominal data (without thrown statistical errors). The p-value of any fit can be calculated by inte-grating the distribution to the right of any given χ 2 min fit value as the χ 2 distribution is independent of the fit type. Figure 8 shows the sampling distribution from this method, with the actual fit value for GENIE v2.8.2 (F A (0)), which has a p-value 10 −4 , indicating a poor fit between data and the model, even after the fit procedure.  Dataset  In Table 3, the contribution that each of the twelve datasets makes to the nominal and best fit χ 2 values is given for the GENIE v2.8.2 (RES) fit. The N DOF contributed by each dataset is also shown for comparison. It is clear that there is a disproportionate contribution from the reanalyzed E ν -dependent datasets for the subdominant channels (ν µ n → µ − pπ 0 and ν µ n → µ − nπ + ). The uncertainty on these distributions only include sta-tistical errors, which are dominant, but there are significant normalization uncertainties due to detector corrections and background subtractions which are not included. These corrections are also likely to have an effect on the shape of the distributions, but it is not possible to calculate meaningful shape-uncertainties for these effects (nor are they included in the published results from ANL or BNL). It should also be noted that there is a correlation between the three E ν -dependent datasets for both ANL and separately for BNL, introduced by the procedure for reanalyzing the datasets, although this is unlikely to be a significant issue. These issues are discussed further in Appendix A.
To ensure that including the four subdominant E νdependent datasets (ν µ n → µ − pπ 0 and ν µ n → µ − nπ + for both ANL and BNL) does not badly bias the results, the fit was repeated without these four problematic datasets included. The fit results are within 1σ of the values in Table 2, which indicates that these datasets do not bias the fit strongly. Indeed, the Q 2 -dependent and E ν -dependent distributions for each channel and experiment agree reasonably well with each other. When these four datasets are excluded, the p-value returned at the best fit point is more reasonable (∼0.02), indicating that the poor quality of fit seen in Figure 8 can be mostly attributed to these four datasets. As there is no reason to suspect that these datasets are biasing the fit, we prefer to leave these datasets in the fit and present the fit with all twelve datasets included as the main result of this work. Using both Q 2 -dependent and E ν -dependent datasets helps break the degeneracy between M RES A and normalization parameters, so there is a strong reason for including all datasets if possible.

Conclusions
ANL and BNL provide the only neutrino-nucleon data for the energies in the few-GeV region most relevant for current and future oscillation experiments. The large normalization discrepancy between them has led to large uncertainties in pion production parameters, which presents a problem for meeting the stringent error budgets required by current and future oscillation analyses. In this work, we use the reanalyzed ANL and BNL ν µ p → µ − pπ + datasets from Reference [17], where this normalization discrepancy has been solved, to constrain the GENIE single pion production model parameters. The reanalysis method from Reference [17] is applied to the subdominant pion production channels ν µ n → µ − pπ 0 and ν µ n → µ − nπ + , and Q 2 -dependent distributions for all three channels for both ANL and BNL are also used in the fits. Although the GENIE single pion model is not state of the art, it is widely used by many currently running experiments, so improvements to the parametrization are of interest to the community. Additionally, the fits described here provide a blueprint for their use in constraining other models.
We find that the uncertainty on variable model parameters can be significantly reduced with respect to the nominal GENIE parameter uncertainties [45], which were necessarily large to cover the disagreement between the published ANL and BNL datasets. A similar conclusion was found in the context of a different model in Reference [51]. The retuned uncertainties on these parameters should be used by neutrino oscillation and interaction experiments. To obtain good agreement with the data it was necessary to significantly reduce the non-resonant background normalization from the GENIE nominal prediction. The result of the GE-NIE v2.8.2 (RES) fit is compared to the global E νdependent data for the three single pion production channels of interest in Figure 9. Most of the higher energy datasets shown (described in Section 2) have an invariant mass cut of W ≤ 2 GeV applied, so the same invariant mass cut has been applied to the GENIE prediction shown. Note that the reanalyzed ANL and BNL data have no invariant mass cut applied, which should be borne in mind when interpreting Figure 9.
We note that the recent coherent pion cross section results from MINERνA [52] found a discrepancy between data and GENIE in a single pion productiondominated background sample that required significant reductions in the prediction, which may be alleviated by a reduction in the non-resonant single pion contribution as found in the fits presented here. We also note that the recent NOνA results [53] found a discrepancy between the hadronic energy distribution observed at the near detector and their GENIE simulation. There were more events in data where the hadronic system had less recoil, and fewer with high recoil, compared to the GE-NIE prediction. Although the discrepancy is treated as calibration effect in the NOνA analysis, it is more likely to be due to deficiencies in the GENIE cross section model. The retuned pion set of production parameters described in this work will ameliorate the NOνA discrepancy because it reduces the non-resonant pion pro- duction component, which will contribute events where the recoiling hadronic system has a lot of energy.
In this work, all available neutrino-nucleon single pion production data for neutrino energies below 10 GeV has been used to constrain the pion production parameters, including the reanalyzed ANL and BNL E ν data for the first time, which is a significant step forward towards reducing the cross section uncertainties on this channel to the level required for future neutrino oscillation experiments. Recent proposals to extract neutrino-proton pion production cross sections from experiments where the target material contains hydrogen [54] raise the possibility of new data which will further reduce the parameter uncertainties.
CCQE as a function of neutrino energy, without any invariant mass cuts, which we correct for detector effects using the recommendations given in the original papers. Then we take the ratio of each exclusive pion production channel to the CCQE event rate (taking the ratio cancels the flux) to get a ratio of the cross sections, and then multiply by the relatively well known CCQE cross section to obtain the cross section for each single pion production channel. Essentially, we replace the flux uncertainty in the published single pion production results with the uncertainty on the CCQE cross section, at the cost of the additional statistical uncertainty on the CCQE event rate.
For ANL, the raw event rates are digitized from References [55] (partial dataset) and [24] (full dataset) and are summarized in Table 4. The CCQE event rates are only given using a partial dataset using ∼30% of the final ANL exposure, whereas the single pion production event rates use the full ANL dataset. The dominant pion production channel ν µ p → µ − pπ + was also given in Reference [55] using the partial exposure, so the ratio of partial to full events in this channel can be used to scale the CCQE event rate to the full statistics. The final fully corrected ANL event rates for the single production channels are shown in Figure 10a. Note that the event rates given for the partial dataset are already corrected for detector effects and backgrounds. For the full dataset, the distributions are given without detector corrections applied, but the total corrected event rate is given, so the corrected event rate can simply be obtained by scaling the raw distribution (detector corrections as a function of E ν were not considered in the ANL analysis). Note also that the ANL data is mostly from a deuterium fill of the detector, but data is also included from an initial hydrogen fill of the detector, which makes up approximately 2% (6%) of the full (partial) dataset. This issue only affects the ν µ p → µ − pπ + channel (as all other channels here are on a neutron), and is discussed in Reference [17].
For BNL, the raw event rates for the single pion production datasets are digitized from Reference [25], and for CCQE from Reference [56], and are summarized in Table 5. Detector and background corrections are applied as calculated by BNL, which are only given without any E ν dependence as used in the original BNL analysis. The final fully corrected BNL event rates for the single pion production channels are shown in Figure 10b.
Using the corrected event rates in Figure 10, and the corresponding distribution for CCQE (shown in Reference [17]), it is possible to form ratios of ν µ n → µ − pπ 0 and ν µ n → µ − nπ + over CCQE, as shown in Figure 11. Finally, corrected cross sections for these subdominant  Fig. 10: The digitized event rates on deuterium for the three interaction channels ν µ p → µ − pπ + , ν µ n → µ − pπ 0 and ν µ n → µ − nπ + , as a function of the reconstructed neutrino energy E ν . The errors are statistical only. Both ANL and BNL event rates and errors have been scaled when necessary to the statistics of their full deuterium samples. pion production channels can be obtained by multiplying the ratio by the known CCQE cross section, to produce the final cross sections given in Figure 12 and used in this work. This procedure has already been applied to the ν µ p → µ − pπ + channel in Reference [17], so is not shown here. Details of the GENIE CCQE cross section for ν µ -D 2 interactions used to produce Figure 12 are also given in Reference [17].

Appendix A.2: Error analysis
For all of the digitized datasets used in this work (summarized in Tables 4 and 5), the agreement between the total digitized event rate and published event rate agrees within 1%. We assume that the effect of digitization on the shape of the event rate distributions is small, and therefore neglect digitization uncertainties in this work, as in Reference [17].  Only statistical errors are shown for the reanalyzed datasets, which are the dominant source of uncertainty for low-statistics bubble chamber data. Flux normalization uncertainties are the second largest source of uncertainty in the original ANL and BNL analyses, at around 15-20%. These uncertainties are not considered here because they cancel (by construction) when taking ratios. However, we note that we have replaced the flux uncertainty with the uncertainty in the ν µ -D 2 CCQE cross section, where the dominant uncertainty is the axial mass, M A , which can be considered to be ∼2% normalization error on the E ν distributions [57,58].
There is an uncertainty on the reconstructed neutrino energy for all channels which is estimated for BNL to be ∆Eν Eν ∼2 % for CCQE and ν µ p → µ − pπ + events [59], and ∼5 % for other charged current production channels which are not kinematically overconstrained (ν µ n → µ − pπ 0 and ν µ n → µ − nπ + ). ANL also quote an uncertainty of ∆Eν Eν ≤ 5 % for the subdominant channels, but do not quote an uncertainty on kinematically overconstrained channels [24]. This uncertainty is therefore more significant for the subdominant channels ν µ n → µ − pπ 0 and ν µ n → µ − nπ + than the dominant ν µ p → µ − pπ + channel, but for all cases, the energy smearing is ∆Eν Eν ≤ 5 %. There are additional uncertainties for all channels which come from the detector corrections and background subtractions which are discussed for ANL in Refences [24,55] and for BNL in References [25,56]. These corrections are given on the total rate only, so no information is available from either experiment on how they may distort the shape of the E ν distributions. For the overconstrained CCQE and ν µ p → µ − pπ + channels, these are mostly corrections for reconstruction and scanning inefficiencies, with small background corrections. A conservative estimate on the normalization uncertainty for the overconstrained channels is ∼5%. For the ν µ n → µ − pπ 0 and ν µ n → µ − nπ + channels, which are not kinematically overconstrained, the normalization uncertainty from the quoted correction factors are ∼10-15% for both experiments. There are significantly more backgrounds for the underconstrained channels, which makes the reanalysis of these channels more dependent on the ANL and BNL calculations than the dominant ν µ p → µ − pπ + channel. These backgrounds are from the misreconstructed CCQE and ν µ p → µ − pπ + events, multipion events with unobserved final state particles and migration between the ν µ n → µ − pπ 0 and ν µ n → µ − nπ + selections. In this analysis we neglect the normalization uncertainty from detector effects for the ν µ n → µ − pπ 0 and ν µ n → µ − nπ + channels for simplicity. Although the ∼10-15% is no longer negligible compared with the statistical errors, applying a fully correlated normalization error of this size would not change the results of this analysis significantly, and still rests on the rather simplistic assumption that all of the detector effects and backgrounds have no E ν dependence (although this assumption is present for the published ANL and BNL cross sections measurements for these channels). We note that the size of this neglected normalization error is smaller than the flux uncertainties which are canceled in this analysis.
Appendix B: Reanalyzed ANL and BNL results with an invariant mass cut of W < 1.4 GeV ANL and BNL also published cross sections with a cut on hadronic invariant mass W < 1.4 GeV, but their publications do not include event rate distributions with the same cut that would allow a similar ratio analysis as carried out in Appendix A. Instead, we use the ratio of reanalyzed to published cross sections without an invariant mass cut as a correction factor for the W < 1.4 GeV cross sections. The published and reanalyzed cross sections have different binnings, so we fit a continuous function in neutrino energy E ν to the ν µ p → µ − pπ + cross section: σ = a 0 tan −1 (a 1 E ν + a 2 ) (B.1) where a i are parameters of the fit. Figure 13 shows the published and renalyzed data sets with their fits  Fig. 12: Comparison of the ν µ n → µ − pπ 0 and ν µ n → µ − nπ + cross sections obtained by multiplying the ratio with CCQE (shown in Figure 11) by the GENIE CCQE cross section prediction for ν µ -D 2 interactions.
to Equation B.1, and the ratio of fit functions which is used to correct the W < 1.4 GeV data given in References [24] (ANL) and [25] (BNL). For E ν < 1 GeV, the value of the correction function at 1 GeV is used. Figure 14 shows the cross sections for W < 1.4 GeV with and without the correction factor applied. In the ν µ p → µ − pπ + channel, where both experiments have data, the agreement is improved by the correction method.