Constraining the GENIE model of neutrino-induced single pion production using reanalyzed bubble chamber data

The longstanding discrepancy between bubble chamber measurements of νμ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _\mu $$\end{document}-induced single pion production channels has led to large uncertainties in pion production cross section parameters for many years. We extend the reanalysis of pion production data in deuterium bubble chambers where this discrepancy is solved (Wilkinson et al., PRD 90, 112017 2014) to include the νμn→μ-pπ0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _{\mu }n\rightarrow \mu ^{-}p\pi ^{0}$$\end{document} and νμn→μ-nπ+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _{\mu }n\rightarrow \mu ^{-}n\pi ^{+}$$\end{document} channels, and use the resulting data to fit the parameters of the GENIE pion production model. We find a set of parameters that can describe the bubble chamber data better than the GENIE default parameters, and provide updated central values and reduced uncertainties for use in neutrino oscillation and cross section analyses which use the GENIE model. We find that GENIE’s non-resonant background prediction has to be significantly reduced to fit the data, which may help to explain the recent discrepancies between simulation and data observed by the MINERν\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu $$\end{document}A coherent pion and NOν\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu $$\end{document}A oscillation analyses.


Introduction
A good understanding of single pion production by neutrinos with few-GeV energies is important for current and future oscillation experiments, where pion production is either a signal process, or a large background for analyses which select quasi-elastic events. At these energies the dominant production mechanism is via the production and subsequent decay of hadronic resonances.
Complete models of neutrino-nucleus single pion production interactions are usually factorized into three parts: the neutrino-nucleon cross section; additional nuclear effects which affect the initial interaction; and the "final state interactions" (FSI) of hadrons exiting the nucleus. Experimental data on nuclear targets presents a confusing picture, with a e-mail: callum.wilkinson@lhep.unibe.ch recent data from the MINERνA [1,2] and MiniBooNE [3] experiments in poor agreement with each other in the framework of current theoretical models [4,5]. An additional problem is the disagreement between measurements of the neutrino-nucleon single pion production cross section in the 100 MeV to few-GeV energy range most relevant for current and planned neutrino oscillation experiments. The axial form factor for pion production on free nucleons cannot be constrained by electron scattering data, so it relies upon data from Argonne National Laboratory's 12 ft bubble chamber (ANL) and Brookhaven National Laboratory's 7 ft bubble chamber (BNL). However, these datasets differ in normalization by 30-40 % for the leading pion production process ν μ p → μ − pπ + , which leads to large uncertainties in the predictions for oscillation experiments [6][7][8][9][10][11], as well as in the interpretation of data taken on nuclear targets [12].
It has long been suspected that the discrepancy between ANL and BNL was due to an issue with the normalization of the flux prediction from one or both experiments, 1 and it has been shown by other authors that their published results are consistent within the experimental uncertainties provided [6,16]. In Ref. [17], we presented a method for removing flux normalization uncertainties from the ANL and BNL ν μ p → μ − pπ + measurements by taking ratios with charged-current quasi-elastic (CCQE) event rates in which the normalization cancels. Then we obtained a measurement of ν μ p → μ − pπ + by multiplying the ratio by an independent measurement of CCQE (which is well known for 1 The ANL neutrino beam [13] was produced by focusing 12.4 GeV protons onto a beryllium target. Two magnetic horns were used to focus the positive pions produced by the primary beam in the direction of the bubble chamber, these secondary particles decayed to produce a predominantly ν μ beam peaked at ∼0.5 GeV. The BNL neutrino beam [14,15] was produced by focusing 29 GeV protons on a sapphire target, with a similar two horn design to focus the secondary particles. The BNL ν μ beam had a higher peak energy of ∼1.2 GeV, and was broader than the ANL beam. nucleon targets). Using this technique, we found good agreement between the ANL and BNL ν μ p → μ − pπ + datasets. In this work, we extend that method to include the subdominant ν μ n → μ − pπ 0 and ν μ n → μ − nπ + channels, and we use the resulting data, along with the Q 2 -spectra (where Q 2 is the four-momentum transfer) from the same experiments, to constrain the parameters of the GENIE single pion production model [18]. While more sophisticated single pion production models exist [19][20][21], the GENIE generator is widely used by current and planned neutrino oscillation experiments, so tuning the generator parameters represents a pragmatic approach to improving its description of available data. We find that the reanalyzed data, where the normalization discrepancy has been resolved, is able to significantly reduce the uncertainties on the pion production parameters. We also find that the non-resonant background prediction from GENIE needs to be significantly reduced to fit the data.
Reduced uncertainties on pion production parameters are vital for current and future neutrino oscillation experiments, which have very stringent systematic uncertainty requirements [22,23]. We recommend that our new uncertainties should be used by experiments which use the GENIE neutrino interaction generator, and the reanalyzed ANL and BNL datasets presented here and in Ref. [17] should be used instead of the published ANL and BNL datasets for future model comparisons.
The paper is organized as follows. In Sect. 2, we describe the datasets used in this analysis. In Sect. 3 we describe the GENIE single pion production model and compare the nominal GENIE model and error bands with the data. The χ 2 statistic which is minimized and the fit machinery are discussed in Sect. 4.1; the fit results are presented in Sect. 4.2; and there is a discussion of the goodness of fit in Sect. 4.3. Finally, our conclusions are presented in Sect. 5.

Datasets used in this analysis
In this work, we use the Q 2 and E ν -spectra from ANL and BNL for all three charged-current single pion production modes (ν μ p → μ − pπ + , ν μ n → μ − pπ 0 and ν μ n → μ − nπ + ), giving a total of 12 datasets.
The Q 2 -dependent distributions used are presented as flux-integrated event rates without any invariant mass cut applied, and were digitized from Refs. [24] (ANL) and [25] (BNL) for this work. To produce flux-integrated event rate predictions with GENIE, the flux was taken from Refs. [13] (ANL) and [15] (BNL).
The E ν -dependent distributions for both ANL and BNL are taken from the reanalysis of ν μ p → μ − pπ + data presented in Ref. [17] and the reanalysis of ν μ n → μ − pπ 0 and ν μ n → μ − nπ + using the same technique and pre-sented in Appendix A. These datasets are neutrino-deuterium cross sections, as no correction has been applied to account for deuterium nuclear effects. Additionally, in Appendix B we present reanalyzed results for the three pion production channels in which an additional correction has been applied to include the effect of the invariant mass cut W ≤ 1.4 GeV on the E ν -dependent distributions. The reanalyzed results with W ≤ 1.4 GeV were not used in the present work, but we provide it for use in future model comparisons.
The E ν -dependent distributions for both ANL and BNL are shown for all three single pion production channels in Fig. 1 along with other bubble chamber measurements available for these channels. The original ANL and BNL results are also shown so the effect of the reanalysis can be seen. It is clear that the reanalysis affects all channels, although the effect is more pronounced for the dominant ν μ p → μ − pπ + channel, where the statistical errors are smaller and biases are easier to see. The ANL and BNL datasets agree well in all three channels after the reanalysis. In Fig. 1a, b, the BEBC data on a hydrogen target has an invariant mass cut of W ≤ 2 GeV [26], which removes contributions from diffractive processes. The FNAL data on a hydrogen target is selected with an invariant mass cut of W ≤ 1.4 GeV, in order to isolate the ++ contribution to the cross section, which also cuts out any diffractive contributions from the cross section [27]. Additionally, the FNAL result was scaled by 14 % to account for ++ contributions with W > 1.4 GeV.
Despite the caveats associated with the subdominant ν μ n → μ − pπ 0 and ν μ n → μ − nπ + channels, we recommend that all three channels are used for future comparisons with ANL and BNL data. It should be stressed that most of the deficiencies in the reanalyzed results detailed here are also present in the original ANL and BNL results, and should be borne in mind when using reanalyzed or published ANL and BNL results.
A final note of caution regarding the use of these corrected datasets is that there is a hidden correlation between the three channels for each experiment. As the CCQE events used in the correction are common to all channels, statistical errors are correlated in a way which is difficult to quantify. An example of the problem can be seen by looking at the three ANL channels for 1.0 ≤ E ν ≤ 1.1 GeV where an upward fluctuation of the CCQE event rate leads to a decrease in the reanalyzed cross section. It should be noted that this problem is also present in the published ANL cross section results because they used the measured CCQE event rate to correct their flux prediction. This problem is not dealt with in the fits presented in this work, we simply use the statistical errors from the reanalysis without considering correlations, but it will lead to an increase in the χ 2 of the fits (because the χ 2 penalty for a large statistical fluctuation is applied three times rather than once). The published and extracted ANL and BNL data are compared with other measurements of the three pion production channels [26][27][28][29]. All data is on hydrogen and deuterium targets, except for SKAT 1989 data which was taken on heavy freon (CF 3 Neutrino energy (GeV) SKAT 89

GENIE single pion production model
The single pion production model in GENIE is described in Ref. [18] and does not change significantly between the GENIE major versions 2.6.X and 2.8.X (X denotes the minor version number) investigated in this study. 2 All processes simulated in GENIE use the Bodek-Ritchie RFG model to describe the initial state nucleon momentum distribution [30] for all nuclear targets, including deuterium. In GENIE, single 2 We include both versions as a sanity check which ensures that our results are consistent between the GENIE major versions used by currently running experiments.
pion production is separated into resonant and non-resonant terms, with interference terms between the two neglected. The resonant component (RES) is a modified version of the Rein-Sehgal (R-S) model [31]. In the original R-S model, the production and subsequent decay of 18 nucleon resonances with invariant masses W ≤ 2 GeV are considered. In GENIE, only 16 resonances are included, based on the recommendation of the Particle Data Group [32]. The cross section calculation has not been modified to include lepton mass terms, but the effect of lepton mass terms on the phase space boundaries is taken into account. The cross section is cut off at a tunable invariant mass value, which is W ≤ 1.7 GeV by default (and in this study). No in-medium modifications to resonances are considered, and interferences between resonances are neglected in the calculation. By default, the resonances decay isotropically in their center of mass frame. In general, there is an additional contribution to single pion production from coherent pion production processes, which are modeled (by default) in GENIE using the Rein-Sehgal coherent model [33] with lepton mass corrections included [34]. However, for the ANL and BNL channels considered here, the selection criteria include requirements on the struck or spectator nucleon, which effectively excludes coherent pion production as the deuterium is no longer bound in the final state. As such, coherent contributions to single pion production are not considered further in this work. 3 The original Rein-Sehgal model in Ref. [31] includes non-resonant single pion production as an additional resonance amplitude, while in GENIE the non-resonant component is implemented as an extension of the deep inelastic scattering model. The non-resonant (DIS) contribution to the GENIE single pion production model is calculated using the Bodek-Yang parametrization [36], with other relevant parameters described in detail in Ref. [18]. Hadronization is described by the AKGY model [37], which uses KNO scaling [38] for invariant masses of W ≤ 2.3 GeV, and PYTHIA [39] for invariant masses of W ≤ 3.0 GeV, with a smooth transition in between. The low-W KNO model is tuned to data from the Fermilab 15-foot bubble chamber experiment [40], and the high-W PYTHIA model is tuned to BEBC data [41]. We note that retuning of the PYTHIA model has been discussed elsewhere [42], although is not considered here as it affects larger W values than are relevant for this study.
A major difference between the GENIE versions 2.6.X and 2.8.X is the change to the default Final State Interac-tion (FSI) model, which is applied to all outgoing particles produced at the vertex for both the resonant and nonresonant contributions to single pion production. The default FSI model for both versions of GENIE is the hA intranuclear cascade model [18], which is tuned to π + -56 Fe and p- 56 Fe data, then extrapolated to other targets based on A 2/3 scaling (where A is the atomic number). In GENIE v2.6.X, FSI effects are negligible for deuterium, but the hA model was retuned for GENIE v2.8.X [43], which leads the deuterium FSI to reduce the total cross section predictions for all channels relevant for this work by 20-30 %. In this work we have ignored this difference and make the assumption that interactions on deuterium can be treated as interactions on quasi-free nucleons which are only loosely bound together, and so neglect FSI effects. Low-Q 2 bins (Q 2 < 0.1 GeV 2 ) are not included in the fit to avoid the region where FSI effects are expected to have a significant effect in deuterium. We note that in Ref. [44] a careful study of FSI effects for pion production interactions on deuterium was carried out. This work found that interactions between the final state nucleons significantly modifies the cross section for the ν μ p → μ − pπ + and ν μ n → μ − nπ + channels, the most notable feature being a suppression of the cross section for very forward pions. A more careful treatment of FSI based on the work presented in Ref. [44] would be an improvement to future iterations of this work, but the calculation does not currently predict the entire final state of the interaction so is not ready to be implemented in a generator.
In GENIE, there are a number of systematic parameters which can be varied to change the single pion production model [45]. These parameters are summarized in Table 1. Note that although GENIE allows the normalization of charged-current non-resonant single pion production on protons and neutrons separately, we have grouped them here into a single category (labeled "DIS") in the absence of any reason to treat them differently. 4 The axial form factor used for resonant pion production in GENIE is given by where 267, Z is a renomalization factor for the axial-vector coupling constant of a quark obtained from data in Ref. [46] (also considered in Ref. [47]) and M RES A is the resonant axial mass, available as a parameter in GENIE's reweighting framework. The normalization of the axial form factor is not available in the GENIE reweighting framework (changing its value requires events to be regenerated with a modified value of the "RS-Zeta" parameter), but a similar parameter for the overall resonant normalization is available. Modifying F A (0) is the more physically motivated alternative, but as modifying the resonant normalization is more convenient for users, we perform two fits, each with one of these parameters modified.
The nominal GENIE v2.8.2 cross sections for the three single pion production channels considered in this work (ν μ p → μ − pπ + , ν μ n → μ − pπ 0 and ν μ n → μ − nπ + ) are shown as a function of the neutrino energy E ν in Fig. 2, and compared with ANL and BNL data. The nominal GENIE v2.8.2 event rate predictions as a function of Q 2 , produced using the ANL and BNL fluxes, are compared separately to ANL and BNL data in Fig. 3. The GENIE prediction for the Q 2 distributions is normalized separately for ANL and BNL such that the total prediction is equal to the measured rate summed over all three channels. The Q 2 predictions shown are therefore shape-only, but with the relative normalization between the different pion production channels preserved. 5 All data considered have no invariant mass cuts applied and the event selection in GENIE is based on the particles produced at the initial interaction vertex, not those surviving GENIE's FSI model as previously discussed. In Figs. 2 and 3, the GENIE prediction is also shown broken down into resonant (RES) and non-resonant (DIS) contributions. Additionally, the dominant contribution to the RES component is shown separately for reference. The total GENIE prediction is the incoherent sum of the RES and DIS contributions, where interference terms have been neglected. On each plot the 1σ error band produced by combining the nominal GENIE uncertainties on M RES A , RES normalization, and DIS normalization is also shown for comparison. Footnote 4 continued prediction for interactions on both a target neutron and target proton, and as fully correlated with the corresponding antineutrino dials. 5 It is not possible to make a correctly normalized event rate prediction because neither experiment gives sufficient information on the number of target nucleons in the bubble chamber.
Neutrino energy (GeV) The nominal GENIE prediction is shown as a function of E ν for the three single pion production channels of interest, and is compared to the corrected ANL and BNL data. The total prediction is broken down into the resonant (RES) and non-resonant (DIS) contributions, and additionally, the -contribution to the RES component is shown.
The error band shown on the total GENIE prediction shows the 1σ error bands for all default GENIE parameters given in Table 1 combined in quadrature (note that F A (0) is not a default GENIE parameter so is not included in the error band) It is clear from Figs. 2 and 3 that the nominal GENIE prediction cannot describe all of the pion production channels well for the reanalyzed datasets. In Fig. 2, it is noticeable that, while the measured cross sections for the subdominant The nominal GENIE prediction is shown as a function of Q 2 for the three single pion production channels of interest, and is compared separately to the ANL and BNL data. The total prediction is broken down into the resonant (RES) and non-resonant (DIS) contributions, and additionally, the -contribution to the RES component is shown.
The error band shown on the total GENIE prediction shows the 1σ error bands for all GENIE default parameters given in Table 1 combined in quadrature (note that F A (0) is not a default GENIE parameter, so it is not included in the error band) ν μ n → μ − pπ 0 and ν μ n → μ − nπ + channels are similar, there are large differences between the nominal GENIE predictions for these channels. The non-resonant component of the GENIE prediction, which contributes strongly to these channels, appears to be too large. It can be seen from Fig. 3 that the nominal GENIE prediction fails to describe the low-Q 2 data well for some channels. We also note that the GENIE uncertainties are larger than the data suggests, and they may be reduced by tuning the GENIE model to the ANL and BNL data. These observations motivate this work.

Fitting the GENIE model
In this section, the datasets described in Sect. 2 are used to constrain the GENIE model introduced in Sect. 3. The χ 2 statistic which is minimized is given in Sect. 4.1, and results are given in Sect. 4.2. A discussion of the goodness of fit is given in Sect. 4.3. The MINUIT package [48] as implemented in the ROOT library [49] is used to perform all fits.
The fits are performed separately for four GENIE configurations. Both GENIE v2.6.2 and GENIE v2.8.2 predictions are fit using all parameters available in the standard version of GENIE (as described in Table 1). A fit was performed to GENIE v2.8.2 where the normalization of the resonant axial form factor, F A (0), is used as a fit parameter instead of the resonant normalization, as motivated in Sect. 3. Finally, a fit was performed to GENIE v2.8.2 without the resonant normalization or F A (0) to investigate the effect that correlations between the axial mass (which has a strong effect on the normalization of the cross section) and the normalization parameters have on the results.

χ 2 definition
No information as regards systematic uncertainties, correlations within datasets or correlations between datasets is available, so only statistical errors are considered for all datasets, and the function to be minimized can be expressed as a sum over the datasets included in the fit. This is reasonable as the statistical uncertainties are large. Additionally, the datasets used are efficiency corrected by the experiments, but are not unfolded, so the treatment of detector effects is likely to be inadequate. 6 For these reasons, any measure of goodness of fit should be treated as approximate.
A Poisson-likelihood statistic is used for the datasets as a function of Q 2 because many of the higher Q 2 bins have low event rates. Note that for Q 2 datasets, the sum is over the N bins with Q 2 ≥ 0.1 GeV 2 .
where n i and μ i (x) are the measured and predicted number of events in the ith bin, σ i is the statistical error on the ith bin, x are the model parameters varied in the fit and the inner summations are over the N bins of each dataset. x also contains normalization terms for ANL and BNL which affect the Q 2 datasets only. As previously remarked, the Q 2 datasets are shape-only in the fit, but the relative normalization between the three pion production modes is preserved separately for ANL and BNL. Note that this statistic is appropriate for minimization, but χ 2 /N DOF is not strictly correct as measure of the goodness of fit because the Poisson-likelihood terms contribute constant terms to the χ 2 . A more rigorous measure of the goodness of fit is discussed in Sect. 4.3.

Results
Two fake data studies were performed to validate the fitter. First of all, Asimov [50] fake data fits were produced for all four GENIE configurations considered. These provide basic validation that the fitter found the correct minimum and give the expected size of the parameter uncertainties, which can be used to validate the fit results. Second, pull studies were  The best fit results to all 12 datasets are shown in Table 2 for the four GENIE configurations considered in this work. The parameter uncertainties given by the fits are consistent with those predicted by the Asimov fake data study. The normalization of the non-resonant background was reduced significantly in all fits, as was expected given the nominal model comparisons shown in Sect. 3. The resonant axial mass, M RES A , was also reduced from the GENIE nominal value of M RES A = 1.12 GeV in all fits, although we note that there is a strong anticorrelation between M RES A and the RES normalization, and between M RES A and F A (0) (as can be seen in Fig. 4). The GENIE v2.8.2 fits were also repeated using GENIE's free nucleon cross sections in order to ensure that the results are not biased by GENIE's initial state nuclear model. It was found that the results were consistent to within 1σ for all parameters in all fits.
The GENIE v2.6.2 and GENIE v2.8.2 (RES) results, with the same parameters available in the standard version of GENIE (as described in Table 1) give consistent results. The fit which uses the normalization of the axial form factor, F A (0), as a fit parameter is consistent with the other fits,  is less suppressed than in fits with RES normalization free (which has no Q 2 dependence).
The best fit distributions for the GENIE v2.8.2 (RES) fit are shown for the ν μ p → μ − pπ + distributions in Fig. 5, for the ν μ n → μ − pπ 0 distributions in Fig. 6, and for the Table 3 Contributions to the nominal χ 2 for GENIE v2.8.2 and to the best fit χ 2 min for the GENIE v2.8.2 (RES) fit from each of the 12 datasets included in the fit.  Fig. 7. Data points with Q 2 ≤ 0.1 GeV 2 are shown but not included in the χ 2 calculation.

Goodness of fit
As has been previously remarked, the χ 2 /N DOF is not an appropriate measure of the goodness of fit for Eq. 2 as it involves Poisson-likelihood terms with low-statistics bins. In Fig. 8, the expected χ 2 distribution has been produced by making 100,000 toy experiments in which a fake dataset has been produced with statistical errors thrown for all 12 datasets included in the fit. The χ 2 is calculated between each toy experiment and the nominal data (without thrown statistical errors). The p-value of any fit can be calculated by integrating the distribution to the right of any given χ 2 min fit value as the χ 2 distribution is independent of the fit type. Figure 8 shows the sampling distribution from this method, with the actual fit value for GENIE v2.8.2 (F A (0)), which has a p-value 10 −4 , indicating a poor fit between data and the model, even after the fit procedure.
In Table 3, the contribution that each of the 12 datasets makes to the nominal and best fit χ 2 values is given for the GENIE v2.8.2 (RES) fit. The N DOF contributed by each dataset is also shown for comparison. It is clear that there is a disproportionate contribution from the reanalyzed E νdependent datasets for the subdominant channels (ν μ n → μ − pπ 0 and ν μ n → μ − nπ + ). The uncertainty on these distributions only includes statistical errors, which are dominant, but there are significant normalization uncertainties due to detector corrections and background subtractions which are not included. These corrections are also likely to have an effect on the shape of the distributions, but it is not possible to calculate meaningful shape-uncertainties for these effects (nor are they included in the published results from ANL or BNL). It should also be noted that there is a correlation between the three E ν -dependent datasets for both ANL and separately for BNL, introduced by the procedure for reanalyzing the datasets, although this is unlikely to be a significant issue. These issues are discussed further in Appendix A.
To ensure that including the four subdominant E νdependent datasets (ν μ n → μ − pπ 0 and ν μ n → μ − nπ + for both ANL and BNL) does not badly bias the results, the fit was repeated without these four problematic datasets included. The fit results are within 1σ of the values in Table 2, which indicates that these datasets do not bias the fit strongly. Indeed, the Q 2 -dependent and E ν -dependent distributions for each channel and experiment agree reasonably well with each other. When these four datasets are excluded, the pvalue returned at the best fit point is more reasonable (∼0.02), indicating that the poor quality of fit seen in Fig. 8 can be mostly attributed to these four datasets. As there is no reason to suspect that these datasets are biasing the fit, we prefer to leave these datasets in the fit and present the fit with all 12 datasets included as the main result of this work. Using both Q 2 -dependent and E ν -dependent datasets helps to break the degeneracy between M RES A and normalization parameters, so there is a strong reason for including all datasets if possible.

Conclusions
ANL and BNL provide the only neutrino-nucleon data for the energies in the few-GeV region most relevant for current and future oscillation experiments. The large normalization discrepancy between them has led to large uncertainties in pion production parameters, which presents a problem for meeting the stringent error budgets required by current and future oscillation analyses. In this work, we use the reanalyzed ANL and BNL ν μ p → μ − pπ + datasets from Ref. [17], where this normalization discrepancy has been solved, to constrain the GENIE single pion production model parameters. The reanalysis method from Ref. [17] is applied to the subdominant pion production channels ν μ n → μ − pπ 0 and ν μ n → μ − nπ + , and Q 2 -dependent distributions for all three channels for both ANL and BNL are also used in the fits. Although the GENIE single pion model is not state of the art, it is widely used by many currently running experiments, so improvements to the parametrization are of interest to the community. Additionally, the fits described here provide a blueprint for their use in constraining other models. The global dataset (described in Sect. 2) is compared with the best fit result and post-fit uncertainties for the GENIE v2.8.2 (RES) fit, for the three single pion production channels investigated in this work. Note that the reanalyzed ANL and BNL data shown have no invariant mass cut in the event selection, whereas the other datasets have an invariant mass cut of W ≤ 2 GeV applied unless otherwise mentioned. An invariant mass cut of W ≤ 2 GeV has been applied to the GENIE prediction for this comparison We find that the uncertainty on the variable model parameters can be significantly reduced with respect to the nominal GENIE parameter uncertainties [45], which were necessarily large to cover the disagreement between the published ANL and BNL datasets. A similar conclusion was found in the context of a different model in Ref. [51]. The retuned uncertainties on these parameters should be used by neutrino oscillation and interaction experiments. To obtain good agreement with the data it was necessary to significantly reduce the non-resonant background normalization from the GENIE nominal prediction. The result of the GENIE v2.8.2 (RES) fit is compared to the global E νdependent data for the three single pion production channels of interest in Fig. 9. Most of the higher energy datasets shown (described in Sect. 2) have an invariant mass cut of W ≤ 2 GeV applied, so the same invariant mass cut has been applied to the GENIE prediction shown. Note that the reanalyzed ANL and BNL data have no invariant mass cut applied, which should be borne in mind when interpreting Fig. 9.
We note that the recent coherent pion cross section results from MINERνA [52] found a discrepancy between data and GENIE in a single pion production-dominated background sample that required significant reductions in the prediction, which may be alleviated by a reduction in the non-resonant single pion contribution as found in the fits presented here. We also note that the recent NOνA results [53] found a discrepancy between the hadronic energy distribution observed at the near detector and their GENIE simulation. There were more events in data where the hadronic system had less recoil, and fewer with high recoil, compared to the GENIE prediction. Although the discrepancy is treated as calibration effect in the NOνA analysis, it is more likely to be due to deficiencies in the GENIE cross section model. The retuned pion set of production parameters described in this work will ameliorate the NOνA discrepancy because it reduces the non-resonant pion production component, which will contribute events where the recoiling hadronic system has a lot of energy.
In this work, all available neutrino-nucleon single pion production data for neutrino energies below 10 GeV has been used to constrain the pion production parameters, including the reanalyzed ANL and BNL E ν data for the first time, which is a significant step forward toward reducing the cross section uncertainties on this channel to the level required for future neutrino oscillation experiments. Recent proposals to extract neutrino-proton pion production cross sections from experiments where the target material contains hydrogen [54] raise the possibility of new data, which will further reduce the parameter uncertainties.
In Ref. [17] we presented a method for removing flux uncertainties from the ANL and BNL bubble chamber datasets, which was applied to both the CC-inclusive and the ν μ p → μ − pπ + cross sections from ANL and BNL. For the fitting work discussed in this work, it is desirable to extend this analysis to the subdominant pion production cross sections ν μ n → μ − nπ + and ν μ n → μ − pπ 0 .
We note that for these subdominant channels, where one of the particles produced at the vertex is unobservable in a bubble chamber, we have to rely more heavily on the ANL and BNL reconstruction and particle identification methods than with the dominant ν μ p → μ − pπ + interaction where all interaction products can generally be observed. 7 It is not possible to accurately assess systematic errors for these selections, so we only quote statistical errors, which are likely to be dominant for all channels.
Appendix A.1 Obtaining corrected cross sections A full description of the method can be found in Ref. [17]. In brief, we take the event rates from ANL and BNL for the exclusive pion production channels ν μ p → μ − pπ + , ν μ n → μ − pπ 0 and ν μ n → μ − nπ + and CCQE as a function of neutrino energy, without any invariant mass cuts, which we correct for detector effects using the recommendations given in the original papers. We take the ratio of each exclusive pion production channel to the CCQE event rate (taking the ratio cancels the flux) to get a ratio of the cross sections, and then we multiply by the relatively well known CCQE cross section to obtain the cross section for each single pion production channel. Essentially, we replace the flux uncertainty in the published single pion production results with the uncertainty on the CCQE cross section, at the cost of the additional statistical uncertainty on the CCQE event rate. For ANL, the raw event rates are digitized from Refs. [55] (partial dataset) and [24] (full dataset) and are summarized in Table 4. The CCQE event rates are only given using a partial dataset using ∼30 % of the final ANL exposure, whereas the single pion production event rates use the full ANL dataset. The dominant pion production channel ν μ p → μ − pπ + was also given in Ref. [55] using the partial exposure, so the ratio of partial to full events in this channel can be used to scale the CCQE event rate to the full statistics. The final fully corrected ANL event rates for the single production channels are shown in Fig. 10a. Note that the event rates given for the partial dataset are already corrected for detector effects and backgrounds. For the full dataset, the distributions are given without detector corrections applied, but the total corrected event rate is given, so the corrected event rate can simply be obtained by scaling the raw distribution (detector corrections as a function of E ν were not considered in the ANL analysis). Note also that the ANL data is mostly from a deuterium fill of the detector, but data is also included from an initial hydrogen fill of the detector, which makes up approximately 2 % (6 %) of the full (partial) dataset. This issue only affects the ν μ p → μ − pπ + channel (as all other channels here are on a neutron), and is discussed in Ref. [17].
For BNL, the raw event rates for the single pion production datasets are digitized from Ref. [25], and for CCQE from Ref. [56], and are summarized in Table 5. Detector and background corrections are applied as calculated by BNL, which are only given without any E ν dependence as used in the original BNL analysis. The final fully corrected BNL event rates for the single pion production channels are shown in Fig. 10b.
Using the corrected event rates in Fig. 10, and the corresponding distribution for CCQE (shown in Ref. [17]), it is possible to form ratios of ν μ n → μ − pπ 0 and ν μ n → μ − nπ + over CCQE, as shown in Fig. 11. Finally, corrected cross sections for these subdominant pion production channels can be obtained by multiplying the ratio by the known CCQE cross section, to produce the final cross sections given in Fig. 12 and used in this work. This procedure has already been applied to the ν μ p → μ − pπ + channel in Ref. [17], so is not shown here. Details of the GENIE CCQE cross section  Fig. 10 The digitized event rates on deuterium for the three interaction channels ν μ p → μ − pπ + , ν μ n → μ − pπ 0 and ν μ n → μ − nπ + , as a function of the reconstructed neutrino energy E ν . The errors are statistical only. Both ANL and BNL event rates and errors have been scaled when necessary to the statistics of their full deuterium samples for ν μ -D 2 interactions used to produce Fig. 12 are also given in Ref. [17].

Appendix A.2 Error analysis
For all of the digitized datasets used in this work (summarized in Tables 4 and 5), the agreement between the total digitized event rate and published event rate agrees within 1 %. We assume that the effect of digitization on the shape of the event  Fig. 11 The ratio of ν μ n → μ − pπ 0 and ν μ n → μ − nπ + events to CCQE events as a function of E ν for both ANL and BNL rate distributions is small, and therefore neglect digitization uncertainties in this work, as in Ref. [17]. Only statistical errors are shown for the reanalyzed datasets, which are the dominant source of uncertainty for low-statistics bubble chamber data. Flux normalization uncertainties are the second largest source of uncertainty in the original ANL and BNL analyses, at around 15-20 %. These uncertainties are not considered here because they cancel (by construction) when taking ratios. However, we note that we have replaced the flux uncertainty with the uncertainty in the ν μ -D 2 CCQE cross section, where the dominant uncertainty is the axial mass, M A , which can be considered to be ∼2 % normalization error on the E ν distributions [57,58].
There is an uncertainty on the reconstructed neutrino energy for all channels which is estimated for BNL to be E ν E ν ∼ 2 % for CCQE and ν μ p → μ − pπ + events [59], and ∼5 % for other charged-current production channels which are not kinematically overconstrained (ν μ n → μ − pπ 0 and ν μ n → μ − nπ + ). ANL also quote an uncertainty of E ν E ν ≤ 5 % for the subdominant channels, but do not quote an uncer- Comparison of the ν μ n → μ − pπ 0 and ν μ n → μ − nπ + cross sections obtained by multiplying the ratio with CCQE (shown in Figure 11) by the GENIE CCQE cross section prediction for ν μ -D 2 interactions tainty on kinematically overconstrained channels [24]. This uncertainty is therefore more significant for the subdominant channels ν μ n → μ − pπ 0 and ν μ n → μ − nπ + than the dominant ν μ p → μ − pπ + channel, but for all cases, the energy smearing is E ν E ν ≤ 5 %. There are additional uncertainties for all channels which come from the detector corrections and background subtractions which are discussed for ANL in Refs. [24,55] and for BNL in Refs. [25,56]. These corrections are given on the total rate only, so no information is available from either experiment on how they may distort the shape of the E ν distributions. For the overconstrained CCQE and ν μ p → μ − pπ + channels, these are mostly corrections for reconstruction and scanning inefficiencies, with small background corrections. A conservative estimate on the normalization uncertainty for the overconstrained channels is ∼5 %. For the ν μ n → μ − pπ 0 and ν μ n → μ − nπ + channels, which are not kinematically overconstrained, the normalization uncertainty from the quoted correction factors are ∼10-15 % for both experiments. There are significantly more backgrounds for the underconstrained channels, which makes the reanalysis of these channels more dependent on the ANL and BNL calculations than the dominant ν μ p → μ − pπ + channel. These backgrounds are from the misreconstructed CCQE and ν μ p → μ − pπ + events, multipion events with unobserved final state particles and migration between the ν μ n → μ − pπ 0 and ν μ n → μ − nπ + selections. In this analysis we neglect the normalization uncertainty from detector effects for the ν μ n → μ − pπ 0 and ν μ n → μ − nπ + GeV with and without the correction described in the text channels for simplicity. Although the ∼10-15 % value is no longer negligible compared with the statistical errors, applying a fully correlated normalization error of this size would not change the results of this analysis significantly and still rests on the rather simplistic assumption that all of the detector effects and backgrounds have no E ν dependence (although this assumption is present for the published ANL and BNL cross sections measurements for these channels). We note that the size of this neglected normalization error is smaller than the flux uncertainties which are canceled in this analysis.

Appendix B Reanalyzed ANL and BNL results with an invariant mass cut of W < 1.4 GeV
ANL and BNL also published cross sections with a cut on hadronic invariant mass W < 1.4 GeV, but their publications do not include event rate distributions with the same cut as would allow a similar ratio analysis to the analysis carried out in Appendix A. Instead, we use the ratio of reanalyzed to published cross sections without an invariant mass cut as a correction factor for the W < 1.4 GeV cross sections. The published and reanalyzed cross sections have different binnings, so we fit a continuous function of the neutrino energy E ν to the ν μ p → μ − pπ + cross section: where a i are the parameters of the fit. Figure 13 shows the published and reanalyzed datasets with their fits to Eq. B.1, and the ratio of fit functions which is used to correct the W < 1.4 GeV data given in Refs. [24] (ANL) and [25] (BNL). For E ν < 1 GeV, the value of the correction function at 1 GeV is used. Figure 14 shows the cross sections for W < 1.4 GeV with and without the correction factor applied. In the ν μ p → μ − pπ + channel, where both experiments have data, the agreement is improved by the correction method.