Neural-network-driven proton decay sensitivity in the p →ν¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{\nu} $$\end{document}K+ channel using large liquid argon time projection chambers

We report on an updated sensitivity for proton decay via p → ν¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{\nu} $$\end{document}K+ at large, dual phase liquid argon time projection chambers (LAr TPCs). Our work builds on a previous study in which several nucleon decay modes have been simulated and analyzed [1]. At the time several assumptions were needed to be made on the detector and the backgrounds. Since then, the community has made progress in defining these, and the computing power available enables us to fully simulate and reconstruct large samples in order to perform a better estimate of the sensitivity to proton decay. In this work, we examine the benchmark channel p → ν¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{\nu} $$\end{document}K+, which was previously found to be one of the cleanest channels. Using an improved neutrino event generator and a fully simulated LAr TPC detector response combined with a dedicated neural network for kaon identification, we demonstrate that a lifetime sensitivity of τ /Br (p → ν¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{\nu} $$\end{document}K+) > 7 × 1034 years at 90% confidence level can be reached at an exposure of 1 megaton · year in quasi-background-free conditions, confirming the superiority of the LAr TPC over other technologies to address the challenging proton decay modes.


Introduction
Direct experimental observation of proton decay would constitute evidence for Grand Unification, in which the electromagnetic, weak and strong interactions are combined into a single gauge with new massive bosons X, Y as force carriers. The minimal SU(5) is the simplest Grand Unified Theory (GUT) and enables proton decay via the transformation of two up quarks into a lepton and anti-quark through the exchange of an X boson, predicting a lifetime of τ /Br ≈ 10 31 years for p → e + π 0 [4].
Using a simplified simulation and making several assumptions on the detector design, we have found in a previous study that large dual phase (DP) LAr TPCs [2,3] can reach a lower lifetime limit of τ /Br > 10 35 years at 90 % CL in the p →νK + channel at an exposure of 1 megaton · year [1]. In this paper, we update our result using an improved JHEP04(2021)243 event generator, a well-defined design for a ∼10 kiloton DP LAr TPC, a validated detector simulation based on data of the 3 × 1 × 1 m 3 DP LAr TPC prototype [17,18], a full reconstruction with aided pattern recognition and a neural-network-driven kaon identification. The DP LAr TPC detector design used in this study is considered as option for the Deep Underground Neutrino Experiment (DUNE) far detector complex, which will deploy a total of four ∼10 kiloton single and dual phase LAr TPCs [19]. The main improvements in the event generator are owed to more precise models for neutrino-nucleus interactions at the GeV scale that are tuned to recent high-statistics neutrino cross section measurements, see e.g. reference [20]. In particular, the production of kaons in neutrino-nucleus interactions, which constitutes an important background for proton decay searches via p →νK + , is better understood (see section 2.1).

Signal and backgrounds
Proton decay via p →νK + in argon constitutes the signal and atmospheric neutrino interactions with argon are considered as background. An accurate modeling of the argon nucleus is essential to the presented proton decay sensitivity study and the signal and background event samples are therefore simulated with the event generator toolkit GENIE [21]. The simulation workflows for both signal and background events include the modeling of the initial state of the argon nucleus in terms of nucleon density, momentum distribution and binding energy as well as the intranuclear propagation of particles emerging inside the nucleus. The momentum distribution and binding energy are modeled together with a so-called spectral function. Furthermore, the background simulation includes the atmospheric neutrino flux and neutrino-argon interaction models. Except for the neutrino flux, all aforementioned processes are implemented in GENIE and different models are available for each process. Consistent combinations of the interdependent processes are combined within so-called GENIE tunes, and the sensitivity study is carried out for signal and background samples generated with two different tunes in order to assess systematic uncertainties related to the event generation, see table 1.
The HKKM2014 atmospheric neutrino flux at solar maximum for the Sanford Underground Research Facility is used in both background samples [22]. The initial HKKM2014 flux is oscillated with the NuFit v4.1 neutrino oscillation parameters [23]. The starting height of all neutrinos is set to 15 km above the earth's surface and coherent forward scattering between neutrinos and electrons inside the earth is taken into account based on the earth density profile from the Preliminary Reference Earth Model (PREM) [24].
For both signal samples, ∼100 000 events are simulated and only the reference kaon decay mode K + → µ + ν µ , which has a branching ratio of 63.6 %, is considered [36]. The obtained results are assumed to be transferable to the remaining kaon decay modes, see section 5. The reference background sample corresponds to an exposure of 10 megaton · years and is used to tune the analysis cuts. The alternative background sample has a size of 2 megaton · years and, together with the alternative signal sample, enables the determination of systematic uncertainties related to event generator models, which represent the JHEP04(2021)243 GENIE tune G18_02a_02_11a G18_10b_00_000 ("reference tune") ("alternative tune") The atmospheric neutrino flux simulation is not part of GENIE but mentioned in this table to provide a comprehensive overview of all models involved in the event generation. Most models in tune G18_02a_02_11a are empirical, while the G18_10b_00_000 tune uses more theoretically motivated models, making these two tunes a good combination to study event generator related uncertainties. Samples generated with the G18_02a_02_11a tune are called reference samples in the following while those generated with the G18_10b_00_000 tune are called alternative samples.
dominant contribution to systematic uncertainties in this study. If not otherwise mentioned, only the reference signal and background samples are discussed in more detail in the following.
In both signal samples, the position of the decaying proton is sampled from the Woods-Saxon nucleon density distribution, and the K + is propagated through the nucleus in steps of 0.05 fm. The interaction probability during each step is calculated with the local nucleon density and K + -nucleon scattering cross sections that are obtained from fixed target kaon scattering experiments. No binding energy is subtracted from the proton at the time of the decay. As a result of empirical tuning inside GENIE, the binding energy E b = 25 MeV is subtracted from the scattered kaon and nucleons, and added to the energy of the remnant nucleus, if the initial kinetic energy of the K + is greater than 100 MeV. For scattered kaons with lower initial kinetic energy and for kaons that leave the nucleus without interaction, no binding energy is removed. Since the nucleon density and K + scattering cross sections are identical in both GENIE tunes, 32 % of K + undergo a so-called final state interaction inside the remnant parent nucleus in both signal samples. The scattered K + typically lose a JHEP04(2021)243 large amount of their kinetic energy to the struck nucleon, which makes their identification more difficult (see figure 1). In the hA2018 intranuclear propagation model used for the reference sample, 11 % of all signal K + scatter off a single nucleon while 21 % scatter off a multi-nucleon system. No charge-exchange is simulated and the scattered K + are therefore always present in the final state outside the nucleus, accompanied by low-energy neutrons and protons. The hN2018 model used for the alternative sample includes both elastic scatters off single nucleons and charge-exchange, with 17 % of the signal K + undergoing elastic scatters and 15 % charge-exchanging into a K 0 inside the nucleus. This results in an a priori signal selection efficiency loss of 15 % in the alternative sample as the emerging K 0 is not attempted to be identified in the presented analysis, see section 3. On the other hand, the energy loss of K + in multi-nucleon scatters, which are only simulated in the reference sample, is higher than in single-nucleon scatters, resulting in a lower average K + kinetic energy after intranuclear propagation in the reference sample compared to the alternative sample (see figure 1). Since low-energy K + are more difficult to reconstruct and identify, the final signal selection efficiency in the reference sample is lower than in the alternative sample (see section 4).
The differential neutrino energy spectra for neutrino-argon interactions in the reference background sample, normalized to an exposure of 1 megaton · year, are shown as a function of neutrino energy and neutrino flavor in figure 2. The number of expected neutrino interactions is obtained by integrating the differential neutrino-argon interaction spectra, yielding a total of ∼212 000 interactions for 1 megaton · year.  detector without further interaction. Given the nature of the signal, the production of charged kaons is of special interest. One process through which charged kaons are produced is the so-called resonant associated kaon production. In a first step, the neutrino interacts with a nucleon as a whole to create a baryon resonance, a process important for neutrino energies between 1 GeV and 5 GeV. In GENIE, the production amplitudes of 18 N and ∆ resonances with masses below 2 GeV/c 2 are calculated with the Berger and Sehgal model. Relatively heavy resonances with masses 1.6 GeV can decay with a low probability into a K + or K 0 and an associated hyperon, typically a lambda (Λ) or sigma (Σ) baryon. The hyperons almost exclusively decay into a nucleon and a pion through the weak interaction [36]. Both in the CC and NC resonant associated K + production, the hyperon and its decay products can be used to distinguish the interaction from the proton decay signal p →νK + . For CC resonant K + production, an additional lepton is present. Resonant single kaon production without accompanying hyperons is possible in CC interactions if the exchanged W − boson turns an up quark into a strange quark to produce a strange baryon resonance that can decay into a neutral or negatively charged kaon and a nucleon. This process is not implemented in GENIE, but since it's Cabibbo suppressed and the charged lepton from the CC interaction makes it distinguishable from the signal, it is not expected to have a big impact on the presented results.
Deep inelastic scatters (DIS) can also give rise to associated and single kaon production, and both processes are implemented in GENIE. In DIS, the squared four-momentum transfer Q 2 is high enough for neutrinos to scatter off individual valence or sea quarks. The struck quark undergoes hadronization and typically produces several nucleons and pions. The radiated gluons involved in the hadronization process can produce strange-antistrange quark pairs that combine with spectator quarks to form kaons and associated hyperons. The hadronization process in GENIE is simulated with an empirical model for low energies and PYTHIA6 for high energies [37]. In CC DIS, an up quark can directly be transformed into a strange quark to produce neutral or negatively charged single kaons. Single kaons from DIS typically have higher energies than the K + from proton decay via p →νK + and the charged lepton produced in these interactions is another handle to distinguish them from proton decay.

Detector design and simulation
The DP LAr TPC combines an active volume of liquid argon with a charge amplification and readout system in argon gas. Charged particles produce ionization charge and scintillation light as they travel through liquid argon. The ionization charge is drifted upwards and extracted into an argon gas layer by the means of electric fields. Inside the argon gas, the ionization charge is amplified inside so-called large electron multipliers and collected at the anode, see figure 3.
The DP LAr TPC design used in this study has been defined in the context of an extensive R&D program and is considered as far detector option for DUNE [17]. The dimensions of the active volume are 60 × 12 × 12 m 3 (length × width × height), providing an active mass of ∼10 kilotons and an average of ∼6 atmospheric neutrino interactions per day. The 60 × 12 m 2 charge readout plane (CRP) consists of 80 independent 3 × 3 m 2 submodules, each surrounded by a gap of 1 cm. Two perpendicular sets of readout channels with a pitch of 3 mm, called view 0 and view 1, collect the charge signal in the submodules. The scintillation light is not considered in this study.

JHEP04(2021)243
In order to reduce computation time, only nine CRP submodules, that are arranged as a square and yield a total charge readout area of 9 × 9 m 2 , are considered in the simulation. The maximum drift distance remains 12 m. The detector geometry is implemented in the LArSoft framework, a common LAr TPC software package for event simulation and reconstruction [38]. The signal and background final state particle four-vectors obtained from GENIE are imported into LArSoft and placed 6 meter below the center of the 9×9 m 2 CRP inside liquid argon at event time t = 0. The energy loss, secondary interactions and decays of the final state particles are simulated in step sizes of ∼0.1 mm with GEANT4 [39]. The local number of free electrons per unit length for each step is calculated from the step energy loss dE and step length ds with a modified version of Birks' law: where W e = 23.6 eV is the electron work function in liquid argon that equals to the average deposited energy necessary to produce one electron-ion pair [40] and R the modified Birks' parameter that equals to the fraction of electron-ion pairs that do not recombine and therefore contribute to the charge signal: with ρ = 1.4 kg/l the liquid argon density, = 500 V/cm the nominal drift field and − (dE/ds) the local linear stopping power. The parameter values A = 0.8 and k = 0.0486 kV · MeV −1 · g · cm −3 have been used [41]. The electrons are drifted upwards from the center of each step. The drift time to the CRP is calculated with the drift velocity of 1.6 m/ms at the nominal drift field of 500 V/cm [42]. In order to account for longitudinal and transverse diffusion during the drift, the electron distribution at the CRP is smeared along the drift direction and in the plane perpendicular to the drift direction with a mean displacement λ L,T = 2 · D L,T · t Drift , using the diffusion constants D L = 0.62 mm 2 /ms and D T = 1.63 mm 2 /ms. The longitudinal diffusion constant D L has been measured by several experiments and the value used in this study is within the measured range [43][44][45]. The transverse diffusion constant D T has been measured indirectly with high precision for drift fields above 2 kV/cm. The extrapolation towards lower drift fields yields D T ≈ 1.44 mm 2 /ms at = 500 V/cm, which disagrees with the sparsely available data for low drift fields, and the used value of D T = 1.63 mm 2 /ms is thus a conservative estimate for the transverse diffusion [42]. The total gain in the CRP is set to 20 and the electrons are shared equally between the two readout views, with each electron being assigned to the closest readout channel in its respective readout view. All channels with at least one collected electron hold a waveform with the collected charge as a function of time. The charge waveform is shaped and transformed to a voltage waveform through convolution with the preamplifier shaping function P S (t): where P G = 2.5 mV/fC is the preamplifier gain and τ 1 = 2.83 µs and τ 2 = 0.47 µs are the preamplifier shaping time constants. The voltage waveform is digitized in samples of 400 ns with a 12 bit ADC over a dynamic range of 1800 mV. The preamplifier shaping function and gain are taken from pulsing measurements of the 3 × 1 × 1 m 3 DP LAr TPC prototype at operating conditions [17,18]. Charge attenuation during the drift due to impurities and electronic noise are not simulated. Figure 4 shows example event displays for proton decay via p →νK + with the kaon decaying into a µ + and ν µ and for a ν µ CC quasi elastic (CC QE) scatter on a neutron, the most common background process. It will be shown in section 3 that ν µ CC QE scatters on neutrons are an important background when the emerging proton is misidentified as signal K + and the muon has a similar energy as the µ + from the K + decay. The event displays are a collection of ADC waveforms of neighboring channels, where the x-axes in both views directly correspond to the readout channel numbers and the drift distance on the y-axes is calculated by multiplying the drift time with the drift velocity. Thanks to the fine-grained imaging capability of LAr TPCs, all particles are clearly visible in the event display. The particle properties are reconstructed with the information stored in the waveforms and used in the analysis to distinguish proton decay from atmospheric neutrino background.

Event reconstruction
Hits are reconstructed by looking for peaks above threshold in the ADC waveforms. Peaks containing inflection points are split into separate hits. The hit charge Q is determined by summing up all samples within a hit and is stored for further reconstruction.
The next step in the reconstruction is the identification of groups of hits that originate from the same particle. Reconstruction algorithms accomplish this task by looking for two types of patterns: continuous lines of hits originating from track-like particles such as kaons, protons, pions and muons, and discontinuous cone-like groups of hits from showering particles such as photons and electrons. As these pattern recognition algorithms for LAr TPCs are currently in development, an aided pattern recognition, in which hits originating from the same particle are grouped in both readout views with Monte Carlo truth JHEP04(2021)243 information, is used in order to not limit the significance of this study by premature reconstruction algorithms. For all particles with at least two reconstructed hits in each of the two readout views, the 2D end points of the hit groups are matched between the readout views to obtain two 3D end points. The 3D track of the particle, hereafter simply called track, is defined as a straight line with length L Track that connects the two 3D end points. The end point in the half of the track with the lower charge content is defined as starting point of the track and the remaining end point as stopping point. The total charge of the track deposited in liquid argon Q Track, LAr is calculated in the readout view with most hits, also called best view, by summing up the charge of the contained hits and by correcting for the total gain in the CRP of 20 and the charge sharing between the two readout views. As both track-like and showering particles are reconstructed as straight tracks, the share of readout channels that do not contain hits associated to the track between its starting and stopping points N Track, missing hits is determined and later on used to distinguish track-like from showering particles.
The length of the track segment ds from which a single channel has collected charge is calculated for the best view using the readout channel pitch and the direction of the track. The corresponding local charge deposition in liquid argon dQ/ds and, through equations (2.1) and (2.2), the local energy loss −dE/ds are determined at each hit. The mean stopping power −dE/ds and residual kinetic energy E kin, residual of the track at all hits are plotted against each other to obtain its stopping power profile, starting with the biggest hit near the track stopping point and walking along the trajectory towards the starting point by excluding the outermost hits with small charge content that originate from diffusing charge. The stopping power profiles are later on used for particle identification, see section 3.2.

Analysis
The goal of this sensitivity study is to determine the lower proton lifetime limit per branching ratio τ /Br p →νK + for exposures up to 1 megaton · year if no proton decay is observed. Since the lifetime limit typically increases with decreasing number of expected background events, a strong background rejection is essential to this study.
The analysis is carried out in three steps with the global strategy of identifying the signal K + and its decay products: event preselection, neural-network-driven track identification and final event selection.

Event preselection
The event preselection uses reconstructed event variables to reject background events. The three following cuts are applied: The signal distribution reflects the kaon decay chain with a K + , µ + and e + as well as a potential proton knocked-out during the intranuclear propagation. The background distribution is dominated by quasi-elastic scatters which typically produce track multiplicities above threshold between zero and three, with the details depending on the neutrino flavor, scattered nucleon and intranuclear propagation.  The ranges of cut 1.1 and 1.2 are chosen to include 99.9 % of signal events while considerably reducing the number of background events. Cut 1.3 allows for three or four reconstructed tracks inside the event, which correspond to the signal K + and its daughter µ + and e + as well as a potential proton knocked out during the intranuclear propagation. Only tracks with a reconstructed charge in liquid argon of Q Track, LAr > 40 fC are considered in order to avoid low-energy photons that are emitted after neutron captures to dominate the track multiplicity distribution shown in figure 5. The signal selection efficiencies and total numbers of background events are shown in table 2 for the individual and combined event preselection cuts.

Track identification
The goal of the track identification is to determine the type of the particle that created a given track. Due to the nature of the signal and the consequent global analysis strategy JHEP04(2021)243 of identifying the signal K + and its decay products, a simplified identification is used that only uses two classes of particles: signal K + vs. all other particles. In a first step, three preselection cuts are applied to tracks in events that survive the event preselection in order to select signal K + -like tracks and reject tracks of all other particles. Subsequently, a neural network is used to determine how signal K + -like the preselected tracks are. The track preselection variables and cuts are: The lower cut value for Q Track, LAr of 40 fC corresponds to a K + length of ∼1 cm in liquid argon, which is the minimum track length for generating two hits in both readout views and thus for a successful particle identification. The upper cut value of 900 fC corresponds to the charge deposition of K + with the maximum kinetic energy E kin ≈ 200 MeV, see figure 1.
Since K + are the highest ionizing particles in signal events, except for low-energy protons in the vertex region in events in which the K + underwent a final state interaction, the reconstructed signal K + track usually does not have missing hits from shadowing particles. Cut 2.2 is chosen accordingly to reject shower-like particles like electrons and photons which typically have a high share of missing hits. Cut 2.3 defines a sensible range for the stopping power profiles used in the neural network classification. The upper limit in −dE/ds of 20 MeV/cm corresponds to the reconstructed stopping power of protons near their stopping point and E kin, residual = 200 MeV is the maximum kinetic energy of signal K + . The neural network will not attempt to classify tracks without at least one hit in this range. 76.4 % of signal K + tracks survive the event and track preselection, which can be interpreted as signal selection efficiency at this stage of the analysis since every signal event contains exactly one K + in the reference sample.
Tracks that pass the preselection are classified by a neural network that is trained with dedicated signal and background training samples generated with the reference GENIE tune. The training signal sample contains ∼65 000 events while the training background sample corresponds to an exposure of 2 megaton · years. The neural network aims at distinguishing between signal K + and all other tracks in the signal and background samples. It is built using the TensorFlow library with an implementation of the Keras application programming interface [46,47]. The stopping power profiles of tracks that survive event and track preselection cuts are divided into 20 equally sized bins in −dE/ds between 0 and 20 MeV/cm and 20 equally sized bins in E kin, residual between 0 and 200 MeV to function as the 400-neuron input layer to the neural network. The stopping power profiles of signal K + as well as protons, pions and muons in the background sample that survive event and track preselection cuts are shown in figure 6. The 400-neuron input layer is connected to JHEP04(2021)243 the first inner layer with 64 neurons through the Rectified Linear Unit (ReLU) activation function, and the first inner layer is in turn connected through the ReLU activation function to the second inner layer, which also consists of 64 neurons. Finally, the second inner layer is connected to the output layer with 2 neurons through the softmax activation function. The two output neurons hold information about the signal K + -likeness and signal K + -unlikeness of a track. The softmax activation function forces the sum of both output values to 1 so that their information is redundant, and only the signal K + -likeness output value is used for further analysis. The network is trained for a maximum of 40 epochs. During each epoch, 90 % of the reshuffled training samples are used to train the network while the remaining 10 % are used for validation. If the performance of the network on the validation subsample does not increase over 10 epochs, the training is complete.
After the training, the network is applied to all tracks that survive event and track preselection cuts in both the reference and alternative analysis samples. Only tracks with a signal K + -likeness of 0.83 or higher are considered as signal K + in the final event selection in section 3.3, which represents the best compromise between signal K + track selection efficiency and rejection of other tracks (see left panel of figure 7). This cut value corresponds to a signal K + track selection efficiency of 75.9 % for tracks that survive the event and track preselection, and thus to an overall signal K + track selection efficiency of 76.4 % · 75.9 % = 58 % after the neural network classification, with 76.4 % being the efficiency after event and track preselection. The right panel of figure 7 shows the signal K + track selection efficiency as a function of the number of tracks misidentified as signal K + in the background sample. Most misidentified tracks are protons since they are the most abundant charged particles in the background and the K + and proton stopping power profiles have a nonnegligible overlap due to smearing effects from the detector simulation and reconstruction, JHEP04(2021)243 see figure 6. At the cut value of 75.9 %, ∼10 000 tracks are misidentified as signal K + in the full 10 megaton · years reference background sample.

Final event selection
In the final event selection, three cuts are applied to the preselected events that aim at the tracks from the signal K + and its daughter µ + and e + . A fourth cut only allows for potential low-energy protons from final state interactions in addition to the tracks from the K + decay chain. As opposed to cut 4. Since ∼92 % of all signal K + decay at rest and only the two-body kaon decay mode K + → µ + ν µ is considered, cut 4.2 aims at monoenergetic µ + with E kin = 152.5 MeV and the corresponding charge deposition and length as chosen in cuts 4.2.1 and 4.2.2. As required by cuts 4.2.3 and 4.2.4, most µ + tracks are track-like with less than 10 % missing hits and close to the end point of the signal K + track. Cut 4.2.5 is introduced since some events in the background sample contain a proton that is misidentified as signal K + as well as a muon or pion with similar length and charge deposition as the µ + from K + decay at rest, see top left event display in figure 8. If the muon or pion travel in the same direction as the proton in these background events, the first part of their tracks are shadowed by the proton and it seems like the muon or pion emerge from the proton, just like the µ + emerges from the K + decay. The minimum angle α under which two close tracks can be separated depends on the charge diffusion and the length of the track and was determined to 10 • for the values used in this study.
The Michel positron from the muon decay µ + → e + ν eνµ is typically reconstructed as a shower-like track with more than 10 % missing hits, and cut 4.3.1 is set accordingly. In order to avoid low-energy photons in both signal and background samples to be misidentified as Michel positrons, cuts 4.3.2 on the charge and cut 4.3.3 on the number of hits in the best view are introduced.
Only low-energy proton tracks can be present in the signal sample in addition to the K + , µ + and e + . Those tracks usually have less than 10 % missing hits and are shorter than 5 cm, and cuts 4. The losses in signal selection efficiency are mainly due to low-energy K + that have scattered inside the nucleus, badly reconstructed K + traveling parallel or antiparallel to the drift direction and in-flight decaying K + . Figure 9 shows the signal K + selection efficiency as a function of the true K + kinetic energy throughout the analysis and as a function of true kaon direction after the neural network classification. The selection efficiency for low-energy K + drops significantly after particle preselection, which can be explained by cut 2.1 that requires a minimum charge deposition in liquid argon of 40 fC per track as well as by cut 2.3 that rejects kaon tracks with unreasonable stopping power JHEP04(2021)243 Figure 8. Event displays of persistent background events in the reference sample. In all events, the proton is misidentified as signal K + and the π + is mistaken for the µ + from K + decay at rest. The top left event justifies cut 4.2.5 as the proton shadows the first part of the π + track in the best view. The top right event fails cut 4.3 as it has two showering particles and the two events at the bottom fail cut 4.4 since there is an additional track present, with the scattered proton in the bottom left event being reconstructed as two separate tracks.

Cut
Signal selection efficiency Background events (efficiency) / 100 % 2 122 620 (100 %)  Table 3. Signal and background selection efficiencies and number of background events for event preselection (cut 1) and consecutive final event selection cuts in the reference signal and 10 megaton · years background samples. Although the background can be completely rejected in the studied sample, rare irreducible background events can not be excluded for larger exposures and the background efficiency is therefore given as upper limit after the last cut.
JHEP04(2021)243 Figure 9. Left: signal K + selection efficiency as a function of true kinetic energy throughout the analysis. Since every signal event contains exactly one K + in the reference sample, the y-axis can also be interpreted as signal selection efficiency. The purple points show the signal K + tracking efficiency for a similar study reported in [48] and are put into context at the end of section 4. Right: signal K + selection efficiency as a function of true K + start direction after the neural network (NN) classification. The ranges of θ and φ have been downsized by exploiting different symmetries in the detector: φ = 0 • is parallel to the readout strips in one of the readout views and φ = 45 • is in the middle of both readout view orientations. θ = 0 • is parallel and antiparallel to the drift direction and θ = 90 • is parallel to the charge readout plane. The selection efficiency decreases significantly for kaons that travel parallel or antiparallel to the drift direction (θ = 0 • ) but is stable for kaons that travel parallel to the readout strips in one of the readout views (φ = 0 • ).
profiles, which is more likely to occur for short tracks from low-energy K + . A similar drop is observed after the neural network classification since the direction of short tracks is more likely to be misreconstructed, which leads to shifted stopping power profiles. These effects are enhanced by the diffusion of the drifting charge and could be mitigated by a better spatial resolution, see section 2.2. Additional inefficiencies are introduced by µ + traveling in the same direction as the parent K + (cut 4.2.5) and low-energy Michel positrons from the µ + three-body decay that are not reconstructed and lead to a signal track multiplicity of 2 (cut 1.3 in the event preselection).
In ten out of the eleven background events that pass cut 4.2, a proton is misidentified as signal K + and in nine events, a charged pion is mistaken for the µ + from the K + decay. Out of the eleven events, six have no shower-like tracks as defined by cut 4.3 since they contain only negatively charged pions or muons and the µ − is captured by an argon atom without producing a Michel electron, and three events have two shower-like tracks instead of one. The remaining two events with one shower-like track that pass cut 4.3 have an additional track that fails cut 4.4. Figure 8 shows four event displays of persistent background events in which the proton is misidentified as signal K + and the π + is mistaken for the µ + from K + decay at rest. Even the most persistent background events are clearly distinguishable from proton decay via p →νK + in the event display since the misidentified proton shares the same vertex with other particles and its Bragg peak is not connected to a second track, as it is the case for the K + in the signal sample. The same analysis from event preselection to final event selection is applied to the alternative signal and background samples (see table 1), yielding a signal selection efficiency of 46.8 % and 0 background events in 2 megaton · years.

Proton decay sensitivity results
The lower lifetime limit per branching ratio for p →νK + can be obtained with: where T is the exposure in kiloton · years, N p = 2.7 · 10 32 the number of protons in one kiloton of argon, the signal selection efficiency and S the upper limit on the number of signal events at 90 % confidence level (CL) that depends on the number of observed events N and the number of expected background events B. In the previous section it has been shown that the background can be reduced to 0 for both samples. Since the considered exposures of 10 megaton · years and 2 megaton · years are beyond the expectation for DUNE, the proton decay sensitivity is only calculated for exposures up to 1 megaton · year, which is a conservative estimate for the maximum achievable exposure with DUNE. The neural network cut 4.1 in the final event selection is adjusted to obtain B = 0.5 background events at exposure steps of 200 kiloton · years for both samples separately, and the concomitant signal selection efficiencies are summarized in table 4. The mean signal selection efficiencies of both samples are used in the sensitivity calculation at the given exposures, and the systematic uncertainty is defined as the full spread between the samples. With B = 0.5 expected background events and in case no event is observed (N = 0), the upper limit on the number of signal events at 90 % CL according to Feldman-Cousins is 1.94 [49]. The resulting sensitivities are obtained with equation (4.1) and interpolated linearly between the studied exposures, see figure 10. Only the kaon decay mode K + → µ + ν µ has been considered in this study and the obtained results are assumed to be transferable to all other kaon decay modes in the sensitivity calculation, see discussion in section 5.
The current best published limit of τ /Br p →νK + > 5.9 · 10 33 years by Super-Kamiokande can be reached with an exposure of ∼80 kiloton · years. After an exposure JHEP04(2021)243 of 1 megaton · year, a lower limit of τ /Br p →νK + > 7 · 10 34 years can be achieved, reaching the predicted limits of many SUSY GUTs (see section 1).
A similar sensitivity study for p →νK + using a ∼10 kiloton single phase LAr TPC at DUNE has been reported in [48], reaching a signal selection efficiency of 15 % at a comparable background level. Based on visual scans, reference [48] further claims that the signal selection efficiency could be increased to 30 % with improved reconstruction algorithms. The signal K + tracking efficiency in [48], which is the share of K + with a reconstructed track but without any information on the nature of that track, is shown in the left panel of figure 9 as a function of true kinetic energy. The curve is comparable to the signal K + selection efficiency after neural network selection in our study, which is the share of K + that produced a reconstructed track that was already identified as signal K +like by the neural network. Our analysis benefits from a better charge readout resolution of 3 mm compared to ∼5 mm in [48] combined with a dedicated neural network for kaon identification. The charge resolution is important for the detailed reconstruction of the short Bragg peak, which plays a crucial role in the particle identification (see section 3.2 and figures 4, 6 and 8). Another important difference between the dual phase LAr TPC considered in our study and the single phase LAr TPC in [48] is the number of charge readout views and their orientation: while there are two perpendicular charge readout views that both collect the arriving charge in the dual phase design, the single phase design foresees three charge readout views of which the first two record an induction signal as the charge passes by (induction planes) while the third one collects the arriving charge (collection plane). The angle between the two induction planes and the collection plane is ±35.7 • [50]. The availability of a third readout view can improve the reconstruction and identification of particles that travel parallel to one of the readout views, but no major inefficiencies in particle identification have been found for kaons with such topologies in our study with two perpendicular readout views (see right panel of figure 9).

Discussion of uncertainties
The systematic uncertainty related to event generator models was assessed by using two different GENIE tunes. The dominant contribution to the uncertainty originates from the intranuclear propagation of kaons. Although the two tunes use different propagation models with different possible interactions (see section 2.1), the underlying K + -nucleon scattering cross sections are identical and yield a signal K + scattering probability of 32 % in both tunes. Since the signal K + typically lose a large amount of their kinetic energy in the scatters, their tracks are often too short to be identified correctly in the analysis independent of the nature of the scatter. The obtained difference in signal selection efficiencies between the two tunes of about 2 − 4 % is therefore relatively small, see table 4. Furthermore, the final state interaction rate of K + inside the remnant nucleus has been cross-checked with NEUT, a generator toolkit developed in the context of the T2K experiment [51]. NEUT yields a total interaction rate of 35 % with a model combination similar to the alternative GENIE tune, confirming the interaction rate of 32 % obtained with GENIE. The presented analysis was carried out for the most common kaon decay mode K + → µ + ν µ and the obtained results were assumed to be identical for all other kaon decay modes in the sensitivity calculation. The main kaon decay modes and branching ratios are summarized in table 5. The event preselection cuts can be easily adjusted for the other kaon decay modes, and the track multiplicity cut 1.3 would likely yield a better background rejection since most background events have a low track multiplicity (see section 3.1 and figure 5). Except for K + → π + π + π − , which shows more activity at the kaon decay point, the neural network signal K + track identification is not affected in the remaining decay modes (see section 3.2). Subsequently, the presence of multiple particles emerging from the kaon decay point, as well as their correlations, enable a strong background rejection that is expected to be comparable to the one obtained for K + → µ + ν µ , and the remaining cuts 4.2 to 4.4 can be adapted accordingly. Moreover, cut 4.2.5, which was introduced for µ + traveling in the same direction as the parent K + and which results in a signal selection efficiency loss of about 6 %, is no longer required. It is therefore reasonable to assume that the obtained results for K + → µ + ν µ are transferable to all kaon decay modes in the sensitivity calculation.
The detector simulation parameter with the highest impact on the sensitivity limit is the transverse diffusion, see section 2.2. The transverse smearing of the charge at the starting and stopping points of a particle reduces the reconstructed charge in the first

JHEP04(2021)243
and last hit and makes the particle's reconstructed track appear longer and tilted. These effects lead to a smearing of the −dE/ds vs. E kin, residual stopping power profiles used for the neural-network-driven signal K + track identification, which plays a central role in the presented analysis. Since the mean transverse displacement λ T is proportional to √ t Drift , placing all events in the center of the detector at 6 m drift in this analysis effectively leads to a higher mean transverse displacement and therefore to a bigger smearing of the −dE/ds vs. E kin, residual curves compared to the expected random distribution of events between 0 and 12 m drift.
A process for charged kaon production in atmospheric neutrino interactions on nuclei that has not been considered in this study is the so-called charged current coherent K + production, in which the neutrino scatters off the entire argon nucleus to create an on-shell K + while leaving the nucleus intact. First evidence for this process has recently been found by the MINERvA experiment [52]. Although no particles leave the nucleus, the lepton from the charged current interaction makes this process distinguishable from the signal. Cosmic muon-induced backgrounds for p →νK + have been found to be negligible for large rock overburdens in our previous study and were therefore not considered in this analysis [1].

Conclusions
We have used the p →νK + benchmark channel to update our previously found sensitivity limits for several proton and neutron decay modes. In our previous study, we performed a simplified detector simulation and had to make assumptions on the detector and backgrounds. Since then, a well-defined DP LAr TPC detector design has been established, precision neutrino cross section measurements have been carried out and more sophisticated event generators have become available. These developments allowed us to update our results for the proton decay mode p →νK + with a full detector simulation and improved signal and atmospheric neutrino background samples. In this study, we have found a signal selection efficiency of ∼50 % in quasi-background-free conditions, resulting in a lower lifetime limit of τ /Br p →νK + > 7 · 10 34 years at 90 % CL for an exposure of 1 megaton · year.
The decrease in signal selection efficiency with respect to the ∼97 % found in our previous study can largely be explained by a low signal K + identification efficiency for low-energy K + that scattered inside the nucleus and by badly reconstructed tracks with difficult topologies, especially parallel or anti-parallel to the drift direction, two effects that have previously not been considered to their full extent (see figure 9). A better spatial resolution could improve the reconstruction of low-energy K + and increase the sensitivity to proton decay via p →νK + .
While the detector design and simulation parameters are well defined, the reconstruction and analysis used in this study can be further improved to yield a higher signal selection efficiency, which is supported by the fact that the event displays of some of the most persistent background events in this analysis are clearly distinguishable from the signal (see figures 4 and 8). An aided pattern recognition was used in this study instead of a full pattern recognition algorithm (see section 2.3), but the additional loss in signal JHEP04(2021)243 selection efficiency by using a full pattern recognition algorithm is expected to be small since it would mainly affect events with short tracks and difficult topologies that already failed the selection cuts in the presented analysis. Except for said short tracks, the neural network signal K + identification shows a good performance with losses of only ∼5 % for kaons above 80 MeV (see left panel of figure 9).
Considering the latest published Super-Kamiokande result with a signal selection efficiency of 10 % and ∼0.5 expected background events at an exposure of 260 kiloton · years for p →νK + [16], we can confirm that the LAr TPC technology is superior over Water Cherenkov detectors for many of the challenging nucleon decay modes. Moreover, LAr TPCs are ideal for discoveries at the few events level thanks to their excellent imaging capabilities and concomitant background rejection.