Measurement of the very rare $K^+\rightarrow\pi^+\nu\bar{\nu}$ decay

The NA62 experiment reports the branching ratio measurement BR$(K^+ \rightarrow \pi^+ \nu\bar{\nu}) = (10.6^{+4.0}_{-3.4} |_{\rm stat} \pm 0.9_{\rm syst}) \times 10 ^{-11}$ at 68% CL, based on the observation of 20 signal candidates with an expected background of 7.0 events from the total data sample collected at the CERN SPS during 2016-2018. This provides evidence for the very rare $K^+ \rightarrow \pi^+ \nu\bar{\nu}$ decay, observed with a significance of 3.4$\sigma$. The experiment achieves a single event sensitivity of $(0.839\pm 0.054)\times 10^{-11}$, corresponding to 10.0 events assuming the Standard Model branching ratio of $(8.4\pm1.0)\times10^{-11}$. This measurement is also used to set limits on BR($K^+ \to \pi^+ X$), where $X$ is a scalar or pseudo-scalar particle. Details are given of the analysis of the 2018 data sample, which corresponds to about 80% of the total data sample.


Introduction
The K + → π + νν decay is a Flavour Changing Neutral Current (FCNC) process that proceeds at the lowest order in the Standard Model (SM) through electroweak box and penguin diagrams, both dominated by t-quark exchange. The quadratic Glashow-Iliopoulos-Maiani (GIM) mechanism and the transition from a top to a down quark make this process extremely rare. Using tree-level elements of the Cabibbo-Kobayashi-Maskawa (CKM) matrix as external inputs, the SM predicts the branching ratio to be BR = (8.4 ± 1.0) × 10 −11 [1], where the uncertainty is dominated by those of the CKM parameters V cb and γ. The intrinsic theoretical uncertainty is 3.6%, related to the uncertainty of the NLO (NNLO) QCD corrections to the top (charm) quark contribution [2,3] and NLO electroweak corrections [4]. The uncertainties due to the hadronic matrix element governing the K-π transition are negligible, because they are evaluated using the precisely measured branching ratio of K + → π 0 e + ν, corrected for isospin-breaking and non-perturbative effects calculated in [4][5][6].
The K + → π + νν decay is among the most promising modes to search for non-SM signals in flavour physics, and it probes higher mass scales than other rare meson decays. The largest deviations from SM predictions are expected in models with new sources of flavour violation, where constraints from B physics are weaker [7,8]. Models with tree-level FCNCs of leftor right-handed chirality produce specific correlation patterns between the branching ratios of K L → π 0 νν and K + → π + νν decay modes, which are constrained by the value of the CPviolation parameter K [9,10]. Present experimental constraints limit the range of variation within supersymmetric models [11][12][13]. The K + → π + νν decay is also sensitive to some aspects of lepton flavour non-universality [14] and can constrain leptoquark models [15,16].
The NA62 experimental signature of the K + → π + νν decay is an incoming K + , an outgoing π + , and missing energy and momentum. The K + → π + X decay, where X can be a scalar or a pseudo-scalar particle, has the same signature. A search for a feebly interacting particle X gives access to physics Beyond the Standard Model (BSM) at low energies. In a hidden-sector portal framework, the X particle mediates interactions between SM and hidden-sector fields [20]. A scalar mediator X can mix with the SM Higgs in inflationary [21], scale invariant [22], and relaxion [23] models, which all have cosmological implications. Models where X is an axion, which acquires mass from the explicit breaking of the Peccei-Quinn (PQ) symmetry [24,25], can be a signature of the PQ mechanism and can solve the strong CP problem [26,27]. A QCD axion with a mass of O(10 −4 ) eV could be a dark matter candidate, and specific axion models can also solve the SM flavour problem [28]. In a broader class of models, X is considered as an axion-like particle (ALP) that acts as a pseudoscalar mediator [29].
The NA62 experiment collected its first data in 2016-2018. In the following, the K + → π + νν analysis of the 2018 data set is described and the combined result based on the full threeyear sample is presented. In addition, a search for the decay K + → π + X is performed and upper limits are established in a particular scenario [20], where X is a scalar particle.

Beam line and detector
The NA62 beam line and detector are sketched in Figure 1 and a detailed description can be found in [30]. The beam line defines the Z-axis of the experiment's right-handed laboratory coordinate system. The origin is the kaon production target, and beam particles travel in the  Figure 1: Schematic top view of the NA62 beam line and detector. Dipole magnets are displayed as boxes with superimposed crosses. The label "COL" denotes the collimator named "final collimator" in the text. The label "CHOD" refers both to the CHOD and NA48-CHOD detectors. Also shown is the trajectory of a beam particle in vacuum which crosses all the detector apertures, thus avoiding interactions with material. A dipole magnet between MUV3 and SAC deflects the beam particles out of the SAC acceptance.
positive Z-direction. The Y-axis is vertical (positive up), and the X-axis is horizontal (positive left).
The kaon production target is a 40 cm long beryllium rod. A 400 GeV proton beam extracted from the CERN SPS impinges on the target in spills of three seconds effective duration. Typical intensities during data taking range from 1.9 to 2.2 × 10 12 protons per pulse. The resulting secondary hadron beam of positively charged particles consists of 70% π + , 23% protons, and 6% K + , with a nominal momentum of 75 GeV/c (1% rms momentum bite).
Beam particles are characterized by a differential Cherenkov counter (KTAG) and a threestation silicon pixel matrix (Gigatracker, GTK, with pixel size of 300 × 300 µm 2 ). The KTAG uses N 2 gas at 1.75 bar pressure (contained in a 5 m long vessel) and is read out by photomultiplier tubes grouped in eight sectors. It tags incoming kaons with 70 ps time-resolution. The GTK stations are located before, between, and after two pairs of dipole magnets (a beam achromat), forming a spectrometer that measures beam particle momentum, direction, and time with resolutions of 0.15 GeV/c, 16 µrad, and 100 ps, respectively.
The last GTK station (GTK3) is immediately preceded by a steel collimator (final collimator, COL). In the first part of the 2018 data taking period, as in 2016 and 2017, the inner variable aperture of the 1 m thick collimator was typically set at 66 × 33 mm 2 , and its outer dimensions were about 15 cm. It served as a partial shield against hadrons produced by upstream K + decays. Shortly after the start of the 2018 data-taking period, this collimator was replaced by a 1.2 m thick collimator with outer dimensions 1.7 × 1.8 m 2 and a central, race-track shaped aperture of 76 × 40 mm 2 , designed to absorb all hadrons emitted by upstream K + decays and passing outside its aperture.
GTK3 marks the beginning of a 117 m long vacuum tank. The first 80 m of the tank define a volume in which about 13% of the kaons decay. The beam has a rectangular transverse profile of 52 × 24 mm 2 and a divergence of 0.11 mrad (rms) in each plane at the decay volume entrance.
The time, momentum, and direction of charged daughters of kaon decays-in-flight are measured by a magnetic spectrometer (STRAW), a ring-imaging Cherenkov counter (RICH), and two scintillator hodoscopes (CHOD and NA48-CHOD). The STRAW, consisting of four straw chambers, two on each side of a dipole magnet, measures three-momenta with a resolution, σ p /p, between 0.3% and 0.4%. The RICH, filled with neon at atmospheric pressure, tags the decay particles with a timing precision of better than 100 ps and provides particle identification. The CHOD, a matrix of tiles read out by silicon photomultipliers, and the NA48-CHOD, comprising two orthogonal planes of scintillating slabs reused from the NA48 experiment, are used for triggering and timing, providing a time measurement with 200 ps resolution.
Other sub-detectors suppress decays into photons or into multiple charged particles (electrons, pions or muons) or provide complementary particle identification. Six stations of plastic scintillator bars (CHANTI) detect, with 1 ns time resolution, extra activity, including inelastic interactions in GTK3. Twelve stations of ring-shaped electromagnetic calorimeters (LAV1 to LAV12), made of lead-glass blocks, are located inside and downstream of the vacuum tank to achieve full acceptance for photons emitted by K + decays in the decay volume at polar angles between 10 and 50 mrad. A 27 radiation-length thick, quasi-homogeneous liquid krypton electromagnetic calorimeter (LKr) detects photons from K + decays emitted at angles between 1 and 10 mrad. The LKr also complements the RICH for particle identification. Its energy resolution in NA62 conditions is σ E /E = 1.4% for energy deposits of 25 GeV. Its spatial and time resolutions are 1 mm and between 0.5 and 1 ns, respectively, depending on the amount and type of energy released. Two hadronic iron/scintillator-strip sampling calorimeters (MUV1, MUV2) and an array of scintillator tiles located behind 80 cm of iron (MUV3) supplement the pion/muon identification system. MUV3 has a time resolution of 400 ps. A lead/scintillator shashlik calorimeter (IRC) located in front of the LKr, covering an annular region between 65 and 135 mm from the Z-axis, and a similar detector (SAC) placed on the Z-axis at the downstream end of the apparatus, ensure the detection of photons down to zero degrees in the forward direction. Additional counters (MUV0, HASC) installed at optimized locations provide nearly hermetic coverage for charged particles produced in multi-track kaon decays.
All detectors are read out with TDCs, except for LKr, MUV1 and MUV2, which are read out with 14-bit FADCs. The IRC and SAC are read out with both. All TDCs are mounted on custom-made (TEL62) boards, except for GTK and STRAW, which each have specialized TDC boards. TEL62 boards both read out data and provide trigger information. A dedicated processor interprets calorimeter signals for triggering. A dedicated board (L0TP) combines logical signals from the RICH, CHOD, NA48-CHOD, LKr, and MUV3 into a low-level trigger (L0) whose decision is dispatched to sub-detectors for data readout [31]. A software trigger (L1) exploits reconstruction algorithms similar to those used offline with data from KTAG, LAV, and STRAW to further reduce the data volume before storing it on disk.
The data come from about 5 × 10 5 SPS spills accumulated during a seven-month data-taking period in 2018, recorded at a mean instantaneous beam particle rate of 500 MHz, measured event-by-event using the number of signals recorded out-of-time in the GTK detector. The average beam particle rate per spill was stable within ±10% throughout the data-taking period, while the instantaneous value showed fluctuations up to a factor of two around the average. The data were collected using a trigger specifically setup for the K + → π + νν measurement, called PNN trigger, concurrently with a minimum-bias trigger. The PNN trigger is defined as follows. The L0 trigger requires a signal in the RICH to tag a charged particle. The time of this signal, called trigger time, is used as a reference to define a coincidence within 6.3 ns of: a signal in one to four CHOD tiles; no signals in opposite CHOD quadrants to suppress K + → π + π + π − decays; no signals in MUV3 to reject K + → µ + ν decays; less than 30 GeV deposited energy and no more than one cluster in the LKr to reject K + → π + π 0 decays. The L1 trigger requires: a kaon identified in KTAG; signals within 10 ns of the trigger time in at most two blocks of each LAV station; at least one STRAW track corresponding to a particle with momentum below 50 GeV/c and forming a vertex with the nominal beam axis upstream of the first STRAW chamber. Events collected by the PNN trigger are referred to as PNN events or PNN data. The minimum-bias trigger is based on NA48-CHOD information downscaled by a factor of 400. In this case, the trigger time is the time of the NA48-CHOD signal. Data collected by the minimum-bias trigger are used at analysis level to determine the K + flux, to measure efficiencies, and to estimate backgrounds. These data are called minimum-bias events or minimum-bias data.
Acceptances and backgrounds are evaluated using a Monte Carlo (MC) simulation based on the GEANT4 toolkit [32] to describe detector geometry and response. The K + decays are generated in the kaon rest frame using the appropriate matrix elements and form factors. The simulation also includes a description of the collimators and dipole and quadrupole magnets in the beam line, necessary to accurately simulate the beam shape. Certain aspects of the simulation are tuned using input from data, namely signal formation and readout detector inefficiencies. Accidental activity is added to the KTAG and GTK signals using the distribution of the instantaneous beam particle rate measured with data, and a library of GTK pileup hits built from GEANT4 simulations. No accidental activity is simulated in the detectors downstream of the last station of the beam tracker. Simulated data are subject to the same reconstruction and calibration procedures as real data.

Analysis method
The experimental signature of the K + → π + νν decay consists of a K + with 4-momentum P K in the initial state and a π + with 4-momentum P π and missing energy in the final state. The kinematic variable used to discriminate between the signal and background K + decays is the squared missing mass m 2 miss = (P K − P π ) 2 . This variable is used to define the two regions used to search for the signal (Region 1 and Region 2, as shown in Figure 2), and to separate it from the other K + decay backgrounds.
The data set collected in 2018 is divided into two subsets, S1 and S2, which correspond to the periods before (20% of the data set) and after (80% of the data set) the installation of the new final collimator COL. The subset S2 is further divided into six categories corresponding to equal 5 GeV/c bins of pion momentum, p π + , in the range 15-45 GeV/c. The subset S1 is considered as a separate category and is integrated over p π + due to its small size. A dedicated selection is applied to each category, which improves signal sensitivity. Data sets from 2016-2017, analyzed in [18] and [19], are added as two separate categories, each integrated over p π + , for a total of nine categories.
The measurement of BR(K + → π + νν) relies on the calculation of the single event sensitivity (SES) and the background evaluation for each category. The SES is defined as 1/(N K + · πνν ), where N K + is the effective number of K + decays in a pre-defined fiducial region and πνν is the signal selection efficiency. The K + → π + π 0 decays selected from minimum-bias data are used as normalization to compute N K + . Signal and normalization decays share the same selection defined by the presence of a single π + forming a vertex with a parent K + inside the fiducial region. The rejection of extra activity from photons or charged particles is applied only to the signal selection. Control regions ( Figure 2) are used to validate the background estimates. Control and signal regions are masked until the completion of the analysis to avoid possible bias during the optimization of the selection conditions. The K + → π + νν branching ratio is obtained from a binned log-likelihood fit using the signal acceptance and background expectation in each category. miss as a function of p π + for minimum-bias events selected without applying π + identification and photon rejection, assuming the K + and π + mass for the parent and decay particle, respectively. Signal regions 1 and 2 (hatched areas), as well as 3π, π + π 0 , and µν background regions (solid thick contours) are shown. The control regions are located between the signal and background regions.

Event selection
The signal and normalization channels both require the presence of a downstream charged particle track identified as a π + and of a parent K + track that forms a vertex in the fiducial volume. After these common selection criteria, specific requirements define the normalization and signal events.
Downstream charged particle: One or two isolated STRAW tracks are allowed in an event. If two STRAW tracks are present, the one closest to the trigger time is selected. Events with a negatively-charged STRAW track are rejected to remove K + → π + π + π − and K + → π + π − e + ν decays. The selected track must be within the RICH, CHOD, NA48-CHOD, LKr, and MUV3 sensitive regions and must be spatially associated to signals in the RICH, CHOD, NA48-CHOD, and LKr. The track angle measured after the spectrometer magnet must be geometrically compatible with the centre of the reconstructed RICH ring. Time constraints are imposed on the associated signals in the RICH, CHOD and LKr using the NA48-CHOD time as a reference. A STRAW track with associated signals in CHOD, NA48-CHOD, RICH and LKr defines a downstream charged particle.
Parent K + : The parent K + of a selected downstream charged particle is defined by the signal in KTAG with time T KTAG closest in time and within 2 ns of the downstream particle, and a beam track in GTK with time T GTK within 600 ps of the KTAG signal and associated in space with the downstream track in the STRAW. The association between the GTK, KTAG and STRAW signals relies on a discriminant built from the time difference ∆T = T GTK -T KTAG and the closest distance of approach (CDA) of the downstream charged particle to the GTK track. The templates for the ∆T and CDA distributions of K + decays are obtained from a dedicated sample of K + → π + π + π − decays, where the K + is fully reconstructed using the pion momenta and directions measured by the STRAW. The GTK track with the largest value of the discriminant is then identified as the parent K + . The same discriminant, but with the RICH time of the downstream particle used as input instead of T KTAG , strengthens the time coincidence and further rejects accidental GTK tracks. The selected K + must be consistent with the nominal beam momentum and direction and the CDA must be less than 4 mm. No more than five reconstructed GTK tracks are allowed. Events with exactly one reconstructed GTK track are rejected if there are additional hits within 300 ps in two GTK stations incompatible in space with the selected K + track. The last condition reduces the impact of GTK inefficiencies and of K + decays between the second and third GTK stations.
Kaon decay: The mid-point of the segment at the CDA of the downstream charged particle to the parent K + defines the kaon decay vertex. The Z position of the kaon decay vertex (Z vertex ) must be inside the region 110-165 m (110-170 m) for S1 (S2), referred to as the fiducial volume (FV) in the following. In addition, the FV of the first p π + bin (15-20 GeV/c) is limited to 110-155 m to reduce the background from K + → π + π 0 decays by exploiting the correlation between momentum and Z vertex of the two-body decay kinematics. In the last two momentum bins (35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45) GeV/c) the FV is limited to 110-160 m to suppress K + → π + π − e + ν decays and upstream background, which dominate at high momentum. Further Z vertex -dependent constraints are imposed on the angle of the downstream charged particle. Its backward-extrapolated position at the exit of the final collimator (COL) must be outside a rectangular box with transverse dimensions 100 × 500 mm 2 . This condition is referred to as the box cut in the following and is applied to S1. The box cut is needed to remove upstream K + decays entering the FV through the aperture of the last achromat dipole, and leads to a 40% signal loss. For the sample collected with the new collimator, which completely covers the aperture, the box cut size is reduced and most of the acceptance loss is recovered. This allows further optimization of selection criteria using a Boosted Decision Tree (BDT) algorithm for S2 instead of the cut-based approach applied to S1. The BDT inputs are nine variables, which quantify the (X,Y,Z) coordinates and direction of the downstream charged particle measured by the STRAW, and those reconstructed at the decay vertex. The BDT is trained on an out-of-time data sample enriched in upstream K + decays. A cut on the resulting BDT output value is chosen to provide the same background rejection as a cut-based selection using the same variables, while increasing the signal acceptance by 8%. Finally, no signal must be present in the CHANTI detector to reduce the contribution from K + inelastic interactions in GTK3.

Pion identification:
The π + identification uses information from the calorimeters and the RICH and requires that no signal is reconstructed in MUV3 within 7 ns of the π + time, which reinforces the trigger condition. A multivariate classifier resulting from a BDT algorithm combines 13 variables describing the energy associated with the π + in the calorimeters, the shape of the clusters and the energy sharing between LKr, MUV1 and MUV2. Samples of π + , µ + and e + selected from 2017 data not included in the present analysis are used for training.
The π + identification by the RICH uses two different approaches to reconstruct a Cherenkov ring. In the first approach, the track direction, as measured by the last two STRAW chambers, is used to predict the position of the ring centre, and the expected ring radius is calculated for each mass hypothesis (e + , µ + , π + , K + ) using the track momentum. For each mass hypothesis, a likelihood is built by comparing the observed hit positions in the RICH and a circle defined by the expected ring radius, and a cut is applied on the value of the largest non-π + likelihood to remove ambiguous events. In the second approach, the ring centre and radius are determined by a χ 2 fit to the hit positions, and the charged particle mass is derived using the track momentum. A cut on the measured mass is then applied to distinguish pions from muons. Particle identification criteria with the calorimeters and RICH are optimized separately for each data category to achieve the best signal sensitivity.
Normalization selection: The selection of the normalization K + → π + π 0 events is applied to minimum-bias data and requires 0.010 < m 2 miss < 0.026 GeV 2 /c 4 and 15 < p π + < 45 GeV/c. The width of the normalization region is defined to be ±8σ, where σ is the m 2 miss peak resolution. The resolution depends on the π + track momentum and angular resolution, mostly due to multiple scattering in the STRAW chambers. The MC simulation reproduces the m 2 miss resolution to 10-20% and the discrepancy is propagated as an uncertainty in the SES.
Signal selection: The selection of the signal events is applied only to PNN data and requires that no in-time photons or additional charged particles are present. An in-time photon in the LKr is defined as an energy cluster located at least 100 mm away from the π + impact point and coincident in time with the π + . The size of the time coincidence window varies with the amount of deposited energy and ranges from ±5 ns below 1 GeV to ±50 ns above 15 GeV. In-time photons in the LAV are identified if a signal is found in any of the twelve LAV stations within 3 ns of the π + time. A similar method is used for the IRC and SAC, where an in-time photon is defined by either a TDC signal within 7 ns of the π + time or a FADC signal of at least 1 GeV within 7 ns of the π + time.
Multi-charged particle rejection discriminates against interactions of photons or charged particles in the RICH mirrors, and against K + → π + π + π − or K + → π + π − e + ν decays with partially reconstructed STRAW tracks. The former category is identified by the presence of isolated signals in at least two of the CHOD, NA48-CHOD and LKr detectors as well as intime signals from the veto detectors MUV0 and HASC. The latter category is characterized by the presence of tracks segments in the first two or the last two STRAW chambers, which are consistent with a particle coming from the kaon decay vertex.
Additional constraints are imposed on the m 2 miss value using: the π + momentum extracted from the RICH ring measurement in the π + mass hypothesis instead of the STRAW measurement; the nominal beam momentum and direction instead of the K + track measured by GTK. This reduces the kinematic tails due to multiple scattering in the STRAW or wrong K/π association.
The minimum momentum value is fixed at 15 GeV/c by the RICH threshold for efficient pion detection. The maximum value is fixed at 35 GeV/c in Region 1, because the K + → µ + ν decay distribution approaches the signal region at high momenta, and at 45 GeV/c in Region 2 to remove K + → π + π − e + ν and upstream backgrounds (see Figure 2).

Single event sensitivity
The following expression is used to compute the SES value: Here N ππ is the number of selected K + → π + π 0 normalization events; BR(K + → π + π 0 ) is the K + → π + π 0 branching ratio [33]; D = 400 is the downscaling factor of the minimum-bias trigger; A πνν and A ππ are the signal and normalization acceptances, respectively, evaluated with simulations; 1 − RV is the inefficiency resulting from the random veto induced by the photon and multi-charged particle rejection due to the presence of accidental activity in the detectors; 0.54 ± 0.04 0.14 ± 0.01 N exp πνν 1.56 ± 0.10 ± 0.19 ext 6.02 ± 0.39 ± 0.72 ext Table 1: Inputs to the SES evaluation, SES values and numbers of expected SM signal events in the S1 and S2 subsets.
resulting SES values, and the corresponding numbers of expected SM K + → π + νν events for the S1 and S2 subsets, integrated over p π + , are summarized in Table 1.
The K + → π + νν decays are simulated using form factors derived from the K + → π 0 e + ν decay. The accuracy of the description of particle identification and K/π association dominates the uncertainties of A πνν and A ππ in Table 1, but these effects cancel to first order in the ratio A ππ /A πνν . The relative contribution of π 0 Dalitz decays, π 0 → e + e − γ, to the SES result is estimated to be 0.7% and is assigned as a systematic uncertainty to A ππ . A systematic uncertainty of 3.5% is propagated to the SES value to take into account the quality of the description of π + interactions with the material upstream of the LKr, as well as of the m 2 miss distribution. The former effect is estimated using K + → π + π 0 decays with two photons selected in the LAV stations, and the latter by the study of the data/MC agreement of simulated K + → π + π 0 decays. The PNN trigger efficiency is a product of L0 and L1 efficiencies. The non-calorimetric L0 trigger efficiency is measured using a sample of K + → π + π 0 events selected from minimumbias data by applying signal-like selection criteria without tagging the π 0 . The additional requirement that both photons are detected by the LAV stations defines the sample used to measure the calorimetric L0 trigger efficiency. The L1 trigger efficiency is measured using a K + → µ + ν sample selected from minimum-bias data and a K + → π + π 0 sample triggered by the PNN L0 condition. The 2% uncertainty of this efficiency is propagated to the SES measurement. To test the assumption that the L0 and L1 trigger efficiencies are uncorrelated, a sample of K + → µ + ν events is obtained from minimum-bias data by applying the K + → π + νν selection criteria except for RICH π + identification. The same selection is applied to PNN data and the number of events in the µν region is compared to the number of minimum-bias events after correction for the measured trigger efficiency and the downscaling factor D. The observed discrepancy of up to 5%, stable across the whole 2018 period, is propagated to the SES measurement.
The random-veto parameter RV is measured using a sample of K + → µ + ν decays from minimum-bias data, selected similarly to K + → π + νν decays but with inverted particle identification criteria (µ + instead of π + ) and in the µν m 2 miss region. The fraction of events left after applying the photon and multi-charged particle rejection is measured to be RV = 0.66 ± 0.01, including a correction of +0.02 to account for activity in LAV and CHOD induced by the δrays produced by muons in the RICH mirrors, as calculated from simulation. The value of RV depends on the instantaneous beam intensity, and its uncertainty is evaluated by extrapolating  Figure 3: Evaluation of the π + π 0 background. Left: Distribution in the (p + π , m 2 miss ) plane of events in the π + π 0 region and in the adjacent control regions after the complete signal selection is applied to the S1 and S2 subsets. The intensity of the grey shaded area reflects the variation of the SM signal acceptance in the plane. Right: Data/MC comparison of the m 2 miss distribution of minimum-bias K + → π + π 0 events selected by tagging the π 0 → γγ decay. This data sample is used to measure the K + → π + π 0 kinematic factor f kin .

Background evaluation and validation
Background contributions to the K + → π + νν final state can be identified from two processes: K + decays inside the FV to a final state different from the signal; upstream events where a π + originates either from a K + decay or from an interaction between a beam K + and the material upstream of the FV.
The four main K + decay backgrounds are K + → π + π 0 , K + → µ + ν, K + → π + π + π − and K + → π + π − e + ν. The first three enter the signal regions if m 2 miss is mis-reconstructed. The estimation of these backgrounds relies on the assumption that π 0 rejection for K + → π + π 0 , particle identification for K + → µ + ν, and multi-charged particle rejection for K + → π + π + π − are independent of the m 2 miss variable defining the signal regions. After the K + → π + νν selection is applied, the expected number of events in the signal or control regions is computed for each category as: Here N bkg is the number of PNN-triggered events in the π + π 0 , µν or 3π background region and f kin (region) is the fraction of events reconstructed in the signal or control region for each decay mode. The values of the kinematic factor f kin (region) are obtained: for K + → π + π 0 and K + → µ + ν by using minimum-bias data samples with dedicated selections; for K + → π + π + π − by using simulated events. Backgrounds from K + → π + π − e + ν, K + → π + γγ and semileptonic decays K + → π 0 l + ν (l = e, µ) are evaluated only with simulations.
K + → π + π 0 : After the K + → π + νν selection, 471 events are observed in the π + π 0 region (Figure 3, left). The kinematic factor is measured using a minimum-bias data sample with a PNN-like selection applied. The π 0 → γγ decay is tagged by requiring exactly two photon clusters reconstructed in the LKr, consistent with m π 0 assuming that the photons originate from the K + decay vertex, and no other activity in SAC, IRC or LAV. This provides a backgroundfree sample of K + → π + π 0 decays selected with no constraints on the π + kinematic variables. The resulting m 2 miss distribution is shown in Figure 3, right. The assumption that the π 0 tagging is independent of the kinematics is tested by comparing the measured kinematic factor in Region 1 with and without the π 0 tagging applied. The disagreement of 3% is assigned as a systematic uncertainty. The background estimates in the signal and control regions are obtained using equation 2 and are validated by comparing expected and observed numbers of events within the control regions. The presence of radiative K + → π + π 0 γ decays from inner bremsstrahlung increases the fraction of K + → π + π 0 background. A correction is applied to account for this effect using the simulated spectrum, combined with a measurement of the energy-dependent single photon detection efficiency of SAC, IRC, LAV and LKr on data [34]. The value of the correction represents 8% of the K + → π + π 0 background and a 100% systematic uncertainty is assigned to account for the precision of the simulation.
K + → µ + ν: After the K + → π + νν selection, 14112 events are observed in the µν region. To take into account the correlations between the RICH particle identification and the kinematic selection criteria, f kin (region) is measured using a minimum-bias data sample. The K + → µ + ν decays are selected applying signal-like conditions, requiring the charged particle to satisfy the µ + identification criteria in the calorimeters and the π + identification criteria in the RICH. Similarly to K + → π + π 0 , the kinematic factor f kin (region) is measured by dividing the number of events in the signal or control region by the number of events in the µν region. The µ + identification applied to the minimum-bias sample suppresses the contribution from muons decaying in flight as µ + → e + ν eνµ with e + misidentified as π + . A simulation-driven correction of +3% (relative) is applied to the K + → µ + ν background estimation in Region 1, to account for this effect. Events with real photon emission are included in the measured f kin .
K + → π + π + π − : The m 2 miss distribution of the three-body decay spans over a wide kinematic region. This background is computed using an approach similar to the K + → π + π 0 background estimation while the kinematic factor f kin (region) is measured using MC simulations. A selection requiring only a match between a π + in the final state and the parent K + is applied to a sample of simulated K + → π + π + π − decays. To account for the resolution of the m 2 miss variable, the factor f kin (region) is computed in bins of m 2 miss . The m 2 miss -dependent value of f kin (region) is then multiplied by the number of events in the corresponding m 2 miss bin of the 3π region in data after the complete signal selection. The K + → π + π + π − background is obtained after integrating the background estimates in each m 2 miss bin.
K + → π + π − e + ν: This decay is characterized by large values of m 2 miss and contributes only to Region 2. The contribution is suppressed by the O(10 −5 ) branching ratio, multi-charged particle rejection, particle identification and kinematics. A sample of 2×10 9 MC simulated K + → π + π − e + ν decays is used to estimate the background. The simulation is validated by selecting four dedicated control samples. All samples are statistically independent of the signal selection and are obtained by inverting multi-charged particle rejection cuts or the charge sign of the downstream charged particle. The m 2 miss region used for the validation is 0.030 < m 2 miss < 0.072 GeV 2 /c 4 , free from other background processes. Agreement is observed across all samples as shown in Figure 4, left.
Other backgrounds from K + decays: The contributions from K + → π 0 l + ν (l = µ, e) and K + → π + γγ decays are found to be negligible given the particle identification and photon rejection criteria applied to the simulated samples. Upper limits of O(10 −3 ) and O(10 −2 ) events are obtained for the K + → π 0 l + ν and K + → π + γγ contributions, respectively.

Number of events
Expected events Observed events S1 S2 Figure 4: Validation of the K + → π + π − e + ν and upstream backgrounds. Left: Comparison between expected and observed number of events for the K + → π + π − e + ν validation samples in the S1 (1-4) and S2 (5-8) subsets. Right: Comparison between expected and observed number of events for the upstream background validation samples in the S1 (1-5) and S2 (6-8) subsets.
Upstream background: The background from upstream events receives contributions from two types of processes: a π + from K + decays occurring between GTK stations 2 and 3, matched to an accidental beam particle; a π + from interactions of a K + with the material in the beam line, produced either promptly or as a decay product of a neutral kaon and matched to the in-time K + . Studies on data and MC simulations validate the above classification of upstream events. The evaluation of the background from upstream events follows a data-driven approach. A sample of PNN data is selected with all K + → π + νν criteria applied, but requiring: CDA > 4 mm; no K/π association; m 2 miss value inside Regions 1 or 2. The events selected define the upstream sample: the distribution of the π + tracks at the (X COL , Y COL ) plane (Section 4) is shown in Figure 5, left. Contamination from K + decays in the FV is at the per cent level and therefore negligible. The upstream background is computed as the product of the number of events in the upstream sample, N ups , and the probability, P mistag , that an upstream event has CDA < 4 mm and satisfies the K/π association criteria. The probability P mistag depends on the shapes of the distributions of CDA and of ∆T. The model of the CDA distribution is extracted from the upstream sample by further removing the veto condition on CHANTI signals and the criteria suppressing pileup events in the GTK, which increase the number of events in the upstream sample without biasing the CDA (Figure 5, right). The probability P mistag is evaluated as a function of ∆T by generating upstream-like events in the (CDA, ∆T) plane and applying the K/π association with the CDA < 4 mm condition. The expected background is computed as The sum runs over the twelve 100 ps wide bins covering the (−600, +600) ps region used to reconstruct the tracks in the GTK; N ups (|∆T i |) is the number of events in the upstream sample in the ∆T bin i; P mistag (|∆T i |) is the mistagging probability; f scale = 1.15 is a scaling factor

Normalized entries/(4 mm)
True K (data)  Figure 5: Properties of upstream background events. Left: Extrapolation of π + tracks of the upstream sample described in the text to the (X COL , Y COL ) plane in the S2 subset. The small (large) red rectangles correspond to the inner (outer) borders of the new collimator. The outline of the last dipole of the beam achromat is shown with blue solid lines. Right: CDA distribution of the events in the upstream sample shown on the left plot (black markers with error bars), compared to the CDA distribution extracted from data and its uncertainty (brown shaded area) and to the same distribution of K + decaying in the FV (grey shaded area). that accounts for upstream events with CDA < 4 mm not included in N ups .
In total, N ups = 9 events are selected in S1 and N ups = 38 in S2, leading to an upstream background of N exp upstream = 3.3 +0.98 −0.73 combining the S1 and S2 subsets. The uncertainty is dominated by the statistical uncertainty of N ups . A systematic uncertainty of 20% is added, related to the modelling of the CDA shape below 4 mm. A 15% systematic uncertainty is assigned to the value of f scale .
The background prediction is validated using a data sample enriched in upstream events obtained by: removing the box cut (S1) or the cut on the BDT output value (S2); selecting the m 2 miss value to be either inside Regions 1 or 2, or in the region m 2 miss < −0.05 GeV 2 /c 4 removing veto conditions against pileup events in the GTK and interactions detected by CHANTI. The selected events are distributed over five validation samples in S1 depending on the π + position in the (X COL , Y COL ) plane, and on m 2 miss . In the S2 subset, three validation samples are obtained by either inverting the cut on the BDT output value or selecting events in the unphysical region m 2 miss < −0.05 GeV 2 /c 4 . The result of the upstream background validation procedure is presented in Figure 4, right.
Background summary: The background prediction for the sum of all contributions described above is validated in the six control regions located between the signal and the π + π 0 , µν and 3π regions. After unmasking the control regions the observed and expected numbers of events are found to be statistically compatible across all control regions (Figure 6, left). A summary of the background estimates summed over Region 1 and Region 2 is presented in Figure 6, right for the two subsets S1 and S2 of the 2018 data.
A background-only hypothesis test is performed using as a test statistic the likelihood ratio for independent Poisson-distributed variables as prescribed in [33]. A p-value of 3.4 × 10 −4 is obtained, corresponding to a signal significance of 3.4 standard deviations.
The K + → π + νν branching ratio is determined using a binned maximum log-likelihood fit to the observed numbers of events in the nine categories comprising the NA62 data set (Figure 7, right). The parameter of interest is the signal branching ratio. The nuisance parameters are the total expected numbers of background events in the signal regions and the single event sensitivities and corresponding uncertainties, obtained separately for each of the nine categories. For each category, the number of background events is constrained to follow a Poisson distribution while the SES follows a Gaussian distribution with mean and sigma as estimated. The resulting branching ratio is miss as a function of π + momentum for events satisfying the K + → π + νν selection criteria. The intensity of the grey shaded area reflects the variation of the SM signal acceptance in the plane. The two boxes represent the signal regions. The events observed in Regions 1 and 2 are shown together with the events found in the background and control regions. Right: Expected numbers of background events and numbers of observed events in the nine categories used in the maximum likelihood fit to extract the K + → π + νν branching ratio. Categories 0,1 and 2 correspond to 2016, 2017 and S1 subsets, respectively. Categories 3 to 8 correspond to the six 5 GeV/c wide momentum bins of the S2 subset. The observed data for each category are indicated by black markers with Poissonian statistical errors. The shaded boxes show the numbers of expected background events and the corresponding uncertainties.
compatible with the SM value within one standard deviation. The first uncertainty is statistical, related to the Poissonian fluctuation of the numbers of observed events and expected background, while the second is systematic, resulting from the uncertainty in the signal and background estimates.
This result is the most precise measurement of the K + → π + νν decay rate to date and provides the strongest evidence so far for the existence of this extremely rare process.
8 Search for K + → π + X decays The existence of a new feebly interacting scalar or pseudo-scalar particle, X, is foreseen in several BSM scenarios. If X decays to invisible particles or lives long enough to decay outside the detector, the signature of a K + → π + X decay is the same as that of the K + → π + νν decay. The two-body decay K + → π + X would result in a peak in the reconstructed m 2 miss distribution, centred at the squared value of the X mass, m 2 X . Using the event sample selected in the K + → π + νν measurement, a search for a peaking signal in the 2016-2018 data set is performed following the procedure detailed in [35]. The width of a signal peak is determined by the resolution of the m 2 miss observable, which decreases monotonically from 0.0012 GeV 2 /c 4 at m X = 0 to 0.0007 GeV 2 /c 4 at m X = 260 MeV/c 2 .
The SES is determined, for each m X , according to equation 1, by replacing A πνν with the acceptance for K + → π + X decays, which is obtained from simulation. Acceptance values for X with finite lifetime, τ X , decaying to visible SM particles are estimated by weighting simulated

Region 2
Data ν ν Total events by the probability that X does not decay upstream of MUV3.
The background contributions in searches for K + → π + X decays are the same as for the K + → π + νν studies with the addition of the K + → π + νν decay itself, which becomes the dominant background. The expected distributions of background processes as functions of m 2 miss , evaluated as described in Section 6 and assuming the SM description for K + → π + νν decay, are shown in Figure 8, left. The largest uncertainty in the estimated background comes from the SM K + → π + νν decay rate. The second largest uncertainty comes from the modelling of the upstream background distribution, which is statistically limited, and a systematic uncertainty of up to 20% is assigned to each m 2 miss bin. The total background is modelled, as a function of the reconstructed m 2 miss , by polynomial functions fitted to the expectations in Regions 1 and 2. The search for K + → π + X decays is performed with a fully frequentist hypothesis test using a shape analysis with observable m 2 miss and an unbinned profile likelihood ratio test statistic. Each X-mass hypothesis is treated independently according to the CLs method [36] to exclude the presence of a signal with 90% CL for the observed data. The statistical analysis is performed using four categories corresponding to the 2016, 2017, S1 and S2 2018 subsets.
Under the assumption that the events observed in the signal regions correspond to the known expected backgrounds, upper limits are established on BR(K + → π + X) at 90% CL for each Xmass hypothesis. Results are displayed in Figure 8, right for a stable or invisibly decaying particle X. For X decaying to visible SM particles, observed upper limits are shown in Figure 9, left as a function of m X and for different values of τ X . These limits improve by a factor of four on those obtained from the 2017 data and improve over previous limits from the E949 experiment [17] for most mass hypotheses. The extension of the FV for the S2 subset (Section 4) with respect to the analysis of the 2017 data set has improved substantially the sensitivity to X with shorter lifetimes.
An interpretation of these limits is presented for a scenario where X is a dark-sector scalar, which mixes with the Higgs boson according to the mixing parameter sin 2 θ [20,37]. Constraints in the parameter space of this scenario are shown in Figure 9, right, under the assumption that X decays only to visible SM particles, with τ X inversely proportional to the mixing parameter.  [20] decaying only to visible SM particles. Exclusion bounds from the present search for the decay K + → π + X are labelled as "K + → π + + inv." and are shaded in red. The constraints from the independent NA62 search for π 0 → invisible decays [34] are shown in purple. Other bounds, shown in grey, are derived from the experiments E949 [17], CHARM [37], NA48/2 [38], LHCb [39,40] and Belle [41].

Conclusions
The NA62 experiment at CERN has analysed the data set collected in 2018, searching for the very rare K + → π + νν decay, taking advantage of new shielding against decays upstream of the kaon decay volume, and of improved reconstruction algorithms and particle identification performance with respect to earlier data sets. The statistical power was increased by three multiplicative factors, one factor of 1.8 due to the larger number of effective kaon decays, and two factors of 1.4, each due to better shielding and to improved analysis technique. Combining the results obtained from the whole 2016-2018 data set, a single event sensitivity of (0.839 ± 0.053 syst ) × 10 −11 has been reached. The number of expected K + → π + νν events in the signal regions is (10.01 ± 0.42 syst ± 1.19 ext ), assuming the Standard Model BR of (8.4 ± 1.0) × 10 −11 , while 7.03 +1.05 −0.82 background events are expected in the same signal regions, mainly due to upstream background. After unmasking the signal regions, twenty candidate events are observed, consistent with expectation. This leads to the branching ratio BR(K + → π + νν) = (10.6 +4.0 −3.4 | stat ± 0.9 syst ) × 10 −11 at 68% CL, which is the most precise measurement to date. In a background-only hypothesis, a p-value of 3.4 × 10 −4 is obtained, which corresponds to a 3.4 standard-deviation evidence for this very rare decay.
This result is also interpreted in the framework of a search for a feebly interacting scalar or pseudo-scalar particle X, produced in the decay K + → π + X with the same experimental signature as the dominant background process K + → π + νν. Upper limits on the branching ratio at 90% CL of 3-6 ×10 −11 are obtained for m X masses in the range 0-110 MeV/c 2 and 1 × 10 −11 for m X masses in the range 160-260 MeV/c 2 . A particular model where X is a darksector scalar mixing with the Higgs boson has been explored, setting more stringent constraints on the allowed region in the plane (m X , sin 2 θ), where θ is the mixing angle.
NA62 will continue taking data in 2021 with an upgraded detector including beam line modifications, with the aim of further reducing the upstream background, thus allowing for an improved signal sensitivity.