1 Introduction

The QCD Axion is a hypothetical pseudoscalar particle, that was postulated in [1] to solve the strong CP problem. Its properties are described by the decay constant \(f_{a}\), related to the Peccei-Quinn (PQ) symmetry breaking scale \(\Lambda _{PQ}\): \(f_{a} = \Lambda _{PQ}/4\pi \). The QCD axion mass is \(m_{a} \sim m_{\pi }f_{\pi }/f_{a}\). It decays to two photons, the decay time is \(\tau _{a}\) \(\sim \) \(2^{8}\pi ^{3}f^{2}_{a}/(\alpha m^{3}_{a})\). When the axion is considered as a candidate dark matter particle, its lifetime \(\tau _{a}\) is expected to be comparable to or greater than the lifetime of the Universe \(\sim \) 13.8 Gyr, and its mass estimate is \(m_{a} \lesssim \) 10 eV/\(c^{2}\). Axion also has couplings to quark currents, in particular to the sd flavour-changing neutral current (FCNC). There is a vector and axial vector coupling: \({\mathscr {L}} = q_{\mu } a \{\bar{d} (\gamma _{\mu }/F^{V}_{sd} + \gamma _{\mu }\gamma _{5}/F^{A}_{sd}) s \}\), where \(q_{\mu }\) – axion four-momentum, \(F^{A}_{sd}\) and \(F^{V}_{sd}\) effective constants, with \(F^{A/V} = 2f_{a}/C^{A/V}\) being model-dependent constants.

Fig. 1
figure 1

Schematic elevation view of the OKA setup. See the text for details

Due to parity conservation in QCD the decay \(K^{+} \rightarrow \pi ^{+} \pi ^{0} a\) is sensitive to the axial-vector coupling while much better constrained decay \(K^{+} \rightarrow \pi ^{+} a\) [2] tests the vector coupling. More general models of axion-like particles (ALPs) consider cases where the axion mass is determined not only by QCD dynamics, but by some other mechanism. These models have two free parameters: \(m_{a}\) and \(f_{a}\). Then there is a softer limit on axion mass: \(m_{ALP} < 1\) GeV/\(c^{2}\) [3].

The only result on the search for an axion in the \(K^{+} \rightarrow \pi ^{+} \pi ^{0} a\) decay mentioned in PDG [4] is that of the BNL E787 experiment [5]. Better upper limits can be extracted from ISTRA+ paper [6], devoted to the search for a pseudoscalar sgoldstino. The result is \(\sim \) \(Br<10^{-5}\) at \(90\%\) C.L.

This article is devoted to a new search for the decay of \(K^{+} \rightarrow \pi ^{+} \pi ^{0} a\). We assume that the ALP has a sufficiently long lifetime to decay outside the detector. In our study, we rely on [3, 7], where the phenomenology of the \(K^{+} \rightarrow \pi ^{+} \pi ^{0} a\)  decay is considered in detail.

2 The OKA setup

The OKA is a fixed-target experiment dedicated to the study of kaon decays using the decay in flight technique. It is located at NRC “Kurchatov Institute”-IHEP in Protvino (Russia). A secondary kaon-enriched hadron beam is obtained by RF separation with the Panofsky scheme. The beam is optimized for a momentum of 17.7 GeV/c with a kaon content of about 12.5% and an intensity up to \(5\times 10^{5}\) kaons per U-70 spill.

The OKA setup (Fig. 1) makes use of two magnetic spectrometers on either side of an 11 m long Decay Volume (DV) filled with helium at atmospheric pressure and equipped with a guard system (GS)  of lead-scintillator sandwiches    mounted in 11 rings inside the DV for photon veto. It is complemented by an electromagnetic calorimeter BGD [8] (with a wide central opening).

The first magnetic spectrometer measures the momentum of the beam particles (with a resolution of \(\sigma _{p}/p\) \(\approx 0.8\%\)). It consists of a vertically (y) deflecting magnet M\(_{1}\) surrounded by a set of 1 mm pitch (beam) proportional chambers BPC\({}_{(}\) \({}_{1Y,}\) \({}_{2Y,}\) \({}_{2X,...,}\) \({}_{4Y}\) \({}_{)}\). The second magnetic spectrometer (with a resolution of \(\sigma _{p}/p \approx \) 1.3–2% for a momentum range of 2–14 GeV/c) is used to measure the momentum of the decay product tracks. It consists of a wide aperture 200 \(\times \) 140 cm\(^2\) horizontally (x) deflecting magnet SM with \(\int Bdl \approx 1\) Tm surrounded by tracking stations: proportional chambers PC\(_{(1,...,8)}\), straw tubes ST\(_{(1,2,3)}\), and drift tubes DT\(_{(1,2)}\). A matrix hodoscope (HODO) is used to improve time resolution and to link xy projections of a track.

At the end of the setup, there are an electromagnetic calorimeter  GAMS\(_\mathtt{(ECAL)}\) of 18X\({}_{0}\)  (consisting of  \(\sim \) 2300 3.8 \(\times \) 3.8 \(\times \) 45 cm\(^3\) lead glass blocks) [9], a hadron calorimeter GDA\(_\mathtt{(HCAL)}\) of 5\(\lambda \) (constructed from 120 20 \(\times \) 20 cm\({}^{2}\) iron-scintillator sandwiches with WLS plates readout), and a wall of four \(1\times 1\) m\({}^{2}\) muon counters \(\mu \)C situated behind the hadron calorimeter.

More details on the OKA setup can be found in [10, 11, 13].

3 The data and the analysis procedure

Two consecutive data sets with a beam momentum of 17.7 GeV/c recorded by the OKA collaboration in 2012 and 2013 are analyzed to search for the ALP.

The present analysis is based on a trigger that selects kaon decays: \(\mathtt{Tr_{Kdecay}=}\) \(\mathtt{S_{1}{\cdot }S_{2}{\cdot }S_{3}{\cdot }S_{4}{\cdot }}{\overline{\check{\texttt {C}}}_{1}{\cdot }\check{\texttt {C}}_{2}{\cdot }\overline{\texttt {S}}_{\texttt {bk}} }\). Four scintillation counters (S\(_{1}\), S\(_{2}\), S\(_{3}\), S\(_{4}\)) are used to select beam particles. A combination of two threshold Cherenkov counters (Č\(_{1}\) sees pions; Č\(_{2}\) sees pions and kaons) is needed to select kaons. An anti-coincidence with two scintillation counters (S\(_{bk1}\), S\(_{bk2}\)) located on the beam axis behind the SM magnet is used to suppress the recording of undecayed beam particles. The trigger additionally requires an energy deposition in the GAMS e.m. calorimeter higher than  2.5 GeV, \(\mathtt{Tr_{GAMS}=Tr_{Kdecay} \cdot (E_{GAMS}>2.5}\) \(\texttt {GeV})\), to suppress the dominating \(K^{+}\rightarrow \mu ^{+}\nu \) decay.

Monte Carlo (MC) simulations are performed using a software package based on the Geant\(-\)3.21 library [12], which includes a realistic description of the setup. The response of the trigger used in the experiment is also included in the simulation. The MC simulation events undergo the full OKA reconstruction procedure. Monte Carlo events for various decays use weights proportional to the square of the absolute value of the corresponding matrix element. Ten independent samples of events with ten different axion masses ranging from 0 to 210 MeV were generated according to the matrix element described in [3, 7] to study the desired signal from \(K^{+}\rightarrow \pi ^{+}\pi ^{0} a\) decay.

To estimate the background, samples of Monte Carlo events are used for the five main decay channels of charged kaon (\(\pi ^{+}\pi ^{0}\), \(\pi ^{+}\pi ^{0}\gamma \), \(\pi ^{+}\pi ^{0}\pi ^{0}\), \(\pi ^{0}\mu ^{+}\nu \), and \(\pi ^{0}e^{+}\nu \)) mixed according to their branching fractions. The generated MC statistics are \(\sim \) 8 times larger than the data sample recorded in the experiment. The weights for the 3-body decays are calculated according to PDG [4]. Processes in which the beam kaon is scattered or interacts while passing through the setup are also added.

3.1 Event selection

The total number of \(\sim \) \(3.65\times 10^{9}\) events with kaon decays is logged, of which \(\sim \) \(8\times 10^{8}\) events are reconstructed with a single charged particle in the final state.

Fig. 2
figure 2

The distribution of \((m_{\pi ^{+}{a}})^{2}{^{~}}\)vs.\({^{~}}(m_{\pi ^{+}\pi ^{0}})^{2}\) for reconstructed events of the axion (plots a, b, and c) for a set of masses \(m_{a} = \{0, 60, 160\}\) MeV/c\({}^{2}\) and (plots d, e, f, g, and h) for the main background processes (\(K^{+}\rightarrow \pi ^{+}\pi ^{0}\), \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\gamma \), \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\pi ^{0}\), \(K^{+}\rightarrow \mu ^{+}\nu _{\mu }\pi ^{0}\), \(K^{+}\rightarrow {e}^{+}\nu _{e}\pi ^{0}\)) according to MC simulation with the matrix elements. The plots correspond to the first run, and the normalisation of different decays is done according to their branching ratios to the \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\) statistics obtained in the experiment. The branching ratio for the \(K^{+}\rightarrow \pi ^{+}\pi ^{0} a\) decay is assumed to be \(10^{-5}\), regardless of the axion mass hypothesis chosen. The selection of the area above the red line is applied to suppress the background processes strongly, while preserving about half the signal events due to the feature of its matrix element

The selection of events for the \(K^{+}\rightarrow \pi ^{+}\pi ^{0}{a}\) decay process begins with a requirement for a single beam track and a single secondary track with an opening angle of \(>4\) mrad and with the vertex matching distance (CDA) below 1.25 cm. A moderate chi-square cut is applied for the quality of the charged track. To clean the tracking sample, it is required that no extra track segments behind the SM magnet are found. The vertex position is required to be inside the DV. The beam particle momentum is required to be within a range of \(17.0<p_{beam}<18.6\) GeV/c. To select \(\pi ^{0}\) among decay products, the number of showers in electromagnetic calorimeters (GAMS or BGD) not associated with the track must be equal to 2. The identification of \(\pi ^{0}\) for these events is done by invariant mass selection: \(|m_{\gamma \gamma }-m_{\pi ^{0}}|<15\) MeV/c\(^{2}\).

After those selections, we obtain \(30.3\cdot 10^{6}\) \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\) events in two runs (\(18.4\cdot 10^{6}\) for 2012 and \(11.9\cdot 10^{6}\) for 2013).

3.2 Selecting the decay of interest

The Dalitz plots in the \((m_{\pi ^{+}\pi ^{0}})^2, (m_{\pi ^{+} a})^2\) plane for the selected MC events (for three masses for the signal and for five different background decays) are shown in Fig. 2a–h. The figures correspond to the first run, and the corresponding figures for the second run look very similar.

Fig. 3
figure 3

Upper plots: the resulting \(m^{2}_{mis}\)-distribution for the 2012 run (left) and 2013 run (right). The data is shown as black points with errors. Different background channels are marked with colors; the stack plot is used to highlight strongly suppressed processes (note the log-y scale). The normalization of the MC background is done using the number of \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\) events before the main cuts. Middle plots: the result of the tuning of the relative magnitudes for the background processes. The corresponding scaling factors are depicted with their fit errors. The bottom plots demonstrate the difference between the experiment and the sum of the tuned background processes. For illustration, the results of the signal fit for five positions of \(m^{2}_{mis}\) are shown

To allow for better separation of signal from background, the area below the red line on the Dalitz plot is rejected. Strong background suppression is achieved, while the signal is reduced moderately (by a factor of \(\sim \) 2).

Furthermore, to disentangle \(K^{+}\rightarrow \pi ^{+}\pi ^{0} a\) signal from its main backgrounds \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\) and \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\pi ^{0}\), three kinematic cuts are applied: the missing energy is \(E_{mis} = (E_{K^{+}} - E_{\pi ^{+}} - E_{\pi ^{0}}) > 2.8\) GeV; and the momenta of \(\pi ^{0}\) and \(\pi ^{+}\) in the kaon rest frame are: \(p^{*}_{\pi ^{0}}<150\) MeV/c and \(p^{*}_{\pi ^{+}}<189\) MeV/c.

To suppress misidentified events from \(K^{+}\rightarrow \mu ^{+}\nu _{\mu }\pi ^{0}\), a requirement of absence of signal from the muon counter \(\mu \texttt{C}\) matched with the track is used.

The further suppression of \(K^{+}\rightarrow {e}^{+}\nu _{e}\pi ^{0}\) and \(K^{+}\rightarrow \mu ^{+}\nu _{\mu }\pi ^{0}\) is done with the help of calorimetry.

To suppress \(e^{+}\), the events with \(E/p>0.83\) (E is the energy of the GAMS shower, associated with the track of momentum p) are discarded. Then, if the number of cells in the shower is \(N>4\) or the cluster energy is \(E>1.9\) GeV, the track is identified as \(\pi ^{+}\) with an "early" hadron shower in GAMS. Otherwise, one shower in GDA, matching the track within 22 cm and with either the number of cells in GDA, \(N_{GDA}>4\), or with a sufficiently high ratio \(E_{GDA}/p>0.67\) for \(N_{GDA}>1\), is required. This cut rejects muons and selects pion with a "late" shower in GDA.

Finally, the total energy deposition in GS below the noise threshold of 100 MeV is required to suppress events with photons escaping the acceptance of e.m. calorimeters.

3.3 Fits of the missing mass spectrum

As the next step, a search for the signal in the missing mass squared spectrum in the range 0 – 0.05 GeV\({}^{2}\)/\({c^{4}}\) is performed, \(m^2_{a} = m^2_{mis} = (\textbf{p}_{K^{+}} - \textbf{p}_{\pi ^{+}} - \textbf{p}_{\pi ^{0}})^{2}\), where \(\textbf{p}_{K^{+}}\), \(\textbf{p}_{\pi ^{+}}\) and \(\textbf{p}_{\pi ^{0}}\) are four-momenta of the corresponding particles.

The resulting \(m^{2}_{mis}\)-distributions for the data and the main background processes are shown in Fig. 3. Due to strong suppression cuts (\(\sim 10^{5}\)), the efficiency estimates for background processes are known with noticeable errors, so a (maximum likelihood) fit in which the efficiencies of the main background processes are allowed to vary is used to tune the background model to the experimental data. Since at this stage the \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\pi ^{0}\) background is indistinguishable from the signal with \(m_{a}=m_{\pi ^{0}}\) the \(2\sigma \) region around \((m_{\pi ^{0}})^{2}\) is excluded from the fit, and we are using only combinatorial background tails as an estimate for this background magnitude. The variations for \(K^{+} \rightarrow \pi ^{+} \pi ^{0}\) and \(K^{+} \rightarrow \pi ^{+} \pi ^{0} \gamma \) backgrounds are only allowed to have lower values to prevent these processes from absorbing the potential signal in the low-mass region.

As a result of the fit we obtain a reasonable description of the background in the wide \(m^{2}_{mis}\)-mass range, see middle and bottom plots in Fig. 3, the residual discrepancy mainly affects the negative non-physical side of the \(m^{2}_{mis}\)-distribution, but due to the non-ideal resolution for the signal at zero mass this affects the extraction of the signal near \(m^{2}_{mis} \approx 0\).

The next stage is a maximum likelihood (ML) fit of the distributions shown in Fig. 3 (middle plots) with a tuned background to search for the signal at 89 different positions of \(m^{2}_{a}\).

According to the MC simulation, the signal \(m^{2}_{mis}\) distribution is well described by the Gaussian over a wide range of axion masses. The \(\pm 2\sigma \) range is used to fit the signal at each \(m^{2}_{a}\) point, with the width (\(\sigma \)) taken from a parameterization shown in Fig. 4 (left).

In the absence of a statistically significant signal in the considered mass region, we estimate the upper limit for the number of signal events. We follow the method used in [6], in which the fit procedure for the signal is not bound to positive values only (to avoid a possible bias). The variation of the signal width by 5% is used during the fit procedure to account for possible systematics in the MC signal width. The one-sided upper limit for the number of signal events corresponding to the confidence level (CL) of 90% is constructed as \(N_{UL@90{\%}CL} = max(N_{P},0)+1.28\cdot \sigma _{N_{P}}\), where \(N_{P}\) and \(\sigma _{N_{P}}\) are evaluated from the ML fit, being the number of signal events and its error. This approach was chosen because it allows one to make a profit from the knowledge of the signal shape, in contrast to the Feldman and Cousins [14] method, where only the central region of the signal is used. The additional profit is that the obtained parameters \(N_{P}\) and \(\sigma _{N_{P}}\) are convenient for the statistical combination of two runs. Nevertheless, for comparison, we present the results with the Feldman–Cousins method (where the window of \(\pm 1.2 \sigma \) for the signal is applied); see Fig. 5 (left, middle). The statistical sum of the two experiments is shown in Fig. 5 (right).

3.4 Branching calculations, systematic errors study

The upper limits for the branching ratio are calculated relative to the well-measured \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\) (aka \(K\pi 2\)) decay [4]. During normalization, the \(p^{*}_{\pi ^{+}}\), \(p^{*}_{\pi ^{0}}\), and \(E_{mis}\) cuts used to select \(K^{+}\rightarrow \pi ^{+}\pi ^{0} a\) are disabled for the \(K\pi 2\) selection. It is expected that the main sources of systematics related to trigger selection, track quality, and particle identification would cancel out in this approach. It is also true for the GS (veto) cut, as we apply it to \(K\pi 2\) events as well.

Fig. 4
figure 4

The width (\(\sigma \) of \(m^{2}_{mis}\) distribution) and efficiency of the signal are calculated at 10 mass points along with their interpolation curves used in the analysis (shown in red) The figures correspond to the first run, the width of the signal for the second run is by \(\sim 10\%\) less, while the efficiency is almost the same

Fig. 5
figure 5

The 90% CL upper limits on  \(K^{+}\rightarrow \pi ^{+}\pi ^{0}{a}\)  branching (left – for the 2012 run, middle – for the 2013 run, right – for the statistical sum of the runs). The Feldman–Cousins method [14] is used for the comparison, see text for details

Fig. 6
figure 6

The 90% CL upper limits for the \(K^{+}\rightarrow \pi ^{+}\pi ^{0}{a}\) branching (left and middle figures). The scatter plot (left figure) demonstrates the systematics arising from the variation of the cuts used for the selection (flat distribution of 20 variables within \(\pm 1\sigma \)). The RMS of the distribution at each \(m^{2}_{mis}\) is indicated by gray bars superimposed over the scatter plot, while the remaining systematic errors are added quadratically and shown with black bars. The middle plot demonstrates the final result for the ULs: the black triangles correspond to the mean value from the variation of the selection criteria, while the systematic errors are depicted by dotted lines; the result from ISTRA+ [6] is depicted by pink filled circles for comparison. The right plot indicates the corresponding lower limit on the \(|F^{A}_{sd}|\) parameter, together with the corresponding systematic error

The branching for the signal decay is calculated as follows: \(Br(K^{+}\rightarrow \pi ^{+}\pi ^{0}{a})=Br(K\pi 2)\cdot (N_{a}/\varepsilon _{a})\cdot (\varepsilon _{K\pi 2}/N_{K\pi 2})\), where the efficiency and number of events for the run 2012 (2013) are: \(\varepsilon _{K\pi 2}=6.9\cdot 10^{-2}\) (\(7.2\cdot 10^{-2}\)) and \(N_{K\pi 2}=9.7\cdot 10^6\) (\(6.1\cdot 10^6\)) events; \(\varepsilon _{a}\) is the signal efficiency.Footnote 1, taken from the parametrization shown in Fig. 4 (right). The systematic error of the \((\varepsilon _{K\pi 2}/N_{K\pi 2})\) ratio is estimated at \(3.5\%\), while the statistical one is negligible. The systematic error in \(\varepsilon _{a}\) is about 5%.

A systematic error of 14.8% is derived from the difference between the \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\) and \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\pi ^{0}\) (with lost \(\pi ^{0}\), which is detected by the missing mass spectrum) normalization alternatives. For the \(K^{+} \rightarrow \pi ^{+}\pi ^{0}\pi ^{0}\) case, the veto (GS) cut is removed to avoid suppression of the second (escaping) \(\pi ^{0}\), and also (in contrast to the \(K^{+}\rightarrow \pi ^{+}\pi ^{0}\)), the \(E_{mis}\) cut is used.

A systematic error from the theory of the \(K^{+}\rightarrow \pi ^{+} \pi ^{0} a\) decay, which uses the experimentally measured \(K_{e4}\) formfactors [15], is \(\lesssim \) 5%. It is estimated via reanalysis with the signal MC, in which two formfactors with the most significant errors, \(f_{p}\) and \({g'}_{p}\), were varied accordingly in different combinations.

To account for possible systematic errors due to selection criteria, we repeated over a hundred standard analyses described earlier in this paper, where a set of main cuts was randomly chosen within a window of \(\pm 1\cdot \sigma \) (sigma is the resolution of a cut variable, estimated from the MC) around the original values. The resulting combinations of UL vs. \(m^{2}_{mis}\) are shown in Fig. 6 (left) as a scatter plot. The mean upper limit at each mass point is considered as the final result, and the obtained RMS of the distribution is treated as the systematic error of the upper limit, shown with gray bars superimposed in Fig. 6 (left).

Finally, systematic errors on the \(\varepsilon _{K\pi 2}/N_{K\pi 2}\) ratio, \(\varepsilon _{a}\), normalization, and due to the experimental knowledge of \(K_{e4}\) formfactors are added quadratically to the RMS obtained from the variation of the selection criteria and shown by black bars in Fig. 6 (left).

The final upper limit, taking into account the systematic errors, is shown in Fig. 6 (middle figure). The corresponding lower limit for the effective constant \(|F^{A}_{sd}|\) is calculated accordingly and is shown in Fig. 6 (right figure).

4 Conclusions

The OKA data is analyzed to search for the light axion-like particle. A peak search method in the missing mass spectrum is used in the analysis. No signal is observed, and the upper limits for the branching ratio in the mass range 0–200 MeV/c\(^{2}\) are set. The current best upper limits for the Br(\(K^{+}\rightarrow \pi ^{+}\pi ^{0}{a}\)) can be obtained from the paper of ISTRA+ [6], which was originally devoted to the search for the pseudoscalar sgoldstino. The only result on the axion search in \(K^{+}\rightarrow \pi ^{+}\pi ^{0}{a}\) decay mentioned in PDG [4] is that of the BNL-787 experiment [5]. It should be noted that both searches were performed under the uniform phase-space distribution hypothesis, which appeared to be a rough approximation, as can be seen in Fig. 2. Our analysis uses the realistic matrix element from [3, 7] and improves the limit [6] by a factor of 3.5–10, depending on the axion mass (see Fig. 6, middle). Using the expression from [3], relating \(F^{A}_{sd}\) and the branching, we calculated the lower limits for \(|F^{A}_{sd}|\), shown in Fig. 6 (right). The limit is about \(6.4 \cdot 10^7\) GeV for the axion mass below 70 MeV/c\({}^{2}\), which is, according to [7], the best limit for \(|F^{A}_{sd}|\) among the HEP experiments.