# Search for the lepton flavour violating decay \(\mu ^+ \rightarrow \mathrm {e}^+ \gamma \) with the full dataset of the MEG experiment

## Abstract

The final results of the search for the lepton flavour violating decay \(\mathrm {\mu }^+ \rightarrow \mathrm {e}^+ \mathrm {\gamma }\) based on the full dataset collected by the MEG experiment at the Paul Scherrer Institut in the period 2009–2013 and totalling \(7.5\times 10^{14}\) stopped muons on target are presented. No significant excess of events is observed in the dataset with respect to the expected background and a new upper limit on the branching ratio of this decay of \( \mathcal{B} (\mu ^+ \rightarrow \mathrm{e}^+ \gamma ) < 4.2 \times 10^{-13}\) (90 % confidence level) is established, which represents the most stringent limit on the existence of this decay to date.

## 1 Introduction

The standard model (SM) of particle physics allows charged lepton flavour violating (CLFV) processes with only extremely small branching ratios (\(\ll \)\(10^{-50}\)) even when accounting for measured neutrino mass differences and mixing angles. Therefore, such decays are free from SM physics backgrounds associated with processes involving, either directly or indirectly, hadronic states and are ideal laboratories for searching for new physics beyond the SM. A positive signal would be an unambiguous evidence for physics beyond the SM.

The existence of such decays at measurable rates not far below current upper limits is suggested by many SM extensions, such as supersymmetry [1]. An extensive review of the theoretical expectations for CLFV is provided in [2]. CLFV searches with improved sensitivity probe new regions of the parameter spaces of SM extensions, and CLFV decay \(\mu ^+ \rightarrow \mathrm {e}^+ \gamma \) is particularly sensitive to new physics. The MEG collaboration has searched for \(\mu ^+ \rightarrow \mathrm {e}^+ \gamma \) decay at the Paul Scherrer Institut (PSI) in Switzerland in the period 2008–2013. A detailed report of the experiment motivation, design criteria, and goals is available in reference [3, 4] and references therein. We have previously reported [5, 6, 7] results of partial datasets including a limit on the branching ratio for this decay \( \mathcal{B} < 5.7\times 10^{-13}\) at 90 % C.L.

The signal consists of a positron and a photon back-to-back, each with energy of 52.83 MeV (half of the muon mass), and with a common origin in space and time. Figure 1 shows cut schematic views of the MEG apparatus. Positive muons are stopped in a thin plastic target at the centre of a spectrometer based on a superconducting solenoid. The decay positron’s trajectory is measured in a magnetic field by a set of low-mass drift chambers and a scintillation counter array is used to measure its time. The photon momentum vector, interaction point and timing are measured by a homogeneous liquid xenon calorimeter located outside the magnet and covering the angular region opposite to the acceptance of the spectrometer. The total geometrical acceptance of the detector for the signal is \(\approx \)\(11\) %.

The signal can be mimicked by various processes, with the positron and photon originating either from a single radiative muon decay (RMD) (\(\mu ^+ \rightarrow \mathrm {e}^+\gamma \nu \bar{\nu }\)) or from the accidental coincidence of a positron and a photon from different processes. In the latter case, the photon can be produced by radiative muon decay or by Bremsstrahlung or positron annihilation-in-flight (AIF) (\(\mathrm {e}^+ \mathrm {e}^- \rightarrow \gamma \gamma \)). Accidental coincidences between a positron and a photon from different processes, each close in energy to their kinematic limit and with origin, direction and timing coincident within the detector resolutions are the dominant source of background.

Since the rate of accidental coincidences is proportional to the square of the \(\mu ^+\) decay rate, the signal to background ratio and data collection efficiency are optimised by using a direct-current rather than pulsed beam. Hence, the high intensity continuous surface \(\mu ^+\) beam (see Sect. 2.1) at PSI is the ideal facility for such a search.

The remainder of this paper is organised as follows. After a brief introduction to the detector and to the data acquisition system (Sect. 2), the reconstruction algorithms are presented in detail (Sect. 3), followed by an in-depth discussion of the analysis of the full MEG dataset and of the results (Sect. 4). Finally, in the conclusions, some prospects for future improvements are outlined (Sect. 5).

## 2 MEG detector

The MEG detector is briefly presented in the following, emphasising the aspects relevant to the analysis; a detailed description is available in [8]. Briefly, it consists of the \(\mu ^+\) beam, a thin stopping target, a thin-walled, superconducting magnet, a drift chamber array (DCH), scintillating timing counters (TC), and a liquid xenon calorimeter (LXe detector).

In this paper we adopt a cylindrical coordinate system \({ (r,\phi ,z)}\) with origin at the centre of the magnet (see Fig. 1). The *z*-axis is parallel to the magnet axis and directed along the \(\mu ^+\) beam. The axis defining \(\phi =90^\circ \) (the *y*-axis of the corresponding Cartesian coordinate system) is directed upwards and, as a consequence, the *x*-axis is directed opposite to the centre of the LXe detector. Positrons move along trajectories with decreasing \(\phi \)-coordinate. When required, the polar angle \(\theta \) with respect to the *z*-axis is also used. The region with \({ z}<0\) is referred to as upstream, that with \({ z}>0\) as downstream.

### 2.1 Muon beam

The requirement to stop a large number of \(\mu ^+\) in a thin target of small transverse size drives the beam requirements: high flux, small transverse size, small momentum spread and small contamination, e.g. from positrons. These goals are met by the 2.2 mA PSI proton cyclotron and \(\pi \)E5 channel in combination with the MEG beam line, which produces one of the world’s most intense continuous \(\mu ^+\) beams. It is a *surface muon beam* produced by \(\pi ^+\) decay near the surface of the production target. It can deliver more than \(10^8\) \(\mu ^+\)/s at 28 MeV/c in a momentum bite of 5–7 %. To maximise the experiment’s sensitivity, the beam is tuned to a \(\mu ^+\) stopping rate of 3\(\times 10^7\), limited by the rate capabilities of the tracking system and the rate of accidental backgrounds, given the MEG detector resolutions. The ratio of \(\mathrm {e^+}\) to \(\mu ^+\) flux in the beam is \(\approx \)8, and the positrons are efficiently removed by a combination of a Wien filter and collimator system. The muon momentum distribution at the target is optimised by a degrader system comprised of a 300 \(\upmu \)m thick mylar^{®} foil and the He-air atmosphere inside the spectrometer in front of the target. The round, Gaussian beam-spot profile has \(\sigma _{x,y} \approx \)10 mm.

The muons at the production target are produced fully polarized (\(P_{\mu ^+}=-1\)) and they reach the stopping target with a residual polarization \(P_{\mu ^+} = -0.86\, \pm \, 0.02 ~ \mathrm{(stat)} ~ { }^{+ 0.05}_{-0.06} ~ \mathrm{(syst)}\) consistent with the expectations [9].

Other beam tunes are used for calibration purposes, including a \(\pi ^-\) tune at 70.5 MeV/c used to produce monochromatic photons via pion charge exchange and a 53 MeV/c positron beam tune to produce Mott-scattered positrons close to the energy of a signal positron (Sect. 2.7).

### 2.2 Muon stopping target

^{®}frame, which is attached to the tracking system support frame and positioned with the target normal vector in the horizontal plane and at an angle \(\theta \)\(\approx \)\(70^\circ \). The target before installation in the detector is shown in Fig. 2.

### 2.3 COBRA magnet

The COBRA (constant bending radius) magnet [10] is a thin-walled, superconducting magnet with an axially graded magnetic field, ranging from 1.27 T at the centre to 0.49 T at either end of the magnet cryostat. The graded field has the advantage with respect to a uniform solenoidal field that particles produced with small longitudinal momentum have a much shorter latency time in the spectrometer, allowing stable operation in a high-rate environment. Additionally, the graded magnetic field is designed so that positrons emitted from the target follow a trajectory with almost constant projected bending radius, only weakly dependent on the emission polar angle \(\theta _{\mathrm {e^+}}\) (see Fig. 3a), even for positrons emitted with substantial longitudinal momentum.

The central part of the coil and cryostat accounts for \(0.197\,{X}_0\), thereby maintaining high transmission of signal photons to the LXe detector outside the COBRA cryostat. The COBRA magnet is also equipped with a pair of compensation coils to reduce the stray field to the level necessary to operate the photomultiplier tubes (PMTs) in the LXe detector.

*z*,

*r*and \(\phi \) in the ranges \(|z|<110\,\mathrm {cm}\), \(0^{\circ }<\phi <360^{\circ }\) and \(0< r < 29\,\mathrm {cm}\), covering most of the positron tracking volume. The probe contained three Hall sensors orthogonally aligned to measure \(B_z, B_r\) and \(B_\phi \) individually. Because the main (axial) field component is much larger than the others, even small angular misalignments of the other probes could cause large errors in \(B_r\) and \(B_\phi \). Therefore, only the measured values of \(B_z\) are used in the analysis and the secondary components \(B_r\) and \(B_\phi \) are reconstructed from the measured \(B_z\) using Maxwell’s equations as

The continuous magnetic field map used in the analysis is obtained by interpolating the reconstructed magnetic field at the measurement grid points by a B-spline fit [11].

### 2.4 Drift chamber system

### 2.5 Timing counter

The TC [13, 14] is designed to measure precisely the impact time and position of signal positrons and to infer the muon decay time by correcting for the track length from the target to the TC obtained from the DCH information.

provide full acceptance for signal positrons in the DCH acceptance matching the tight mechanical constraints dictated by the DCH system and COBRA;

ability to operate at high rate in a high and non-uniform magnetic field;

fast and approximate (\(\approx \)\(5\) cm resolution) determination of the positron impact point for the online trigger;

good (\(\approx \)\(1\) cm) positron impact point position resolution in the offline event analysis;

excellent (\(\approx \)\(50\) ps) time resolution of the positron impact point.

### 2.6 Liquid xenon detector

The LXe photon detector [15, 16] requires excellent position, time and energy resolutions to minimise the number of accidental coincidences between photons and positrons from different muon decays, which comprise the dominant background process (see Sect. 4.4.1).

It is a homogeneous calorimeter able to contain fully the shower induced by a 52.83 MeV photon and measure the photon interaction vertex, interaction time and energy with high efficiency. The photon direction is not directly measured in the LXe detector, rather it is inferred by the direction of a line between the photon interaction vertex in the LXe detector and the intercept of the positron trajectory at the stopping target.

The calibration tools of the MEG experiment

Process | Energy | Main purpose | Frequency | |
---|---|---|---|---|

Cosmic rays | \(\mu ^{\pm }\) from atmospheric showers | Wide spectrum \(\mathcal O\)(GeV) | LXe-DCH relative position | Annually |

DCH alignment | ||||

TC energy and time offset calibration | ||||

LXe purity | On demand | |||

Charge exchange | \(\pi ^- \mathrm {p} \rightarrow \pi ^0 \mathrm {n} \) | 55, 83, 129 MeV photons | LXe energy scale/resolution | Annually |

\( \pi ^0 \rightarrow \gamma \gamma \) | ||||

Radiative \(\mu -\)decay | \(\mu ^+ \rightarrow \mathrm{e}^+\gamma \nu \bar{\nu }\) | Photons \(> 40\) MeV, | LXe-TC relative timing | Continuously |

Positrons \(> 45\) MeV | Normalisation | |||

Normal \(\mu -\)decay | \(\mu ^+ \rightarrow \mathrm{e}^+ \nu \bar{\nu }\) | 52.83 MeV end-point positrons | DCH energy scale/resolution | Continuously |

DCH and target alignment | ||||

Normalisation | ||||

Mott positrons | \(\mathrm {e^+}\) target \(\rightarrow \mathrm {e^+}\) target | \(\approx \)\(50\) MeV positrons | DCH energy scale/resolution | Annually |

DCH alignment | ||||

Proton accelerator | \(^7 \mathrm{Li} (\mathrm {p}, \gamma ) ^8 \mathrm{Be}\) | 14.8, 17.6 MeV photons | LXe uniformity/purity | Weekly |

\(^{11} \mathrm{B} (\mathrm {p}, \gamma ) ^{12} \mathrm{C}\) | 4.4, 11.6, 16.1 MeV photons | TC interbar/ LXe–TC timing | Weekly | |

Neutron generator | \(^{58} \mathrm{Ni}(\mathrm {n},\gamma ) ^{59}\mathrm{Ni}\) | 9 MeV photons | LXe energy scale | Weekly |

Radioactive source | \(^{241}\mathrm{Am}(\alpha ,\gamma )^{237}\mathrm{Np}\) | 5.5 MeV \(\alpha \)’s, \(56~\mathrm{keV}\) photons | LXe PMT calibration/purity | Weekly |

Radioactive source | \(^9\mathrm {Be}(\alpha _{^{241} \mathrm{Am}}, \mathrm {n})^{12}\mathrm {C}^{\star }\) | 4.4 MeV photons | LXe energy scale | On demand |

\(^{12}\mathrm {C}^{\star }(\gamma )^{12}\mathrm {C}\) | ||||

LED | LXe PMT calibration | Continuously |

A schematic view of the LXe detector is shown in Fig. 8. It has a C-shaped structure fitting the outer radius of COBRA. The fiducial volume is \(\approx \)\(800~\ell \), covering 11 % of the solid angle viewed from the centre of the stopping target. Scintillation light is detected in 846 PMTs submerged directly in the liquid xenon. They are placed on all six faces of the detector, with different PMT coverage on different faces. The detector’s depth is 38.5 cm, corresponding to \(\approx \)\(14\) X\(_0\).

### 2.7 Calibration

Multiple calibration and monitoring tools are integrated into the experiment [17] in order to continuously check the operation of single sub-detectors (e.g. LXe photodetector gain equalisation, TC bar cross-timing, LXe and spectrometer energy scale) and multiple-detector comparisons simultaneously (e.g. relative positron-photon timing).

Data for some of the monitoring and calibration tasks are recorded during normal data taking, making use of particles coming from muon decays, for example the end-points of the positron and photon spectra to check the energy scale, or the positron-photon timing in RMD to check the LXe–TC relative timing. Additional calibrations required the installation of new tools, devices or detectors. A list of these methods is presented in Table 1 and they are briefly discussed below.

Various processes can affect the LXe detector response: xenon purity, long-term PMT gain or quantum efficiency drifts from ageing, HV variations, etc. PMT gains are tracked using 44 blue LEDs immersed in the LXe at different positions. Dedicated runs for gain measurements in which LEDs are flashed at different intensities are taken every two days. In order to monitor the PMT long-term gain and efficiency variations, flashing LED events are constantly taken (1 Hz) during physics runs. Thin tungsten wires with point-like \(^{241}\mathrm {Am}\)\(\alpha \)-sources are also installed in precisely known positions in the detector fiducial volume. They are used for monitoring the xenon purity and measuring the PMT quantum efficiencies [18].

A dedicated Cockcroft–Walton accelerator [19] placed downstream of the muon beam line is installed to produce photons of known energy by impinging sub-MeV protons on a lithium tetraborate target. The accelerator was operated twice per week to generate single photons of relatively high energy (17.6 MeV from lithium) to monitor the LXe detector energy scale, and coincident photons (4.4 and 11.6 MeV from boron) to monitor the TC scintillator bar relative timing and the TC–LXe detectors’ relative timing (see Table 1 for the relevant reactions).

A dedicated calibration run is performed annually by stopping \(\pi ^-\) in a liquid hydrogen target placed at the centre of COBRA [20]. Coincident photons from \(\pi ^0\) decays produced in the charge exchange (CEX) reaction \(\pi ^- \mathrm{p} \rightarrow \pi ^0 \mathrm{n}\) are detected simultaneously in the LXe detector and a dedicated BGO crystal detector. By appropriate relative LXe and BGO geometrical selection and BGO energy selection, a nearly monochromatic sample of 55 MeV (and 83 MeV) photons incident on the LXe are used to measure the response of the LXe detector at these energies and set the absolute energy scale at the signal photon energy.

A low-energy calibration point is provided by 4.4 MeV photons from an \(^{241}\)Am/Be source that is moved periodically in front of the LXe detector during beam-off periods.

Finally, a neutron generator exploiting the \(\left( \mathrm{n},\gamma \right) \) reaction on nickel shown in Table 1 allows an energy calibration under various detector rate conditions, in particular normal MEG and CEX data taking.

Data with Mott-scattered positrons are also acquired annually to monitor and calibrate the spectrometer with all the benefits associated with the usage of a quasi-monochromatic energy line at \(\approx \)\(53\) MeV [21].

### 2.8 Front-end electronics

The digitisation and data acquisition system for MEG uses a custom, high frequency digitiser based on the switched capacitor array technique, the Domino Ring Sampler 4 (DRS4) [22]. For each of the \(\approx \)\(3000\) read-out channels with a signal above some threshold, it records a waveform of 1024 samples. The sampling rate is 1.6 GHz for the TC and LXe detectors, matched to the precise time measurements in these detectors, and 0.8 GHz for the DCH, matched to the drift velocity and intrinsic drift resolution.

Each waveform is processed offline by applying baseline subtraction, spectral analysis, noise filtering, digital constant fraction discrimination etc. so as to optimise the extraction of the variables relevant for the measurement. Saving the full waveform provides the advantage of being able to reprocess the full waveform information offline with improved algorithms.

### 2.9 Trigger

An experiment to search for ultra-rare events within a huge background due to a high muon stopping rate needs a quick and efficient event selection, which demands the combined use of high-resolution detection techniques with fast front-end, digitising electronics and trigger. The trigger system plays an essential role in processing the detector signals in order to find the signature of \(\mu ^+ \rightarrow \mathrm{e}^+ \gamma \) events in a high-background environment [23, 24]. The trigger must strike a compromise between a high efficiency for signal event selection, high live-time and a very high background rejection rate. The trigger rate should be kept below 10 Hz so as not to overload the data acquisition (DAQ) system.

the photon energy;

the relative \(\mathrm {e}^+\gamma \) direction;

the relative \(\mathrm {e}^+\gamma \) timing.

The amplitudes of the inner-face PMT pulses are also sent to comparator stages to extract the index of the PMT collecting the highest charge, which provides a robust estimator of the photon interaction vertex in the LXe detector. The line connecting this vertex and the target centre provides an estimate of the photon direction.

On the positron side, the coordinates of the TC interaction point are the only information available online. The radial coordinate is given simply by the radial location of the TC, while, due to its segmentation along \(\phi \), this coordinate is identified by the bar index of the first hit (first bar encountered moving along the positron trajectory). The local *z*-coordinate on the hit bar is measured by the ratio of charges on the PMTs on opposite sides of the bar with a resolution \(\approx \)\(5\) cm.

On the assumption of the momentum being that of a signal event and the direction opposite to that of the photon, by means of Monte Carlo (MC) simulations, each PMT index is associated with a region of the TC. If the online TC coordinates fall into this region, the relative \(\mathrm {e}^+\gamma \) direction is compatible with the back-to-back condition.

The interaction time of the photon in the LXe detector is extracted by a fit of the leading edge of PMT pulses with a \(\approx \)\(2\) ns resolution. The same procedure allows the estimation of the time of the positron hit on the TC with a comparable resolution. The relative time is obtained from their difference; fluctuations due to the time-of-flight of each particle are within the resolutions.

### 2.10 DAQ system

The DAQ challenge is to perform the complete read-out of all detector waveforms while maintaining the system efficiency, defined as the product of the online efficiency (\(\epsilon _\mathrm{trg}\)) and the DAQ live-time fraction (\(f_\mathrm {LT}\)), as high as possible.

At the beginning of data taking, with the help of a MC simulation, a trigger configuration which maximised the DAQ efficiency was found to have \(\epsilon _\mathrm{trg}\approx 90\,\%\) and \(f_\mathrm {LT}\approx 85\,\%\) and an associated event rate \(R_\mathrm{daq}\approx 7~\mathrm {Hz}\), almost seven orders of magnitude lower than the muon stopping rate.

## 3 Reconstruction

In this section the reconstruction of high-level objects is presented. More information about low-level objects (e.g. waveform analysis, hit reconstruction) and calibration issues are available in [8].

### 3.1 Photon reconstruction

A 52.83 MeV photon interacts with LXe predominantly via the pair production process, followed by an electromagnetic shower. The major uncertainty in the reconstruction stems from the event-by-event fluctuations in the shower development. A series of algorithms provide the best estimates of the energy, the interaction vertex, and the interaction time of the incident photon and to identify and eliminate events with multiple photons in the same event.

For reconstruction inside the LXe detector, a special coordinate system (*u*, *v*, *w*) is used: *u* coincides with *z* in the MEG coordinate system; *v* is directed along the negative \(\phi \)-direction at the radius of the fiducial volume inner face (\(r_\mathrm {in} = 67.85\) cm); \(w=r-r_\mathrm {in}\), measures the depth. The fiducial volume of the LXe detector is defined as \(|u|<25\) cm, \(|v|<71\) cm, and 0 \(<w<38.5\) cm (\(|\cos \theta |<0.342\) and \(120^{\circ }<\phi <240^\circ \)) in order to ensure high resolutions, especially for energy and position measurements.

The reconstruction starts with a waveform analysis that extracts charge and time for each of the PMT waveforms. The digital-constant-fraction method is used to determine an (almost) amplitude independent pulse time, defined as the time when the signal reaches a predefined fraction (20 %) of the maximum pulse height. To minimise the effect of noise on the determination of the charge, a digital high-pass filter^{1} with a cutoff frequency of \(\approx \)\(10\) MHz, is applied.

^{2}and, thus, are continuously monitored and calibrated using the calibration sources instrumented in the LXe detector (see Sect. 2.7).

#### 3.1.1 Photon position

The 3D position of the photon interaction vertex \({\varvec{r}}_{\gamma }=(u_{\gamma },v_{\gamma },w_{\gamma })\) is determined by a \(\chi ^2\)-fit of the distribution of the numbers of scintillation photons in the PMTs (\(N_\mathrm {pho}\)), taking into account the solid angle subtended by each PMT photocathode assuming an interaction vertex, to the observed \(N_\mathrm {pho}\) distribution. To minimise the effect of shower fluctuations, only PMTs inside a radius of 3.5 times the PMT spacing for the initial estimate of the position of the interaction vertex are used in the fit. The initial estimate of the position is calculated as the amplitude weighted mean position around the PMT with the maximum signal. For events resulting in \(w_{\gamma }< 12\) cm, the fit is repeated with a further reduced number of PMTs, inside a radius of twice the PMT spacing from the first fit result. The remaining bias on the result, due to the inclined incidence of the photon onto the inner face, is corrected using results from a MC simulation. The performance of the position reconstruction is evaluated by a MC simulation and has been verified in dedicated CEX runs by placing lead collimators in front of the LXe detector. The average position resolutions along the two orthogonal inner-face coordinates (*u*, *v*) and the depth direction (*w*) are estimated to be \(\approx \)\(5\) and \(\approx \)\(6\) mm, respectively.

The position is reconstructed in the LXe detector local coordinate system. The conversion to the MEG coordinate system relies on the alignment of the LXe detector with the rest of the MEG subsystems. The LXe detector position relative to the MEG coordinate system is precisely surveyed using a laser survey device at room temperature. After the thermal shrinkage of the cryostat and of the PMT support structures at LXe temperature are taken into account, the PMT positions are calculated based on the above information. The final alignment of the LXe detector with respect to the spectrometer is described in Sect. 3.3.1.

#### 3.1.2 Photon timing

The determination of the photon emission time from the target \(t_{\mathrm {\gamma }}\) starts from the determination of the arrival time of the scintillation photons on the *i*-th PMT \(t_{\gamma ,i}^\mathrm {PMT}\) as described in Sect. 3.1. To relate this time to the photon conversion time, the propagation time of the scintillation photons must be subtracted as well as any hardware-induced time offset (e.g. due to cable length).

The propagation time of the scintillation photons is evaluated using the \(\pi ^0 \rightarrow \gamma \gamma \) events produced in CEX runs in which the time of one of the photons is measured by two plastic scintillator counters with a lead shower converter as a reference time. The primary contribution is expressed as a linear relation with the distance; the coefficient, i.e., the effective light velocity, is measured to be \(\approx \)\(8\) cm/ns. A remaining non-linear dependence is observed and an empirical function (2D function of the distance and incident angle) is calibrated from the data. This secondary effect comes from the fact that the fraction of indirect (scattered of reflected) scintillation photons increases with a larger incident angle and a larger distance. PMTs that do not directly view the interaction vertex \({\varvec{r}}_{\gamma }\), shaded by the inner face wall, are not used in the following timing reconstruction. After correcting for the scintillation photon propagation times, the remaining (constant) time offset is extracted for each PMT from the same \(\pi ^0 \rightarrow \gamma \gamma \) events by comparing the PMT hit time with the reference time.

Finally, the photon emission time from the target \(t_\gamma \) is obtained by subtracting the time-of-flight between the point on the stopping target defined by the intercept of the positron trajectory at the stopping target and the reconstructed interaction vertex in the LXe detector from \(t_\gamma ^\mathrm {LXe}\).

The timing resolution \(\sigma _{t_{\mathrm {\gamma }}}\) is evaluated as the dispersion of the time difference between the two photons from \(\pi ^0\) decay after subtracting contributions due to the uncertainty of the \(\pi ^0\) decay position and to the timing resolution of the reference counters. From measurements at 55 and 83 MeV, the energy dependence is estimated and corrected, resulting in \(\sigma _{t_{\mathrm {\gamma }}} (E_{\mathrm {\gamma }}=52.83~\mathrm {MeV}) \approx 64\) ps.

#### 3.1.3 Photon energy

^{3}\(W_i({\varvec{r}}_{\gamma })\) is a weighting factor for the PMT that is common to all PMTs on a given face and is determined by minimising the resolution in response to 55-MeV photons from CEX. \(\Omega ({\varvec{r}}_{\gamma })\) is a correction factor for the solid angle subtended by photocathodes for scintillating photons emitted at the interaction vertex; it is applied only for shallow events (\(w_{\gamma }<3\) cm) for which the light collection efficiency is very sensitive to the relative position of each PMT and the interaction vertex. \(U({\varvec{r}}_{\gamma })\) is a position dependent non-uniformity correction factor determined by the responses to the 17.6- and 55-MeV photons.

*H*(

*t*) is a correction factor for the time-varying LXe scintillation light yield and

*S*is a constant conversion factor of the energy scale, determined by the 55- and 83-MeV photons with a precision of 0.3 %.

A potential significant background is due to pile-up events with more than one photon in the detector nearly coincident in time. Approximately 15 % of triggered events suffer from pile-up at the nominal beam rate. The analysis identifies pile-up events and corrects the measured energy, thereby reducing background and increasing detection efficiency. Three methods are used to identify and extract the primary photon energy in pile-up events.

The first method identifies multiple photons with different timing using the \(\chi ^2/\mathrm {NDF}\) value in the time fit. In contrast to the time reconstruction, all the PMTs with more than 50 \(N_\mathrm {pe}\) are used to identify pile-up events.

The second method identifies pile-up events with photons at different positions by searching for spatially separated peaks in the inner and outer faces. If the event has two or more peaks whose energies cannot be determined using the third method below, a pile-up removal algorithm is applied to the PMT charge distribution. It uses a position dependent table containing the average charge of each PMT in response to 17.6-MeV photons. Once a pile-up event is identified, the energy of the primary photon is estimated by fitting the PMT charges to the table without using PMTs around the secondary photon. Then, the PMT charges around the secondary photon are replaced with the charges estimated by the fit. Finally, the energy is reconstructed as a sum of the individual PMT charges with the coefficients \(F_i\) (Eq. 1), instead of integrating the summed waveform.

*N*template waveforms is fitted to the total summed waveform, where

*N*is the number of pulses detected in this event. Figure 10 shows an example of the fitting, where three pulses are detected. Finally, the contributions of pile-up photons are subtracted and the remaining waveform is used for the primary energy estimation.

The energy response of the LXe detector is studied in the CEX runs using \(\pi ^0\) decays with an opening angle between the two photons \(170^\circ \), for which each of the photons has an intrinsic line width small compared to the detector resolution. The measured line shape is shown in Fig. 11 at two different conversion depth (\(w_{\gamma }\)) regions. The line shape is asymmetric with a low energy tail mainly for two reasons: the interaction of the photon in the material in front of the LXe detector fiducial volume, and the albedo shower leakage from the inner face. The energy resolution is evaluated from the width of the line shape on the right-hand (high-energy) side (\(\sigma _{E_{\mathrm {\gamma }}}\)) by unfolding the finite width of the incident photon energy distribution due to the imperfect back-to-back selection and a small correction for the different background conditions between the muon and pion beams. Since the response of the detector depends on the position of the photon conversion, the fitted parameters of the line shape are functions of the 3D coordinates, mainly of \(w_{\gamma }\). The average resolution is measured to be \(\sigma _{E_{\mathrm {\gamma }}} = 2.3~\%\) (\(0< w_{\gamma }< 2~\mathrm {cm}\), event fraction 42 %) and 1.6 % (\(w_{\gamma }> 2~\mathrm {cm}\), 58 %).

### 3.2 Positron reconstruction

#### 3.2.1 DCH reconstruction

The reconstruction of positron trajectories in the DCH is performed in four steps: hit reconstruction in each single cell, clustering of hits within the same chamber, track finding in the spectrometer, and track fitting.

In step one, raw waveforms from anodes and cathodes are filtered in order to remove known noise contributions of fixed frequencies. A hit is defined as a negative signal appearing in the waveform collected at each end of the anode wire, with an amplitude of at least \(-5\) mV below the baseline. This level and its uncertainty \(\sigma _\mathrm {B}\) are estimated from the waveform itself in the region around 625 ns before the trigger time. The hit time is taken from the anode signal with larger amplitude as the time of the first sample more than \(-3\sigma _\mathrm {B}\) below the baseline.

*z*-coordinate, with a resolution of about 1 cm, is obtained from charge division on the anode wire, and it allows the determination of the Vernier cycle (see Sect. 2.4) in which the hit occurred. If one or more of the four cathode pad channels is known to be defective, the

*z*-coordinate from charge division is used and is assigned a 1 cm uncertainty. Otherwise, charge integration is performed on the cathode pad waveforms and the resulting charges are combined to refine the

*z*-measurement, exploiting the Vernier pattern. The charge asymmetries between the upper and lower sections of the inner and outer cathodes are given by

*x*,

*y*) plane is given by the wire position.

Once reconstructed, hits from nearby cells with similar *z* are grouped into clusters, taking into account that the *z*-measurement can be shifted by \(\lambda \) if the wrong Vernier cycle has been selected via charge division. These clusters are then used to build track seeds.

A seed is defined as a group of three clusters in four adjacent chambers, at large radius (\(r>24\) cm) where the chamber occupancy is lower and only particles with large momentum are found. The clusters are required to satisfy appropriate proximity criteria on their *r* and *z* values. A first estimate of the track curvature and total momentum is obtained from the coordinates of the hit wires, and is used to extend the track and search for other clusters, taking advantage of the adiabatic invariant \(p_T^2/B_z\), where \(p_T\) is the positron transverse momentum, for slowly varying axial magnetic fields. Having determined the approximate trajectory, the left/right ambiguity of the hits on each wire can be resolved in most cases. A first estimate of the track time (and hence the precise position of the hit within a cell) and further improvement of the left/right solutions can be obtained by minimising the \(\chi ^2\) of a circle fit of the hit positions in the (*x*, *y*) plane.

At this stage, in order to retain high efficiency, the same hit can belong to different clusters and the same cluster to different track candidates, which can result in duplicated tracks. Only after the track fit, when the best information on the track is available, independent tracks are defined.

A precise estimate of the (*x*, *y*) positions of the hits associated with the track candidate is then extracted from the drift time, defined as the difference between the hit and track times. The position is taken from tables relating (*x*, *y*) position to drift time. These are track-angle dependent and are derived using GARFIELD software [25]. The reconstructed (*x*, *y*) position is continuously updated during the tracking process, as the track information improves.

A track fit is finally performed with the Kalman filter technique [26, 27]. The GEANE software [28] is used to account for the effect of materials in the spectrometer during the propagation of the track and to estimate the error matrix. Exploiting the results of the first track fit, hits not initially included in the track candidate are added if appropriate and hits which are inconsistent with the fitted track are removed. The track is then propagated to the TC and matched to the hits in the bars (see Sect. 3.2.7 for details). The time of the matched TC hit (corrected for propagation delay) is used to provide a more accurate estimate of the track time, and hence the drift times. A final fit is then done with this refined information. Following the fit, the track is propagated backwards to the target. The decay vertex (\(x_\mathrm {e^+}\), \(y_\mathrm {e^+}\), \(z_\mathrm {e^+}\)) and the positron decay direction (\(\phi _\mathrm {e^+}\), \(\theta _\mathrm {e^+}\)) are defined as the point of intersection of the track with the target foil and the track direction at the decay vertex. The error matrix of the track parameters at the decay vertex is computed and used in the subsequent analysis.

Among tracks sharing at least one hit, a ranking is performed based on a linear combination of five variables denoting the quality of the track (the momentum, \(\theta _\mathrm {e^+}\) and \(\phi _\mathrm {e^+}\) errors at the target, the number of hits and the reduced \(\chi ^2\)). In order to optimise the performance of the ranking procedure, the linear combination is taken as the first component of a principal component analysis of the five variables. The ranking variables are also used to select tracks, along with other quality criteria as (for instance) the request that the backward track extrapolation intercepts the target within its fiducial volume. Since the subsequent analysis uses the errors associated with the track parameters event by event, the selection criteria are kept loose in order to preserve high efficiency while removing badly reconstructed tracks for which the fit and the associated errors might be unreliable. After the selection criteria are applied, the track quality ranking is used to select only one track among the surviving duplicate candidates.

#### 3.2.2 DCH missing turn recovery

After the track reconstruction is completed, a missing first turn (MFT) recovery algorithm, developed and incorporated in the DCH reconstruction software expressly for this analysis, is used to identify and refit positron tracks with an MFT. Firstly, for each track in an event, the algorithm identifies all hits that may potentially be part of an MFT, based on the compatibility of their *z*-coordinates and wire locations in the DCH system with regard to the positron track. The vertex state vector of the track is propagated backwards to the point of closest approach with each potential MFT hit, and the hit selection is refined based on the *r* and *z* residuals between the potential MFT hits and their propagated state vector positions. Potential MFT candidates are subsequently selected if there are MFT hits in at least four DCH modules of which three are adjacent to one another, and the average signed *z*-difference between the hits and their propagated state vector positions as well as the standard deviation of the corresponding unsigned *z*-difference are smaller than 2.5 cm. A new MFT track is reconstructed using the Kalman filter technique based on the selected MFT hits and correspondingly propagated state vectors. Finally, the original positron and MFT tracks are combined and refitted using the Kalman filter technique, followed by a recalculation of the track quality ranking and the positron variables and their uncertainties at the target. An example of a multi-turn positron with a recovered MFT is shown in Fig. 12.

#### 3.2.3 DCH alignment

Accurate positron track reconstruction requires precise knowledge of the location and orientation of the anode wires and cathode pads in the DCH system. This is achieved by an alignment procedure that consists of two parts: an optical survey alignment based on reference markers, and a software alignment based on reconstructed tracks.

Each DCH module is equipped with cross hair marks on the upstream and downstream sides of the module. Each module is fastened to carbon-fibre support structures on the upstream and downstream sides of the DCH system, which accommodate individual alignment pins with an optically detectable centre. Before the start of each data-taking period an optical survey of the cross hairs and pins is performed using a theodolite. The optical survey technique was improved in 2011 by adding corner cube reflectors next to the cross hairs, which were used in conjunction with a laser tracker system. The resolution of the laser method is \(\approx \)\(0.2\) mm for each coordinate.

Two independent software alignment methods are used to cross-check and further improve the alignment precision of the DCH system. The first method is based on the Millepede algorithm [29] and uses cosmic-rays reconstructed without magnetic field. During COBRA shutdown periods, cosmic-rays are triggered using dedicated scintillation counters located around the magnet cryostat. The alignment procedure utilises the reconstructed hit positions on the DCH modules to minimise the residuals with respect to straight tracks according to the Millepede algorithm. The global alignment parameters, three positional and three rotational degrees of freedom per module, are determined with an accuracy of better than 150 \(\upmu \)m for each coordinate.

The second method is based on an iterative algorithm using reconstructed Michel positrons and aims to improve the relative radial and longitudinal alignment of the DCH modules. The radial and longitudinal differences between the track position and the corresponding hit position at each module are recorded for a large number of tracks. The average hit-track residuals of each module are used to correct the radial and longitudinal position of the modules, while keeping the average correction over all modules equal to zero. This process is repeated several times while refitting the tracks after each iteration, until the alignment corrections converge and an accuracy of better than 50 \(\upmu \)m for each coordinate is reached. The method is cross-checked by using reconstructed Mott-scattered positrons (see Sect. 2.7), resulting in very similar alignment corrections.

The exact resolution reached by each approach depends on the resolution of the optical survey used as a starting position. For a low-resolution survey, the Millepede method obtains a better resolution, while the iterative method obtains a better resolution for a high-resolution survey. Based on these points, the Millepede method is adopted for the years 2009–2011 and the iterative method is used for the years 2012–2013 for which the novel optical survey data are available; in 2011, the first year with the novel optical survey data, the resulting resolution of both approaches is comparable.

#### 3.2.4 Target alignment

Precise knowledge of the position of the target foil relative to the DCH system is crucial for an accurate determination of the muon decay vertex and positron direction at the vertex, which are calculated by propagating the reconstructed track back to the target, particularly when the trajectory of the track is far from the direction normal to the plane of the target.

*x*,

*y*,

*z*) directions. For each year, a plane fit of the cross mark measurements is used in the propagation of tracks back to the target as a first approximation of the target foil position. However, the residuals between the cross mark measurements and the plane fits indicate that the target foil has developed a gradual aplanarity over time. This is confirmed by measurements of the target aplanarity performed with a high-precision FARO 3D laser scanner [30] at the end of 2013, as shown in the top panel of Fig. 14. Therefore, the propagation of tracks back to the target is improved by using a paraboloidal approximation \(z_\mathrm {t} - z_0\)\(=\)\(c_x(x_\mathrm {t}-x_0)^2\)\(+\)\(c_y(y_\mathrm {t}-y_0)^2\) of the target foil obtained by fitting separately the cross mark measurements for each year. In this function, \((x_\mathrm {t}, y_\mathrm {t}, z_\mathrm {t})\) is the local target coordinate system (i.e. not the nominal MEG coordinate system) in which \(x_\mathrm {t}\) (\(y_\mathrm {t}\)) is the coordinate along the semi-major (semi-minor) axis of the target, and \(z_\mathrm {t}\) is the coordinate perpendicular to the target plane. The fit parameters are \((x_0, y_0, z_0)\) for the position of the paraboloid extremum, and \(c_x\) and \(c_y\) for the paraboloid curvatures in the \(x_\mathrm {t}\) and \(y_\mathrm {t}\) directions. The paraboloidal fit of the 2013 cross mark measurements, shown in the bottom panel of Fig. 14, exhibits the largest aplanarity among all years. In this fit \(c_y=-0.03\) \(\mathrm{cm}^{-1}\), which corresponds to a focal length of \(\approx \)\(8\) cm for the semi-minor axis of the target in 2013.

The effect of the non-paraboloidal deformation on the analysis and its systematic uncertainty are estimated by using a 2D map of the \(z_\mathrm {t}\)-difference between the 2013 paraboloidal fit and the FARO measurements, as a function of \(x_\mathrm {t}\) and \(y_\mathrm {t}\) (i.e. the difference between the top and bottom panels of Fig. 14). As discussed in detail in Sect. 4.5, this map is scaled by a factor \(k_\mathrm {t}\) for each year, to represent the increase of the non-paraboloidal deformation of the target over time.

#### 3.2.5 DCH performance

We developed a series of methods to extract, from data, an estimate of the resolution functions, defined for a generic observable *q* as the distribution of the errors, \(q - q_\mathrm {true}\).

A complete overview of the performance of the spectrometer can be found in [8], where the methods used to evaluate it are also described in detail. Two methods are used to extract the resolution functions for the positron parameters. The energy resolution function, including the absolute energy scale, is extracted with good accuracy from a fit to the energy spectrum of positrons from Michel decay. A core resolution of \(\sigma ^\mathrm {core}_{E_\mathrm {e^+}} \approx 330\) keV is found, with a \(\approx \)\(18\) % tail component with \(\sigma ^\mathrm {tail}_{E_\mathrm {e^+}} \approx 1.1\) MeV, with exact values depending on the data subset. The resolution functions for the positron angles and production vertex are extracted exploiting tracks that make two turns inside the spectrometer. The two turns are treated as independent tracks, and extrapolated to a prolongation of the target plane at a position between the two turns. The resulting differences in the position and direction of the two turns are then used to extract the position and angle resolutions. The same method is used to study the correlations among the variables and to cross-check the energy resolution. However, since the two-turn tracks are a biased sample with respect to the whole dataset, substantial MC-based corrections are necessary. These corrections are introduced as multiplicative factors to the width of the resolution functions, ranging from 0.75 to 1.20. Moreover, no information can be extracted about a possible reconstruction bias, which needs to be estimated from the detector alignment procedures described later in this paper. After applying the corrections, the following average resolutions are found: \(\sigma _{\theta _\mathrm {e^+}}=9.4\) mrad; \(\sigma _{\phi _\mathrm {e^+}}=8.4\) mrad; \(\sigma _{y_\mathrm {e^+}}= 1.1\) mm and \(\sigma _{z_\mathrm {e^+}}=2.5\) mm at the target.

*q*with the PDF of the corresponding

*pull*:

#### 3.2.6 TC reconstruction

Each of the timing counter (TC) bars acts as an independent detector. It exploits the fast scintillating photons released by the passage of a positron to infer the time and longitudinal position of the hit. A fraction of the scintillating photons reaches the bar ends where they are read-out by PMTs.

The signal from each TC PMT is processed with a Double Threshold Discriminator (DTD) to extract the arrival time of the scintillating photons minimising the time walk effect. A TC hit is formed when both PMTs on a single bar have signals above the higher DTD threshold. The times \(t^\mathrm {TC,in}_\mathrm {e^+}\) and \(t^\mathrm {TC,out}_\mathrm {e^+}\), measured by the two PMTs belonging to the same bar, are extracted by a template fit to a NIM waveform (square wave at level \(-0.8\) V) fired at the lower DTD threshold and digitised by a DRS.

*L*is the bar length; the

*z*-axis points along the main axis of the bar and its origin is taken in the middle of the bar. Adding the two parts of Eq. 2 the result is:

The longitudinal position resolution is \(\sigma _{z^{\mathrm {TC}}_{\mathrm {e^+}}} \approx 1.0\) cm and the time resolution is \(\sigma _{t^{\mathrm {TC}}_{\mathrm {e^+}}} \approx 65\) ps.

The TC, therefore, provides the information required to reconstruct all positron variables necessary to match a DCH track (see Sect. 3.2.7) and recover the muon decay time by extrapolating the \(t^\mathrm {TC}_\mathrm {e^+}\) along the track trajectory back to the target to obtain the positron emission time \(t_\mathrm {e^+}\).

#### 3.2.7 DCH-TC matching

The matching of DCH tracks with hits in the TC is performed as an intermediate step in the track fit procedure, in order to exploit the information from the TC in the track reconstruction.

- 1.
the TC hit belongs to the reference bar, with the longitudinal distance between the track and the hit \(\left| \Delta z^{\mathrm {TC}} \right| < 12\) cm (the track position defined as the entrance point of the track in the bar volume);

- 2.
the TC hit belongs to another bar whose extended volume is also crossed by the track, and \(\left| \Delta z^{\mathrm {TC}} \right| < 12\) cm (the track position defined as the entrance point of the track in the extended bar volume);

- 3.
the TC hit belongs to a bar whose extended volume is not crossed by the track, but where the distance of closest approach of the track to the bar axis is less than 5 cm, and \(\left| \Delta z^{\mathrm {TC}} \right| < 12\) cm (the track position defined as the point of closest approach of the track to the bar axis).

The time of the matched TC hit is assigned to the track, which is then back-propagated to the chambers in order to correct the drift time of the hits for the track length timing contribution. The Kalman filter procedure is also applied to propagate the track back to the target to get the best estimate of the decay vertex parameters at the target, including the time \(t_\mathrm {e^+}\).

#### 3.2.8 Positron AIF reconstruction

The procedure starts by building positron AIF seeds from all reconstructed clusters. An AIF seed is defined as a set of clusters on adjacent DCH modules which satisfy a number of minimum proximity criteria. A positron AIF candidate (\(\mathrm {e^+_{AIF}}\)) is reconstructed from each seed by performing a circle fit based on the *xy*-coordinates of all clusters in the seed. The circle fit is improved by considering the individual hits in all clusters. The *xy*-coordinates of hits in multi-hit clusters are refined and left/right solutions based on the initial circle fit are determined by taking into account the timing information of the individual hits, which also results in an estimate of the AIF time. The *xy*-coordinates of the AIF vertex are determined by the intersection point of the circle fit with the first DCH cathode plane after the last cluster hit. If the circle fit does not cross the next DCH cathode plane, the intersection point of the circle fit with the support structure of the next DCH module or the inner wall of COBRA is used. The *z*-coordinate of the AIF vertex is calculated by extrapolating the quadratic polynomial fit of the *xz*-positions of the last three clusters of the AIF candidate to the *x*-coordinate of the AIF vertex. The AIF candidate direction is taken as the direction tangent to both the circle fit and the quadratic polynomial fit at the AIF vertex. Figure 17 shows an example of a reconstructed AIF candidate.

### 3.3 Combined reconstruction

This section deals with variables requiring signals both in the spectrometer and in the LXe detector.

#### 3.3.1 Relative photon–positron angles

Since the LXe detector is not capable of reconstructing the direction of the incoming photons, this direction is determined by connecting the reconstructed interaction vertex of the photon in the LXe detector to the reconstructed decay vertex on target: it is defined through its azimuthal and polar angles (\(\phi _\mathrm {\gamma }\), \(\theta _\mathrm {\gamma }\)).

There are correlations among the errors in measurements of the positron observables at the target both due to the fit and also introduced by the extrapolation to the target. Additionally, the errors in the photon angles contain a contribution from the positron position error at the target. Due to the correlations, the relative angle resolutions are not the quadratic sum of the photon and positron angular resolutions.

The \(\theta _{\mathrm {e^+ \gamma }}\) resolution is evaluated as \(\sigma _{\theta _{\mathrm {e^+ \gamma }}} = (15.0 - 16.2)\) mrad depending on the year of data taking by taking into account the correlation between \(z_e\) and \(\theta _\mathrm {e^+}\). Since the true positron momentum and \(\theta _{\mathrm {e^+ \gamma }}\) of the \(\mu ^+ \rightarrow \mathrm{e}^+ \gamma \) signal are known, \(\phi _\mathrm {e^+}\) and \(y_\mathrm {e^+}\) can be corrected using the reconstructed energy of the positron and \(\theta _{\mathrm {e^+ \gamma }}\). The \(\phi _{\mathrm {e^+ \gamma }}\) resolution after correcting these correlations is evaluated as \(\sigma _{\phi _{\mathrm {e^+ \gamma }}}= (8.9-9.0)\) mrad depending on the year.

The systematic uncertainty of the positron emission angle relies on the accuracy of the relative alignment among the magnetic field, the DCH modules, and the target (see Sects. 3.2.3 and 3.2.4 for the alignment methods and the uncertainties). The position of the target (in particular the error in the position and orientation of the target plane) and any distortion of the target plane directly affect the emission angle measurement and are found to be one of the dominant sources of systematic uncertainty on the relative angles.

Positron AIF events,

Cosmic rays without the COBRA magnetic field.

#### 3.3.2 Relative photon–positron time

^{4}and shown in Fig. 18.

The centre of this distribution is used to correct the time offset between the TC and LXe detectors. The position of the RMD-peak corresponding to \(t_{\mathrm {e^+ \gamma }}= 0\) is monitored constantly during the physics data-taking period and found to be stable to within 15 ps. In order to obtain the resolution on \(t_{\mathrm {e^+ \gamma }}\) for signal events, the resolution of Fig. 18 must be corrected for the photon energy dependence as measured in the CEX calibration run and for the positron energy dependence (from a MC simulation), resulting in \(\sigma _{t_{\mathrm {e^+ \gamma }}}=122\pm 4\) ps.

The dominant contributions to the \(t_{\mathrm {e^+ \gamma }}\) resolution are the positron track length uncertainty (in timing units 75 ps), the TC intrinsic time resolution (65 ps), and the LXe detector time resolution (64 ps).

#### 3.3.3 Photon-AIF analysis

In order to determine if a photon originates from positron AIF, the following three quantities are calculated for each possible \(\mathrm {e^+_{AIF}}\)\(\gamma \)-pair from all reconstructed \(\mathrm {e^+_{AIF}}\) candidates and photons in the event: the angular differences between the AIF candidate direction and the vector connecting the photon and the AIF vertex (\(\theta _\mathrm{AIF}\) and \(\phi _\mathrm{AIF}\)), and the time difference between the photon and the AIF candidate (\(t_\mathrm{AIF}\)). If there are multiple \(\mathrm {e^+_{AIF}}\) candidates per event, a ranking of \(\mathrm {e^+_{AIF}}\)\(\gamma \)-pairs is performed by minimising the \(\chi ^2\) based on these three observables.

A plot of \(\phi _\mathrm{AIF}\) vs. \(\theta _\mathrm{AIF}\) for the highest ranked \(\mathrm {e^+_{AIF}}\)\(\gamma \)-pairs per event in a random sample of year 2011 events is shown in Fig. 19. The peak at the centre is caused by photons originating from positron AIF in the DCH. The peak has a tail in the negative \(\phi _\mathrm{AIF}\) direction since the AIF vertex is reconstructed at the first DCH cathode foil immediately after the last hit in the \(\mathrm {e^+_{AIF}}\) candidate. However, if the last hit is located in the left plane of a DCH module, it is equally likely (to first order) that the AIF occurred in the first cathode foil of the next DCH module.

## 4 Analysis

### 4.1 Analysis strategy

The \({\mu ^+ \rightarrow \mathrm{e}^+ \gamma }\) event is characterised by an \(\mathrm {e}^+\gamma \)-pair, simultaneously emitted with equal momentum magnitude and opposite directions, and with energy of \(m_{\mu }/2 = 52.83~\mathrm{MeV}\) each. The \({\mu ^+ \rightarrow \mathrm{e}^+ \gamma }\) event signature is therefore very simple and the sensitivity of the experiment is limited by the ability to reject background \(\mathrm {e}^+\gamma \)-pairs, of various origins. Positron and photon energies (\(E_\mathrm {e^+}\) and \(E_{\mathrm {\gamma }}\)), \(\mathrm {e}^+\gamma \) relative time (\(t_{\mathrm {e^+ \gamma }}\)), and relative azimuthal and polar angles \(\theta _{\mathrm {e^+ \gamma }}\) and \(\phi _{\mathrm {e^+ \gamma }}\) are the observables available to distinguish possible \({\mu ^+ \rightarrow \mathrm{e}^+ \gamma }\) candidates from background pairs. In the maximum likelihood analysis presented here, \(\theta _{\mathrm {e^+ \gamma }}\) and \(\phi _{\mathrm {e^+ \gamma }}\) are treated separately, with independent distributions, since these variables can have different experimental resolutions.

This maximum likelihood analysis is thoroughly cross-checked by an alternative independent maximum likelihood analysis where some of the methods are simplified; for example, the relative stereo angle \(\Theta _{\mathrm {e^+ \gamma }}\) is used instead of the relative polar and azimuthal angles.

### 4.2 Dataset

Data were accumulated intermittently in the years 2008–2013. Figure 20 shows the data collection period divided into each calendar year by the planned PSI winter accelerator shutdown periods of 4–5 months. Shutdown periods are used for detector maintenance, modification and repair work. The data accumulated in 2008 were presented in [5], but the quality of those data was degraded by problems with the tracking system and therefore they are not considered in this analysis.

In total, \(7.5 \times 10^{14}\) muons were stopped on target in 2009–2013. The analysis based on the \(3.6\times 10^{14}\) muons stopped on target in 2009–2011 has already been published [7]. The data from the remaining \(2.3\times 10^{14}\) muons stopped on target in 2012, and from \(1.6\times 10^{14}\) muons stopped on target in 2013 are included in this analysis, thus completing the full dataset.

In the first stage of the MEG analysis, events are pre-selected with loose requirements, requiring the presence of (at least) one positron track candidate and a time match given by \(-6.9< t_\mathrm{LXe-TC} < 4.4 ~\mathrm{ns}\), where \(t_\mathrm{LXe-TC}\) is the relative difference between the LXe time and the TC time associated with the positron candidate. The window is asymmetric to include multiple turn events. This procedure reduces our data size to \(\approx \)\( 16\) % of the recorded events. No requirements are made on photon and positron energies or relative directions. Such loose cuts ensure that even in the presence of not yet optimised calibration constants the possibility of losing a good \({\mu ^+ \rightarrow \mathrm{e}^+ \gamma }\) event is negligible.

### 4.3 Blinding

For purposes of various studies, a number of side-band regions were defined. Events with \(| t_{\mathrm {e^+ \gamma }}| > 1~\mathrm{ns}\) fall in the “timing side-bands”, the left side-band corresponding to \(t_{\mathrm {e^+ \gamma }}< -1~\mathrm{ns}\) and the right side-band to \(t_{\mathrm {e^+ \gamma }}> 1~\mathrm{ns}\), while events with arbitrary relative timing and with \(E_{\mathrm {\gamma }}< 48.0~\mathrm{MeV}\) fall into the “energy side-band”. Different photon energy windows are used for different timing side-band studies. For example, events with \(48.0< E_{\mathrm {\gamma }}< 58.0~\mathrm{MeV}\) are used when the timing side-band data are compared with the data in the analysis window, and events with \( E_{\mathrm {\gamma }}> 40.0~\mathrm{MeV}\) are used for the single photon background study. RMD events, with zero relative timing, belong to the energy side-band and, as stated in Sect. 3.3.2, are used to accurately calibrate the timing difference between LXe detector and TC. Events in the timing side-bands are very likely to be accidental events; hence, their positron and photon energy spectra and relative angle distributions are uncorrelated. We also define “angle side-bands” the regions corresponding to \(50< | \theta _{\mathrm {e^+ \gamma }}| < 150~\mathrm{mrad}\) or \(75< | \phi _{\mathrm {e^+ \gamma }}| < 225~\mathrm{mrad}\), which are used for self-consistency checks of the analysis procedure.

Side-band events are studied in detail to optimise the algorithms and analysis quality, to estimate the background in the analysis window, and to evaluate the experimental sensitivity by using toy MC simulations. At the end of the optimisation procedure, the events in the blinding box are analysed and a maximum likelihood fit is performed to extract the number of signal (\(N_\mathrm{sig}\)), RMD (\(N_\mathrm{RMD}\)) and accidental background (\(N_\mathrm{ACC}\)) events. The likelihood fit is performed on events falling in the “Analysis Window” defined by \(48.0< E_{\mathrm {\gamma }}< 58.0~\mathrm{MeV}\), \(50.0< E_\mathrm {e^+}< 56.0~\mathrm{MeV}\), \(| t_{\mathrm {e^+ \gamma }}| < 0.7~\mathrm{ns}\), \(| \theta _{\mathrm {e^+ \gamma }}| < 50~\mathrm{mrad}\) and \(| \phi _{\mathrm {e^+ \gamma }}| < 75~\mathrm{mrad}\). The projection of the analysis window in the \((t_{\mathrm {e^+ \gamma }}, E_{\mathrm {\gamma }})\) plane is also shown in Fig. 21. The size of the analysis window is chosen to be between five and twenty times the experimental resolutions of all observables in order to prevent any risk of losing good events and to restrict the number of events to be fitted at a reasonable level. The same fitting procedure is preliminarily applied to equal size regions in the timing and angle side-bands (with appropriate shifts on relative timings or angles) to verify the consistency of the calculation.

### 4.4 Background study

The background in the search for the \(\mathrm {\mu }^+ \rightarrow \mathrm {e}^+ \mathrm {\gamma }\) decay comes either from RMD or from an accidental overlap between a Michel positron and a photon from RMD or AIF. All types of background are thoroughly studied in the side-bands prior to analysing events in the analysis window.

#### 4.4.1 Accidental background

The accidental overlap between a positron with energy close to the kinematic edge of the Michel decay and an energetic photon from RMD or positron AIF is the leading source of the background.

##### 4.4.1.1 Single photon background

High energy single photon background events are mainly produced by two processes: RMD and AIF of positrons. The contribution from external Bremsstrahlung is negligibly small in our analysis window. RMD is the Michel decay with the emission of a photon, also called inner Bremsstrahlung. The integrated fraction of the spectrum of photons from RMD is roughly proportional to the square of the integration window size near the signal energy, which is usually determined by the energy resolution [32, 33]. AIF photon background events are produced when a positron from Michel decay annihilates with an electron in the material along the positron trajectory into two photons and the most energetic photon enters the LXe detector. The emission direction of the most energetic photon is closely aligned to that of the original positron and the cross section is peaked with one photon carrying most of the energy. The total number of AIF background events depends on the layout and the material budget of the detector along the positron trajectory.

The integrated photon yield per decay above *y* is plotted on the vertical axis (the maximum allowed value for *y* is slightly smaller than one for RMD and slightly larger than one for AIF, due to the electron mass). The RMD photon fraction is 55 %, and the AIF photon fraction is 45 % in the \(y>0.9\) region. From Fig. 22, AIF becomes dominant in the \(y>0.92\) region. Since the energy spectra decrease rapidly as a function of *y* near the kinematic end-point, a good energy resolution reduces steeply the single photon background.

In addition to the RMD and AIF components in the analysis window, there are contributions from pile-up photons and cosmic-ray components, totalling at most 4–6 %. The pile-up rejection methods are discussed in Sect. 3.1.3. The cosmic-ray events are rejected by using topological cuts based on the deposited charge ratio of the inner to outer face and the reconstructed depth (*w*) because these events mostly come from the outer face of the LXe detector while signal events are expected from the inner face. After applying these cuts, photon background spectra are measured directly from the timing side-band data, and the measured shape is used in the analysis window.

##### 4.4.1.2 Single positron background

The single positron background in the analysis window results from the Michel decay positrons. Although the theoretical positron energy spectrum of the Michel decay is well known [34], the measured positron spectrum is severely distorted by the design of the spectrometer which tracks only high momentum positrons, and therefore introduces a strong momentum dependence in the tracking efficiency. The resolution in the momentum reconstruction also influences the measured spectrum. The positron spectrum obtained by our detector with the resolution function and the acceptance curve are shown in [8]. There is a plateau region near the signal energy where the measurement rate of the positrons reaches its maximum, which allows us to extract the shape of the positron background precisely from the data with high statistics.

##### 4.4.1.3 Effective branching ratio

*q*. Figure 23a shows the effective branching ratio for the accidental background as a function of the lower edges of the integration regions of \(E_\mathrm {e^+}\) and \(E_{\mathrm {\gamma }}\). The same plot for the RMD background is shown in Fig. 23b, which is described in detail in the Sect. 4.4.2. It can be seen that the accidental background is much more severe than the RMD background.

#### 4.4.2 RMD background

A second background source consists of the \(\mu ^+ \rightarrow \mathrm{e}^+\gamma \nu \bar{\nu }\) RMD process, producing a time-coincident \(\mathrm {e}^+\gamma \)-pair. The RMD events fall into the analysis window when the two neutrinos have small momentum and are identical to the signal in the limit of neutrino energies equal to zero. Observation of the RMD events provides a strong internal consistency check for the \(\mu ^+ \rightarrow \mathrm{e}^+ \gamma \) analysis since it is a source of time-coincident \(\mathrm {e}^+\gamma \)-pairs.

The estimated number of RMD events in the \(\mu ^+ \rightarrow \mathrm{e}^+ \gamma \) analysis window is calculated by extrapolating the energy side-band distribution to the analysis window, giving an estimate of \(\langle N_\mathrm{RMD}\rangle = 614 \pm 34\), which is used as a statistical constraint in the likelihood analysis.

The RMD branching ratio is highly suppressed when the integration region is close to the limit of \(\mu ^+ \rightarrow \mathrm{e}^+ \gamma \) kinematics. The effective branching ratio, which is calculated by considering the detector resolution, is plotted in Fig. 23b as a function of the lower edges of the integration regions on \(E_\mathrm {e^+}\) and \(E_{\mathrm {\gamma }}\). For example, the effective branching ratio for \(52.0< E_{\mathrm {\gamma }}< 53.5\) MeV and \(52.0<E_\mathrm {e^+}<53.5\) MeV is \(3\times 10^{-14}\), more than twenty times lower than that due to the accidental background.

### 4.5 Maximum likelihood analysis

#### 4.5.1 Likelihood function

*i*-th event.

*S*, *R* and *A* are the probability density functions (PDFs) for the signal, RMD and accidental background events, respectively. \(N = N_\mathrm{sig}+ N_\mathrm{RMD}+ N_\mathrm{ACC}\) is the total number of events in the fit and \(N_\mathrm{obs}\) is the total number of detected events in the analysis window. *C* is a term for the constraints of nuisance parameters.

The expected numbers of RMD and accidental background events with their respective uncertainties are evaluated in the side-bands and are applied as Gaussian constraints on \(N_\mathrm{RMD}\) and \(N_\mathrm{ACC}\) in the *C* term in Eq. 3.

The target position parameters \(z_0\) are subject to Gaussian constraints whose widths are the year dependent systematic uncertainties; the target deformation parameters \(k_\mathrm {t}\) are constrained with uniform distributions in year dependent intervals.

#### 4.5.2 PDFs

##### 4.5.2.1 Event-by-event PDFs

The PDFs for signal, RMD and accidental background events are formed as a function of the five observables (\(E_{\mathrm {\gamma }}\), \(E_\mathrm {e^+}\), \(t_{\mathrm {e^+ \gamma }}\), \(\theta _{\mathrm {e^+ \gamma }}\), \(\phi _{\mathrm {e^+ \gamma }}\)) taking into account the correlations between them and the dependence of each of them and of their uncertainties on the photon interaction vertex, the muon decay vertex and the track reconstruction quality.

Because the detector resolutions depend on the detector conditions and the hit position in the detector, this approach uses different PDFs for each event (event-by-event PDFs). The energy response, the position resolution and the background spectrum of the LXe detector are evaluated as function of the interaction vertex. For the positron PDF, the fitting errors of the tracking variables are used to compute the resolutions; namely the resolution on the observable *q* (\(\sigma _q\)) is replaced by a product of the pull parameter (\(s_q\)) and the fitting error (\(\sigma _q^{\prime }\)). The pull parameters are extracted from the data as described in Sect. 3.2.5 and are common to all events in a given DAQ period. The correlations between observables are also treated on an event-by-event basis. For example, the errors on the momentum and the angle are correlated because the emission angle of positrons is computed by extrapolating the fitted tracks to the target plane. Since the true positron momentum of the signal is known, the mean of the signal angle PDF can be corrected as a function of the observed momentum.

Because the energies, relative timing and angles for the signal are fixed and known, the signal PDFs are described by the product of the detector response function for each observable. The correlations between the errors of the observables are implemented in the \(t_{\mathrm {e^+ \gamma }}\), \(\theta _{\mathrm {e^+ \gamma }}\) and \(\phi _{\mathrm {e^+ \gamma }}\) PDFs by shifting the centres and modifying the resolutions. The possible reconstruction bias due to errors on the target position and deformation is included in the signal PDF by shifting the centre of the \(\phi _{\mathrm {e^+ \gamma }}\) PDF by an amount computed from \({\varvec{t}}\). The amount of the shift is computed geometrically by shifting the target by \(\delta z_0 + k_\mathrm {t} \cdot (z_\mathrm {t,FARO}(x_{\mathrm {e^+}}, y_{\mathrm {e^+}}) - z_{\mathrm {t},2013}(x_{\mathrm {e^+}}, y_{\mathrm {e^+}}))\) in the \(z_\mathrm {t}\) direction, where \(\delta z_0\), \(z_\mathrm {t,FARO}\) and \(z_{\mathrm {t},2013}\) are the deviation of \(z_0\) from the nominal value and the coordinates defined by the FARO measurements and the 2013 paraboloid fit, respectively (see Fig. 14). For the \(t_{\mathrm {e^+ \gamma }}\) PDF, events are categorised by using \({\varvec{q}}_\mathrm{e^+}\), which consists of the track-fitting quality and the matching quality between the fitted track and the hit position on the TC. The resolution and the central value are extracted for each category from the observed RMD timing peak. The dependence on \(E_{\mathrm {\gamma }}\) and \(E_\mathrm {e^+}\) is taken into account. Most of the parameters used to describe the correlations are extracted from data by using the double-turn method (see Sect. 3.2.5), while a few parameters (for instance, the slope parameter for the \(\delta _{t_{\mathrm {e^+ \gamma }}}\)–\(\delta _{E_\mathrm {e^+}}\) correlation, where \(\delta _x\) is the difference between the observed and the true value of the observable *x*) are extracted from a MC simulation.

The RMD PDF is formed by the convolution of the detector response and the kinematic distribution in the parameter space, (\(E_{\mathrm {\gamma }}\), \(E_\mathrm {e^+}\), \(\theta _{\mathrm {e^+ \gamma }}\), \(\phi _{\mathrm {e^+ \gamma }}\)), expected from the Standard Model [35]. The correlations between the variables are included in the kinematic model. The PDF for \(t_{\mathrm {e^+ \gamma }}\) is almost the same as that of the signal PDF, while the correlation between \(\delta _{t_{\mathrm {e^+ \gamma }}}\) and \(E_\mathrm {e^+}\) is excluded.

The accidental background PDFs are extracted from the timing side-band data. For \(E_\mathrm {e^+}\), the spectrum, after applying the same event selection on the track reconstruction quality as for the physics analysis, is fitted with a function formed by the convolution of the Michel positron spectrum and a parameterised function describing the detector response. For \(E_{\mathrm {\gamma }}\), the energy spectra after applying the pile-up and cosmic-ray cuts and a loose selection on the \(\mathrm {e}^+\gamma \) relative angle, are fitted with a function to represent background photon, remaining cosmic-ray and the pile-up components convoluted with the detector response. The \(\theta _{\mathrm {e^+ \gamma }}\) and \(\phi _{\mathrm {e^+ \gamma }}\) PDFs are represented by polynomial functions fitted to the data after applying the same event selection except for the \(t_{\mathrm {e^+ \gamma }}\). For \(t_{\mathrm {e^+ \gamma }}\), a flat PDF is used.

##### 4.5.2.2 Constant PDFs

The event-by-event PDFs employ the entire information we have about detector responses and kinematic variable correlations. A slightly less sensitive analysis, based on an alternative set of PDFs, is used as a cross check; this approach was already implemented in [7].

In this alternative set of PDFs the events are characterised by “categories”, mainly determined by the tracking quality of positrons and by the reconstructed depth of the interaction vertex in the LXe detector for photons. A constant group of PDFs is determined year by year, one for each of the categories mentioned above; the relative stereo angle \(\Theta _{\mathrm {e^+ \gamma }}\) is treated as an observable instead of \(\theta _{\mathrm {e^+ \gamma }}\) and \(\phi _{\mathrm {e^+ \gamma }}\) separately, while the three other kinematic variables (\(E_\mathrm {e^+}\), \(E_{\mathrm {\gamma }}\) and \(t_{\mathrm {e^+ \gamma }}\)) are common to the two sets of PDFs. Correlations between kinematic variables are also taken into account with a simpler approach and the systematic uncertainties associated with the target position are included by shifting \(\Theta _{\mathrm {e^+ \gamma }}\) of each event by an appropriate amount, computed by a combination of the corresponding shifts of \(\theta _{\mathrm {e^+ \gamma }}\) and \(\phi _{\mathrm {e^+ \gamma }}\). Signal and RMD PDFs are modelled as in the event-by-event analysis by using calibration data and theoretical distributions, folded with detector response. This likelihood function is analogous to Eq. 3 with the inclusion of the Gaussian constraints on the expected number of RMD and accidental background events and of the Poissonian constraint on the expected total number of events. In what follows we refer to this set of PDFs as “constant PDFs” and to the analysis based on it as “constant PDFs’ analysis”.

#### 4.5.3 Confidence interval

The following systematic uncertainties are included in the calculation of the confidence interval: the normalisation (defined in Sect. 4.6), the alignment of the photon and the positron detectors, the alignment (position and deformation) of the muon stopping target, the photon energy scale, the positron energy bias, the centre of the signal \(t_{\mathrm {e^+ \gamma }}\) PDF, the shapes of the signal and background PDFs, and the correlations between the errors of the positron observables. The dominant systematic uncertainty is due to the target alignment as described in Sect. 4.7.1, which is included in the maximum likelihood fit by profiling the target parameters. The other uncertainties are included by randomising them in the generation of the pseudo experiments used to construct the distribution of the likelihood ratio.

### 4.6 Normalisation

Two independent methods are used to calculate \(N_{\mu }\). Since both methods use control samples measured simultaneously with a signal, they are independent of the instantaneous beam rate.

#### 4.6.1 Michel positron counting

The absolute values of the positron acceptance and efficiency cancel in the ratio \(\epsilon ^{\mathrm {e}\gamma }_{\mathrm {e}}/\epsilon ^{\mathrm {e}\nu \bar{\nu }}_{\mathrm {e}}\). Momentum dependent effects are derived from the Michel spectrum fit, resulting in \(\epsilon ^{\mathrm {e}\gamma }_{\mathrm {e}}/\epsilon ^{\mathrm {e}\nu \bar{\nu }}_{\mathrm {e}} = 1.149 \pm 0.017\).

The photon efficiency is evaluated via a MC simulation taking into account the observed event distribution. The average value is \(\epsilon ^{\mathrm {e}\gamma }_{\gamma }=0.647\). The main contribution to the photon inefficiency is from conversions before the LXe detector active volume: 14 % loss in the COBRA magnet, 7 % in the cryostat and PMTs, and 7 % in other materials. Another loss is due to shower escape from the inner face, resulting in a 6 % loss. The photon efficiency is also measured in the CEX run. By tagging an 83-MeV photon from a \(\pi ^0\) decay, the efficiency for detection of 55-MeV photons is measured to be 0.64–0.67, consistent with the evaluation from a MC simulation. With an additional selection efficiency of 0.97 resulting from the rejection of pile-up and cosmic-ray events, \(\epsilon ^{\mathrm {e}\gamma }_{\gamma }=0.625 \pm 0.023\).

The trigger efficiency consists of three components; photon energy, time coincidence, and direction match. The efficiency of photon energy is estimated from the online energy resolution and found to be \(\gtrsim 0.995\) for \(E_{\mathrm {\gamma }}>48.0\) MeV. The efficiency of the time coincidence is estimated from the online time resolution and found to be fully efficient. The direction match efficiency is evaluated, based on a MC simulation, to be \(\epsilon ^{\mathrm {e}\gamma }_{\mathrm {trg}} = 0.91 \pm 0.01\) and 0\(.96 \pm 0.01\) for the data up to and after 2011, respectively (see Fig. 9).

For \(\mathrm {e}^+\gamma \)-pairs that satisfy the selection criteria for each particle, two kinds of further selection are imposed. One is the cut for the AIF-like events described in Sect. 3.2.8, resulting in 1.1 % inefficiency for the signal events. The other is defined by the analysis window, in particular those for the relative angles and timing. The inefficiency is evaluated via a MC simulation taking into account the pile-up and detector condition. A loss of 3.2 % is due to the tails in the angular responses. Additionally, about 1.5 % of events are outside the time window, mainly due to the erroneous reconstruction of positron trajectories when one of the turns, usually the first, is missed. As a result, \(\epsilon ^{\mathrm {e}\gamma }_{\mathrm {sel}} = 0.943\pm 0.010\).

In total, the Michel positron counting method provides \(N_{\mu }\) with a 4.5 % uncertainty.

#### 4.6.2 RMD channel

We use events reconstructed in the energy side-band defined in Sect. 4.4.2, corresponding to \(\mathcal {B}^{\mathrm {e}\nu \bar{\nu }\gamma } = 4.9\times 10^{-9}\). The number of RMD events is extracted from the fit to the \(t_{\mathrm {e^+ \gamma }}\) distribution separately for each year dataset and for 12 statistically independent sub-windows, resulting in \(N^{\mathrm {e}\nu \bar{\nu }\gamma } = 29\,950\pm 527\) in total.

The momentum dependent ratio of the positron detection efficiency is extracted from the Michel spectrum fit. An additional correction for the momentum dependence of the missing turn probability is applied based on the evaluation of a MC simulation. A pre-scaled trigger with a lowered \(E_{\mathrm {\gamma }}\) threshold (by \(\approx \)\(4\) MeV) allows for a relative measurement of the energy-dependent efficiency curve of the LXe detector. The efficiency ratio of the direction match is evaluated from the distribution of accidental background. The effect of muon polarisation [9], which makes the background distribution non-flat (asymmetric) even in case of a fully efficient detector and trigger, is taken into account. Inefficiency due to the AIF-like event cuts and the tail in the time reconstruction are common to signal and RMD, and thus, only tails in the angular responses are relevant. A more detailed description of the RMD analysis is found in [38].

A \(\chi ^2\) fit is performed to extract \(N_{\mu }\) from the measured RMD spectrum. The systematic uncertainty on each factor, correlated among different windows, is accounted for in the fit. The uncertainty on \(N_{\mu }\) from the fit to the full dataset is 5.5 %.

#### 4.6.3 \(N_{\mu }\) summary

The normalisation factors calculated by the two methods are shown in Fig. 25. The two independent results are in good agreement and combined to give \(N_{\mu }\) with a 3.5 % uncertainty. The single event sensitivity for the full dataset is \(1/N_\mu = (5.84 \pm 0.21) \times 10^{-14}\).

### 4.7 Results

#### 4.7.1 Sensitivity

Best fit values of the branching ratios (\(\mathcal{B}_\mathrm {fit}\)), upper limits at 90 % C.L. (\( \mathcal{B}_{90}\)) and sensitivities (\( \mathcal{S}_{90}\))

Dataset | 2009–2011 | 2012–2013 | 2009–2013 |
---|---|---|---|

\( \mathcal{B}_\mathrm {fit}\times 10^{13}\) | \(-1.3\) | \(-5.5\) | \(-2.2\) |

\( \mathcal{B}_{90}\times 10^{13}\) | 6.1 | 7.9 | 4.2 |

\( \mathcal{S}_{90}\times 10^{13}\) | 8.0 | 8.2 | 5.3 |

The average contributions of the systematic uncertainties are evaluated by calculating the sensitivities without including them. The dominant one is found to be the uncertainty on the target alignment; it degrades the sensitivity by 13 % on average, while the total contribution of the other systematic uncertainties is less than 1 %. The sensitivity for the 2009–2011 dataset is found to be slightly worse than previously quoted in [7] due to a more conservative assignment of the systematic uncertainty on the target alignment.

#### 4.7.2 Likelihood analysis in the analysis window

A maximum likelihood analysis is performed to evaluate the number of signal events in the analysis window by the method described in Sect. 4.5. Figure 28 shows the profile-likelihood ratios as a function of the branching ratio observed for 2009–2011, 2012–2013, and 2009–2013 full dataset, which are all consistent with a null-signal hypothesis. The kinks visible in the curves (most obvious in 2012–2013) are due to the profiling of the target deformation parameters (see Sect. 4.5.1). In the positive side of the branching ratio, the estimate of the target shape parameters in the profiling is performed by looking for a positive excess of signal-like events in the \(\phi _{\mathrm {e^+ \gamma }}\) distribution. On the other hand, in the negative side, it is done by looking for a deficit of signal-like events. These parameters are therefore fitted to opposite directions (the paraboloid shape or the deformed shape defined by the FARO measurement) in the positive and the negative sides of the branching ratio. The likelihood curve shifts from one to another of the two shapes crossing 0 in the branching ratio. The best fit value on the branching ratio for the full dataset is \(-2.2\times 10^{-13}\). The upper limit of the confidence interval is calculated following the frequentist approach described in Sect. 4.5.3 to be \(4.2\times 10^{-13}\) at 90 % C.L.

The results from the maximum likelihood analysis are summarised in Table 2. The dominant systematic uncertainty is due to the target alignment uncertainty, which increases the upper limit by 5 % while the other uncertainties increase it by less than 1 % in total.

The upper limit on the branching ratio is consistent with the sensitivity under the background-only hypothesis presented in Sect. 4.7.1. This result is confirmed by following the profile of the log-likelihood curve as a function of the number of signal events, in parabolic approximation, and by independent analysis, based on a set of the constant PDFs, which will be discussed in Sect. 4.7.3.1.

A maximum likelihood fit without the constraints on \(N_\mathrm{RMD}\) and \(N_\mathrm{ACC}\) estimated in the side-bands is performed as a consistency check. The best fit values of NACC and NRMD for the combined dataset are \(7684\pm 103\) and \(663\pm 59\), respectively. They are consistent with the respective expectations of \(7744\pm 41\) and \(614\pm 34\) and also with the total number of observed events (\(N_\mathrm{obs}= 8344\)) in the analysis window.

#### 4.7.3 Discussion

##### 4.7.3.1 Constant PDFs’ analysis

The consistency of the two analyses is also checked by a set of pseudo experiments, specifically produced to be compatible with the structures of both the analyses (“common toy MCs”). The upper limits at 90 % C.L. observed in the two analyses for a sample of several hundred common toy MCs are compared in Fig. 31; the experimental result is marked by a star. There is a clear correlation between the upper limits from the two analyses with a \(\approx \)\(20\) % better sensitivity on average for the event-by-event PDFs’ analysis. By analysing the distribution of the differences between the upper limit reconstructed by the two analyses on this sample of common toy MCs, we found that the probability of obtaining a difference in the upper limit at least equal to that measured on the real data is 70 %.

##### 4.7.3.2 Comparison with previous analysis

## 5 Conclusions

A sensitive search for the lepton flavour violating muon decay mode \(\mu ^+ \rightarrow \mathrm {e}^+ \gamma \) was performed with the MEG detector in the years 2009–2013. A blind, maximum-likelihood analysis found no significant event excess compared to the expected background and established a new upper limit for the branching ratio of \( \mathcal{B} (\mu ^+ \rightarrow \mathrm{e}^+ \gamma ) < 4.2 \times 10^{-13}\) with 90 % C.L. This upper limit is the most stringent to date and provides important constraints on the existence of physics beyond the Standard Model.

The new measured upper limit improves our previous result [7] by a factor 1.5; the improvement in sensitivity amounts to a factor 1.5. Compared with the previous limit from the MEGA collaboration [39], our new upper limit represents a significant improvement by a factor 30.

An effort to upgrade the existing MEG detector is currently underway with the goal of achieving an additional improvement in the sensitivity of close to an order of magnitude [40]. The modifications are designed to increase acceptance, enable a higher muon stopping rate, and improve limiting detector resolutions. Tracking and timing detectors for measuring the positrons have been completely redesigned and other parts of the detector have been refurbished. The newly designed experiment, MEG II, will be able to use a muon decay rate twice that of MEG. The improved detector is expected to improve the branching ratio sensitivity to \(5 \times 10^{-14}\) with 3 years of data taking planned for the coming years.

*x*[] is the waveform amplitude in waveform time-bins,

*y*[] is the output signal in the same time-bins, and \(M=105\) is the number of points used in the average. This filter is based on the moving average, which is a simple and fast algorithm with a good response in time domain.

Two kinds of instability in the PMT response are observed: one is a long-term gain decrease due to decreased secondary emission mainly at the last dynode with collected charge and the other is a rate-dependent gain shift due to charge build-up on the dynodes.

## Acknowledgments

We are grateful for the support and co-operation provided by PSI as the host laboratory and to the technical and engineering staff of our institutes. This work is supported by DOE DEFG02-91ER40679 (USA), INFN (Italy), MEXT KAKENHI 22000004 and 26000004 (Japan), Schweizerischer Nationalfonds (SNF) Grant 200021_137738, the Russian Federation Ministry of Education and Science and Russian Fund for Basic Research Grant RFBR-14-22-03071.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funded by SCOAP^{3}