keV-Scale Sterile Neutrino Sensitivity Estimation with Time-Of-Flight Spectroscopy in KATRIN using Self-Consistent Approximate Monte Carlo

We investigate the sensitivity of the Karlsruhe Tritium Neutrino Experiment (KATRIN) to keV-scale sterile neutrinos, which are promising dark matter candidates. Since the active-sterile mixing would lead to a second component in the tritium $\beta$-spectrum with a weak relative intensity of order $\sin^2\theta \lesssim 1\times10^{-6}$, additional experimental strategies are required to extract this small signature and to eliminate systematics. A possible strategy is to run the experiment in an alternative time-of-flight (TOF) mode, yielding differential TOF spectra in contrast to the integrating standard mode. In order to estimate the sensitivity from a reduced sample size, a new analysis method, called self-consistent approximate Monte Carlo (SCAMC), has been developed. The simulations show that an ideal TOF mode would be able to achieve a statistical sensitivity of $\sin^2\theta \sim 5\times10^{-9}$ at one $\sigma$, improving the standard mode by approximately a factor two. This relative benefit grows significantly if additional exemplary systematics are considered. A possible implementation of the TOF mode with existing hardware, called gated filtering, is investigated, which, however, comes at the price of a reduced average signal rate.


Introduction
In recent years the interest has grown for sterile neutrinos with a mass scale of a few keV [1]. They are proposed as dark matter particle candidates in cold dark matter (CDM) and especially warm dark matter (WDM) scenarios [2][3][4][5]. WDM has the potential to avoid issues regarding structure formation on small scales which are not yet solved for WIMP (weakly interacting massive particle) CDM [6][7][8][9][10][11][12]. However, the shortcomings of WIMP CDM can possibly be mitigated via Baryonic feedback [13] while any sterile neutrino dark a e-mail: nicholas.steinbrink@uni-muenster.de matter production mechanism needs to be fine-tuned to yield the correct DM density. Mass-dependent bounds on the sterile neutrino mixing with active neutrinos have been established by searches for sterile neutrino decay via X-ray satellites [14,15] and on basis of theoretical considerations in order to avoid dark matter overproduction [16], which never exceed sin 2 θ 10 −7 . The mass range has been constrained by the DM phase-space distribution in dwarf spheroidal galaxies [17] and gamma-ray line emission from the Galactic center region [18] to 1 keV < m h < 50 keV. In order to produce the existing amount of dark matter, mass and mixing angle are linked by the production mechanism, which can be nonresonant [16,19,20] or resonant [21][22][23][24]. Moreover, possible evidence of relic sterile neutrinos with mass m h = 7 keV has been reported in XMM-Newton data [25][26][27].
In principle, it can also be searched for keV-scale sterile neutrinos in ground-based experiments, such as in tritium βdecay [28,29]. A promising example is the Karlsruhe Tritium Neutrino Experiment (KATRIN) [30], which is the most sensitive neutrino mass experiment currently under construction. Sterile neutrinos would be visible by a discontinuity in the βdecay spectrum if they have a sufficiently large mixing angle with electron neutrinos. In order to adapt KATRIN, which is optimized for light neutrinos of m l O(eV), for keV sterile neutrinos, different approaches are discussed with the goal of enhancing statistics and managing systematics. A suitable idea is to develop a dedicated detector measuring in differential mode [31][32][33]. As an alternative idea, it is worthwhile to study the performance of an alternative Time-of-Flight (TOF) mode, which has already shown to be promising in theory for active neutrino mass measurements [34].
In this publication the sensitivity of a keV-scale sterile neutrino search based on TOF spectroscopy with the KA-TRIN experiment is discussed both for an ideal measurement arXiv:1710.04939v3 [physics.ins-det] 13 Feb 2018 method as for a possible implementation with minimal hardware modifications.
2 Sterile Neutrino Search with TOF spectroscopy

Sterile Neutrinos in Tritium β-Decay and KATRIN
There has been some previous work on sterile neutrinos in general in tritium β-decay. Most publications focus on eVscale sterile neutrinos [35][36][37][38][39], which are proposed to address certain anomalies in oscillation experiments [40][41][42][43][44][45]. However, in recent time also dedicated studies, dealing with keVscale neutrinos have been published, such as [29,31,32], as well as studies involving more exotic models, such as [46][47][48]. We will quickly summarize the main effect of a keV-scale sterile neutrino on the tritium β-spectrum, while we refer especially to [31] for deeper insights into systematics and theoretical corrections.
The tritium β-decay spectrum with a single neutrino with mass eigenstate m i is given as [28,49,50], where E is the kinetic electron energy, θ C the Cabbibo angle, N the number of tritium atoms, G F the Fermi constant, M the nuclear matrix element, F(E, Z ) the Fermi function with the charge of the daughter ion Z , p the electron momentum, P j the probability to decay to an excited electronic and rotational-vibrational state with excitation energy V j [51][52][53] and E 0 the beta endpoint, i.e. the maximum kinetic energy in case of m i = 0. The electron neutrino is a superposition of multiple mass eigenstates. Since the flavor eigenstate is the one which defines the interaction, but the mass eigenstate the one which describes the dynamics of the decay, the β-spectrum for the electron neutrino is an incoherent superposition of the contributions for each mass eigenstate, In case of an additional keV-scale sterile neutrino, a fourth mass state m 4 is introduced with a significantly lower mixing with the electron neutrino, |U e4 | 2 |U ei | 2 (i ∈ 1, 2, 3). In the following we define the heavy or sterile neutrino mass as m h ≡ m 4 and the active-sterile mixing angle as sin 2 θ ≡ |U e4 | 2 < 10 −7 [15]. Since the light mass eigenstates 1, 2, 3 are not distinguishable by KATRIN [50], a light neutrino mass is defined as The combined β-spectrum with sterile and active neutrino can then be expressed as An example with exaggerated mixing is shown in Fig. 1. In probing the absolute neutrino mass scale, the KATRIN experiment is designed to measure the light neutrino mass m l with a sensitivity of < 0.2 eV at 90% confidence level (CL) [30]. Therefor it uses a windowless gaseous molecular tritium source (WGTS) [54] with an activity of ∼ 10 11 Bq. The electrons from the β-decay are filtered in the main spectrometer based on the magnetic adiabatic collimation with electrostatic filter (MAC-E-Filter) principle [55]. The magnetic field in the center of the main spectrometer, the analyzing plane, is held small at B A = 3 mT and otherwise high at B S = 3.6 T in the source and at B max = 6 T at the exit of the main spectrometer just before the counting detector. Due to adiabatic conservation of the relativistic magnetic moment, electron momenta are aligned with the field in the analyzing plane. By additionally applying an electrostatic retarding potential qU in the analyzing plane, the MAC-E-Filter acts as a high-pass filter with a sharp energy resolution of ∆ E/E = B A /B max ≈ 0.9 eV/E 0 . In the focal plane detector (FPD) the count rate is then measured. That way, KATRIN measures the integral β -spectrum as a function of qU.

Time-Of-Flight Spectroscopy
The idea of using Time-Of-Flight (TOF) spectroscopy for a measurement of the light neutrino mass is explained in detail in Ref. [34]. In the following, we will recapitulate the approach briefly and explain the motivations for investigating this technique for a keV-scale sterile neutrino search as well.
In contrast to the standard mode of operation, as described in the last section, TOF spectroscopy allows to measure not only the count-rate, but a full TOF spectrum at a given retarding potential qU. The TOF as a function of the energy is given by integrating the reciprocal velocity over the center of motion, which we will assume for simplicity to be on the z-axis, where E and ϑ are the initial kinetic energy and polar angle of the electron, respectively. z start and z stop are the positions on the beam axis between which TOF is measured, ∆U(z) is the potential difference as a function of position z and p (z) the parallel momentum. By assuming adiabatic conservation of the magnetic moment, p (z) can be expressed analytically as a function of the potential ∆U(z) and magnetic field B(z) (derivation see Ref. [34]). If these are known, the integral in Eq. (4) can be solved numerically.
Since the TOF is a function of the energy, the β-spectrum can be transformed into a TOF spectrum dN/dτ, given the initial angular distribution of the β-decay electrons. A feature in the β-spectrum such as a sterile neutrino contribution would then also have a corresponding effect on the TOF spectrum if the retarding energy qU is sufficiently low. Like the β-spectrum (2), the TOF spectrum can as well be expressed as a superposition of a component with a heavy neutrino mass m h and a light neutrino mass m l : For each of these two components, the TOF spectrum can then formally be obtained from the β-spectrum with neutrino mass m l and m h , respectively, using the transformation theorem for densities [56]: where g(ϑ ) denotes the angular distribution and dN/dE(E, ϑ ) the response corrected energy spectrum, which itself is a function of the β-spectrum (1) for a given neutrino mass. If angular changes from inelastic scattering processes in the tritium source are neglected, the angular distribution is approximately independent from the energy spectrum and given by isotropic emission within the angular acceptance interval given by the default KATRIN field settings with ϑ max = B S /B max = 50.77 • . The response corrected energy spectrum dN/dE(E, ϑ ) in Eq. (6) is given in good approximation by the β-spectrum (3), convolved with the inelastic energy loss function in the tritium source, where the f n is the energy loss spectrum of scattering order n which can be approximately defined via recursive convolution through the single scattering energy loss spectrum f 1 . This can be written as The probability p n that an electron is scattered n times depends on the emission angle ϑ and is given by a Poisson law The average number of scattering processes λ is given in terms of the column density ρd of the tritium source, the mean free column density ρd free and the scattering cross section σ scat as Since the probability of n-fold scattering is a function of the emission angle (10), the response corrected energy spectrum (8) itself becomes dependent on the angle. Note that the scattering model is simplified, since angular changes in collisions are neglected and the scattering probabilities are averaged over a hypothetical uniform density profile in the source. We would like to clarify that in our actual implementation the n-fold energy loss spectra are not generated via convolution but via Monte Carlo, which yields, however, equivalent results. Furthermore, using Eq. (4), the radial starting position is always assumed to be r = 0, which is not the case in KATRIN, but we do not expect significant changes in the spectral shape for outer radii. For analysis of real experimental data a fully realistic treatment would be necessary, yet for a principle sensitivity study these approximations are reasonable.
The benefits of a TOF measurement can be understood from Fig. 2, where the TOF (4) as a function of E for different angles is shown. It can be seen that energy differences up to some ∼ 100 eV above the retarding potential translate By combining multiple TOF spectra with different retarding energies, the TOF method will give a differential map of the energy spectrum within the measuring interval.
into significant TOF differences. Within these regions, TOF spectroscopy is thus a sensitive differential measurement of the energy spectrum. Combining multiple TOF spectra measured at different retarding energies thus allows to measure a differential equivalent of the β-spectrum throughout the whole region of interest. As already outlined in Ref. [31], a differential measurement has important benefits for a sterile neutrino search. On the one hand it enhances the statistical sensitivity since the sterile neutrino signature can be measured directly without any intrinsic background from higher energies as in the classic high pass mode. On the other hand, it reduces the systematic uncertainty since it improves the distinction between systematic effects and a real sterile neutrino signature in the spectrum.

TOF Measurement
As the approach is rather novel, most existing ideas for TOF measurement are still in an early development phase and have not been tested. There are ongoing efforts to develop hardware which is intended to detect passing electrons with minimal interference with their energy (electron tagger) [34]. Approaches are amongst others to measure tiny excitations induced in an RF cavity or to detect the weak synchrotron emission of the electrons in the magnetic field via long antennas (cf. Refs. [57,58]). While promising, there has unfortunately not been any break-through in the technical realization for such an electron tagger, yet. Additionally, it seems unlikely that such an approach is also useful for keV sterile neutrino searches. For a sufficient sensitivity on sin 2 θ the count-rate needs to be as high as possible. However, count-rates much above 10 kcps would lead to ambiguities in the combination of a start signal in the electron tagger and the stop signal in the detector given the overall TOF of order ∼ µs (see Fig. 2). A method which has already been tested in the predating Mainz experiment [59] is a periodic blocking of the electron flux, called gated filtering (GF). If electrons are only transmitted during a short fraction of the time, the arrival time spectrum would approximate the TOF spectrum. In KATRIN, this could for instance be achieved by pulsing the pre-spectrometer potential between one setting with full transmission and one setting with zero transmission (Fig. 3). The main downside of the method is that it sacrifices statistics in order to get time information. However, it would require minimal hardware modifications since only the capability to pulse the pre-spectrometer potential by some keV would have to be added. Since the focal plane detector of KATRIN is optimized for low rates near the endpoint, the method could also in principle be utilized for an early keV sterile neutrino search by using a small duty cycle with sharp pulses and thereby reducing the count-rate. However, in this scenario with small hardware modifications, it is unlikely that the prespectrometer potential can be pulsed by more than some keV. Due to the capacity of the pre-spectrometer, there is possibly a non-vanishing ramping time involved, depending on the ramping interval. If electrons arrive within the ramping time, they become either accelerated or retarded, giving rise to non-isochronous background. The problem can be mitigated partly by using a voltage supply with higher power. Alternatively, a mechanical high-frequency beam shutter could be use. However, this would come at the cost of larger modifications of the set-up and a lower flexibility regarding finetuning of the timing parameters. We will not discuss this problem further and just assume an ideally efficient method of periodically blocking the beam. However, we will restrict the sensitivity study of the sterile neutrino search with the GF method to a measurement region spanning only a few keV below the endpoint.

Monte Carlo Sensitivity Estimation
The TOF spectrum (6) can not be calculated analytically, since the magnetic field B(z) and electron potential q∆U(z) are only known numerically. There are two remaining possibilities of simulating TOF spectra. The first approach is to to evaluate the δ function in the TOF spectrum (6) via numerical integration. This method has been used in Ref. [34] since it delivers generally precise results and is well scalable. The bottleneck of this method is, however, the convolution of the β-spectrum with the n-fold energy loss spectra (8). The convolution routine is rather performance-intensive especially for a large spectral surplus E 0 − qU (as present in case of keV scale sterile neutrino search) and requires complicated optimizations to work successfully. Furthermore, if the addition of further effects such as angular-changing collisions might be requested for future studies, the implementation will become more difficult.
Therefore, we chose to apply the second approach which is to generate the TOF spectra (6) via Monte Carlo (MC) simulation. This especially avoids the convolution of the β-spectrum with the energy loss function (8), since the energy loss can be randomly generated individually without additional expensive convolutions. While a MC approach is generally very flexible when it comes to the addition of more detailed effects and systematics, it is generally not as scalable in terms of the expected number of events as a purely numerical approach. KATRIN is designed for measurements near the β-endpoint with low rates on the order of several cps. The measurements for the keV-scale sterile neutrino detection, however, have to be performed over a significantly broader region of the β-spectrum and thus count rates up to ∼ 10 10 cps can be expected. For a data taking period of three years, one would thus expect up to ∼ 10 18 cps. If a realistic model for a sensitivity analysis shall be simulated event-by-event, it is obvious that the sample size needs to be significantly larger than the expected number of events. In our case, the calculation of flight times of more than 10 18 events is simply not possible within a reasonable computing time.
However, we will show that, if the signal is sufficiently small compared to the total expected rate, the dominating "background part" of the model (corresponding to the cos 2 θterm in Eq. (5)) can be approximated. This works due to the fact that for a pure sensitivity study, as opposed to an analysis of real data, only the fidelity of the signal is relevant.

Self-Consistent Approximate Monte Carlo
In this section, we argue that a modified Monte Carlo strategy, from here on called self-consistent approximate Monte Carlo (SCAMC), will be able to reduce the necessary total sample size in a sufficient amount to address the problems mentioned above. This works if two requirements are met. These are 1. that the model can be separated into a background part and a signal part, with the latter sufficiently smaller than the first, and 2. that model and toy data are self-consistent, i.e. the toy data are sampled directly from the model.
We will first discuss this approach for a generic case. Assume, the model distribution Φ can be expressed by a linear combination consisting of a signal contribution c S Φ S , sampled with maximum precision, and an approximated background contribution, c B Φ B . The distribution of interest is then replaced by a modified distribution with Φ B ∼ Φ B , where the background component is either approximated by an analytic expression or simulated by MC with a reduced sample size. We demand that Φ B is independent of any parameter of interest, µ (and of any parameter which is strongly correlated with a parameter of interest): The approximate model (13) can then be used as replacement for the real model. The sensitivity estimation can then be continued in the standard frequentist way: toy data are sampled from Φ for given parameter choices and the confidence region for the parameter of interest µ can then be determined via χ 2 fits.
The benefit of this strategy can be understood in the following way. Since the data have been sampled from the model, any error in the model will also be passed over to the data. However, while the total approximated distribution Φ itself is inaccurate, it still contains all essential information about the sensitivity, since Φ − c B Φ B = c S Φ S holds exactly (Fig. 4). Since only the fidelity of the signal is relevant for the sensitivity analysis (which we assured with condition (14)), both the error in the model and in the data approximately cancel each other in the fit. It can be shown in this case that the width of the χ 2 minimum stays the same as long as the background component is at least approximately correct. A simplified proof can be found in Appendix A.
We shall discuss the method now on the initial case of the keV scale sterile neutrino search with TOF spectroscopy. As derived above, the electron TOF spectrum (6) with added sterile neutrinos can be expressed as a superposition of two signal TOF spectra with a light or heavy neutrino mass, m l and m h , respectively. We identify the signal with the sterile neutrino component of the TOF spectrum (5) and the background with the active neutrino contribution, The coefficients are then given by the active-sterile mixing, It is obvious that for a small signal fraction of, e.g., sin 2 θ 10 −6 , only a small fraction of the total expected events needs to be simulated now. However, since signal and background are always measured together and not separately, the required sample size is reduced even more. For demonstration purposes, let us define the signal expectation value in bin i as with n as total number of expected events (see Fig. 4). We will denote the number of expected events in bin i as n i . To approximate the necessary sample size, we require that the numerical uncertainty of λ S i needs to be smaller than the expected measurement uncertainty of the number events in the corresponding bin, σ i : Assuming a Poissonian measurement uncertainty, σ i = √ n i and using ∆ λ S i /λ S i = 1/ N S i , where N S i denotes the signal sample size in bin i, Eq. (18) gives We now define the total signal sample size as N S = ∑ i N S i . If we assume the signal-background ratio to be roughly within a constant order of magnitude, we get the required minimum signal sample size: Naively, one would suppose that the signal part still needs to be sampled with full statistics, i.e. N S n · c S . However, due to the fact that the signal part is always measured with background, we have shown that an additional suppression factor of c S applies. Assuming sin 2 θ ∼ 10 −6 and a total event size of n ∼ 10 18 , we thus get n · c 2 S = n · sin 4 θ ∼ 10 18 · 10 −12 = 10 6 .
Note that sin 2 θ ∼ 10 −6 represents roughly the upper bound from astrophysical observations. Likewise, n ∼ 10 18 is approximately the maximum number of counts which will decrease with higher retarding potentials. Thus, for lower values of either one, the necessary sample size is reduced even more according to condition (20).

Probabilistic model: TOF Spectra
Using a Monte Carlo algorithm, the TOF spectra given by the transformation (6) can be determined in a straightforward way. For each MC sample, at first an initial energy and starting angle is generated. The angular distribution is given by Eq. (7). For the initial energy, the electronic excited state is generated from the final state distribution in Eq. (1) and then the energy is generated from the respective β-spectrum component in Eq. (2). Given the initial energy and the starting angle, the number of inelastic scattering process in the source is generated from Eq. (10) and for each process the energy loss is generated from Eq. (9) and subtracted from the energy. In order to further optimize the Monte Carlo method for a parametrizable heavy neutrino mass, the TOF spectra have additionally been decomposed into elements corresponding to different sterile neutrino mass phase space segments, which is explained in detail in Appendix B. The advantage of such a scheme is that already simulated Monte Carlo events can be reused for different sterile neutrino masses.
We found that a sample size of 10 8 for each sterile subcomponent is feasible in finite calculation time and sufficient for an accurate simulation. The active neutrino component, which contains ∼ 1/ sin 2 θ more counts than the total sterile component, was approximated with a sample size of 10 9 , according to the SCAMC approach. The active neutrino mass was set to m l = 0 and the endpoint held constant at E 0 = 18.575 keV, since there is no correlation to expect with the sterile neutrino. The bin width was chosen to be 250 ns (compared to the FPD time resolution of about 50 ns) for reasons of performance and robustness. However, it is unlikely to expect for any measurement method to achieve a higher resolution. To all spectra a Gaussian time uncertainty of ∆ τ = 50 ns was added to account for the detector time resolution and a isochronous background of b = 10 mcps.
Figs. 5 and 6 show exemplary simulated TOF spectra for different active-sterile mixings and heavy neutrino masses, respectively. It can be seen that the spectra show a dominating peak within the first 2 µs which consists of the fast electrons more than some 100 eV above the retarding potential. They are, however, followed by a long tail where the electron velocity becomes slower and the TOF difference per given energy difference (see Fig. 2) becomes more significant. In this region the TOF spectrum is to a good extent a differential representation of the β spectrum, while the fast peak region consists only of some bins, thus contributing to the sensitivity more by its integral. If the sterile neutrino mass is some 100 eV smaller than the difference of endpoint  and retarding potential, the sterile neutrino signal becomes similar to that one in the tritium beta spectrum. The sterile neutrino contribution appears as a discontinuity in shape of a "kink" at a certain position in the spectrum. Since the relationship between energy and TOF is non-linear, the position of the kink allows no direct analytical conclusion about the sterile neutrino mass. However, given the retarding potential, the relation in Fig. 2 can be used for an estimation.

Ideal TOF mode Sensitivity
The model described in the last section was utilized to estimate the sensitivity according to the procedure described in section 3. The fits have generally been performed by a χ 2 minimizations using MINUIT [60] . For statistical sensitivity estimation, the mixing sin 2 θ and overall amplitude S are free fit parameters, using a range of fixed values for m h . In those simulations, where the uncertainty on m h is of interest, also the squared heavy neutrino mass m 2 h has been included as fit parameter. Since each fit incorporates a set of multiple measurements at different retarding potentials, the χ 2 functions of each measurement are added and fitted with global fit parameters. Instead of a pure ensemble approach, the parameter uncertainties have been calculated using the module MINOS from MINUIT [60], averaged over multiple simulations, which gives in case of an approximately quadratic χ 2 near the minimum an identical result.

Exemplary Systematics
In addition to the statistical sensitivity, an exemplary systematic effect has been studied, which is the inelastic scattering cross section due to fluctuation in the column density as described in Eq. (9). This is one of two main systematics when it comes to keV sterile neutrino search, the other being the final state distribution [51][52][53]. To incorporate the systematics, the χ 2 function has been modified by an additional term: where χ 2 0 is the default binned χ 2 function, ρd the fitted column density, ρd its expectation value and ∆ ρd the systematic uncertainty. In order to be able to have ρd as free fit parameter, the complete model has additionally been separated by number of inelastic scattering processes and weighted with the l-fold energy loss probability p l (ρd) as given by Eq. (10), instead of randomly generating the number of inelastic scattering events, To determine the influence of the uncertainties ∆ ρd on the sensitivity, the column density has been shifted by for the data generation by ρd = ρd + ∆ ρd while still using the unshifted expectation value ρd in Eq. (22). By this approach the MINOS error will increase plus a possibly slight bias in average which is then quadratically added to the average error bars.
To illustrate the imprint of the systematic uncertainty of ρd in the TOF spectrum, Fig. 7 shows the difference between a TOF spectrum with shifted column density, Φ(ρd) = dN/dτ(ρd + ∆ ρd) and a TOF spectrum with mean column density, Φ 0 = dN/dτ( ρd ), weighted by √ Φ 0 which is proportional to the expected Poissonian uncertainty of the data. By doing so, the signature becomes visible proportionally to its impact in the χ 2 function. It can be seen that the imprint of a shifted column density is present foremost at lower flight times, which is since the energy loss causes the count-rate near the endpoint to drop. There are fluctuations at higher flight times near the retarding potential arising from the energy loss spectrum (9). However, these are weighted minimally since the differential rate in the TOF spectrum drops with higher flight times (see Fig. 5). Fig. 8 shows the sensitivity for an ideal TOF mode. The results are based on three years measurement time which was distributed uniformly on the retarding potential within an interval of [4; 18.5] keV with steps of 0.5 keV. The setting was chosen in that way that a 7 keV neutrino signal [25] would roughly lie in the center of the potential distribution. For the  Fig. 7 Difference between TOF spectra with shifted ρd, Φ(ρd) and default value ρd = 5 × 10 14 cm −2 , Φ 0 , weighted proportionally with the expected Poissonian uncertainty of the data ∝ √ Φ 0 . The imprint of a shifted column density is present foremost at lower flight times, due to missing events near the endpoint because of the energy loss. Fluctuations at higher flight times near the retarding potential are suppressed by a lower differential count rate. The spectra consist only of the active neutrino component, sin 2 θ = 0, and the retarding potential is qU = 18 kV. exemplary inelastic scattering systematics an initial uncertainty of ∆ ρd/ρd = 0.002 has been assumed in accordance with Ref. [30]. The statistical sensitivity of the integral mode in this simulation is in good agreement with Ref. [31]. The statistical sensitivity of the ideal TOF mode is close to that one of an ideal differential detector in the aforementioned publication. However, if the uncertainties of the column density are incorporated, the benefit by the TOF mode grows even further, since a shifted column density has a unique imprint on the TOF spectrum (see Fig. 7), which is not the case in the integral mode.

Results
It should be noted, however, that for low retarding potentials as used in Fig. 8, adiabaticity of the electron transport is limited. Yet, that can be maintained by increasing the magnetic field in the main spectrometer. This lowers the energy resolution and thus the transformation of transverse into longitudinal momentum, which would manifest in a stronger angular-dependence of the energy-TOF relation in Fig. 2. Though, this should have no significant influence on the sensitivity since the measurement takes place on a keV-scale where the requirements for magnetic adiabatic collimation are more relaxed.
An exemplary fit is shown in Fig. 9 for a sterile neutrino with mass m h = 2 keV and a mixing of sin 2 θ = 10 −6 assuming an ideal TOF measurement and using four exemplary retarding potentials of 15, 16, 17 and 18 keV. In this case the sterile neutrino mass has not been fixed but used as a free fit parameter to test the ability to fit the sterile neutrino mass, given a sufficiently high active-sterile mixing. While it is in  principle sufficient to use only one retarding potential closely below the sterile neutrino kink, in practice a multitude of retarding potentials is necessary. The reasons are that, on one hand, the mass of the sterile neutrino is unknown and, on the other hand, that it is also necessary in order to determine the other parameters. In contrast to the pure sterile active mixing sensitivity estimation (Fig. 8) the heavy neutrino mass has been used as free fit parameter. It shows that the method is capable of a sensitive mass determination as well, in case the mixing angle is large enough. However, since most parts of the sensitive regions of the TOF method are disfavored by X ray satellite measurements [15], it seems unlikely that a mass fit will be possible.

SCAMC Variance
In order to show that the SCAMC method is really working as expected, it has been tested using different Monte Carlo sample sizes. A necessary condition is convergence of the result towards a constant value with growing sample size. As described in Appendix B, the signal itself is split into signal components defined by slices of the signal TOF spectrum in m h -space (Eq. (B.6)).  terms of m h . The total signal sample size per TOF spectrum is thus given by where N C denotes the sample size per component. For minimum qU and m h it amounts to ∼ 150 · N C . The background has been simulated with a sample size of 10 · N C . It can be seen that convergence is met and that with subsample sizes such as 10 4 per component the expected result is approximated with less than 1 percent uncertainty. Fig. 11 shows the sensitivity to sin 2 θ in a similar way as Fig. 8, but for a measurement interval of [15; 18.5] keV, roughly centered around a 2 keV neutrino, as favored in Ref. [3]. It can be seen in comparison that there is no benefit of restricting the measurement interval to a narrow region in search for a sterile neutrino with a given energy. This seems counter-intuitive at first, but is has to be kept in mind that the sterile neutrino signal is not localized at the kink, but instead contributes to the whole spectrum below. In contrast to dedicated 'kink-search' methods [32], all spectral parts contribute to the sensitivity in a χ 2 fit. While the relative difference made by a sterile neutrino signal might be smaller at lower retarding potentials, this drawback is however balanced by a larger count-rate at lower potentials.

Measurement
Step Size Fig. 12 shows the statistical sensitivity as a function of the spacing between different measurement points of the retarding potential qU. The total sample size has been kept constant. The simulations show no preference towards any particular value. That appears unintuitive, since one would expect a narrower spacing to have beneficial effects on a distinct kink search. Yet, as mentioned in the last paragraph, the sterile neutrino signal is not localized, but manifests itself in relative count rate differences between the measurement points with a spectral feature as broad as the mass of the sterile neutrino m h . Therefore, a larger step size does not weaken the sensitivity in principle because the measurement time is distributed over less points. Anyway, it is in general recommended to use a step size lower then the smallest possible heavy neutrino mass, since otherwise it is possible that there are not enough vital measurement points above the kink.
The benefit of a TOF measurement can be explained in this context as follows: TOF spectra carry extra information about the differential energy distribution closely above each measurement point. That equates to knowledge about the slope of the integral spectrum at these measurement points. Fig. 13 shows exemplary TOF spectra using Gated Filtering (GF, see Fig. 3). It illustrates how GF works: without the gate (cyan points), the arrival time spectrum is isochronous. However, with activated gate, a certain portion is cut away from the isochronous spectrum. For a given repetition time t r and duty cycle ξ , the duration in which the gate is open is given by t r · ξ . The GF arrival time filter thus is smeared with a step function when compared to the raw TOF spectrum. Reducing the duty cycle ξ makes the arrival time spectrum approximate the TOF spectrum of Fig. 9, however with a loss of overall rate. Electrons with a TOF greater than the repetition time t r lead to the wrongful attribution of the corresponding events to a later period, which can be seen in the first few bins. However, since TOF spectra at several keV below the endpoint are rather sharp, this effect is small for repetition times of ∼ 10 µs. nario is based on the assumption that the existing focal plane detector (FPD) of KATRIN is used, which is optimized for a measurement near the endpoint of the β spectrum and thus can not maintain much higher count-rates. The bottleneck is particularly the per-pixel rate which should not exceed ∼ 10 3 cps within a window of some µs. This limitation holds for the current data acquisition and might be improved in the future. In this simulation, an exemplary overall reduction of the signal rate by a factor 10 5 has been chosen which will ensure a flux compatible with the current hardware. Since the gated filter periodically blocks the flux of electrons, the rate can be increased again with respect to the integral mode. The actual allowed rate with the gated filter depends on the readout electronics and will effectively be between two extremal values. In an optimistic case, short-time excess of the rate is tolerable, while the average rate has to be at the same level as with the integral mode. In a conservative case, also short-time excess leads to pile-up, which means that instead the peak rate may not exceed the constant rate of the integral mode (see Fig. 13). The repetition rate has been fixed at 10 µs, which will ensure coverage the vast part of the TOF spectrum. The measurement interval has been limited to [15; 18.5] keV since it is not believed to be viable to pulse the pre-spectrometer more than several keV.

Gated Filter Sensitivity
It can be seen that in the the gated filter beats the integral mode in the optimistic case, but not in the conservative case. This means that the loss of rate in the conservative case is too high to be compensated by the additional TOF information. In case the detector readout electronics sufficiently tolerates short-time excesses, the loss of statistics by the gated filter can, nevertheless, be compensated and additional TOF information is gained. Note, however, that in the scenario of an  14 Sensitivity (1 σ ) of integral mode with the rate reduced by a factor 1 × 10 5 (red), compared with a conservative gated filter TOF scenario (blue) with same peak rate as the integral mode and an optimistic gated filter TOF scenario (green) with the same total rate as the integral mode. Both statistical uncertainty (dashed lines) and combined statistical plus systematic uncertainty including a column density uncertainty ∆ ρd/ρd = 0.002 (full lines, see section 4.2) are plotted. The duty cycle is 0.1 for both gated filter scenarios. Measurement interval has been [15; 18.5] keV for three years data taking. The repetition time is t r = 10 µs for all retarding potentials. upgraded future detector which tolerates the full rate from the tritium source, the integral mode outperforms the gated filter mode, since there is now way in this case to increase the rate further.

Summary and Discussion
It has been shown that TOF spectroscopy in a KATRIN context is in principle able to boost the sensitivity of the sterile neutrino search significantly. Fig. 11 suggests an improvement of up to a factor two in terms of pure statistical uncertainty down to at maximum sin 2 θ ∼ 5 × 10 −9 for a sterile neutrino of m h = 7 keV at one σ . If the exemplary systematic uncertainty of the inelastic scattering cross section is considered, the sensitivity is only mildly weakened in contrast to the integral mode, which is in that case outperformed by the TOF mode by up to a factor five. However, the practical realization of a sensitive TOF measuring method is still work in progress. Given the current hardware, which requires a reduction of the signal rate, the gated filter method might be able to realize a TOF mode with a slight sensitivity increase compared to the integral mode under the condition that the detector tolerates short-time excesses of the rate and that it is possible to ramp the pre-spectrometer potential some keV within ∼ 0.1 µs. From a long-term point of view, the concept of an upgraded differential detector [31] which is capable of extreme rates up to 10 10 cps is very promising. If there is sufficient progress in developing a sensitive TOF measurement method, a beneficial strategy could be a combined measurement to eliminate systematics and perform cross-checks.
Appendix A: Unchanged χ 2 Properties with SCAMC In the following it is shown that the properties of the χ 2 function defining the sensitivity, which are position and width of the minimum with respect to any parameter of interest, are independent of the choice of the background model Φ B . This works as well for a Poissonian log-likelihood, but for brevity we show it on a χ 2 example. First we define the expectation value for the i-th bin, using the definition of the approximated model (13), and assume that the background prediction λ B i is independent of the parameter of interest µ, For the proof we differentiate χ 2 with respect to µ and demand that the result is approximately independent of the choice of the background model Φ B : The variable n i is Poisson distributed with mean λ i (µ 0 ) = λ S i (µ 0 ) + λ B i , where µ 0 is the null-hypothesis for µ. Due to self-consistency, λ B i n i is approximately independent from the choice of Φ B , as long as the order of magnitude is in agreement Φ B ∼ Φ B . The latter condition ensures that the Poissonian uncertainty of n i , which is given by λ i (µ 0 ), is approximately correct.
Note that the proof is only correct in the simplified case of one parameter of interest and no correlation with nuisance parameters. However, the simulation results in this paper show that there is valid reason to expect the method to work also for more complex problems as long as there is no heavy parameter correlation.

Appendix B: Sterile Neutrino Mass Decomposition of TOF Spectra
The simulation of the TOF spectra has further been optimized with the aim of being able to use the sterile neutrino mass m l as a free parameter with a minimum of computational overhead. The idea is to decompose the sterile neutrino components of the TOF spectra, Φ S , into sub-spectra Φ S k which can be added subsequently to obtain the signal for a given sterile neutrino mass m h . That works as follows: at first a number J of grid points with heavy neutrino masses m j are chosen. For each grid-point j, the signal spectrum is given as the sum of all sub-signals from j up to J, (B.5) The sub-signals Φ S k constitute the difference of two TOF spectra with adjacent sterile neutrino masses. The total TOF spectrum for the sterile component can then be written as Each sub-component in the sum will be sampled separately. The difference between two TOF spectra can be sampled just like any TOF spectrum, as outlined, by replacing the β-spectrum in (6) also with the difference of two β spectra corresponding to the neutrino masses m k and m k+1 . Via (B.6), that gives then the sterile contribution of the TOF spectrum for each mass value m j on the grid. For sterile neutrino masses between the grid points, the resulting spectrum is then calculated by cubic spline interpolation. The strategy is illustrated in Fig. 15.
In addition to the reuse of already simulated Monte Carlo events, this strategy has the possible advantage of a smoother interpolation in bins with small statistics, which are possible for high flight times 40 µs. By the de-composition and subsequent addition of the components, monotony between the interpolation grid points is guaranteed. However, if a sufficient overall sample size is chosen, this effect should not matter significantly.