1 Introduction

A promising approach for the direct determination of the effective electron neutrino mass \(m(\nu _e)\) is the analysis of the endpoint region of a calorimetrically measured spectrum following the electron capture (EC) of \(^{163}\mathrm {Ho}\). This method was first proposed in [1] and is presently pursued by two collaborations, ECHo [2] and HOLMES [3]. The EC of \(^{163}\mathrm {Ho}\) is characterised by \(Q_\mathrm {EC} ={2.833 \pm 0.030 (\hbox {stat})\, \pm \, 0.015 (\hbox {sys})}\,\hbox {keV}\) [4], which is the available energy for the decay, and a half life of \(T_{1/2}={4570 \pm 50}\) y [5].

In the EC of \(^{163}\mathrm {Ho}\), a shell electron is captured by the nucleus, converting a proton into a neutron and emitting an electron neutrino. The daughter atom is left in an excited state \(\mathrm {^{163}Dy^*}\), which predominantly decays to the ground state \(\mathrm {^{163}Dy}\) via non-radiative processes. In first approximation, the excited state \(\mathrm {^{163}Dy^*}\) is characterised by a vacancy of an inner shell electron and an additional electron in the 4f-shell. In this simplified model, the spectrum is given by six resonances centred at the binding energies of the orbitals of the captured electrons in the potential of the daughter nucleus: MI(\(3\mathrm {s}_{1/2}\)), MII(\(3\mathrm {p}_{1/2}\)), NI(\(4\mathrm {s}_{1/2}\)), NII(\(4\mathrm {p}_{1/2}\)), OI(\(5\mathrm {s}_{1/2}\)), OII(\(5\mathrm {p}_{1/2}\)).Footnote 1 The amplitude of each resonance is defined by the phase space factor and the probability that an electron wave function in the given state overlaps with the nucleus. The phase space factor contains the information on the neutrino mass and gives rise to a cutoff at the endpoint energy \(E = Q_\mathrm {EC}- m(\nu _e) \). The endpoint region is therefore most sensitive to a finite effective electron neutrino mass. Recently, a more accurate description of the calorimetrically measured \(^{163}\mathrm {Ho}\) spectrum has been developed. Besides the main resonances, it contains a number of structures that take into account electron scattering processes in the atom and excitations to the continuum [6, 7].

Fig. 1
figure 1

Left: comparison of the pulse shape of a DC-coupled signal and an AC-coupled signal. Both traces are MI-pulses. Right: three time traces of different energies normalised to unity amplitude. MI, NI and OI refers to the corresponding peak in the \(^{163}\mathrm {Ho}\) spectrum. The first quarter of the time traces are pre-trigger samples. The traces are acquired with AC-coupled signal amplification

In order to obtain sub-eV sensitivity on \(m(\nu _e)\), one must measure energies below \({3}\,\hbox {keV}\) with eV-precision. In addition, a good intrinsic time resolution \(\sim \) \({100}\,\hbox {ns}\) and a high statistics of more than \(10^{14}\) \(^{163}\mathrm {Ho}\) events are essential [2]. For ECHo, metallic-magnetic calorimeters (MMCs) operated at temperatures below \({30}\,\hbox {mK}\) [8] inside a dry dilution refrigeratorFootnote 2 are used to meet these requirements. The particular type of MMC used for ECHo is characterised by a particle absorber that encloses the high-purity \(^{163}\mathrm {Ho}\) source. If an energy \(E\) is deposited in the absorber, its temperature rises with a time constant \(\tau _\mathrm {rise} \sim {100}\,\,\hbox {ns}\). A paramagnetic Ag:Er temperature sensor, which is situated in an external static magnetic field and is thermally well coupled to the absorber, acts as a precise thermometer. The magnetisation of this sensor is temperature dependent. Consequently, a change in temperature causes a change of magnetic flux in a suitable pick-up coil. A flux-locked-loop dc-SQUID (direct current – superconducting quantum interference device) readout is then used to convert the change of flux into a change of voltage proportional to the initially deposited energy \(E\). A gold thermal link made of several gold films with increasing width finally connects the detector to an on-chip thermal bath so that the initial temperature is restored. At the operating temperature of \({20}\,\hbox {mK}\), the recovery time is of the order of milliseconds. The decaying part of the temperature pulse can be described by a sum of exponential functions due to the step structure of the thermal link to the on-chip thermal bath. The rising part of the pulse can be affected by a reduced readout bandwidth, which effectively increases the signal rise time \(\tau _\mathrm {rise}\) \(\simeq {500}\, \hbox {ns}\). For a DC-coupled signal, the time constants with their respective amplitudes fully specify the shape of a thermal pulse. AC coupling of the signal keeps the baseline offset at \({0}\,\hbox {V}\). This strongly modifies the signal shape, mainly the decaying part, as shown in Fig. 1 (left). The rising part is not affected, since the high-pass filter of the AC-coupling has a cutoff frequency of \(\sim {100}\,\hbox {Hz}\) which is orders of magnitude below the frequency associated with the rise time.

The detector geometry used for ECHo is a double meander, which corresponds to two superconducting meander structures connected in parallel with the input coil of one dc-SQUID. On top of each meander, a paramagnetic sensor is fabricated. To polarise the spins in the sensor, a constant magnetic field is generated by a persistent current in the meander structures. Simultaneously, the meander structure serves as a readout coil to detect the magnetisation changes in the sensor. In such a gradiometric setup, the signal of a common change in temperature in the two sensors cancels out, which significantly reduces noise caused by global temperature fluctuations of the chip. On top of each sensor, a gold absorber with the dimensions \({180}\,{\upmu \hbox {m}} \times {180}\,{\upmu \hbox {m}} \times {10}\,{\upmu \hbox {m}}\) is fabricated. Each set comprising meander, sensor and absorber is referred to as one pixel and one gradiometer consisting of two pixels is referred to as one detector, which is read out by a two-stage SQUID setup [9]. Thus, each detector is associated to one readout channel (detector channel). Due to the opposite polarity of the screening current in the double meander structure of the detectors, the voltage signals from the two pixels of one detector have opposite signs. The triggered signals are separated into positive and negative polarity pulses based on the voltage slope after the trigger. Thus, signals from the two pixels of one detector can be distinguished.

In the ECHo-1k high statistics measurement (run 24 and run 25), two ECHo-1k chips [10, 11] have been used. Each chip hosts 32 detectors implanted with \(^{163}\mathrm {Ho}\), two double meanders to study the properties of non-implanted pixels and two so-called temperature channels, which feature only one sensor and are therefore sensitive to temperature fluctuation of the substrate. In 7 of the 32 implanted detectors, only one pixel contains \(^{163}\mathrm {Ho}\) to allow for in-situ background measurements. Signals from a total of 68 pixels have been acquired over a period of five months, 58 of which are implanted with \(^{163}\mathrm {Ho}\). The average activity per detector is approximately \({1}{\,\hbox {Bq}}\). The data reduction scheme discussed in this work has been developed to eliminate spurious events like triggered noise or pileup from these ECHo-1k datasets.

The signals of each detector channel are amplified by a room temperature SQUID electronics,Footnote 3 controlling the two-stage SQUID readout, and digitised by a 16-channel analogue-to-digital converter (ADC) with 16-bit resolution (\(\sim 13.5\) effective bits) and a maximum sampling rate of \({125}\,\hbox {MHz}\)Footnote 4 [12]. An FPGA splits each datastream in two. One datastream is typically 16-fold downsampled, leading to an effective sampling rate of about \({8}\,\hbox {MHz}\) and an effective resolution of about 15.5 bit, and written to a ring buffer to allow for the acquisition of pre-trigger samples. The other stream is fed into a bipolar trigger engine, where a constant fraction discriminator (CFD) is applied to the datastream filtered with finite impulse response (FIR).Footnote 5 For events with good signal-to-noise ratio the CFD trigger can reach sub \({125}\,\hbox {MHz}\) clock-time resolution (but we record the time stamps only with \(\sim {8}\,\hbox {ns}\) resolution) and the jitter of the digitised signals within the recorded time windows is minimal, i.e. less than \(\pm 0.5\) downsampled samples, which corresponds to \(\pm {64}\,\hbox {ns}\). Three examples of saved AC-coupled \(^{163}\mathrm {Ho}\) traces are shown in Fig. 1 (right). For a sampling rate of \({125}\,\hbox {MHz}\) and an oversampling of 16, the total saved time window of each trace is \({2.09715}\,\hbox {ms}\). During this time period after a trigger, no further triggers from the same detector channel are accepted. For each trace, the timestamp of its trigger is saved. One can then calculate the time difference \(\varDelta t\) to the previously saved trace in any of the acquiring detector channels of one ECHo-1k chip, and the time difference to the previously saved trace within the same detector channel, \(\varDelta t_\mathrm {ch}\).

The information extracted from the analysis of the timestamps is used for reliable and energy-independent data reduction. This is crucial to avoid distortions of the spectrum, particularly in the endpoint region. The shapes of the filtered traces are then analysed to remove remaining spurious signals. The method used is based on the chi-squared goodness of fit measure, which is calculated for each event following a template fit.

In the first part, we present different signal families that have been identified in our data. In the second part, methods to eliminate various spurious events are discussed and the performance of these algorithms applied to a subset of ECHo-1k data is evaluated in the last part.

2 Signal families

The majority of triggered traces are \(^{163}\mathrm {Ho}\) events with an energy-independent pulse shape as shown in Figs. 1 (right) and 2a, and a statistically distributed \(\varDelta t_\mathrm {ch}\) depending on the activity of the particular detector channel. In addition, there are traces from various sources which, if not recognised and eliminated, can distort the spectrum. They can be divided into pileup originating from \(^{163}\mathrm {Ho}\), and spurious signals from external sources.

Fig. 2
figure 2

a \(^{163}\mathrm {Ho}\) event, b pileup-on-tail outside the time window (\(\mathrm {POT}\)), c Pileup-on-tail with both signals inside the time window (\(\mathrm {PIT}\)), d Pixel-pixel coincidence event with thermal pulse shape, e GSM signal, f triggered noise. All traces are acquired with AC-coupled signal amplification

Fig. 3
figure 3

Left: illustration of a pileup-on-tail outside the time window (\(\mathrm {POT}\)) event. The pulse shape of Trace 2 is distorted by the tail of Trace 1. Two traces are saved with a timestamp difference \(\varDelta t_\mathrm {ch}\). Right: schematic of the time trace of a GSM signal triggered in one detector channel. The time structure of the corresponding GSM signal is indicated above

2.1 \(^{163}\mathrm {Ho}\) pileup

The pulse shape of a \(^{163}\mathrm {Ho}\) event can be distorted if a second event in the same detector occurs within a relatively short time interval \(\varDelta t_\mathrm {ch}\). For a time difference larger than the time window \(\varDelta t_\mathrm {ch} > {2}\,\hbox {ms}\), individual traces are triggered and saved for the two pulses as illustrated in Fig. 3 (left). The pulse shape of the second trace is distorted by the tail of the previous pulse, as shown in Fig. 2b. For \(\varDelta t_\mathrm {ch}\) smaller than a given value, which depends on the time profile of the signals, this distortion can result in an incorrect reconstruction of the amplitude. Events of this kind are referred to as “pileup-on-tail outside the time window” (\(\mathrm {POT}\)). The distortions can be omitted by selecting traces with sufficiently high values of \(\varDelta t_\mathrm {ch}\), as discussed in Sect. 3.1.1.

If \(\varDelta t_\mathrm {ch} < {2}\,\hbox {ms}\), only one trace is saved with a trigger time corresponding to the occurrence of the first event. For very short time differences \(\varDelta t_\mathrm {ch} \sim \tau _\mathrm {rise} \), the pileup of two events with energies \(E_1, E_2\) cannot be distinguished from the trace of one event with an energy of \(E \simeq E_1+E_2\). This unresolved pileup is examined in more detail in Sect. 4.2.3 and can be taken into account statistically.

Events with \(\varDelta t_\mathrm {ch} < {1.57}\,\hbox {ms}\) but \(\varDelta t_\mathrm {ch} > rsim \tau _\mathrm {rise} \) are referred to as “pileup-on-tail with both signals inside the time window” (\(\mathrm {PIT}\)). For these events, the tail of the first pulse deviates strongly from the regular pulse shape (see Fig. 2c). A template fit, discussed in Sect. 3.2, in which a reference pulse is scaled to the trace would provide a false amplitude. Identifying and discarding such events is possible by means of a larger \(\chi ^2_\mathrm {red}\) value of the fit, which is described in Sect. 3.2.

2.2 External spurious signals

Particle background: Natural radioactivity and cosmic muons can produce events in the energy range of the \(^{163}\mathrm {Ho}\) spectrum. Not all these events can be distinguished from \(^{163}\mathrm {Ho}\) events by means of their pulse shape (see Fig. 2d). Background suppression measures and a background model for the ECHo-1k setup are therefore of major importance to reliably analyse the endpoint region of the \(^{163}\mathrm {Ho}\) spectrum. Coincident signals could arise from secondary particles generated by muons interacting in surrounding materials or from muons passing through a pixel and the substrate. Along this line, a search for coincident events among different detector channels allows to identify events part of these muon related events. A study of the spectrum of muons and muon-related events in the region of interest is discussed in [13].

Mobile phone signal: We observed that mobile phone signals transmitted with the Global System for Mobile communication (GSM) [14] can couple into our readout system and generate triggered traces (see Fig. 2e). The detailed underlying mechanism for this coupling is still under investigation. The time structure of a GSM signal is partitioned into time division multiple access (TDMA) frames with a duration of \({120}\,\hbox {ms}/26 \simeq {4.615}\,\hbox {ms}\). Each TDMA frame consists of eight equal time slots, each of which can contain a burst of data. Normally, a user is assigned to one of these time slots, which can cause repeating signals to be triggered with a period of \(\sim \,{4.615}\,\hbox {ms}\), as illustrated in Fig. 3 (right). In consideration of the respective guard periods one expects a burst duration of \({0.5465}\,\hbox {ms}\) for a normal burst (i.e. digitised voice data) and \({0.3210}\,\hbox {ms}\) for an access burst (i.e. communication to the base station).

Triggered noise: In addition to the sources mentioned above, miscellaneous temporary electromagnetic spurious signals can couple into the readout chain and create a false triggered signal. One example for this are small fluctuations in the power grid. Typically, those signals are characterised by a quickly repeating time signature and an anomalous shape of the trace (see Fig. 2f).

3 Data reduction

Various methods have been studied to filter spurious events in microcalorimeters. The main objectives are mitigating nonlinearity of the detector output, reconstructing single events from pile-up events, lowering the threshold for unresolved pile-up, and detecting outliers. Most approaches are based on either (modified) optimal filtering techniques [15, 16] or on principal component analysis [17,18,19,20,21]. While these methods show promising results, in this study we focus on arithmetically simple approaches as we aim for a fast online data reduction. For this, the developed algorithms have been tested offline with an available dataset first, and will be implemented into the data readout scheme in the future.

The presented offline data reduction algorithm comprises two levels as illustrated in Fig. 4. On the first level, only the trigger time information of the acquired raw traces is used to discard \(\mathrm {POT}\) events and external spurious signals in an energy-independent way. On the second level, the remaining data filtered by the first level filter are further analysed based on their time profile to discard remaining spurious signals. A template pulse is automatically generated by averaging \(^{163}\mathrm {Ho}\) traces of the MI-line. Traces that deviate strongly from the template are then discarded.

Fig. 4
figure 4

Schematic of the two-level offline data reduction algorithm described in the text

3.1 Time trigger information analysis

The time information filter is defined as the logical AND of four independent subfilters, each applied to the raw data. Thus, if a trace is discarded by at least one of the four subfilters, it is discarded by the time information filter. In the following, the aim and implementation of each of the subfilters is described. The holdoff and burst subfilters are performed channel by channel, with \(\varDelta t_\mathrm {ch}\) being analysed. The coincidence and GSM subfilters are done globally, analysing \(\varDelta t\).

3.1.1 Holdoff subfilter

The aim of the holdoff subfilter is to discard \(\mathrm {POT}\) events. For this, traces are removed that fulfil

$$\begin{aligned} \varDelta t_\mathrm {ch} < t_\mathrm {hold}. \end{aligned}$$
(1)

The holdoff time \(t_\mathrm {hold}\) is fixed based on the time profile of a typical \(^{163}\mathrm {Ho}\) signal such that the distortion of a pulse with \(\varDelta t_\mathrm {ch} = t_\mathrm {hold} \) is sufficiently small to ensure correctly reconstructed amplitudes at all energies. The holdoff time is determined in dedicated characterisation measurements prior to the actual experiment run. For this purpose, traces are acquired over a large time window up to the point where the temperature pulse recovers its initial voltage value. This is done for both AC and DC coupled signals.

This subfilter only removes the trace on the tail, i.e. the pulse occurring at a time interval \(\varDelta t_\mathrm {ch} <t_\mathrm {hold} \) after a previously triggered pulse in the same detector channel. For ECHo run 24 with AC-coupled signals, a value of \(t_\mathrm {hold}\) = \({15}\,\hbox {ms}\) was determined.

3.1.2 Burst subfilter

In order to discard any traces from quickly repeating triggered noise, the burst subfilter identifies time intervals with an abnormally high trigger rate. The subfilter is applied channel by channel, since noise usually does not couple identically in all detector channels.

The timestamps of traces of each detector channel are binned with a bin width \(\varDelta t_\mathrm {bin}\). The expected number of events from \(^{163}\mathrm {Ho}\) decay per bin is then given by

$$\begin{aligned} \langle N_\mathrm {ch} \rangle = A_\mathrm {ch} \varDelta t_\mathrm {bin}, \end{aligned}$$
(2)

where \(A_\mathrm {ch}\) is the \(^{163}\mathrm {Ho}\) activity in the corresponding detector channel known from detector characterisation. A bin that contains quickly repeating triggered noise will exhibit a number of counts that strongly exceeds the expected value \(\langle N_\mathrm {ch} \rangle \) of the otherwise dominant \(^{163}\mathrm {Ho}\) events. The degree of deviation from the expected value can be expressed in terms of the statistical uncertainty of \(\langle N_\mathrm {ch} \rangle \), which for a Poissonian distribution is given by its standard deviation \(\sigma = \sqrt{\langle N_\mathrm {ch} \rangle }\). The traces within a bin are discarded if the number of counts exceeds \(\langle N_\mathrm {ch} \rangle + 4\sigma \). If a bin fulfils this criterion, it is referred to as a seed bin. For the two neighbouring bins of a seed bin, the threshold for the bins to be discarded is lowered to \(\langle N_\mathrm {ch} \rangle + 2\sigma \). This ensures that no fragments of a burst are missed due to binning.

Two complementing burst subfilters are implemented, one optimised for faster bursts and the other for slower bursts. The difference lies in the way the bin width is defined. For fast bursts it is chosen such that \(\langle N_\mathrm {ch} \rangle \) \(=1\), i.e.

$$\begin{aligned} \varDelta t_\mathrm {bin} = A_\mathrm {ch} ^{-1}. \end{aligned}$$
(3)

This corresponds to the shortest bin width that can reasonably be defined.

In order to be sensitive to noise triggered with a frequency down to \(f_\mathrm {noise}\), the bin width of the second burst subfilter is defined in a way that \(f_\mathrm {noise} \varDelta t_\mathrm {bin} = 4\sigma \). With the definition of \(\sigma \) and Eq. (2), this condition is fulfilled for

$$\begin{aligned} \varDelta t_\mathrm {bin} = 16 \frac{A_\mathrm {ch}}{f_\mathrm {noise} ^2}. \end{aligned}$$
(4)

The burst subfilter is then defined as the logical AND of the decisions made with both methods. Thus, if a trace is discarded by at least one of the two methods, it is discarded by the burst subfilter.

Fig. 5
figure 5

Histograms to illustrate the influence of the time information filter, broken down by the four independent subfilters. Left: histograms of the number of acquired traces per minute. From a two-day dataset of ECHo run 24, a six hour sample containing high levels of noise between the timestamps 13.5h and 15.5h is shown. Right: histograms of the fit amplitude. In all ten panels, the raw histogram prior to any filters is plotted in grey. In the two top panels, the histograms of remaining traces after application of the time information filter (TI filter) can be seen. The panels below present the histograms of traces discarded by the individual subfilters

3.1.3 Coincidence subfilter

For an activity of \({1}\,\hbox {Bq}\) per detector channel, which is typical for ECHo-1k, coincidence among different detector channels on a microsecond timescale due to \(^{163}\mathrm {Ho}\) has a low probability of \(\sim 10^{-4}\). Muon-induced events or certain electromagnetic signals in turn often cause triggered events in multiple detector channels at the same time as discussed in Sect. 4.1.2. Thus, discarding coincident events altogether is an efficient way to reduce spurious signals. Traces that fulfil

$$\begin{aligned} \varDelta t < t_\mathrm {coinc} \end{aligned}$$
(5)

as well as the corresponding previous traces are considered coincident.

The coincidence time \(t_\mathrm {coinc}\) can be defined by the time response of the signal. In the discussed datasets, the time response is governed by the gain bandwidth product (GBP) of the amplification circuit. One usually obtains an effective time resolution \(\tau _\mathrm {rise}\) of a few hundred nanoseconds. For muon related events, \(\varDelta t < \tau _\mathrm {rise} \) holds. However, for electromagnetic signals that couple into the readout scheme of multiple detector channels, time differences \(\varDelta t\) up to a few microseconds have been observed. Therefore, a conservative coincidence time of \(t_\mathrm {coinc} = {8}\,\upmu \hbox {s}\) is used for ECHo run 24.

3.1.4 GSM subfilter

This subfilter is implemented to specifically reject triggered GSM phone signals. For this, characteristic \(\varDelta t\) values associated with GSM signals are defined. Besides integer multiples of the duration of a TDMA frame, this includes the burst duration of a normal burst and an access burst. The burst duration can appear in the data stream when the rising and falling edges of a burst are triggered in different detector channels. Traces with a relative \(\varDelta t\) within a \({\pm 20}\,\upmu \hbox {s}\) interval around one of these characteristic \(\varDelta t\) values are discarded.

In principle, there is an infinite number of characteristic time differences \(\varDelta t\) that can be associated when considering all integer multiples. In practice however, a maximum value of \(\varDelta t\) is defined according to the total activity of the chip such that the probability that two triggered GSM signals separated by \(\varDelta t\) are not interrupted by a \(^{163}\mathrm {Ho}\) signal is \(10\%\).

3.1.5 Application of the time information filter

The first level filter is applied to a dataset acquired with 34 implanted pixels of one ECHo-1k chip during two days of run 24 with a total activity of \(\sim \,{25}\,\hbox {Bq}\). In the presented data reduction routine, the template fit as described in Sect. 3.2.2 is only performed for data that passes the time information filter. For this two-day dataset however, amplitudes are obtained for all data in order to illustrate the working principle of the time information filter as well as to assess its efficiency (Sect. 4.1).

In Fig. 5, the number of acquired and discarded traces per minute (left) and the fit amplitudes of acquired and discarded traces (right) are shown for 18 detector channels of the two-day dataset acquired with an ECHo-1k chip. For both plots, the top panel shows the corresponding histograms after applying a time information filter consisting of all four subfilters. The lower panels show the histograms of discarded traces broken down by by the individual subfilters. In all ten panels, the histogram of acquired raw traces is shown in grey for comparison. For most of the acquisition time, the number of events acquired per minute is constant, as mostly \(^{163}\mathrm {Ho}\) events are triggered. Between the timestamps of 13.5 h and 15.5 h, the number of counts increases by up to an order of magnitude. After applying the time information filter, the number of counts per minute within this time interval drops below the average undisturbed value, while the undisturbed region is barely affected. During the period of a high count rate, the number of discarded traces increases strongly for all subfilters.

Fig. 6
figure 6

Representation of the signal heights of the first 10,000 raw traces of a single pixel. Left: histogram of the signal heights Right: brightness of the traces calculated via Eq. (7). One can clearly see that the MI-line is the brightest. This holds for much higher signal heights as well, as highlighted by the 1-count limit in red. The number of bins (1000) is chosen simply for optical clarity and only slightly influences the position of the maximum

In the histograms of discarded fit amplitudes shown in Fig. 5 (right, lower panels), one can see the reconstructed amplitudes of discarded background traces as well as a component of falsely discarded \(^{163}\mathrm {Ho}\) traces. The spectrum of traces discarded by the coincidence subfilter, which can be seen in the bottom panel, is a background spectrum with only few discarded \(^{163}\mathrm {Ho}\) traces. It is characterised by a strong increase of counts towards low fit amplitudes, particularly below the NI-resonance. It is important to note that the fit amplitudes of background signals cannot necessarily be translated to an energy scale as is the case with \(^{163}\mathrm {Ho}\) signals.

3.2 Pulse shape analysis

The aim of the second level of data reduction is to recognise and eliminate \(\mathrm {PIT}\) as well as time-uncorrelated noise traces. For this, a mean trace (template) is generated for each pixel. All traces that have passed the first filter will then undergo a template fit with the obtained template. The goodness of fit parameter \(\chi ^2_\mathrm {red}\) is calculated, which provides a measure for how well each trace can be scaled to the template. This is used to define the second level filter.

Fig. 7
figure 7

Histogram of \(\chi ^2_\mathrm {red}\) of a single detector channel. The plotted range contains \(\simeq {99.7}{\%}\) of all raw events. The filtered region after the pulse shape filter still encompasses \(\simeq {97.6}{\%}\) of all raw events. Also depicted is the skewed Gaussian fit from which the pulse shape filter is calculated

3.2.1 Automated template generation

It is apparent from Fig. 1 (right) that the shapes of the traces from a single energy deposition in the detector are energy independent. Therefore, it is possible to build a discrimination scheme based on the deviation of traces from the general shape called the template. In order to process the vast amount of data acquired for the ECHo-1k experiment, an automated process for generating templates has been developed. To ensure a high reliability of the final pulse shape, the template is generated by averaging a large number of individual traces belonging to the MI-line of the \(^{163}\mathrm {Ho}\) spectrum. In fact, MI-events already have a very good figure-of-merit \(FOM\) at the level of single events, defined as:

$$\begin{aligned} FOM = \frac{V}{\sigma }, \end{aligned}$$
(6)

where \(V\) and \(\sigma \) are the height of the template signal (the average of the 10 samples after the maximum) and the standard deviation of the noise of the pre-trigger samples (pre-trigger noise), respectively. A high \(FOM\) corresponds to a signal shape relatively undisturbed by noise. As an example, compare a pulse from the MI-line (\(FOM \sim 100\) –400) to a pulse from the OI-line (\(FOM \sim 10\)–40), as shown in Fig. 1.

To reach higher \(FOM\) values, multiple traces have to be averaged to form the template. By averaging N traces, the pre-trigger noise of the resulting template is reduced by a factor of \(N ^{-1/2}\). Ideally, the traces should already have a high signal height. In order to maximise the \(FOM\) of the template, the best approach is to select a region of the available spectrum with a high fraction of \(^{163}\mathrm {Ho}\) signals compared to traces from various other sources as detailed in Sect. 2. This is the case for energies close to the main resonances of the \(^{163}\mathrm {Ho}\) spectrum. Also, selecting only traces from a single resonance as opposed to multiple ones ensures that \(FOM\) increases reliably.

The MI-resonance was found to be best suited for template generation based on the following approach: first, a histogram of the signal heights in \(\hbox {V}\) as provided by the acquisition software of the first few traces (\({\simeq 10,\!000}\)) is generated, as shown in Fig. 6 (left). Then, the brightness \(b\) is calculated per bin, defined by

$$\begin{aligned} b = V \sqrt{I}, \end{aligned}$$
(7)

where \(I\) is the number of counts in each bin. The result can be seen in Fig. 6 (right). Summing all traces of the region with the maximum \(b\) will result in an average trace with the maximum \(FOM\). For the \(^{163}\mathrm {Ho}\) spectrum, the brightest line is the MI-line. Hence, as stated above, it is the ideal candidate for template generation.

The signal height of the MI-line is located via a peak detection algorithm based on a continuous wavelet transform as implemented in the Python package SciPy [22], performed on \(b\).

The last step is to iteratively read in traces with a signal height that is within 1% tolerance of the mode of the MI-line in small batches of 200. One can then filter those traces with \(\mathrm {PIT}\) and other defects by calculating their pairwise quadratic differences in a vectorised manner, discarding those that deviate from the median quadratic difference by a factor \(\ge 2\). The remaining traces can be averaged until an initially defined \(FOM\) for the template is reached. If the data is exhausted before reaching the intended \(FOM\), this particular dataset is discarded from further analysis.

3.2.2 Template fit method

Once the average MI-signal is generated, a template fit for all the traces which survived the first level filter is performed. The measure used to determine how well the shape of the traces agrees with the template is the reduced chi-square, defined as:

$$\begin{aligned} \chi ^2_\mathrm {red} = \frac{1}{f}\sum _{i=1}^f \left( s_i - A \theta _i - O \right) ^2, \end{aligned}$$
(8)

where \(A\) and \(O\) describe the amplitude and offset of the trace \(\mathbf{s} \) respectively, \(\varvec{\theta }\) is the template, and the sum runs over all f elements of \(\mathbf{s} \) and \(\varvec{\theta }\). \(\chi ^2_\mathrm {red}\) is then minimised with respect to \(A\) and \(O\). The normalisation by 1/f (roughly the degrees of freedom of the fit) causes \({\chi ^2_\mathrm {red} \simeq 1}\) for a \(^{163}\mathrm {Ho}\) signal and facilitates further evaluation and analysis.

After all traces have been fit, a histogram of all \(\chi ^2_\mathrm {red}\) is generated as shown in Fig. 7, where in a typical measurement, \(\simeq {99.5}{\%}\) of all traces are inside the region of \({0 \le \chi ^2_\mathrm {red} \le 5}\). A skewed Gaussian distribution is fit to the histogram and the pulse shape filter is defined such that all traces which lie outside the \({99.73}{\%}\)-regionFootnote 6 of the skewed Gaussian are discarded.

4 Assessment of the data reduction algorithm

In the following, the performance of the filters defined above is evaluated for a subset of ECHo-1k data. For the first level filter, the efficiencies to retain signal and reject background are estimated based on the two-day dataset acquired with an ECHo-1k chip with a total activity of \(\sim {25}\,\hbox {Bq}\) as well as using simulated \(\varDelta t\) values. Hereinafter, the energy dependence of the second level filter is analysed based on simulated \(^{163}\mathrm {Ho}\) traces.

4.1 Assessment of the time information analysis

4.1.1 \(^{163}\mathrm {Ho}\) selection efficiency

The fraction of \(^{163}\mathrm {Ho}\) events that are unaffected by the time information filter, which we will simply call signal efficiency, is estimated for each subfilter individually. For the holdoff subfilter, by definition, the signal efficiency is \(\epsilon _\mathrm {hold}^\mathrm {sig}={100}{\%}\). Even though not all discarded traces would cause a falsely reconstructed amplitude, it is crucial for them to be rejected in order to obtain an energy-independent filter. Hence, all traces discarded by this subfilter are considered a source of background.

For the GSM subfilter as well as for the coincidence subfilter, \(^{163}\mathrm {Ho}\) traces are randomly discarded if their time difference \(\varDelta t\) lies in a region that is associated with mobile phone signal or coincident events respectively. The fraction of \(^{163}\mathrm {Ho}\) events occurring within the time intervals related to GSM signals and the fraction of random coincidence of \(^{163}\mathrm {Ho}\) events are obtained by applying the subfilters on simulated data with values of \(\varDelta t\) distributed according to a total activity of \(A = {25}\,\hbox {Bq}\). One finds that the fraction of \(^{163}\mathrm {Ho}\) events removed by applying the GSM subfilter is 5% while in case of the coincidence subfilter the fraction of discarded \(^{163}\mathrm {Ho}\) traces amounts to 0.04%. This corresponds to signal efficiencies of \(\epsilon _\mathrm {GSM}^\mathrm {sig} = {95}{\%}\) and \(\epsilon _\mathrm {coinc}^\mathrm {sig} = {99.96}{\%}\) respectively. As expected, it can be seen in Fig. 5 (right) that for the two-day data set, the number of \(^{163}\mathrm {Ho}\) traces discarded by the GSM subfilter exceeds the number of \(^{163}\mathrm {Ho}\) traces discarded by the coincidence subfilter by more than two orders of magnitude.

For the burst subfilter, the effective off-time caused by the rejection of time intervals with abnormally high rates is determined for each detector channel. The number of falsely discarded \(^{163}\mathrm {Ho}\) traces can be estimated from the product of activity and effective off-time for each detector channel. For the two-day dataset, the fraction of effective off-time to acquisition time ranges from \(0.1\%\) for low-noise detector channels to \(1.5\%\) for detector channels with strong coupling of mobile phone signals. The total fraction of \(^{163}\mathrm {Ho}\) traces discarded by the burst subfilter is 0.7% and thus \(\epsilon _\mathrm {burst}^\mathrm {sig} = {99.3}{\%}\).

4.1.2 Background rejection efficiency

The efficiency to reject signals which could contribute as background to the \(^{163}\mathrm {Ho}\) spectrum is defined individually for the signal families specified in Sect. 2. To reject \(\mathrm {POT}\) events, the holdoff time is conservatively chosen such that the broadening of the spectral shape due to falsely reconstructed amplitude is negligible. Thus, a background rejection efficiency for \(\mathrm {POT}\) of \(\epsilon _\mathrm {\mathrm {POT}}^\mathrm {bkg}={100}{\%}\) can be assumed.

Fig. 8
figure 8

Fit amplitude of a trace versus fit amplitude of the preceding trace in the same detector channel before and after a time information filter is applied. Data points from one detector channel of the two-day dataset are shown. \(^{163}\mathrm {Ho}\) events are mostly located on lines parallel to the main axes corresponding to the individual resonances of the \(^{163}\mathrm {Ho}\) spectrum. High densities appear at the intersections of two such lines. The inset shows a magnification of the region between the N-resonances and O-resonances together with a structure of correlated amplitudes originating from mobile phone signals. The ellipse-shaped reference region is used to estimate the background rejection efficiency for mobile phone signals. Data points within the ellipse are magnified for better readability. One can see that nearly all of the signals within the ellipse are discarded by the time information filter while most of the \(^{163}\mathrm {Ho}\) signals are unaffected

To estimate the background rejection efficiency for mobile phone signals, we define a reference region in a parameter space with particularly high background-to-signal-ratio. The background rejection efficiency can then be estimated based on the fraction of events within this region that are discarded. A good way to separate GSM events from \(^{163}\mathrm {Ho}\) events is to plot the amplitude of the acquired traces against the amplitude of the preceding trace for all acquired traces in one detector channel. As indicated in Fig. 8, \(^{163}\mathrm {Ho}\) events are mainly distributed in regions parallel to the main axes. Triggered noise was found to accumulate along the diagonal through the origin with unit slope. The correlation of fit amplitudes of subsequent mobile phone signals is not surprising, since triggered mobile phone signals are typically characterised repeating signals with the same shape. The reconstructed amplitudes of mobile phone signals are well below the M-resonances and in the continuum region between two resonances, the \(^{163}\mathrm {Ho}\) event rate is suppressed by several orders of magnitude compared to the region around the resonances. Thus, an ellipse-shaped reference region can be defined, centred between the NII- and OI-resonances, as drawn in Fig. 8, which has a particularly high background-to-signal-ratio. In this way, traces contained within the ellipse can be considered a pure sample of background signals. For the two-day dataset, only 77 out of 44,602 events within the ellipse are not discarded by a time information filter consisting of all four subfilters. The efficiency is conservatively estimated by the fraction of events within the ellipse that are discarded by the time information filter \(\epsilon _\mathrm {phone}^\mathrm {bkg}={99.826\pm 0.020}{\%}\). The error is obtained from the variance of a Binomial distribution.Footnote 7 A more refined estimate can be obtained by additionally considering the \(\chi ^2_\mathrm {red}\)-value of the traces. Of the 77 events within the ellipse that are not discarded by the time information filter, only 5 satisfy \(\chi ^2_\mathrm {red} >1.3\) expected for mobile phone signals. The other 72 traces have a \(\chi ^2_\mathrm {red}\) value close to 1, thus it can be assumed that they are \(^{163}\mathrm {Ho}\) traces. This additional selection yields \(\epsilon _\mathrm {phone}^\mathrm {bkg}={99.988\pm 0.005}{\%}\). Based on the shape of the time traces of the remaining 5 events, they can be identified as \(\mathrm {PIT}\), thus resulting in a rejection efficiency for mobile phone signals of \(\epsilon _\mathrm {phone}^\mathrm {bkg}={100}{\%}\).

As for particle background, only signals initiated by atmospheric muons can be tackled with the time information filter. Such traces can be discarded if a coincident signal in multiple pixels is produced. The efficiency of rejecting muon-induced events by means of the coincidence subfilter can be estimated from an acquisition with an active muon veto installed around the dilution refrigerator. The background rejection efficiency for muon-induced events is the ratio of pixel-pixel-veto coincidences and pixel-veto coincidences, i.e. the fraction of muon-induced signals that produce a signal in at least two pixels. In [13], a measurement with muon veto was described for 64 pixel-days. A total of \({242\pm 20}\) pixel-veto coincidences and \({194\pm 12}\) pixel-pixel-veto coincidences were measured. Thus, one can derive \(\epsilon _\mathrm {muon}^\mathrm {bkg}={80\pm 8}{\%}\).

4.1.3 Necessity of the individual subfilters

\(\mathrm {POT}\) events are discarded efficiently by the holdoff subfilter and muon related background is tackled by the coincidence subfilter. Since these signal families are each addressed by only one subfilter, the use of both the holdoff subfilter and the coincidence subfilter is essential. This also implies that the combination of different subfilters does not improve the respective rejection efficiencies for traces from these signal families.

To evaluate the benefit of the additional use of a burst subfilter and a GSM subfilter to specifically reject mobile phone induced noise, the influence of these subfilters on the background rejection efficiency \(\epsilon _\mathrm {phone}^\mathrm {bkg}\) is investigated. If instead of all four subfilters only the holdoff subfilter and the coincidence subfilter are applied, 226 instead of 77 out of a total of 44,602 mobile phone signals within the ellipse are not discarded. If in addition to these two subfilters the burst subfilter is applied, 81 traces remain undiscarded. In the case of using the GSM subfilter in addition to the holdoff subfilter and the coincidence subfilter, 103 traces in the ellipse are not discarded.

Even though the improvement due to the additional subfilters seems to be minor for this dataset, the burst subfilter in particular should always be applied. The signal efficiency for this subfilter is already high and increases even further the lower the noise level of the acquisition. In addition, the burst subfilter is sensitive to abnormally high trigger rates that only occur in one detector channel, and even to rather low noise frequencies that cannot be resolved by any of the other subfilters. Applying a time information filter without the GSM subfilter, the background rejection efficiency \(\epsilon _\mathrm {phone}^\mathrm {bkg}\) is still above 99.8% and the signal efficiency is dominated by \(\epsilon _\mathrm {burst}^\mathrm {sig} = {99.3}{\%}\).

With a signal efficiency of only 95%, the GSM subfilter is an expensive filter in terms of discarding good data. Furthermore, the reconstructed energy of GSM signals is well below \(Q_\mathrm {EC}\) and thus won’t affect the spectral shape close to the endpoint. In the two-day dataset of ECHo-1k, the additional application of this particular subfilter shows no advantage over the sole use of the other three subfilters. In future runs, the coupling of GSM signals will be reduced by improving screening of the read out components. For analyses of the low energy part of the spectrum however, where background levels increase, as well as for acquisitions with high levels of triggered mobile phone signal, this subfilter can be of relevance.

4.2 Assessment of the pulse shape analysis

Fig. 9
figure 9

Simulated (a, b) and ECHo-1k data (c, d). a Fit amplitude vs. \(\chi ^2_\mathrm {red}\) scatter plot of simulated events. An exemplary path of decreasing \(\varDelta t_\mathrm {ch}\) for \(A_1=A_2=1\) and \(\varPi =+1\) is indicated in red. The location of the maximum of the arc-shaped structures is determined by the amplitude of the pulse on the tail \(A_2\). b Histogram of the fit amplitude of simulated events before and after applying the pulse shape filter. The large fraction of pileup is caused by the truncated \(\varDelta t_\mathrm {ch}\) distribution used for the simulation. The structure in the histogram of traces discarded by the pulse shape filter is discussed in the text. c Fit amplitude vs. \(\chi ^2_\mathrm {red}\) scatter plot for one detector channel of the two-day ECHo-1k dataset. The arc-shaped structures become apparent after applying the time information filter. d Histogram of the fit amplitude of the ECHo-1k dataset with 18 detector channels where the endpoint region is blinded. The slight asymmetry of spikes around the main \(^{163}\mathrm {Ho}\)-lines is caused by detector channels with an asymmetric activity in the two pixels. The spike at fit amplitude 0 corresponds to triggered baselines. Note that the scale on the x-axis is the same for all plots

To assess the pulse shape analysis, the template fit described in Sect. 3.2 is performed on a set of simulated \(^{163}\mathrm {Ho}\) data. The aim is to find the sensitivity of identifying \(\mathrm {PIT}\) events as a function of time difference and energies of two subsequent events. Traces of \(\mathrm {PIT}\) as well as regular \(^{163}\mathrm {Ho}\) events are simulated with amplitudes, timestamps and polarities randomly drawn from corresponding distributions.

4.2.1 Simulation of pileup-on-tail with both signals inside the time window

For the simulation of signals with \(\mathrm {PIT}\), \(10^7\) events are generated, each with an amplitude of the triggered pulse \(A_1\), an amplitude of the subsequent pulse \(A_2\), a time difference to the subsequent pulse \(\varDelta t_\mathrm {ch}\) and a relative sign of the polarity to the subsequent pulse \(\varPi \). The corresponding values are drawn randomly from the expected probability distributions of the parameters. For the two pulse amplitudes, this distribution is the theoretical \(^{163}\mathrm {Ho}\) spectrum [7] normalised by its area. The values of \(\varDelta t_\mathrm {ch}\) are drawn from an exponential distribution \(\propto \exp (-A \varDelta t_\mathrm {ch})\) with activity \(A={1}\,\hbox {Bq}\). Integer multiples of \({128}\,\hbox {ns}\) are allowed, which corresponds to the time difference between two sampled data points typically used for acquiring ECHo data, as described in Sect. 1. For \(\varPi \), a discrete distribution \(P(\varPi =-1)=P(\varPi =+1)=0.5\) is employed. Normal distributed noise \(\varvec{{\mathcal {N}}}\) is generated with a constant \(FOM\) for a pulse height corresponding to the MI-line of the \(^{163}\mathrm {Ho}\) spectrum. In the following, \(FOM =300\) is used, which is a typical value for ECHo-1k data.

A simulated pileup trace \(\varvec{PU}(A_1,A_2,\varDelta t_\mathrm {ch}, \varPi )\) is then generated according to

$$\begin{aligned} \varvec{PU}(A_1,A_2,\varDelta t_\mathrm {ch}, \varPi )&= A_1 \varvec{\theta }(t_\mathrm {shift}=0) \\&\quad + \varPi A_2 \varvec{\theta }(t_\mathrm {shift}=\varDelta t_\mathrm {ch}) \\&\quad + FOM ^{-1}\varvec{{\mathcal {N}}}(\mu =0,\,\sigma ^{2}=1) \end{aligned}$$

where \(\varvec{\theta }(t_\mathrm {shift})\) is a template pulse (see Sect. 3.2.1) shifted in time by \(t_\mathrm {shift}\).

For a time difference larger than \(\varDelta t_\mathrm {ch} \simeq {1.57}\,\hbox {ms}\), the rising edge of the subsequent pulse lies outside the time window and thus the pulse shape of the initial event is not affected. For this reason, the distribution of \(\varDelta t_\mathrm {ch}\) is truncated at \(\varDelta t_\mathrm {ch} ={10}\,\hbox {ms}\) for the simulation.

In total, \(10^7\) events are generated. The number of \(\mathrm {PIT}\) events simulated with the truncated range of \(\varDelta t_\mathrm {ch}\) is equivalent to the number of \(\mathrm {PIT}\) events from \(1.01\cdot 10^9\) events if no truncation would be applied. The number of simulated undisturbed \(^{163}\mathrm {Ho}\) events with \({1.57}\,\hbox {ms}\le \varDelta t_\mathrm {ch} \le {10}\,\hbox {ms}\) is \(8.42\cdot 10^6\). Note that \(\mathrm {POT}\) is not considered in this simulation, as these events are already sorted out by the holdoff subfilter as discussed in Sect. 3.1.1.

4.2.2 Analysis of pileup-on-tail with both signals inside the time window

A template fit as described in Sect. 3.2.2 is applied to the generated traces, thereby obtaining the fit amplitude and \(\chi ^2_\mathrm {red}\) for each simulated event. The scatter plot of these fit parameters is shown in Fig. 9a. For \(\chi ^2_\mathrm {red} \sim 1\), the line structure of the \(^{163}\mathrm {Ho}\) spectrum is apparent with high abundances for fit amplitudes of \(\sim 1.0\) (MI-line), \(\sim 0.9\) (MII-line), \(\sim 0.2\) (NI-line), \(\sim 0.16\) (NII-line) and \(\sim 0.025\) (O-lines). For larger values of \(\chi ^2_\mathrm {red}\), arc-shaped structures that are centred around those amplitudes can be found. Three distinct groups of arc-shaped structures can be identified, culminating at \(\chi ^2_\mathrm {red} \sim 10\), \(\chi ^2_\mathrm {red} \sim 400\) and \(\chi ^2_\mathrm {red} \sim 10,\!000\). One finds that these groups correspond to \(\mathrm {PIT}\) with amplitude \(A_2\) corresponding to a \(^{163}\mathrm {Ho}\) event from the O-lines, N-lines and M-lines respectively, while the shift of the arc-shaped structures along the x-axis depends on the initial amplitude \(A_1\) of the pileup event.

The structures can further be understood when considering the influence of the time difference between the pulses \(\varDelta t_\mathrm {ch}\). For illustrative purposes, the path of decreasing \(\varDelta t_\mathrm {ch}\) for fixed amplitudes \(A_1=A_2=1\) and relative sign of the polarities of \(\varPi =+1\) is indicated in Fig. 9a. For \(\varDelta t_\mathrm {ch} < {1.57}\,\hbox {ms}\), the value of \(\chi ^2_\mathrm {red}\) increases with decreasing \(\varDelta t_\mathrm {ch}\) and up to \(\varDelta t_\mathrm {ch} \sim {0.55}\,\hbox {ms}\) the true amplitude of the triggered pulse is underestimated by an increasing amount. At \(\varDelta t_\mathrm {ch} \sim {0.55}\,\hbox {ms}\), the largest value of \(\chi ^2_\mathrm {red}\) is reached. For further decreasing \(\varDelta t_\mathrm {ch}\), the fit amplitude increases while \(\chi ^2_\mathrm {red}\) decreases up to the point where \(\varDelta t_\mathrm {ch} ={0}\,\hbox {ms}\). Here, a fit amplitude of \(A_1+\varPi A_2\simeq 2.0\) and \(\chi ^2_\mathrm {red} \sim 1\) is reached as it is expected for unresolved pileup of the two pulses. A corresponding mirrored structure arises from the same amplitudes with opposite relative sign \(\varPi =-1\).

Fig. 10
figure 10

Upper panel: histogram of true amplitude \(A_1\) of the simulated pileup events that are discarded by the pulse shape filter compared to the histogram of all simulated true amplitudes \(A_1\). Lower panel: ratio of the two histograms and the corresponding uncertainty bands due to the Poisson error of the number of counts in each bin. The ratio agrees well with a constant fit, which indicates that \(\mathrm {PIT}\) events are discarded by a pulse shape filter in an energy-independent way

A simplified pulse shape filter that selects fitted traces with \(\chi ^2_\mathrm {red} <1.3\) is used. Applying this filter to the simulated fit amplitudes yields a fairly clean theoretical \(^{163}\mathrm {Ho}\) spectrum (see Fig. 9b upper panel orange), apart from a few outliers with fit amplitudes above 1.5 and below 0 that will be discussed in Sect. 4.2.3. The region around the turning point in Fig. 9b at \(\varDelta t_\mathrm {ch} \sim {0.55}\,\hbox {ms}\) is densely populated. This increase in density gives rise to a spiky structure in the histogram of the fit amplitudes of traces that are discarded by the pulse shape filter (see Fig. 9b lower panel). By comparing the scatter plot (Fig. 9a)with the histogram (Fig. 9b lower panel) one can associate the peaks at fit amplitudes of \(\sim 0.9\) and \(\sim 1.1\) to \(\mathrm {PIT}\) of an MI-pulse (\(A_1\)) with an NI-pulse (\(A_2\)) for the two possible values of \(\varPi \). In the same way, the peaks at fit amplitudes \(\sim 0.7\) and \(\sim 1.3\) correspond to \(\mathrm {PIT}\) of two MI-pulses. Similar structures can be found centred around each line of the \(^{163}\mathrm {Ho}\) spectrum.

Fig. 11
figure 11

\(\chi ^2_\mathrm {red}\) as a function of \(\varDelta t_\mathrm {ch}\) for simulated \(\mathrm {PIT}\) events. The value of \(\chi ^2_\mathrm {red}\) only depends on the amplitude of the pulse on the tail \(A_2\) and \(\varDelta t_\mathrm {ch}\) but not on \(A_1\). In the magnification it can be seen that it follows that the value of \(\varDelta t\) for which \(\mathrm {PIT}\) events satisfy \(\chi ^2_\mathrm {red} <1.3\) depends on \(A_2\). Note the logarithmic scale of the x-axis of the inset

For the two-day dataset acquired with an ECHo-1k chip, the fit amplitude vs. \(\chi ^2_\mathrm {red}\) scatter plot of one detector channel (Fig. 9c) and the histogram of fit amplitudes discarded by a pulse shape filter for 18 detector channels (Fig. 9d lower panel) show structures that have striking similarities to the ones found in the simulated data. For better comparison, the same simplified pulse shape filter applied to the simulated data is also applied to the ECHo-1k dataset. The arc-shaped structures described above become apparent in the scatter plot for the data after applying the time information filter. These in turn result in a similar structure of the histogram of fit amplitudes of traces discarded by the pulse shape filter. The most apparent difference between the histograms in Fig. 9b, d is the larger fraction of \(\mathrm {PIT}\) events which arises from the truncated \(\varDelta t_\mathrm {ch}\) distribution used for the simulation. Furthermore, one can observe an asymmetry of spike pairs (e.g. fit amplitude of 0.9 and 1.1) in the histogram of the acquired data. For implanted detector channels, the activity in the two pixels is not identical, which yields to \(P(\varPi = +1)\ge P(\varPi = -1)\). The probability that two consecutive triggers have the same polarity becomes larger for an increasing asymmetry of activity of the pixels. The asymmetry is maximal for detector channels, which only have one implanted pixel and thus \(P(\varPi = +1) = 1\) and \(P(\varPi = -1)=0\).

4.2.3 Energy dependence of a pulse shape filter

Fig. 12
figure 12

Upper panel: scatter plot of fit amplitude and true amplitude \(A_1\) of each simulated unresolved pileup event. Most events are distributed around the straight of slope 1 through the origin. Additionally, straights of slope 1 shifted by \(\pm 0.2\) and \(\pm 1\) are drawn to guide the eye. Lower panel: histogram of the fit amplitudes of the events above. The autoconvolution of the theoretical \(^{163}\mathrm {Ho}\) spectrum for a pileup fraction of \(f_\mathrm {pu} =3\,\times \,10^{-6}\) is superimposed for comparison. In the simulated spectrum, structures from \(\mathrm {PIT}\) with large amplitudes on the tail (e.g. at fit amplitude \(\sim 2.0\)) are more than an order of magnitude smaller than in the autoconvolution spectrum. In turn, unresolved pileup with barely altered fit amplitudes are more abundant in the simulated spectrum

In order to assess the energy dependence of a pulse shape filter, the histogram of true amplitudes \(A_1\) of traces that are discarded by the filter is compared to the theoretical spectrum (Fig. 10). The ratio of the two histograms is shown in the lower panel together with the \(1\sigma \) and \(2\sigma \) error bands due to the Poisson error of the number of counts in each bin. A constant is fit to the ratio and, apart from one deviation of \(-2.20 \sigma \) at an amplitude \(A_1=1.06\), all ratios agree with the fit within the \(2\sigma \) band. From this we can conclude that \(\mathrm {PIT}\) events are discarded by a pulse shape filter in a fairly energy independent way.

On a subdominant level, an energy-dependent distortion of the final spectrum arises from unresolved pileup, which in this context are pileup events that survive the pulse shape filter. In Fig. 11, \(\chi ^2_\mathrm {red}\) is plotted as a function of \(\varDelta t_\mathrm {ch}\). The data points are coloured according to the amplitude of the pulse on the tail \(A_2\). Again, it becomes apparent that \(\chi ^2_\mathrm {red}\) only depends on \(A_2\) and \(\varDelta t_\mathrm {ch}\), but not on \(A_1\). The horizontal bands indicated in Fig. 9a correspond to the location of the plateaus of \(\chi ^2_\mathrm {red}\) found for \(\varDelta t_\mathrm {ch}\) between \(\sim {0.25}\,\hbox {ms}\) and \(\sim {1.25}\,\hbox {ms}\). For smaller \(\varDelta t_\mathrm {ch}\), the value of \(\chi ^2_\mathrm {red}\) steeply decreases towards \(\chi ^2_\mathrm {red} =1\), as expected for \(\varDelta t_\mathrm {ch} ={0}\,\hbox {ms}\). The inset in Fig. 11 shows that the value of \(\varDelta t_\mathrm {ch}\) for which the events fulfil \(\chi ^2_\mathrm {red} < 1.3\) is larger the smaller the amplitude \(A_2\). These traces are considered good \(^{163}\mathrm {Ho}\) traces by a pulse shape filter and thus correspond to unresolved pileup. For \(0.015< A_2 < 0.050\), i.e. OI-pulses on the tail, pileup is not recognised by the pulse shape filter for \(\varDelta t_\mathrm {ch} \lesssim {10}\,\upmu \hbox {s}\) while for \(0.950< A_2 < 1.500\), i.e. MI-pulses and higher on the tail, the time resolution for pileup is of the order of \(\varDelta t_\mathrm {ch} \sim {100}\,\hbox {ns}\),Footnote 8 which is of the same order as the time difference between two samples of a trace of \({128}\,\hbox {ns}\). This energy-dependent characteristic has the effect that the unresolved pileup spectrum does not simply correspond to the autoconvolution of the \(^{163}\mathrm {Ho}\) spectrum as one would naively expect. Rather, we can infer from Fig. 11 that the majority of unresolved pileup traces will feature small amplitudes \(A_2\) and thus have a fit amplitude that deviates only slightly from their true amplitude. As a result, the acquired spectrum is only weakly distorted – mainly by means of a slight broadening of the resonances. The spectrum of the reconstructed amplitudes of unresolved pileup events is shown in the lower panel of Fig. 12. In the upper panel, a corresponding scatter plot of fit amplitude vs. true amplitude \(A_1\) is presented. As expected, the majority of events are distributed near a straight line with unitary slope through the origin. These data points correspond to barely distorted traces from O-line pulses on the tail. Further accumulations can be found on the diagonals shifted by \(\sim \pm 0.2\) (NI-pulse on the tail with \(\varPi =+1\) (+) and \(\varPi =-1\) (-)) and \(\sim \pm 1\) (MI-pulse on the tail). It can be seen that the outliers mentioned in Sect. 4.2.2 with fit amplitudes above 1.5 and below 0 are concentrated near those shifted diagonals. The shape of the unresolved pileup spectrum is well understood and in particular no structure near the endpoint of the \(^{163}\mathrm {Ho}\) spectrum emerges. A total of 2801 unresolved pileup traces are found. The fraction of unresolved pileup for this simulation is \(f_\mathrm {pu} = {2.77\,\times \,10^{-6}}\), considering that the number of simulated \(\mathrm {PIT}\) events is equivalent to the number of \(\mathrm {PIT}\) events from \(1.01 \cdot 10^9\) events without truncating \(\varDelta t_\mathrm {ch}\). For comparison, the autoconvolution of the theoretical \(^{163}\mathrm {Ho}\) spectrum for a pileup fraction \(f_\mathrm {pu} = {3\,\times \,10^{-6}}\) is superimposed in the lower panel of Fig. 12. Unresolved pileup with an OI-line on the tail have similar rates in both spectra. However, structures with larger amplitudes on the tail are reduced by more than an order of magnitude, while those with barely altered fit amplitudes have a higher rate in the simulated spectrum.

This simulation is representative for the estimation of unresolved pileup in the high statistics spectrum of ECHo-1k.

5 Conclusions and outlook

In the ECHo-1k high statistics measurement, 58 MMC pixels, each loaded with an average of about 0.5Bq of \(^{163}\mathrm {Ho}\), have been operated over several months in order to acquire more than \(10^8\) \(^{163}\mathrm {Ho}\) events. This will allow to test the effective electron neutrino mass to a level of about 20eV. To reach this sensitivity, a new data reduction scheme has been developed. The aim of this scheme is to efficiently remove signals which could act as a background for the \(^{163}\mathrm {Ho}\) spectrum, without sacrificing large fractions of \(^{163}\mathrm {Ho}\) events and to precisely characterise any energy dependence of the filters.

We present a two-level data reduction scheme to obtain a clean signal from data acquired with ECHo-1k chips. The first level filter is purely based on the time information of traces. It is thus inherently energy independent. On a second level, the filtered data are further analysed by means of their deviation from a template pulse. The minor energy dependence due to unresolved pileup is well understood and can be modelled in an analysis of the \(^{163}\mathrm {Ho}\) spectrum. All implemented algorithms are designed such that they can be applied online.

After the data has been filtered by the two-level data reduction scheme, the recovered amplitudes are corrected for temperature fluctuations of the entire setup. The energies of the events are then obtained by identifying the major resonances of the \(^{163}\mathrm {Ho}\) spectrum and fitting their positions to the previously measured values with a polynomial function.

The methods discussed here can be adapted to be used for the next stages of the ECHo experiment. Future efforts will be directed towards resolving the energy of the first pulse in a \(\mathrm {PIT}\) to maximise the signal yield of the second level filter. This is particularly important for a higher implanted activity per pixel, as envisaged in future phases of the ECHo experiment.