Pulse shape analysis in Gerda Phase II

The GERmanium Detector Array (Gerda) collaboration searched for neutrinoless double-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta $$\end{document}β decay in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{76}$$\end{document}76Ge using isotopically enriched high purity germanium detectors at the Laboratori Nazionali del Gran Sasso of INFN. After Phase I (2011–2013), the experiment benefited from several upgrades, including an additional active veto based on LAr instrumentation and a significant increase of mass by point-contact germanium detectors that improved the half-life sensitivity of Phase II (2015–2019) by an order of magnitude. At the core of the background mitigation strategy, the analysis of the time profile of individual pulses provides a powerful topological discrimination of signal-like and background-like events. Data from regular \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{228}$$\end{document}228Th calibrations and physics data were both considered in the evaluation of the pulse shape discrimination performance. In this work, we describe the various methods applied to the data collected in Gerda Phase II corresponding to an exposure of 103.7 kg year. These methods suppress the background by a factor of about 5 in the region of interest around \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Q_{\beta \beta }= 2039$$\end{document}Qββ=2039 keV, while preserving \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(81\pm 3)$$\end{document}(81±3)% of the signal. In addition, an exhaustive list of parameters is provided which were used in the final data analysis.


Introduction
Neutrinoless double-β (0νββ) decay is a hypothetical process in which two neutrons in a nucleus are transformed simultaneously into two protons with the emission of only two electrons. Such a process violates lepton number conservation and requires the neutrino to be its own antiparticle (Majorana particle). In combination with cosmological observations and direct neutrino mass measurement, a non-zero 0νββ decay rate would highly constrain the standard light, left-handed neutrino exchange mechanism via the effective Majorana neutrino mass or shed light on alternative processes [1].
The highest half-life sensitivity to 0νββ decay requires the experiments to achieve large target mass, high detection efficiency, good energy resolution and most complete elimination of background at the Q-value of the decay (Q ββ ). The goal of the GERmanium Detector Array (Gerda) experiment was to realize a background-free 1 experiment for the first time. Gerda was located at the underground Laboratori Nazionali del Gran Sasso (LNGS) of INFN, Italy. Gerda used up to 43 kg of high purity germanium (HPGe) detectors enriched in the candidate isotope 76 Ge up to 88%. They g Present address: Physik Department, Technische Universität München, Munich, Germany E. Bellotti: deceased. 1 Number of expected background events at full exposure in the region of interest below 1. ensure high detection efficiency, low intrinsic background and excellent energy resolution. The bare HPGe detectors were operated in liquid argon (LAr), which served as cooling medium and as active shield against environmental backgrounds at Q ββ = 2039 keV. The details of the experimental setup and its upgrade from Phase I to Phase II can be found elsewhere [2,3].
The Phase II data taking took place between December 2015 and November 2019 with an upgrade of the detector array and the surrounding LAr instrumentation in 2018. Gerda operated three types of enriched HPGe detectors arranged in an array of 7 strings: 7 semi-coaxial detectors, referred to as coaxial detectors for brevity, from Phase I with a total mass of 15.6 kg; 30 Broad Energy Germanium (BEGe) detectors (20.0 kg) [4,5]; and 5 inverted coaxial (IC) detectors (9.6 kg) [6,7], which were installed in summer 2018. The accumulated exposure, product of total detector mass and respective livetime, amounts to 103.7 kg year for Phase II. In order to avoid bias in the event selection criteria, Gerda followed a strict blinding strategy, where events within a Q ββ ± 25 keV energy window were processed only after the analysis had been finalized.
In 0νββ decay, the two electrons deposit their energy in a small volume (about 1 mm 3 [8]) of a germanium detector producing single-site events (SSEs). On the other hand, background events consisting of γ rays from natural radioactivity interact mostly via Compton scattering producing events with multiple separated energy depositions (multi-site events, MSEs). Therefore, events with energy depositions in multiple germanium detectors or in the LAr volume around the detector array are discarded as background events. The time structure of the germanium signals is used to identify MSEs in a single detector and additionally recognize events close to the detector surface, together referred to as pulse shape discrimination (PSD). In this paper, we present the PSD techniques applied to the Gerda Phase II data that allow for a background-free operation of the experiment in combination with the LAr veto [9]. The PSD methods have been improved and extended compared to Phase I [10]. They were applied consistently to the complete Phase II dataset which was split in two parts, one before and the other after the 2018 upgrade, to account for changes in the readout electronics (cables and cross-talk). Alternative methods, e.g. applied to parts of the data in intermediate releases [9,11], are summarized in the Appendix.

Signal formation, readout and processing in germanium detectors
Gerda used p-type HPGe diodes of three different geometries called coaxial, BEGe and inverted coaxial (Fig. 1). They all have a relatively thick ([0.8-2.6] mm) n + elec-trode, formed by lithium diffusion, and a thin (∼300 nm) p + electrode created by boron implantation. The p + and n + electrodes are separated by a groove with non-conducting surface. Because of their small p + electrode the BEGe and IC detectors belong to the class of point-contact detectors [12] which exhibits intrinsic performance advantages with respect to energy resolution and pulse shape discrimination. The Ge detectors are operated under reverse bias voltage such that almost the entire volume is depleted of free charge carriers. An interaction in the active volume creates a number of electron-hole pairs proportional to the deposited energy. The charge carriers drift according to the electric field created by both the positive bias voltage applied to the n + contact and the volume charge density due to the net bulk impurity concentration. The electrons are collected at the n + contact, the holes at the p + contact which is used for readout. The n + layer covers most of the crystal surface and features a large Li concentration. Exhibiting zero electric field beyond the p-n junction till the outer surface, a charge created in this layer will only experience thermal diffusion with two possible outcomes: recombine (loss) or reach the depleted volume (collection). As a result, two generic regions can be identified, the dead layer with high probability of no charge collection and a transition layer with partial charge collection.
During the drift of charge carriers, charge is induced on the readout contact as described by the Shockley-Ramo theorem [13]: where Q 0 is the total charge carried by the holes or electrons and w (r h/e (t)) is the weighting potential along the drift path of holes or electrons. The weighting potential is shown in Fig. 1 for the three different detector geometries used in Gerda also indicating the geometry of the p + and n + surfaces. The weighting potentials have been calculated using the AGATA Detector Library [14] pulse shape simulation package.
Due to their small p + contact, BEGe and IC detectors have a weighting potential distribution that is very small in most of the volume and sizable only close to the p + contact. This results in similar waveforms Q(t) from interactions in a large part of the volume. Multiple energy deposits can be treated as a superposition of single interactions. In Fig. 2, normalized example pulses from a BEGe detector are shown for a SSE, a MSE, an event close to the p + contact and an event near the dead layer of the n + contact with incomplete charge collection. As shown, surface events produce characteristic pulse shapes being fast close to the p + contact due to the strong electric field and slow near the n + contact due to the weak electric field and the transition layer.
In coaxial detectors, both the hole and electron drift play a role in the pulse formation, which result in different pulse shapes throughout the volume of the detector. In Fig. 3, sim-ulated example pulses from different parts of the detector are shown. Similarly to BEGe detectors, coaxial detectors also show special pulse shape characteristics in case of surface events. Indeed, energy deposits near the groove or the bottom of the borehole cause faster pulses.
The signals induced on the p + contact of the Gerda detectors are read out by charge sensitive amplifiers located in the LAr about 35 cm above the detector array. The signals are digitized at 25 MHz for 160 µs and at 100 MHz for 10 µs. Both traces are centered at the rising edge of the charge pulse Q(t). The offline analysis of the digitized signals follows the procedures described in [15,16]. The 25 MHz traces used for the energy reconstruction ensure the excellent energy resolution, while the 100 MHz traces are used in the PSD methods presented in the following sections. The energy estimator E is reconstructed with a zero-area cusp filter [17], whose parameters are optimized for each detector and calibration run.

Overview of event samples and discrimination methods
Weekly calibration runs with 228 Th sources are performed to determine the energy scale and resolution of the detectors [18] and to calibrate and train the PSD techniques. Figure 4 shows a calibration spectrum highlighting the different event samples used in pulse shape analysis. The most prominent feature is the full energy peak (FEP) at 2615 keV from 208 Tl decay. Its double escape peak (DEP) at 1593 keV is used as a sample of SSEs as the electron and positron from pair production deposit their energy in a small volume and both annihilation γ rays leave the detector. The FEP at 1621 keV from 212 Bi is used as a sample of MSEs that is sufficiently near in energy to the DEP in order to avoid noise dependent biases. To test the performance of MSE rejection, the FEP and single escape peak (SEP) of 208 Tl, mostly featuring MSEs are used while the Compton continuum region around Q ββ ± 35 keV (CC(Q ββ )) serves to estimate the background rejection in the 0νββ decay signal region. In physics data, the standard neutrino accompanied double-β (2νββ) decay provides another sample of signallike events that is equally distributed in the whole detector volume and used for the investigations in the following. After applying the LAr veto cut [3], about 97% of the events in the 1000-1300 keV region originate from 2νββ decays. Beside MSEs from γ rays, the physics data have a significant amount of surface events from α and β decays that can be discriminated thanks to their specific pulse shape. The physics spectrum at low energies is dominated by β decays of 39 Ar up to its Q-value of 565 keV. However, these events are not used in the pulse shape analysis due to their relatively high noise. A prominent background source at Q ββ is the β decay of 42 K, which is produced as a progeny of the long-lived 42 Ar and has a Q-value of 3525 keV. Beta particles deposit their energy in germanium within a few mm resulting in events partly in the dead and transition layers of the n + surface. Such n + surface events can induce slow pulse shapes with incomplete charge collection. Apart from possible HPGe bulk contamination, that have been shown to be insignificant [19,20], α particles can only reach the active volume of the detector at the thin p + contact or at the non-conducting groove producing pulse shapes with fast rise as shown in Figs. 2 and 3. A clean sample of α surface events is found in physics data above the 42 K Q-value. The most prominent structure at these energies is a broad peak at 5304 keV, the α energy of the 210 Po α decay ( 238 U decay chain) [19,21].
Due to their different geometries, BEGe and IC detectors are treated separately from coaxial detectors in the pulse shape analysis. In the case of the BEGe and IC detectors one parameter, A/E, is used to classify background events, where A is the maximum current amplitude as indicated in Fig. 2 and E is the energy. As MSEs and surface events at the n + contact are characterized by longer, i.e. wider current pulses, they feature a lower A/E value compared to SSEs, while surface events at the p + contact show a higher A/E value [22]. Therefore, rejecting events on both sides of the A/E distribution of SSEs enhances the signal to background ratio. The details of the A/E analysis are presented in Sect. 4. In the case of coaxial detectors an artificial neural network (ANN [23]) is used to discriminate SSEs from MSEs similar to the approach applied in Phase I [10]. To discard events close to the p + contact, a dedicated cut on the risetime of the pulses is applied. The training and optimisation of the ANN and risetime cuts for coaxial detectors is described in Sect. 5. An additional cut is applied to all detectors to remove events with slow or incomplete charge collection. These events arise from energy depositions in a non-depleted volume (n + layer or insulating groove). These events are identified through the difference between two energy estimates performed using the same digital filter but different shaping times as summarized in Sect. 6. The signal efficiency of 0νββ decay is estimated using the survival fraction of DEP and 2νββ decay events by taking into account the energy dependence of the different PSD techniques. The details of this extrapolation to Q ββ including systematic uncertainties are found in Sects. 7 and 1.

The A/E method for BEGe and IC detectors
The amplitude of the current pulse A is computed after applying 3 times a moving window average (MWA) filter with 50 ns length and interpolating the pulse down to 1 ns sampling time. This filtering procedure optimizes the high frequency noise attenuation while preserving the pulse shape information. The energy estimator E is determined by a pseudo Gaussian filter with a shaping time of 10 µs. A is then divided by E before calibration, providing the raw A/E for each pulse. The raw A/E is then corrected for time stability and energy dependence before a cut value is defined.
For each calibration run, the A/E distribution of events in the 1000-1300 keV region is fitted with a Gaussian (SSEs) and a low-side tail (MSEs) as described in [10]. The position of the Gaussian μ A/E from each calibration of the four years of data taking is used to define stable time periods where the raw A/E changes by less than its σ A/E resolution. Instabilities are mostly related to hardware changes and a few detectors show a small systematic drift of the raw A/E. Physics data between the stable periods are removed from the analysis causing a few percent exposure loss. After normalizing the raw A/E by the average μ A/E within a given time period, the data of all calibrations are merged, only separating before and after the upgrade of 2018.
The A/E of SSEs in the Compton continuum of the merged calibration data show a small linear energy dependence of a few percent per MeV, due to the larger charge cloud size at higher energies that broadens the current pulse. The energy dependence of μ A/E (E) and σ A/E (E) is described by a linear and a b + c/E 2 type of function, respectively, as shown in Fig. 5 for a BEGe (GD61A) and an inverted coaxial (IC74A) detector as examples. In addition to the correction for the energy dependence, A/E is normalized to the mean of the A/E distribution of the DEP, which lies about 0.25% above the SSE band.
The cut values for each detector are determined on the energy-independent A/E classifier defined as ζ = ( Its distribution is centered around zero and has a standard deviation of one for SSEs. The low-side A/E cut against MSEs and n + surface events is chosen to yield a DEP survival fraction of 90%. The resulting cut values range from ζ = −1.9 to −1.2 and from ζ = −1.9 to −1.7 for BEGe and IC detectors, respectively. The highside A/E cut against p + surface events is chosen at ζ = 3.0 for each detector, in order to reject all α events in physics data above 3525 keV. It has been shown that the high-side A/E cut discards events, including degraded α events, from a small volume around the p + contact [24] causing the survival fraction of events after the high-side A/E cut to be proportional to the detector mass. Figure 6 shows for BEGe and IC detectors the energy distribution of calibration data before and after the A/E cuts described above as well the corresponding survival fractions. As expected the survival fraction of events in the FEPs and SEP is much lower than in the DEP. Events in the Compton continuum are discarded with about the same probability independent of their energy but depending on the overall detector size, and more generally speaking from the detector type. IC detectors discard a higher fraction of events because    The survival fractions of events in the DEP, FEP and CC(Q ββ ) are shown in Fig. 7 for each detector before and after the upgrade. The DEP survival fractions are slightly smaller than 90% due to the high A/E cut. The rejection of MSEs shows a small dependence on the detector position in the string because of different electronic noise conditions. This effect was reduced after the upgrade. The IC detectors (detector numbers above 35) reject MSEs more efficiently than BEGe detectors.
In order to check the validity of the A/E corrections and cuts in the whole dataset, survival fractions of the usual event samples are studied for each calibration run. The average survival fraction from all BEGe detectors is shown for each calibration in Fig. 8 as a function of time. The A/E cut shows, for both BEGe and IC detector types, a stable behaviour at  Table 3 for detector numbers and types). Open (filled) symbols show the calibration dataset before (after) the upgrade. The dashed lines separate the detector strings in the array. The uncertainties are only statistical and smaller than the markers The raw A/E is corrected in the same way for physics data and the same cut values are applied as in calibration Fig. 9 A/E classifier distributions after LAr veto comparing events in the DEP from calibration data and 2νββ decay events from physics data from BEGe (left) and IC (right) detectors. The histograms are normalized to their integrals data. In order to check the validity of the corrections and the cut values, Fig. 9 shows a comparison of the A/E classifier between DEP events from calibration data and 2νββ events from physics data for BEGe and IC detectors. By construction, the A/E classifier peaks at 0 and has a width of about 1 for these SSEs. The agreement between physics and calibration data is satisfactory and confirms the applied correction procedure. For each detector, the residual difference of the average A/E, between calibration and physics data, is included in the systematic uncertainties by shifting the cut value accordingly.
The energy distributions of events before and after the A/E cut are shown in Fig. 10 for the whole physics data corresponding to 61.8 kg year exposure from BEGe (53.3 kg year) and IC (8.6 kg year) detectors. Events in the 2νββ decay region survive the cut with a high probability, while the 40 K and 42 K peaks at 1461 keV and 1525 keV, respectively, mostly featuring MSEs, are reduced significantly. High energy events above 3525 keV coming from p + surface events are all discarded by the high A/E cut by definition. In a 240 keV wide window around Q ββ , only 7 events in BEGe detectors and 1 event in IC detectors survive the LAr veto and A/E cuts. This results in the corresponding unique background indices 2 of Gerda in the region of interest of 5.5 +2.4 −1.8 · 10 −4 counts/(keV kg year) and 4.9 +7.3 −3.4 · 10 −4 counts/(keV kg year) for BEGe and IC detectors, respectively.

The ANN and risetime methods for coaxial detectors
The 100 MHz waveform traces are used to compute the artificial neural network input variables (IVs) and the risetime. They are first filtered with a MWA of 30 ns width three times. 2 The background index is evaluated in the range between 1930 and 2190 keV without the two intervals (2104 ± 5) and (2119 ± 5) keV due to known γ rays and without the signal interval (Q ββ ± 5) keV [26]. The quoted uncertainties are statistical only. They are subsequently baseline subtracted and normalized between 0 and 1 by the amplitude of the 25 MHz pole-zero corrected traces. This amplitude is provided by a trapezoidal filter with a typical precision of 0.2% at 2 MeV.
The ANN IVs are a list of the 50 times at which the resulting waveform reaches [1%, 3%, . . ., 99%] of its amplitude (see Fig. 11). An interpolation of the 10 ns wide gaps between data points allows for a more precise estimation. These IVs are computed for all physics and calibration events found in Gerda. However, given the degraded signal-to-noise ratio at low energy, only events above 1000 keV are considered in this pulse shape analysis to avoid loosely constrained energy dependence correction. Calibration runs are used to optimize the discrimination of SSEs from MSEs. The network is built on two hidden layers with 50 and 51 neurons, using the TMVA-MLPBFGS algorithm [23]. Figure 12 shows as an example the classifier distributions from the ANN training of the ANG5 detector with the events from the indicated DEP and FEP peaks. The distributions from the Compton events under the peaks are statistically subtracted using the distributions of the events in the energy side-bands of the peaks. The lower and upper side-bands are defined by selecting events falling in the [−9σ ,−4.5σ ] and [4.5σ ,9σ ] energy regions where 2.355 · σ is the full width at half maximum used to quantify the energy resolution of the Ge detectors. The indicated ANN cut keeps 90% of the events in the DEP peak.
Due to the significant change in hardware, the data taken before and after the 2018 upgrade periods have been trained separately. Finer splittings of the data have yielded signal The risetime is estimated after interpolating the waveform with a 1 ns time step. From studies on the rejection of α particles [25], the [10-90%] amplitude signal risetime was selected (see Fig. 11). This parameter is used to reject α events on the p + contact that develop faster signals (see Fig. 3). The cut definition relies on the maximization of the following figure of merit: where ε 2νββ (x) is the 2νββ survival fraction at risetime cut x and ε α (x) is the corresponding survival fraction of α events.
Only physics data after ANN-MSE and LAr veto, to increase purity of samples, are used for this figure of merit that allows to reject most of the α particles while preserving a high 2νββ decay signal survival fraction. Figure 13 depicts an example of such a cut definition. On average, about 90% of the high energy α particles are rejected. The calibration energy spectrum of all coaxial detectors before and after applying the PSD cuts is shown in Fig. 14. As targeted, the ANN cut removes preferentially the regions highly populated by MSEs (FEPs and SEP in particular) and preserves 90% of the 208 Tl DEP. On the contrary, the risetime cut deployed to reject events with fast risetime is insensitive to these types of events and hence has a high survival fraction for both SSEs and MSEs.
Various survival fractions for all coaxial detectors, after applying both ANN and risetime cuts for the two data taking periods, are depicted in Fig. 15. In addition, the stabilities of these cuts in three calibration energy regions are plotted in Figs. 16 and 17. Apart from a slight improvement of the ANN   rejection of the Compton continuum events at Q ββ after the 2018 upgrade due to an improved signal cable management, an overall 3% level stability in PSD performance is observed over the course of Gerda Phase II. Compared to Gerda Phase I [10], on average a 7% relative worsening is observed, mostly attributed to the different electronics scheme, hence different noise. The risetime cut also shows a very stable behavior during this period.  The result of the pulse shape analysis applied on the full 41.8 kg·yr physics data exposure, after the LAr veto cut, is summarized in Fig. 18. The ANN is preserving 80.5(3)% of the 2νββ decay event sample while removing 62(1)% of the 40 K and 42 K lines. The reduction of the 214 Bi FEPs at 1806 keV and 2204 keV is also visible. However, it is in general inefficient at rejecting fast and degraded α events from about 3500 keV to 5500 keV, as 58.4(8)% of them survive. After applying also the risetime cut, 67.8(4)% of 2νββ decay events remain. The α background is suppressed by a

Events with incomplete charge collection
Events from the n + layer or the groove featuring slow or incomplete charge collection (see Fig. 2) have an uncertainty on energy reconstruction because the ZAC filter [17] is optimized for the FEP resolution with a relatively short integration time. These particular events, especially for coaxial detectors, can survive the ANN and risetime cuts. In order to discard these events with uncertain ZAC energy reconstruction, an additional classification based on energies reconstructed with pseudo Gaussian filters with different integration times is performed. The energy is reconstructed with an integration time of 4 μs (E short ) and 20 μs (E long ). The ratio E short /E long is then normalized to its average observed in Fig. 19 Distribution of the δ E classifier for calibration events in the 208 Tl FEP and in the Compton region around Q ββ as well as for physics events in the 2νββ region for the coaxial ANG2 detector. The histograms are normalized to their integrals events from the 2615 keV line in calibration data. The classifier is defined as: where E is the default ZAC-reconstructed energy for each event. With this definition, the classifier δ E has an average of 0 keV. The E short /E long distribution normalization is performed time dependently in order to account for possible instabilities of the readout electronics: for each calibration run the mean of the fitted Gaussian to the FEP distribution is used as a normalization factor for the following physics events. Figure 19 shows the resulting δ E distributions for calibration events in the 208 Tl FEP and in the CC(Q ββ ) as well as for physics events in the 2νββ region for the ANG2 detector as an example. Events in the Compton continuum have a higher fraction with large negative δ E values. This is due to the higher fraction of pulses with incomplete charge collection that is not present in a peak where the whole energy has been collected. Physics data in the 2νββ region shows a similar Gaussian distribution as calibration data with no significant energy dependence.
A cut value is applied to the lower side of the δ E distribution and it is set for each detector separately 3σ away from zero. 3 The cut value is loose enough that more than 99% of signal events are kept whereas those with uncertain energy (significant difference between energies reconstructed by the ZAC and Gauss algorithms) are rejected.
For the BEGe and IC detectors, most of the 2νββ decay events rejected by the δ E cut are also rejected by the A/E cut. For BEGe detectors out of 3477 events cut by either method only 7 are rejected by the δ E cut (for energies above 1000 keV 12 events in total are cut by δ E). For the IC detectors none of the 521 2νββ decay events is removed by the δ E cut alone. For the coaxial detectors the correlation between ANN, risetime and δ E cuts is weaker. Analyzing the 2νββ decay region for the full Phase II dataset, one gets 4970 events (out of 15,433) removed by either method, 1 event cut by all three methods and 57 by δ E only. 83 events from coaxial detectors with E > 1000 keV are cut by δ E only. Many of these events show slow pulses that could originate from the detector surface but are not cut by ANN or risetime cuts.
The survival fractions for different energy regions of physics and calibration data have been studied for each detector separately. Survival fractions of 2νββ decay events for the different detector types are presented in Table 1 before and after the other PSD cuts. Values for DEP, FEP, CC(Q ββ ) and 2νββ decay events for the two datasets before and after the 2018 upgrade are very close to 100%. The Compton region as well as the 2νββ region show lower acceptance because of the contribution of slow pulses.
For BEGe and IC detectors the impact of the δ E cut is negligible, while for coaxial detectors a small correction of the efficiency has to be taken into account.

PSD detection efficiencies at Q ββ
In the absence of signal proxies at Q ββ in calibration and physics data and due to a not sufficiently accurate modeling of the pulse shape response in simulation, ε PSD is estimated for point contact and coaxial detectors from the extrapolation of the survival fractions of the 208 Tl DEP and of the 2νββ decay events at 1593 keV and 1150 keV to Q ββ (see circles and squares in Fig. 20).
For that purpose, sets of down-scaled waveforms are used to evaluate the energy dependence of the various PSD methods. This is based on the observation that the energy dependence of PSD classifiers is dominated by the electronic noise in Gerda. Two samples from calibration data have been considered, namely 208 Tl DEP and CC (Q ββ ) events (see Fig. 4). The waveform ω (k, E) at energy E is produced by downscaling the waveform ω(k, E S ), of true energy E S , from one of these two samples and superimposing it to a baseline sam- Fig. 20 Extrapolation of reference survival fractions (circles) to Q ββ (squares) using the energy dependence deduced from indicated two samples of down-scaled waveforms (see text). Examples are given for BEGe (top), coaxial (middle) and IC detectors (bottom) for the A/E, ANN and δ E methods, respectively. As to the BEGe-A/E example on top: the reference survival fraction is corrected for Compton events below the 208 Tl DEP while for the rescaled distribution (red) this correction is missing ple b(k) obtained from random triggers: where k is the index running from 0 μs to 160 μs and from 0 to 10 µs for the low and high frequency traces introduced in Sect. 2, respectively. By applying the complete set of digital processing steps and analysis methods described earlier, such a procedure allows to estimate the contribution of the energy dependence, induced by the signal-to-noise ratio variation, on the PSD methods results (for more details see [25]). Figure 20 shows the survival fractions of the 208 Tl DEP and CC(Q ββ ) datasets as a function of the down-scaled energy for the A/E, ANN and δ E methods. 4 Each plot depicts the energy dependence of a given detector type, as a result of a weighted average of the individual detector responses, to illustrate the general trends. The reference efficiency is extrapolated to Q ββ using the average energy dependence of 208 Tl DEP and CC(Q ββ ) datasets. 5 Table 2 shows how the different PSD procedures contribute to the resulting average detection efficiencies ε PSD of the various detector types for each data taking period. The efficiencies of the point contact detectors are almost 30% larger than those of the coaxial detectors.
The analysis of the final Gerda results [26] is based on time-dependent and detector-wise datasets. Table 3 (Appendix A) reports the overall PSD efficiency for each detector separately. Since the central values of the individual detector efficiencies exhibit significant shifts, the average overall PSD efficiencies ε PSD of Table 2 have been obtained, in fact, from an exposure weighted Monte-Carlo sampling, and not from a simple average.
The systematic uncertainty of the extrapolation is estimated from the difference of the slopes of the two downscaled waveform samples. It is on average about 0.5% and 1.3% for point contact and coaxial detectors, respectively, the latter detector type being more sensitive to the noise due to its larger p + contact capacitance.
In addition, two other systematic effects have been taken into account: (1) the difference between the calibration and physics data and (2) the difference between signal proxies and 0νββ decay events. The former applies only to point contact detectors as the 208 Tl DEP from calibration data is used as signal proxy. Shifting the A/E cut, for each detector, by the A/E distribution bias observed between the 208 Tl DEP and 2νββ decay (Fig. 9) leads to an average relative systematic uncertainty of 1.9%. The latter makes use of pulse shape simulation (see Appendix B) to quantify the PSD performance bias between 2νββ decay and 208 Tl DEP events and those coming from 0νββ decay. Indeed, the signal proxies feature much lower energies hence different Bremsstrahlung contribution. In addition, 208 Tl DEP events have a higher probability to happen close to the detector surface while 0νββ decays would occur homogeneously throughout the detector bulk. A 2.3% and 1.5% absolute uncertainty has been estimated for BEGe and IC detectors, respectively, while it amounts to 4% for coaxial detectors. This last estimate is larger due to the difficulty to match the ANN training performance in simulation with the data. Table 2 Average signal detection efficiencies at Q ββ of individual PSD methods for the different detector types and data taking periods. The corresponding total PSD detection efficiencies ε PSD and their systematic uncertainties are shown in bold; they are estimated via a MC sampling of individual values (see Table 3) and are thus different from the product of the individual PSD methods efficiencies reported here

Summary
Nowadays, running a background-free 0νββ decay experiment is essential to boost the T 0ν 1/2 sensitivity on a reasonable time scale. Over the past years, the Gerda collaboration demonstrated the feasibility of such a program by upgrading its initial setup with additional point contact detectors (BEGe and IC), a LAr veto instrumentation and lower mass holders. As a consequence, the sensitivity linearly increased with the exposure. The interplay between passive and active shielding techniques has proven to be highly effective. In this paper, we focused on the ability offered by germanium detectors to analyze the topological structure of the recorded events. This topology, distinct for signal-like and background-like events at a 100 ns time scale, is best scrutinized with the high frequency based data acquisition system of Gerda. Using the Phase II dataset, we confirmed the superior discriminating power of point contact detectors (BEGe and IC) over the historical coaxial detectors thanks to the simple A/E parameter. Their 0νββ decay PSD efficiency is 89% and 69%, respectively for a similar background index of about 5 · 10 −4 counts/(keV kg year) after all cuts. We also demonstrated the high and stable performance in LAr of the newly produced enriched inverted-coaxial point contact germanium detectors. The LEGEND collaboration will deploy these IC detectors for its future 76 Ge 0νββ decay search program.
The total detection efficiencies for 0νββ decay are discussed and listed for each detector in Appendix A. thank the UZH for the Postdoc and Candoc Forschungskredit fellowships respectively. The institutions acknowledge also internal financial support. The Gerda collaboration thanks the directors and the staff of LNGS for their continuous strong support of the Gerda experiment.

Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors' comment: All relevant numerical results are collected in Tables 1 to 4

.]
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Funded by SCOAP 3 .

Appendix A: Efficiency tables per detector
The final results of Gerda on the search for 0νββ decay [26] have been obtained from data partitions, defined detectorwise by stable data taking periods, that notably include the stability of PSD performance. Table 3 lists for each germanium detector, before and after the 2018 upgrade, the PSD cut efficiencies ε PSD as well as the total efficiencies ε 0νββ of detecting 0νββ decays the latter entering Eq. (1) of [26]: For the coaxial and BEGe detectors, Table S1 of the Supplementary Material of [11] provides the 76 Ge enrichment fractions f 76 and the active volume fractions f av . The electron containment efficiencies ε fep of the period before the upgrade are also provided there while a new computation has been used for the period after the upgrade that is reported in Table 3. Table 4 shows the active volume fractions for Table 3 Detector-wise PSD cut efficiencies ε PSD and the total detection efficiencies ε 0νββ for 0νββ decays used in the final analysis of [26] and corresponding exposures E. For the period after the 2018 upgrade also the electron containment efficiencies ε fep are listed (see text). Quoted uncertainties account for statistics and systematics. The channels with empty entries were used in anti-coincidence only  the IC detectors. The efficiencies ε LAr of the LAr veto cut for the two data taking periods are 0.977(1) and 0.982(1), respectively [26]. Equation 5 deliberately neglects the efficiencies of muon veto and quality cuts both being larger than 0.999. The uncertainty of all efficiencies are incorporated via a Monte-Carlo sampling from which we retrieve the ε 0νββ central value and its standard deviation.

Appendix B: Pulse shape simulation
The analysis of Gerda is data-driven for all PSD methods thanks to the selection of relevant energy regions for signal proxies. However, this comes at the expense of neglecting the specific decay dynamics at Q ββ . For instance, the amount of Bremsstrahlung at 2039 keV is larger than at 1593 keV, hence a potential 0νββ decay event has a different A/E ratio compared to 208 Tl DEP. Also, 208 Tl DEP events do happen on average closer to the detector surface while 0νββ decay are homogeneously distributed all over the detector bulk (see Fig. 21). This results in a larger event fraction of the latter population above the p + contact with higher A/E values (see Fig. 1). These effects are best studied with a Monte-Carlo simulation of the Gerda experiment and subsequent pulse shape simulation (PSS) of individual events occurring in the germanium detectors. For this purpose, the Geant4 based simulation package (MaGe [28]) and detector configuration used for the background model of Gerda [19] was employed to generate 208 Tl and 212 Bi decays in order to reproduce the calibration energy spectrum of each germanium detector. In addition, 2νββ decays were simulated homogeneously in each detector, to cross-check the PSD efficiencies in simulation with the data, as well as 0νββ decays. The energy spectrum of the two electrons emitted in the double-β decay was sampled according to the distribution given in reference [29] implemented in Decay0 [30].
The individual hit positions and deposited energies are used to calculate the corresponding induced charge flow in the detector by means of electrostatic simulation software (ADL [14] and Fieldgen [31]). Each generated pulse is then convoluted by an optimized electronics response model (bi-quad filter [32]) of the Gerda setup before taking into account a realistic gain and offset. Subsequently, waveforms from baseline events, recorded during the physics data acquisition to estimate the random coincidence probability, are randomly picked-up and added to the convoluted pulses to emulate the physical signal-to-noise ratio. Finally, the postprocessing is identical to the PSD methods described in Sects. 4 and 5 to retrieve the A/E, ANN and risetime classifiers.
In Fig. 22, the PSD classifier distribution of the simulation is compared to the data. For the particular case of coaxial detectors, two approaches have been studied for the ANN analysis. First, the training and cut values from the data were applied to the simulation. Second, the simulated pulses were fed into the ANN for its training and the cuts were computed such that 90% of simulated 208 Tl DEP events survive. Due to the not accurate enough modeling of the Gerda electronics response, the latter method was found to give results in better agreement with the data.
For IC detectors, the final systematic uncertainty from these data and simulation comparison is estimated by computing the survival fraction difference between 208 Tl DEP and 0νββ decay for each individual detector and then average it per detector type. For coaxial detectors, the agreement of the 2νββ decay survival fractions between data and simulation is used to quantify the systematic uncertainty on the 0νββ signal efficiency. The corresponding BEGe detectors systematic uncertainty was estimated in previous studies [33].

Appendix C: ANN for α event rejection
In order to mitigate the high energy α background, an analysis based on a second ANN was developed at the beginning of Gerda Phase II [9,34]. Other than in the ANN analysis described in Sect. 5, the neural network applied to Signal proxy and 0νββ decay events PSD classifier distributions for an inverted coaxial (top) and a coaxial (bottom) detector. The pulse shape simulation results (colored lines) are superimposed on the data (grey area). The histograms are normalized to their integrals α events was trained for different datasets using the highest possible statistics available [35]. For the training, the 2νββ decay events were used as signal-like sample, while αinduced events did serve as a background-like sample. From the training point of view, all detectors had poor statistics for high energy events. Therefore, the training of the ANN-α has been performed with the whole available background dataset to optimize the selection. Later, two changes have been introduced into the α cut: (i) in order to avoid any conflict with the determination of the survival fraction for the 2νββ decay events, calibration events at Q ββ ± 10 keV, were provided as a signal-like sample instead, (ii) input variables between 10% and 90% of the original trace were used for the supervised learning, what helped to reduce energy dependence of the ANN-α cut.
Even though the statistics in the training data sets was much smaller compared to the MSE-based training, a very efficient -and even better -separation from the signal event class has been achieved. The discriminating parameter was set such that for physics data with E > 3500 keV a survival fraction of 10% yielded, while retaining, on average, ∼95% of 208 Tl DEP events in the calibration data. Only two detectors, RG1 and ANG4, featured slightly smaller 208 Tl DEP survival fractions of about 92%. The corresponding statistical uncertainty on the 10% cut on α events was in the range of (2-3)%.
For reliability and technical reasons, the original TMVA algorithm used in [9,34,35] called TMlpANN (own ROOT ANN TMultiLayerPercetron class) was replaced by the recommended and supported MLPBFGS (ANN multilayer perceptron class) in 2018, both using the same Broyden-Fletcher-Goldfarb-Shanno algorithm. Two existing caveats persisted in the analysis. First, the low statistics of the background sample (∼ 50-100 events), was not sufficient to test the ANN from the beginning. The new algorithm required to split the data in training and testing samples hence an even lower background statistics, which led to non-converging training in some detectors. Second, few events with very fast signals, i.e. short risetime, were found to survive the ANN-α cut in the region of interest while they were thought to be α events with high confidence level. This put in the question of reliability and robustness of this analysis approach, hence the decision to switch to the simpler risetime cut at the cost of a ∼ 10% loss in signal efficiency.

Appendix D: Projective likehood analysis
An alternative analysis strategy was pursued for the coaxial detectors using a projective likelihood (PL) to remove both MSEs and α background at once. In this analysis the high-frequency waveform from the preamplifier output was smoothed by averaging every 5 adjacent samples, differentiated and then the maximum current value was found along with its corresponding time (t 0 ). For the analysis, 15 amplitudes before and 15 after t 0 were extracted from the original trace (31 in total). In order to further reduce the number of input variables sums of 4 neighboring amplitudes were calculated establishing 5 new input parameters ( 1 -5 , 5 is a sum of 3 amplitudes) -see details in [32]. In order to avoid energy dependence, after the baseline subtraction, the pulses were normalized with respect to their energies.
For training and definition of the cuts, calibration data were used. The Compton edge events of the 2615 keV peak (region between 2350 keV and 2375 keV) were chosen as SSEs (signal), and multiple Compton scattered events (region between 2450 keV and 2550 keV) as MSEs (background). The cut was defined requesting 80% survival fraction for the DEP events. This allows to compare the present analysis with the corresponding one from the Gerda Phase I data. No dedicated cut for high energy events was defined.
The stability of the input data was monitored using the time distribution of 2 . If instabilities (abrupt change of the input variable distribution) were observed, the data in the affected channels was divided into different (stable) subperiods and analyzed separately. As a consequence, the cut for a given survival probability, was time dependent. Such a change for ANG5 has been observed between some runs, otherwise the distribution of 2 was stable within less than 4%.
The survival probability of the 0νββ decay events estimated from the simple cut based on the PL classifier is (65.5± 13.3)%, while the background in the region of interest is reduced by 56%. Although, the signal efficiency is relatively low with conservatively estimated systematic uncertainty, PL allows also to eliminate efficiently the high energy events, about 87% of them are rejected. An advantage of PL is the training performed only on γ -rays from the calibration runs.
This method was not retained for the final 0νββ decay search analysis due to lower signal efficiency and higher background in the region of interest.