1 Introduction

Through different astrophysical observations on a wide range of cosmological scales, it is well established that \({\sim }27\,\%\) of the energy density in the Universe is made up of an unknown dark matter [2]. A well-motivated class of particles proposed to solve the dark matter problem are weakly interacting massive particles (WIMPs) with masses of the order of \(\mathrm{GeV}/c^2\) to \(\mathrm{TeV}/c^2\) and an extremely low scattering cross section with ordinary matter. Direct detection experiments search for the elastic scattering of a WIMP from the galactic dark matter halo in detectors on Earth-based experiments. The nuclear recoils from such interactions would have an exponentially falling energy spectrum up to a few keV, depending on the mass \(m_\chi \) of WIMPs. In addition, the expected rate is smaller than one interaction per kg of target material per year. To minimize the background for this rare event search, the EDELWEISS experiment is located in the Modane underground laboratory (LSM) in the French-Italian Alps, where a rock overburden of \(4800\,\mathrm{m\ w.e.}\) reduces the cosmic muon flux down to \(5\,\mathrm {muons/m^2/day}\). Remaining muons are tagged with an active muon veto system surrounding the experiment [3], followed by \(50\,\mathrm{cm}\) of polyethylene and \(20\,\mathrm{cm}\) of lead to suppress neutrons and gammas. Inside these layers of shielding a cryostat made of ultra-pure copper houses germanium monocrystals which are cooled down to \(18\,\mathrm{mK}\). A simultaneous measurement of the heat and ionization energies produced in a recoil allows to discriminate between the dominant electron recoils (ER) from radioactivity and nuclear recoils (NR), which at low energies are only caused by neutrons and the expected WIMP signal.

Other direct detection dark matter experiments use similar approaches based on the same principle to discriminate between different backgrounds and a possible signal from WIMPs. Exclusion limits on the WIMP-nucleon spin-independent scattering cross section from LUX [4] and SuperCDMS [5] are in strong tension with favoured parameter regions based on observations by DAMA/LIBRA [6], CoGeNT [7] and CDMSII-Si [8].

Almost all existing signal claims for low-mass WIMPs can be excluded at 90 % C.L. with the improved limits that were recently published by the EDELWEISS-III collaboration [1] considering standard assumptions about the WIMP-nucleus interaction and the galactic halo model. Data from a 10-month WIMP search run were analysed in terms of low-mass WIMPs with masses \(m_\chi \in [4, 30]\,\mathrm{GeV/}c^2\) using a method based on Boosted Decision Trees (BDT). No statistically significant excess of events was observed for eight selected detectors, resulting in exclusion limits up to a factor 40 stronger at \(m_\chi = 7\,\mathrm{GeV/}c^2\), compared to results from previous EDELWEISS-II [9] low-energy data. Such a cut-based analysis performs well when the separation of signal and background is sufficient, as is the case for higher WIMP masses. However, at low energy, the finite resolutions of the detectors cause the electron and nuclear recoils to have overlapping populations in the distributions of the variables that serve as discriminator. A separaration thus requires a cut at lower energy, resulting in a severely reduced efficiency. To overcome this problem, the analysis presented here uses an alternative approach which is based on the maximum likelihood, similar to e.g. [4, 10]. It is an unblind analysis performed on a similar data sample that was recorded with the same detectors as in [1]. With its completely different analysis approach it improves the sensitivity for low-mass WIMPs and allows to cross-check the results of the BDT-based analysis. Instead of extracting limits without background subtraction from a smaller signal region with optimized signal-to-background ratio, the maximum likelihood method is used to model and fit the data in the entire region of interest (RoI). Thus, the remaining WIMP signal after detector efficiency corrections is not further reduced, while expected backgrounds are fitted and can be subtracted. The systematic uncertainties of the background predictions are taken into account by constraints in the likelihood fit and the calculation of exclusion limits.

The operating principle of the EDELWEISS-III detectors and the selection criteria for the analysed data are detailed in Sect. 2, while a description of the different background components is presented in Sect. 3. The formalism of the likelihood model for the analysis is explained in Sect. 4, both for fitting the data to individual detectors, as well as for a combined fit of a common signal to all detectors. We also detail how the exclusion limit is set using a hypothesis test based on the profile likelihood test statistics. A discussion of the fit results and a comparison with the result achieved with the BDT method follows in Sect. 5.

2 EDELWEISS-III detectors and selection of data

The detectors used in EDELWEISS-III are of the full inter-digit FID800 type [11]. These are high-purity Germanium bolometers in a cylindrical shape of \(7\,\text {cm}\) diameter and \(4\,\text {cm}\) height with masses ranging from 820 to \(890\,\mathrm{g}\) due to small variations both in diameter and height. Aluminium electrodes cover all sides of the detector in concentric rings to collect the charge carriers produced in a particle recoil. Glued on the top and bottom surfaces each is a neutron transmutation doped (NTD) Ge-sensor, which measures the micro-Kelvin temperature increase due to the energy deposit. The measurement of both heat and ionization signals on an event-by-event basis allows to discriminate rare NR events from the dominating ER events. The latter are mostly due to radioactive background outside the detector which produces \(\gamma \)-radiation. Their ionization yield Q, i.e. the ratio of ionization over total recoil energy, is defined as \(Q_{\mathrm{ER}} = 1\) by the energy calibration with a \(^{133}\text {Ba}\) \(\gamma \)-source. The ionization yield \(Q_{\mathrm{NR}}\) for nuclear recoils from neutrons and WIMPs is quenched and thus \({\sim }3\) times smaller. It depends on the recoil energy \(E_{\mathrm{r}}\) and can be parameterized as \(Q_{\mathrm{NR}}(E_{\mathrm{r}}) = 0.16 \cdot \left( E_{\mathrm{r}}/\mathrm{keV}\right) ^{0.18}\) for EDELWEISS Ge-detectors, which is consistent with the Lindhard theory [12]. The separation of particle types due to their different Q-value is only applicable if the produced charges are properly collected. For recoils close to the detector surface, where charge trapping is important, this is not guaranteed. Therefore, FID detectors are designed to discriminate between surface events and events originating in the bulk of the detector. Interleaved electrode rings on the detector surface are wired together and the resulting groups on the top half are biased with \(+4\,\mathrm{V}\), \(-1.5\,\mathrm{V}\) and \(-4\,\mathrm{V}\), \(+1.5\,\mathrm{V}\) on the bottom half. The electric field configuration divides the detector into a fiducial volume from which charges are drifted to the so-called fiducial electrodes (at \(\pm 4\,\mathrm{V}\)) and a near-surface volume, for which a signal is also seen on the veto electrodes (at \({\mp } 1.5 \mathrm{V}\)). This fiducialization is used to significantly suppress backgrounds from surface interactions and select only events with full charge collection efficiency. In this analysis, we consider the heat signal as the resolution weighted average heat energy of the two NTDs and the ionization signal as the averaged fiducial ionization energy of events coming from the fiducial electrodes. These two observables, named \(E_{\mathrm{heat}}\) and \(E_{\mathrm{ion}}\) in the following, are both in units of \(\mathrm{keV}_{\mathrm{ee}}\), as they have been calibrated to fiducial electron recoils from a \(^{133}\mathrm{Ba}\) \(\gamma \)-source. A signal on one of the two veto electrodes, i.e. \(E_{\mathrm{veto}}\), is only used to reject surface events.

The data analysed in this work was taken between July 2014 and April 2015, when 24 FID800 detectors with full read-out were installed in the cryostat. WIMP search data was recorded for a total of 161 live days, together with an additional 25 days of calibration data from \(^{133}\text {Ba}\) \(\gamma \)- and AmBe neutron-sources. As we are searching for WIMP signals at very low energies, a proper understanding of the trigger for such events is crucial. Out of the 24 installed detectors, eight were selected [1] because of their low trigger thresholds and good noise conditions. Only hourly periods satisfying requirements on the FWHM baseline resolution of the heat and ionization channels (\(\mathrm{FWHM}_{\mathrm{heat}} < 1\,\mathrm{keV}_{\mathrm{ee}}\) and \(\mathrm{FWHM}_{\mathrm{ion}} < 0.7\,\mathrm{keV}_{\mathrm{ee}}\)) are considered, leading to a live-time ranging from 95 to 133 days per detector. More details on the trigger condition and the applied quality cuts, as well as the corresponding efficiencies, are given in [1]. The RoI for WIMP search is chosen such that the recoil spectrum for WIMPs with masses up to \(m_\chi \approx 30\,\mathrm{GeV/}c^2\) is included with good efficiency of up to 60 %. We define the RoI by requiring the ionization energy to be \(0 \le E_{\mathrm{ion}} \le 15\,\mathrm{keV}_{\mathrm{ee}}\) and the heat energy of events to be below \(15\,\mathrm{keV}_{\mathrm{ee}}\). The analysis threshold in heat energy \(E_\text {heat}^\text {min}\) varies from detector to detector and depends on the efficiency of the online trigger. The EDELWEISS DAQ-system triggers events on each of the two heat channels, with a trigger threshold adapted on the scale of a few minutes to the instantaneous noise conditions. To ensure a high signal efficiency while minimizing systematic effects we define the analysis threshold in heat as the corresponding energy, for which the livetime-averaged trigger efficiency for each individual detector is above 80 %. This leads to values ranging from \(E_\text {heat}^\text {min} = 0.9\,\mathrm{keV}_{\mathrm{ee}}\) for the best detector up to \(E_\text {heat}^\text {min} = 1.5\,\mathrm{keV}_{\mathrm{ee}}\) for the one with lowest efficiency.

Fig. 1
figure 1

WIMP search data in the RoI accumulated in eight selected detectors with a fiducial exposure of \(496\,\text {kg-days}\) in ionization vs. heat energy (black markers). Events before the fiducial cut and in the extended energy range are shown as gray points. Coloured lines indicate the detector-averaged positions that are expected for different background components depending on their ionization yields and collection voltage biases (see text). From top to bottom electron recoils from tritium decay as well as Compton and cosmogenic gammas in the fiducial volume (blue), surface gammas (dashed blue), nuclear recoils from neutron scattering (magenta), surface betas (dashed green) and \(^{206}\text {Pb}\)-recoils (dashed brown). Heat-only events have only noise on the ionization channels and no ionization signal on average (red). The coloured contour indicates an \(m_\chi = 10\,\mathrm{GeV/}c^2\) WIMP signal

The last cut applied on the data is the selection of fiducial events, which is of particular importance for this analysis. The two observables considered in the likelihood analysis are the heat and ionization energies. To select unambiguous fiducial events only, we require the signal on each of the two veto electrodes to be within \(\pm 1.64\,\sigma _{\mathrm{veto}}\) of the energy-dependent Gaussian noise, where \(\sigma _{\mathrm{veto}}\) increases with increasing fiducial ionization energy \(E_{\mathrm{ion}}\) as described in [13]. The cut with a total acceptance of 81 % was chosen after initial tests on a subsample of data and combines a strong discrimination of surface events with an acceptable signal efficiency loss compared to the BDT analysis [1]. The fiducial efficiency was determined for each detector from the homogeneously distributed decay of cosmogenically activated isotopes in the crystal: the number of K-shell electron capture (EC) events giving a peak triplet at \(10\,\mathrm{keV}\) was fitted with and without fiducial cut. The resulting effectiveFootnote 1 fiducial masses found with this method vary between 508 and \(562\,\mathrm{g}\) for the eight detectors. The data after all cuts on data quality and noise periods, before and after the application of the fiducial cut is shown in Fig. 1. The total fiducial exposure is \(496\,\text {kg-days}\). Different event populations can be observed and, at higher energies, are distinguishable via their ionization yield Q. Surface events from \(\gamma \)- and \(\beta \)-radiation as well as \(^{206}\text {Pb}\)-recoils can be efficiently rejected with the fiducial cut. Remaining events between the electron recoils and heat-only populations are natural WIMP candidates. A likelihood analysis can assess the probability of these events to be of signal or background origin. As input to the likelihood model a probability density function (PDF) describing each different component is required.

3 Modelling of the signal and background components

Thanks to the effective shielding scheme of the EDELWEISS experiment, the residual background mainly originates from radioactive materials inside the cryostat such as connectors, holding structure and detector copper casings as well as from decays of cosmogenically activated isotopes within the detectors [14, 15]. Each background component is modelled with a data-driven approach: unblinded data from outside the region of interest (sideband data), acquired in the same WIMP run, are fitted and extrapolated to the low-energy RoI considered in the analysis. In order to construct a likelihood model describing the data for each of the eight detectors, a PDF is calculated for each different background component i. This PDF \({\mathcal {P}}_{\mathrm{i}}\) describes a recoil spectrum \(\rho _{\mathrm{i}}(E_{\mathrm{r}})\) in the two observables heat and ionization energy. It takes into account the ionization yield \(Q_{\mathrm{i}}\) for each background, the efficiency of the trigger on the heat channel \(\varepsilon (E_{\mathrm{heat}})\) and the efficiency of the fiducial cut \(\varepsilon ^\mathrm{fid}(E_{\mathrm{r}})\) as well as a Gaussian smearing due to the degraded, energy-dependent resolutions \(\sigma _{\mathrm{heat}}\) and \(\sigma _{\mathrm{ion}}\) of a given detector. In the energy range covered by this analysis, the intrinsic widths of the \(Q_{\mathrm{i}}\)-distributions of the different populations are small compared to the effect of \(\sigma _{\mathrm{heat}}\) and \(\sigma _{\mathrm{ion}}\), and are neglected. Before normalization, the PDF can be written as:

$$\begin{aligned}&{\mathcal {P}}_{\mathrm{i}} (E_{\mathrm{heat}}, E_{\mathrm{ion}}) = \frac{ \varepsilon (E_{\mathrm{heat}}) }{ 2 \pi \sigma _{\mathrm{heat}} \sigma _{\mathrm{ion}} } \int _{0}^{\infty } \mathrm{d}E_{\mathrm{r}} \, \rho _{\mathrm{i}}(E_{\mathrm{r}}) \, \varepsilon _{\mathrm{i}}^\mathrm{fid}(E_{\mathrm{r}}) \nonumber \\&\quad \times \;\exp \left[ - \frac{(E_{\mathrm{heat}}- f_{\mathrm{i}}(E_{\mathrm{r}}))^2 }{2 \sigma _{\mathrm{heat}}^2} - \frac{ (E_{\mathrm{ion}}-Q_{\mathrm{i}} \cdot E_{\mathrm{r}} )^2 }{2 \sigma _{\mathrm{ion}}^2}\right] \end{aligned}$$
(1)

where the function \(f_{\mathrm{i}}(E_{\mathrm{r}})\) allows to calculate the observed heat signal of a given recoil energy. It includes the additional heating via the Neganov–Luke effect [16, 17], produced by the scattering of charges which are collected by electrodes with a differential voltage U (in volts):

$$\begin{aligned} f_{\mathrm{i}}(E_{\mathrm{r}}) = \frac{1 + Q_{\mathrm{i}}(E_{\mathrm{r}}) \frac{U_{\mathrm{i}}}{3} }{1 + \frac{U_{\mathrm{ref}}}{3}} \cdot E_{\mathrm{r}} \end{aligned}$$
(2)

The selected detectors have an electric potential of \(U_{\mathrm{ref}} = 8\,\mathrm{V}\) between the fiducial electrodes and bulk ER-events were used to calibrate the energy scale of all heat and fiducial ionization channels. For charges created in the near-surface volume, the Neganov–Luke contribution to the heat energy is smaller, due to the reduced potential of only \(5.5\,\mathrm{V}\) between fiducial and veto electrodes. It reduces the measured heat energy for surface events, as can be seen in Fig. 1 for the group of 10 keV cosmogenic peaks at the surface which are observed at \(E_{\mathrm{heat}} \approx 7.7\,\mathrm{keV}_{\mathrm{ee}}\). The measured average value of \(Q_{\mathrm{i}}\) for those events is 0.9. For surface backgrounds from \(\beta \)’s and \(^{206}\mathrm{Pb}\)-recoils as well as so-called heat-only events, the spectrum in heat energy was directly extracted from sideband data. For those components the smearing in heat energy is already included and the PDF before normalization can be directly expressed as:

$$\begin{aligned} {\mathcal {P}}_{\mathrm{i}} (E_{\mathrm{heat}}, E_{\mathrm{ion}})= & {} \frac{ \varepsilon (E_{\mathrm{heat}}) }{ \sqrt{2 \pi } \sigma _{\mathrm{ion}} } \, \rho _{\mathrm{i}}(E_{\mathrm{heat}}) \, \varepsilon _{\mathrm{i}}^\mathrm{fid} \left( f_{\mathrm{i}}^{-1}\left( E_{\mathrm{heat}} \right) \right) \nonumber \\&\times \; \exp \left[ - \frac{ (E_{\mathrm{ion}}-Q_{\mathrm{i}} \cdot f_{\mathrm{i}}^{-1}\left( E_{\mathrm{heat}} \right) )^2 }{2 \sigma _{\mathrm{ion}}^2} \right] \nonumber \\ \end{aligned}$$
(3)

where the average measured value of \(Q_{\mathrm{i}}\) for surface \(\beta \)’s and \(^{206}\mathrm{Pb}\)-recoils are 0.4 and 0.1, respectively. The suppression of surface events via the fiducial cut decreases at low energies due to the finite resolution of the ionization channels: the veto energy of a surface event can be smaller than the noise on the veto electrode, thus the event will not be rejected. To a small extent, our data selection is therefore polluted by surface events with heat energies just above the analysis threshold. To build the PDF for these events we take into consideration the efficiency \(\varepsilon ^\mathrm{fid} (E_{\mathrm{r}})\) as a function of recoil energy. For surface events the survival probability after the fiducial cut is highly reduced, as is shown in Fig. 2 for different background components. It is calculated for each of the detector sides considering the baseline resolution \(\sigma _{\mathrm{veto}}\) and the measured energy \(E_{\mathrm{veto}}\) of the corresponding veto electrode:

$$\begin{aligned} \varepsilon ^\mathrm{fid}_{\mathrm{surf}} (E_{\mathrm{r}})= & {} \frac{1}{ \sqrt{2 \pi } \sigma _{\mathrm{veto}} } \int _{-1.64 \sigma _{\mathrm{veto}}}^{+1.64 \sigma _{\mathrm{veto}}} \text {d}E_{\mathrm{veto}} \nonumber \\&\times \; \exp \left[ - \frac{(E_{\mathrm{veto}} - Q_{\mathrm{i}} E_{\mathrm{r}})^2}{2 \sigma _{\mathrm{veto}}^2} \right] \end{aligned}$$
(4)

For events originating in the bulk of the crystal, no signal is measured on the veto electrodes and only noise is reconstructed. The efficiency of the fiducial cut is \(\varepsilon ^\mathrm{fid}_{\mathrm{bulk}} = 81\,\%\) as described in Sect. 2. The fraction of surface nuclear recoils leaking into the acceptance below \(5\,\mathrm {keV}\) (Fig. 2), and increasing further the WIMP efficiency, is neglected. With the definition of the PDF mentioned above, the WIMP signal and the following background components can be fully described.

Fig. 2
figure 2

Survival probability for different surface background components on the top side of detector FID824 as a function of initial recoil energy: electron recoils (blue) from Compton and cosmogenic \(\gamma \)’s as well as \(\beta \)’s from the tritium beta decay, nuclear recoils from neutrons or WIMPs (violet), \(\beta \)’s (green) and \(^{206}\mathrm{Pb}\)-recoils (brown) from external radioactivity. The dashed part of the efficiency curves is below the analysis threshold \(E_{\mathrm{heat}}^\mathrm{min} = 0.9\,\mathrm{keV}_{\mathrm{ee}}\) in heat energy (see Eq. 2). Only surface events from \(\beta \)’s and \(^{206}\mathrm{Pb}\)-recoils are considered in the analysis as they have a significant contribution after all cuts. For bulk events, the survival probability (or efficiency for the WIMP signal) is considered as approximately constant (black line), neglecting the leakage of surface WIMPs into the acceptance

3.1 WIMP signal

A signal PDF is constructed for each WIMP mass \(m_\chi \) independently, using Eq. 1. The parametrization for the ionization yield \(Q_{\mathrm{NR}}\) for nuclear recoils has been validated to a precision of 5 % using neutron calibration data taken during the same run. In the description of the signal PDF, \(Q_{\mathrm{NR}}\) is a nuisance parameter and constrained with its systematic uncertainty. The recoil spectrum for the scattering of WIMPs on natural germanium with an average of \(A = 72.6\) nucleons is calculated following [18]. For all astrophysical parameters we use values corresponding to the Standard Halo Model (SHM), i.e. \(\rho _{\mathrm{DM}}^\mathrm{local} = 0.3\,\mathrm{GeV}/\mathrm{c}^2/\mathrm{cm}^3\), \(v_{\mathrm{0}} = 220\,\mathrm{km/s}\), \(v_{\mathrm{earth}} = 230\,\mathrm{km/s}\) and \(v_{\mathrm{esc}} = 544\,\mathrm{km/s}\). With the cuts described in Sect. 2 a potential WIMP signal is reduced to \({\sim }60\,\%\) for \(m_\chi = 30\,\mathrm{GeV/}c^2\). Detector FID824 has the highest sensitivity for a WIMP signal due to its good baseline of the heat channel and the resulting low heat threshold \(E_{\mathrm{heat}}^\mathrm{min} = 0.9\,\mathrm{keV}_{\mathrm{ee}}\). For this detector, the signal fraction after cuts decreases to \(2 \cdot 10^{-4}\) for a \(m_\chi = 4\,\mathrm{GeV/}c^2\) signal but is above 1 % for masses \(m_\chi > 5\,\mathrm{GeV/}c^2\).

Fig. 3
figure 3

Heat energy spectrum of events with negative ionization energy (“heat-only”) for detector FID824 after quality cuts. Below \(E_{\mathrm{heat}} \approx 1\,\mathrm{keV}_{\mathrm{ee}}\) the data is dominated by random triggers on noise fluctuations, the efficiency of the heat trigger is given for reference (gray). The spectrum is modelled with a Kernel density estimation function (orange). Of the 63,400 events in the sideband, 5386 are above the 80 % efficiency analysis threshold of \(E_{\mathrm{heat}}^\mathrm{min} = 0.9\,\mathrm{keV}_{\mathrm{ee}}\) for this detector (dashed line). Inset: ionization energy spectrum of events above the analysis threshold fitted with a gaussian function of fixed width. The fitted mean of \(0.021 \pm 0.005\,\mathrm{keV}_{\mathrm{ee}}\) results in a systematic error of \(14.9\,\%\) for the expected number of events in the RoI

3.2 Heat-only events

The dominant background in the EDELWEISS-III low-energy data are heat-only events. They are present in all detectors with different intensity and constitute between 85 and 95 % of the events in the RoI after all cuts. For those events the data acquisition was triggered by a clear signal on one or both NTD heat sensors, while only noise can be seen on each of the four ionization channels, and the signals of the two NTDs are compatible. The heat energy spectrum of those events shows an exponential decrease (e.g. Fig. 3) for all detectors and overlaps with randomly triggered noise fluctuations near the heat threshold of a detector. The variation of the heat-only event rate shows a common behaviour for all detectors: a simultaneous burst of the rate which coincides with a period of unstable operating conditions due to the cryogenic system followed by an exponential decay with a time constant of around 20 days which is not compatible with any of the known radioactive isotopes in the setup. A particle origin, e.g. from \(^{206}\mathrm{Pb}\)-recoils absorbed in one of the electrodes and producing no ionization signal, can be excluded due to the high rate and temporal behaviour. Internal radiation within the NTD heat sensors is rejected by a cut requiring a coincident signal in both NTDs described in [1]. The source of heat-only events is yet unknown, but possible explanations are the creation of phonons from friction of the detector with the holders, or stress near the NTD gluing spot. Several strategies are pursued to identify the origin of those events and to significantly reduce them in future runs. We use the sideband with negative ionization energy to model heat-only events in the RoI. In the absence of a theory to describe the shape of the heat-only energy spectrum, we use a Kernel density estimation (KDE) function of the data in this sideband to model this background. The ionization energy spectrum has a gaussian shape with a width given by the average baseline noise for the ionization channels. Fitting the distribution of sideband data in \(E_{\mathrm{ion}}\) with a gaussian indicates a small possible shift of the mean with respect to \(E_{\mathrm{ion}} = 0\,\mathrm{keV}_{\mathrm{ee}}\). That shift is only statistically significant for some of the detectors and is related to a small fraction of \({<}1\,\%\) uncorrected cross-talk between heat and fiducial ionization channels. The effect of a possible shift on the number of expected events for this background is taken as a systematic uncertainty and ranges between 0.4 and 14.9 % and is considered in the constraint for this background. For the most sensitive detector, FID824, Fig. 3 shows heat and ionization energy spectra of the sideband data with the respective models. In principle, the heat-only sideband can be contaminated by underfluctuations of the ionization energy from low-energy event populations with small ionization yield, such as \(^{206}\mathrm{Pb}\)-recoils and \(\beta \)-particles. Considering the low number of expected events for these components (\(\mathcal {O}(10)\) events above the analysis threshold per detector) compared to the high rate of heat-only events, the effect on the extracted spectrum is negligible. It was also checked that the number of events for a possible WIMP signal of mass \(m_\chi \) in the heat-only sideband is negligible for the cross section excluded in the following. The heat-only sideband data (\(E_{\mathrm{ion}} < 0\)) and modelled PDF in the RoI (\(E_{\mathrm{ion}} > 0\)) are shown in Fig. 4, together with WIMP signals for two different masses.

3.3 Electron recoils from Gammas and Betas

The energy spectrum of electron recoils in the fiducial volume up to \(15\,\mathrm{keV}_{\mathrm{ee}}\) consists of a set of peaks on top of a continuous component. This component is due to the Compton scattering of gamma rays from external radioactive sources and to betas from the decay of \(^{3}\mathrm{H}\) inside the detectors [19]. The observed peaks are produced by mono-energetic gammas from electron capture reactions within the crystal and result from the activation of different isotopes due to cosmic rays or neutron calibration. The intensity of these peaks is different for each detector and depends on its age and exposure to cosmic rays before installation underground. In the energy range between \(5\,\mathrm{keV}_{\mathrm{ee}}\) and \(7.7\,\mathrm{keV}_{\mathrm{ee}}\) X-rays from the K-shell EC of the isotopes \(^{49}\mathrm{V}\) (\(E = 4.97\,\mathrm{keV}\)), \(^{51}\mathrm{Cr}\) (\(5.46\,\mathrm{keV}\)), \(^{54}\mathrm{Mn}\) (\(5.99\,\mathrm{keV}\)), \(^{55}\mathrm{Fe}\) (\(6.54\,\mathrm{keV}\)), \(^{56,57,58}\mathrm{Co}\) (\(7.11\,\mathrm{keV}\)) and \(^{56}\mathrm{Ni}\) (\(7.71\,\mathrm{keV}\)) are included in the fit as potential peaks. Around \(10\,\mathrm{keV}_{\mathrm{ee}}\) a triplet of \(^{65}\mathrm{Zn}\) (\(8.98\,\mathrm{keV}\)), \(^{68}\,\mathrm{Ga}\) (\(9.66\,\mathrm{keV}\)) and \(^{68}\mathrm{Ge}\) (\(10.37\,\mathrm{keV}\)) can be resolved, which has corresponding L-shell peaks at 1.10, 1.19 and \(1.30\,\mathrm{keV}\) (Fig. 5, light blue). While the K-shell peaks are well separated from a WIMP signal in the analysis parameter space, the 3 L-shell peaks can have significant overlap with a signal for the lowest WIMP masses probed. Depending on the analysis threshold \(E_\text {heat}^\mathrm{min}\) of each detector, the fraction of those peaks in the RoI can vary significantly, from almost full coverage to only a tail of the gaussian peak. With the known L/K-shell ratio of 11 % [20] and the calculated peak fraction above threshold we extrapolate the rate of L-shell X-ray events. For this we perform a sideband fit of fiducial events in the electron recoil band \(3\,\mathrm{keV}_{\mathrm{ee}}< E_{\mathrm{heat}}, E_{\mathrm{ion}} < 30\,\mathrm{keV}_{\mathrm{ee}}\) with a separate likelihood model including all K-shell peaks, Compton gammas and tritium \(\beta \)-events. We find the extrapolated rate of tritium decay for each detector to be in agreement within uncertainties with the rates found in [19]. Systematic uncertainties for all ER-components in the RoI are propagated from the errors of this sideband fit and are typically \(\mathcal {O}(30\,\%)\).

Fig. 4
figure 4

Sideband data (\(E_{\mathrm{ion}} < 0\,\mathrm{keV}_{\mathrm{ee}}\), data bins) and modelled heat-only PDF in the RoI (\(E_{\mathrm{ion}} > 0\,\mathrm{keV}_{\mathrm{ee}}\)) for detector FID824. The regions containing 10 %/90 % of the signal density for a WIMP with \(m_{\chi } = 5\,\mathrm{GeV/}c^2\) and \(m_{\chi } = 10\,\mathrm{GeV/}c^2\) are shown as dashed purple and solid magenta contours, respectively. The dashed black line indicates the analysis threshold in heat energy for this detector. Any signal contribution to the sideband can be considered negligible for the cross sections probed in this analysis

3.4 Unrejected surface events

At higher energies, the fiducial cut allows the rejection of all surface events, as they would induce a clear signal \(E_{\mathrm{veto}}\) on one of the two veto electrodes. For low ionization energies however, the rejection can fail. If the ionization energy of a surface event is low enough, so that \(E_{\mathrm{veto}} < 1.64 \, \sigma _{\mathrm{veto}}\), the event passes the cut. For particle types with low ionization yield \(Q_{\mathrm{i}}\), the produced ionization energy is smaller, and therefore less charge is collected on the veto electrodes to reject surface events. The surface events in this analysis are mostly \(^{206}\mathrm{Pb}\)-recoils and \(\beta \)-particles originating from the \(^{238}\mathrm{U}\) decay chain of surrounding materials such as \(^{222}\mathrm{Rn}\) daughter isotopes [15]. Those particles have a small penetration depth or even scatter on the crystal surface. Another possible component would originate from the electron recoils described in Sect. 3.3, which are also produced in the near-surface volume. However, due to their high ionization yield of \(Q_{\mathrm{ER}} \approx 1\), the rejection of these surface events above the heat threshold \(E_{\mathrm{heat}}^\mathrm{min}\) is very efficient: the expected number of events in the RoI after the applied fiducial cut was calculated to be well below \(10^{-2}\) for all detectors and these events are therefore not considered in the analysis. For both \(\beta \)’s and \(^{206}\mathrm{Pb}\)-recoils, the spectrum in heat energy is extracted from a clear selection of surface events with energies \(E_{\mathrm{veto}} > 5\,\sigma _{\mathrm{veto}}\) and then extrapolated to the lower heat threshold within the RoI. The ionization yield of the events is fitted from the same sideband data. We do not include any uncertainty on the fitted \(Q_i\) as it is negligible with respect to the smearing due to the energy resolutions. Both energy spectra and ionization yield are determined for top and bottom surface of each detector independently.

Fig. 5
figure 5

Energy spectra in the two observables \(E_{\mathrm{heat}}\), \(E_{\mathrm{ion}}\) for single-scattering events in the fiducial volume of detector FID824 passing all quality cuts. The projection of the best fit PDF for an individual fit to this data is shown in orange. The fitted signal component for \(m_{\chi } = 4\,\mathrm{GeV/}c^2\) is zero. All background components are scaled to their corresponding fit values: heat-only (red), Compton gammas (dark blue), tritium \(\beta \)’s (turquoise), cosmogenic K and combined L-shell peaks (light blue), \(\beta \)-events (green) and \(^{206}\text {Pb}\)-recoils (brown). Top distribution in heat energy, which is dominated by the exponential heat-only spectrum at energies near the analysis threshold (dashed black line). Bottom distribution in ionization energy showing a clear separation between the Gaussian heat-only noise around \(0\,\mathrm{keV}_{\mathrm{ee}}\) and the electron recoil background

3.5 Nuclear recoils from neutrons

Neutron background can mimic a WIMP signal, as neutrons can produce single scatter nuclear recoils with the same ionization yield \(Q_{\mathrm{NR}}\) as WIMPs, according to an exponential energy spectrum. We distinguish between two different sources of neutrons in our detectors: muon-induced and radiogenic neutrons. Simulations showed that in the energy range of this analysis, the number of single scattering neutrons induced by muons is compatible with zero after vetoing [21]. For radiogenic neutrons coming from radioactivity due to (\(\alpha \), n) reactions and spontaneous fissions within the cryostat, Monte Carlo simulations have been performed with all known sources to derive their energy spectrum down to the lowest energies. The spectral shape of the radiogenic neutron background shows little dependence on the exact location of individual sources and can be fitted and parametrized by a double exponential law in the energy range of 2–20 \(\mathrm{keV}_{\mathrm{nr}}\), calibrated for nuclear recoil interactions. The normalization of the spectrum is derived from data taken with 17 detectors during the same EDELWEISS-III physics run. In the energy range of 10–100 \(\mathrm{keV}_{\mathrm{nr}}\), nine multiple scattering events are found in the \(90\,\%\,\mathrm{C.L.}\) nuclear recoil band for a fiducial exposure of 1309 kg-days. This number cannot be reproduced with the simulation of all known sources and hints at an additional neutron source in the experiment. The Monte Carlo simulation however is able to reproduce the measured single-over-multiple-ratio within uncertainties. We derive the normalized neutron spectrum for each detector by weighting it with corresponding exposure in the present data set, as well as the single-over-multiple-ratio of 0.45 from simulations. After all cuts and efficiency corrections, the expected background from single scatter neutrons in the RoI is similar for all detectors and has an average value of \(\mu ^\mathrm{exp}_{\mathrm{neutron}} = 0.20 \pm 0.07\,\mathrm{events}\) (Table 1). Expected rates for individual detectors have a combined uncertainty of \(45\,\%\) coming from the single-over-multiple ratio uncertainty and the statistical error from the measurement of multiples.

Table 1 Rate of expected events for different types of backgrounds for detector FID824 and all detectors combined. Event rates for components of the same type have been summed up for demonstration purposes only with propagated systematic errors. During fitting all components are considered as separate PDFs with individual constraints. The background model is clearly dominated by heat-only events

With respect to the BDT based analysis [1], most of the background components listed above are identical. Deviations are mainly related to the different fiducial cut and the resulting survival probability of background components. The preselection applied before the BDT analysis accepts more surface beta and gamma events than the present stricter fiducial cuts, leaving the BDT a larger population of these events to optimize its multi-parametric selection. The present fiducial cut effectively removes most of them. For the same reason we do not include so-called triple events with a signal on both fiducial and one veto electrode. Lastly we intentionally differentiate between bulk events from Compton \(\gamma \)’s and tritium \(\beta \)’s as two separate components in the likelihood analysis although their energy spectra are approximately degenerated in the RoI. An overview of the expected event rates summarized for different types of backgrounds is given in Table 1. The total background for both detector FID824 and all detectors combined is within 1–2 % agreement with the observed number of events.

4 Profile likelihood analysis

With the model of the different background components i as PDFs \({\mathcal {P}}_{\mathrm{i}}(E_{\mathrm{heat}}, E_{\mathrm{ion}})\), we can now define for each detector a total PDF \({\mathcal {P}}_{\mathrm{tot}}\) as the sum of all backgrounds and the signal for a given WIMP mass \(m_\chi \):

$$\begin{aligned} {\mathcal {P}}_{\mathrm{tot}} (\sigma , \mathbf {\mu } \mid m_\chi ) = \frac{1}{\nu } \left[ \mu _{\chi } {\mathcal {P}}_{\chi }(m_\chi ) + \sum _{\mathrm{i}} \mu _{\mathrm{i}} {\mathcal {P}}_{\mathrm{i}} \right] \end{aligned}$$
(5)

where the combined fitted rate of all components is \(\nu = \mu _\chi + \sum _{\mathrm{i}} \mu _{\mathrm{i}}\), while the number of observed events is N. The rate of WIMP events \(\mu _\chi (\sigma )\) is a function of the parameter of interest, the WIMP-nucleon cross section \(\sigma \), and is proportional to the integrated signal in the RoI. All other fit parameters are the event rates \(\mathbf {\mu }\) for the different background components i. The shape of all PDFs is fixed and given by the detector resolutions of heat and ionization energies, as well as the ionization yield and energy spectra for each component. Each of the background event rates \(\mu _{\mathrm{i}}\) is a nuisance parameter and constrained in the fit with a Gaussian constraint term. The expected rate \(\mu _{\mathrm{i}}^\mathrm{exp}\) for each background is calculated by integrating its unnormalized PDF, while the width \(\sigma _i\) of the constraint term is given by the combined statistical and systematic error on this value and varies between all backgrounds and detectors. With this information, we can for each detector construct the extended likelihood function in heat and ionization energies:

$$\begin{aligned}&\mathcal{L} \left( \sigma , \mathbf {\mu } \mid m_\chi \right) = \prod _{\mathrm{n=1}}^{N} \mathcal {P}_{\mathrm{tot}} \left( E_{\mathrm{heat}}^\mathrm{n}, E_{\mathrm{ion}}^\mathrm{n} \right) \nonumber \\&\quad \times \; \prod _{\mathrm{i}} \mathrm{Gauss} \left( \mu _{\mathrm{i}} \mid \mu _{\mathrm{i}}^\mathrm{exp}, \sigma _{\mathrm{i}} \right) \times \mathrm{Poisson} \left( N \mid \nu \right) \end{aligned}$$
(6)

From these likelihood functions with detector specific PDFs and constraint terms, we can construct a joint likelihood function describing the data for all eight detectors combined:

$$\begin{aligned} \mathcal{L}_{\mathrm{comb}} (\sigma , \mathbf {\mu }_{1},\ldots ,\mathbf {\mu }_{8} \mid m_\chi ) = \prod _{j=1}^{8} \mathcal{L}_{\mathrm{j}} (\sigma , \mathbf {\mu }_{\mathrm{j}} \mid m_\chi ) \end{aligned}$$
(7)

Here, each detector has its own, independent background and signal PDFs, as well as nuisance parameters and constraint terms. The only common fit parameter shared by all likelihood terms is the WIMP-nucleon scattering cross section \(\sigma \). If the fit for a given WIMP mass results in a maximum likelihood estimator (MLE) for the cross section \({\hat{\sigma }}\) compatible with zero within errors, we can set an exclusion limit. For this we follow [22] and perform a hypothesis test based on the profile likelihood ratio \(\lambda (\sigma )\):

$$\begin{aligned} \lambda (\sigma ) = \frac{\mathcal{L} (\sigma , {\hat{\mathbf {\mu }}}^\prime )}{\mathcal{L} ({\hat{\sigma }}, {\hat{\mathbf {\mu }}})} \end{aligned}$$
(8)

where \({\hat{\mathbf {\mu }}}^\prime \) are the MLE of the nuisance parameters when maximizing the likelihood for a fixed value of the cross section \(\sigma \). The test statistics \(q_\sigma \) used to reduce the data to a single value is defined as:

$$\begin{aligned} q_\sigma = {\left\{ \begin{array}{ll} {-2}\ln \lambda (\sigma ) &{} \sigma \ge {\hat{\sigma }} \\ 0 &{} \sigma < {\hat{\sigma }} \end{array}\right. } \end{aligned}$$
(9)

The probability distribution functions \(f(q_\sigma {\mid } H_\sigma )\) and \(f(q_\sigma {\mid } H_0)\) under the signal hypothesis \(H_\sigma \) and the background-only hypothesis \(H_0\) are used to find the cross section \(\sigma \) for which \(H_\sigma \) can be excluded at 90 % C.L. while correcting for downward fluctuations of the background following the prescription in [23]. The parametrisation of the probability distribution functions \(f(q_\sigma {\mid } H_\sigma )\) and \(f(q_\sigma {\mid } H_0)\) with an approximation as described in [22] was found to yield limits with a C.L. less than 90 % for some of the probed WIMP masses. We use Monte Carlo generated datasets to derive all upper limits. Calculations of limits are performed with the RooStats-package [24], based on the RooFit [25] framework with which all PDFs and likelihood functions are constructed.

5 Results

The likelihood fit of the selected data after all cuts is a two-step process: first, an unbinned fit to the data of the eight detectors is performed independently, using the constrained likelihood functions \(\mathcal{L}_{\mathrm{j}}\) described in Eq. 6. It allows to find the best fit values for all of the nuisance parameters describing the different backgrounds for each detector. With these values as starting point, a combined fit over all detectors is performed, in which the individual signal PDFs \({\mathcal {P}}_\chi (m_\chi )\) share a common WIMP-nucleon scattering cross section \(\sigma \). The result of the individual fit of a WIMP signal with \(m_\chi = 4\,\text {GeV}\) to data of detector FID824 is shown in Fig. 5. Fitted signal rates for three different WIMP signal masses are given in Table 2. For FID824, which is the most sensitive detector due to its low energy threshold \(E_{\mathrm{heat}}^\mathrm{min} = 0.9\,\mathrm{keV}_{\mathrm{ee}}\), no signal is fitted for any of the probed WIMP masses \(m_{\chi } \in [4, 30]\,\mathrm {GeV}\). The same is valid for detector FID839 for all WIMP masses. For several other detectors (FID827, FID838, FID841 and FID842) a strong degeneracy between the WIMP signal and heat-only events is observed. This degeneracy is only present up to WIMP masses of around \(10\,\mathrm{GeV/}c^2\). Therefore, a signal is fitted for these detectors while the rate of heat-only events is underestimated by the same magnitude. However, the resulting signals are always associated with large uncertainties, and they are compatible with zero within \(2\,\sigma \). Three detectors (FID825, FID837 and FID838) have events above \(E_\text {heat} = 2\,\mathrm{keV}_{\mathrm{ee}}\) which are close to the ionization yield \(Q_{\mathrm{NR}}\) expected for nuclear recoils, as shown in Fig. 1. For these events a degeneracy between WIMP signal and neutron component is observed, depending on the WIMP mass \(m_\chi \). For masses above \(m_\chi \approx 10\,\mathrm{GeV/}c^2\) those events are better described by the signal component, as the expected rate of neutrons is constrained to much lower average values. The resulting excess is between 1 and 3 signal events which are fitted, depending on the detector and in all cases compatible with no signal. The best fit rates for all other backgrounds are in good agreement with the values expected from their constraint terms, independent of the probed WIMP mass. In the combined fit over all detectors with a common signal cross section \(\sigma \), the degeneracies for individual detectors are alleviated and no signal is fitted up to a mass of \(m_\chi = 6\,\mathrm{GeV/}c^2\). For masses above, the aforementioned nuclear recoil candidate events lead to a positive but not significant signal, which peaks at \(m_\chi = 7\,\mathrm{GeV/}c^2\) and then decreases again. Due to the similar exposure and the small influence of the different heat threshold \(E_{\mathrm{heat}}^\mathrm{min}\) for larger mass WIMP signals, the fitted cross section leads to a fairly similar signal rate which is fitted for all eight detectors. Again, all of the fitted signals in the combined fit are compatible with zero.

Table 2 Analysis threshold \(E_{\mathrm{heat}}^\mathrm{min}\) (in \(\mathrm{keV}_{\mathrm{ee}}\)) for all detectors together with the fitted rate \(\mu _\chi \) of signal events for the individual fit of data from single detectors and three different WIMP signal masses. For the combined fit over all detectors with a common signal the total signal rate is given together with the corresponding best fit cross section \(\sigma \)
Fig. 6
figure 6

Calculated \(90\,\%\,\mathrm{C.L.}\) exclusion limit on the spin-independent WIMP-nucleon scattering cross section \(\sigma \) as a function of WIMP mass \(m_\chi \) for the combined fit over all detectors (solid red). The green and yellow band represent the 1 and \(2\sigma \) confidence band of the expected median sensitivity (dashed black). Shown for comparison is the result of the BDT based analysis [1] (dashed red). Contours show possible signals from CDMS-II (Si) [8] (blue), DAMA [26] (brown), CRESST-II [27] (pink) and CoGeNT [7] (orange). Other existing exclusion limits are from EDELWEISS-II [9] (small red dashes), CoGeNT [28] (orange), CRESST [29] (pink), SuperCDMS [5] (purple), XENON100 [30] (black), CDMSlite [31] (dashed violet) and LUX [4] (green)

In the absence of a statistically significant signal for any of the probed WIMP masses, we set 90 % C.L. upper limits for the WIMP-nucleon cross section \(\sigma \) using the profile likelihood based test statistics described in Eq. 9. The resulting exclusion limit is shown in Fig. 6. For masses below \(m_\chi = 6\,\mathrm{GeV/}c^2\) the observed limit is better than the expected median sensitivity due to an underfluctuation of background in the most sensitive detectors. For masses above \(m_\chi = 10\,\mathrm{GeV/}c^2\) the limit is 2–3 sigma above the expected sensitivity, due to the presence of NR candidate events in multiple detectors which are in excess with respect to the expected neutron background. This excess is in good agreement with the observations quoted in [1], as both data sets contain these events. However, the 90 % C.L. limit of the analysis here presented is a factor of seven stronger for \(m_\chi = 4\,\mathrm{GeV/}c^2\) due to the absence of cuts other than the fiducial selection and the resulting higher signal efficiency as well as a subtraction of the expected backgrounds. For higher masses above \({\sim }15\,\mathrm{GeV/}c^2\) limits from BDT and likelihood approaches are in very good agreement. For these masses the nuclear recoil spectrum of WIMPs extends to high energies, where it can be well discriminated from the dominating heat-only background and the electron recoil component. In this case, the BDT analysis can easily separate a signal with high efficiency and there is no gain in performing a likelihood analysis. The \({\sim }15\,\%\) lower exposure due to the stricter fiducial cut with a total acceptance of \(81\,\%\) is compensated by a higher signal efficiency, and the resulting limits are very similar. We also find a good agreement between the ratios of observed and expected exclusion limits for the two analyses, which we consider to be an inherent property of the data and a confirmation of the validity of this approach.

6 Conclusion

We have presented a search for low-mass WIMPs with the EDELWEISS-III experiment, using eight selected detectors and data taken with a total fiducial exposure of 496 kg-days after all cuts. A data-driven approach was used to model relevant backgrounds from sideband data. For each detector a likelihood function describing the data in heat and fiducial ionization energies was constructed, with constraint terms for each of the nuisance parameters taking into account systematic uncertainties. No statistically significant signal was found, neither for the fit of data from single detectors, nor for a combined fit over all detectors with a common signal cross section. Exclusion limits were set with a hypothesis test using a profile likelihood based test statistics, including corrections for under-fluctuations of the background. At 90 % C.L. limit we exclude spin-independent WIMP-nucleon scattering cross sections of \(\sigma = 1.6 \times 10^{-39}\,\text {cm}^2\) (\(6.9 \times 10^{-44}\,\text {cm}^2\)) for a WIMP mass of \(m_\chi = 4\,\text {GeV}\) (\(m_\chi = 30\,\text {GeV}\)). Thanks to the higher signal efficiency and a subtraction of the expected backgrounds, the likelihood analysis shows an improvement of a factor of \({\sim }7\) for \(4\,\mathrm{GeV/}c^2\) WIMPs compared to a BDT based analysis while reproducing the limit at \(15\,\mathrm{GeV/}c^2\) and above. The results and achieved sensitivity underline the power of a maximum likelihood analysis based on detailed background models.