Background modeling for dark matter search with 1.7 years of COSINE-100 data

We present a background model for dark matter searches using an array of NaI(Tl) crystals in the COSINE-100 experiment that is located in the Yangyang underground laboratory. The model includes background contributions from both internal and external sources, including cosmogenic radionuclides and surface $^{210}$Pb contamination. To build the model in the low energy region, with a threshold of 1 keV, we used a depth profile of $^{210}$Pb contamination in the surface of the NaI(Tl) crystals determined in a comparison between measured and simulated spectra. We also considered the effect of the energy scale errors propagated from the statistical uncertainties and the nonlinear detector response at low energies. The 1.7 years COSINE-100 data taken between October 21, 2016 and July 18, 2018 were used for this analysis. Our Monte Carlo simulation provides a non-Gaussian peak around 50 keV originating from beta decays of bulk $^{210}$Pb in a good agreement with the measured background. This model estimates that the activities of bulk $^{210}$Pb and $^{3}$H are dominating the background rate that amounts to an average level of 2.85$\pm$0.15 counts/day/keV/kg in the energy region of (1-6) keV, using COSINE-100 data with a total exposure of 97.7 kg$\cdot$years.

Abstract We present a background model for dark matter searches using an array of NaI(Tl) crystals in the COSINE-100 experiment that is located in the Yangyang underground laboratory. The model includes background contributions from both internal and external sources, including cosmogenic radionuclides and surface 210 Pb contamination. To build the model in the low energy region, with a threshold of 1 keV, we used a depth profile of 210 Pb contamination in the surface of the NaI(Tl) crystals determined in a comparison between measured and simulated spectra. We also considered the effect of the energy scale errors propagated from the statistical uncertainties and the nonlinear detector response at a e-mail: ejjeon@ibs.re.kr b e-mail: yjko@ibs.re.kr low energies. The 1.7 years COSINE-100 data taken between October 21, 2016 and July 18, 2018 were used for this analysis. Our Monte Carlo simulation provides a non-Gaussian peak around 50 keV originating from beta decays of bulk 210 Pb in a good agreement with the measured background. This model estimates that the activities of bulk 210 Pb and 3 H are dominating the background rate that amounts to an average level of 2.85±0.15 counts/day/keV/kg in the energy region of (1-6) keV, using COSINE-100 data with a total exposure of 97.7 kg·years.

Introduction
COSINE-100 is an NaI-based experiment for the direct detection of dark matter particles, with an array of 106 kg NaI(Tl) crystals. It has been operating at the Yangyang underground laboratory (Y2L) since September 2016 [1,2,3]. One of the COSINE-100 goals is to test DAMA/LIBRA's assertion of an observation of annual modulation signal [4,5]. The DAMA/LIBRA collaboration claimed that their results with the release of Phase-II data and 1 keV energy threshold einforces the annual modulation signature at 9.5 σ C.L. in the energy region of (1-6) keV [5]. There are several groups, such as ANAIS [6], PICOLON [7], and SABRE [8], using the same low-background NaI(Tl) crystals with the goal of reproducing the DAMA/LIBRA results.
To verify the DAMA/LIBRA modulation signal, a complete understanding of the background energy spectrum is required. We have developed a background model by performing Monte Carlo (MC) simulations using the Geant4 toolkit [9]. We used 1.7 years of COSINE-100 data taken from October 21, 2016 to July 18, 2018 with a 106 kg array of low background NaI(Tl) crystals. We used the measured spectrum obtained with a threshold of 1 keV electron equivalent energy [10]. To build a complete background model we investigated the low-energy contribution from the surface 210 Pb contamination in the NaI(Tl) crystals [11], in addition to background simulations of radioactive contaminants, such as natural radioisotopes and cosmogenically activated isotopes inside NaI(Tl) and background sources from the exterior of the crystals.

The COSINE-100 experiment
The main detector of COSINE-100 is a 106 kg array of eight ultra-pure NaI(Tl) crystals (named as C1-C8) stacked in two layers. Each crystal's lateral surfaces are wrapped in roughly 10 layers of 250 µm-thick PTFE reflective sheets (Teflon) and they are hermetically encased in copper tubes with wall thickness of 1.5 mm and quartz windows (12.0 mm thick) at each end. The encapsulated NaI(Tl) crystal assembly is equipped with two 3-inch Hamamatsu R12669SEL photomultiplier tubes (PMTs) that are each configured to generate two readouts, a high-gain signal from the anode and a low-gain signal from the 5 th -stage dynode that are acquired in independent channels. This is because R12669SEL PMTs suffer from non-linear behavior when the signal energy is higher than about 1 MeV and, thus, the dynode signal is amplified with a linear response for energies up to 3 MeV [12]. The crystals array is immersed in a 2200liter liquid scintillator (LS) that serves both as an active veto and as a passive shield. Four shielding layers exist comprising plastic scintillator panels, a lead-brick castle, a copper box, and the tank of LS. Figure 1 shows the detector geometry and the details of the experimental setup is described in Ref. [1].
We present here the background modeling to represent 1.7 years of COSINE-100 data with a total effective mass of 61.3 kg from crystals C2, C3, C4, C6 and C7, excluding three crystals due to a high noise rate and low light yields. The five crystals used in this analysis have light yields of about 15 photoelectrons/keV and allowed an energy threshold of 2 keV in the previous analysis [1]. We consider crystal signals when corresponding to at least four photoelectrons and LS signals when exceeding a 20 keV energy threshold [13]. Crystal events are classified as multiple-hit if they are accompanied by a concomitant signal in one or more other crystals or in the LS. Events for which none of the two occurred are classified as single-hit.

Energy calibration
To determine the light characteristics of the crystals, including light yields, energy scales, and energy resolutions, a calibration has been performed using internal β-and γ-ray peaks from radioactive contaminants in the crystals, as well as using γ-ray sources. Regarding two readouts from the PMT, the high-gain anode signals are used for low energy events up to 70 keV, while the 5 th -stage dynode signals are used above this energy. The peaks at 238 keV from 212 Pb, 295 keV and 352 keV from 214 Pb, 1173 keV from 60 Co, 1462 keV from 40 K, 2614 keV from 208 Tl, and 609, 1764, and 2204 keV from 214 Bi are used for the high-energy calibration. The peaks at 0.9 keV from 22 Na, 3.2 keV from 40 K, 25.5 keV from 109 Cd, 30.5 keV from 121 Te, 49 keV from 210 Pb, and 67.8 keV from 125 I are used for the low-energy calibration; peaks from short-lived cosmogenic isotopes (e.g. 125 I and 121 Te) are obtained using the 59.5 day data.
In addition, a nonlinear detector response of NaI(Tl) crystals in the low energy region, as reported in Ref. [14], is studied and modeled empirically as a function of energy, E, as following: where p i=0,1,2,3 are free-floating parameters. The chargeto-energy ratios of the calibration data points for five NaI(Tl) crystals are fitted by Eq. 1, as shown in Fig. 2. Relative Ratio

Background simulations
We use the Geant4-based simulation framework developed for modeling the background spectra of the first 59.5-day of COSINE-100 [15], although we now adopted G4 version 10.4.2. The newer version better describes the non-Gaussian peak around 50 keV from the beta decay of 210 Pb; it is attributed to 46.5 keV emissions of conversion/Auger electrons and γ/X-ray together with about 4 keV mean energy of beta electrons from the decay to the excited state of 210 Bi, which results in a non-Gaussian peak. It was not well reproduced by the simulations using G4 version 9.6.2.
Each simulated event records all energies deposited in the crystals within an event window of 10 µs from the time when a decay is generated, to account for the conditions in the data acquisition system (DAQ) of the experimental setup [7]. Consecutive decays occurring in a short time, such as 212 Bi-212 Po decays with 212 Po's half-life of 300 ns, appear together in a 10 µs time window, resulting in pileup events. They are treated as a single event in the simulation. Based on this framework, we carried out MC simulations for all the possible background sources to build a complete model of the background measurements with 1 keV energy threshold.
The simulated spectrum was convolved with the energy resolution as a function of energy obtained during the calibration, described in Sect. 2.1.

Internal and external backgrounds
With the first 59.5 days of data, the detector background was investigated with simulated background spectra. We simulated full decay chains of 238 U and 232 Th inside the eight NaI(Tl) crystals assuming a chain equilibrium [15]. However, the background levels may vary over the time if the internal activities are not in a chain equilibrium. We indeed found an evident increase in the 228 Th background level during the 1.7 years of data when compared with the 59.5 day data. Consequently, the 238 U and 232 Th decay chains are treated as broken at the long-lived parts of the chain in the simulations, i. e. into five groups and three groups, respectively. The activities of each group are free parameters in the fit to the data spectrum.
We simulated external background sources in the COSINE-100 experiment configuration: PMTs, greases, copper cases, bolts, cables, acrylic supports, liquid scintillator, copper box, and a steel structure that supports the lead block housing. The background due to the 235 U chain from the PMTs, which was also reported by [16] is included. The contribution of γs from the 208 Tl decay in 232 Th decay chains in materials outside the shielding as an environmental background is included to identify the peak at 2.614 MeV. Table 1 lists all the cosmogenic radioactive isotopes produced in the NaI(Tl) crystals in COSINE-100, as reported in Ref. [17], with their half-lives; short-lived isotopes, for which half-lives are less than a year, are 125 I, 121 Te, 121m Te, 123m Te, 125m Te, 127m Te, and 113 Sn and long-lived isotopes are 109 Cd, 22 Na, 3 H, and 129 I. There are three long-lived nuclides namely 3 H, 22 Na, and 109 Cd, which have low energy deposits and are, therefore, potentially troublesome. The beta-decay spectrum of tritium has an endpoint energy of 18 keV. The electron capture decay of 22 Na produces 0.87 keV emissions, and the electron capture decay of 109 Cd contributes peaks at 25.5 keV and around 3.5 keV which are at the binding energies of the Ag K-shell and L-shell electrons. In addition, the electron capture decays of 113 Sn and 121m Te produce an X-ray peak at the L-shell energy of 3 keV. These short-lived (T 1/2 < 1 year) cosmogenic isotopes are not expected to contribute significantly to the crystals in the long term. However, some backgrounds from them are still expected as averaged activities during 1.7 years. It is thus essential to understand their background contributions to the low energy spectra regions, especially in the (1-6) keV dark matter signal region of interest (ROI). We, therefore, simulated backgrounds from cosmogenic radioactive isotopes, listed in Table 1. The simulated background spectral shapes are used in Table 1 Cosmogenic radionuclides in the NaI(Tl) crystals identified in other studies and considered here [17]. We list the contributing cosmogenic isotopes with their half lives.

Cosmogenic
Half-life [ the data fitting, while the background rates are left free in the fit. The fitted results are compared with the measurements reported in Ref. [17] and the details of these comparisons are discussed in Sect. 4.
As the presence of cosmogenic 129 I was introduced by DAMA/LIBRA with the estimated concentration of 129 I/ nat I = (1.7±0.1)×10 −13 [21], we included it in our background fitting model, by treating it as a free parameter. The beta decay of 129 I to 129 Xe * is followed by 129 Xe * transitioning to the stable 129 Xe isotope via the emission of a 39.6 keV γ-ray and the resulted spectral feature has a distribution starting around ∼45 keV; this energy region is dominated by bulk 210 Pb that was found to be a main contributor in ROI in the previous analysis with the 59.5 days data. It is thus necessary to distinguish these background contributions, quantitatively. Figure 3 shows the fitted simulation spectra (a) not including 129 I and (b) including 129 I (green solid line). The inclusion of the contribution from 129 I in Crystal 6, as shown in Fig 3(b), improves the adherence of our background model to the data around 30 to 70 keV. The details of the modeling, including 129 I, for each NaI(Tl) crystal are discussed in Sect. 4.

Surface 210 Pb
The Q value of the beta decay of 210 Pb is 63.5 keV; it decays to the stable 210 Bi with a branching ratio of 16% and decays to the excited state of 210 Bi at 46.5 keV with a branching ratio of 84%. Because the de-excitation of the 210 Bi excited state associates with low-energy emissions of electrons and γ/X-rays, it leaves the full energy deposition with the peak at ∼50 keV if it is positioned deep enough in the crystal, while the spectral features of these events depend on the depth distribution of 210 Pb within the crystal surface. It has also been suggested that the surface 210 Pb are attributed to the 222 Rn contamination that occurred anytime during the powder-and/or crystal-processing stages. To understand the energy spectra from the beta decays of 210 Pb in the crystal surface, we simulated them by generating 210 Pb at random locations within the surface thickness of 1 µm in the crystal. The simulated spectra are depicted in Fig. 4. As expected, there are spectral features in the energy below 40 keV, which are presumably contributed by conversion electrons depending on the depth position of 210 Pb. Therefore, the depth profile of the surface 210 Pb contamination should be taken into account for the detector background in the low-energy region of (1-6) keV.
We have studied the surface 210 Pb contamination with a test setup at Y2L using a NaI(Tl) crystal from the same ingot as C6 and C7 [11]. We measured its depth profile by using the measured spectra from both beta decay of 210 Pb and alpha decay of 210 Po of the decay sequence of the surface 210 Pb contamination that is obtained using a 222 Rn-contaminated crystal, as reported in Ref. [11]. Using this study, it was found that the low-energy spectrum of the surface 210 Pb contamination is primarily attributed to depth profiles of 210 Pb exponentially distributed within a shallow surface with a mean depth of (0.107±0.003) µm, as well as a deep surface with a mean depth of (1.39±0.02) µm. We thus simulated beta decays of 210 Pb according to these two depth profiles and included the shapes in the fit to the data. The rates of the two components are left free and independent in the fit as their relative weight might depend on the 222 Rn exposure. Figure 5 shows the simulated energy spectra of the two surface components for Crystal 7.

Energy scale
We have improved the background modeling in the low energy region by precisely studying the low energy contributions from the background sources such as the surface 210 Pb contamination and long-lived cosmogenic isotopes. However, there is still a little mis-matching between data and MC spectra at low energies. It is presumably because the energy scales are set separately for the anode readout and the dynode readout, based on linear fits of calibration data points. They have errors propagated from the statistical uncertainty, as well as the nonlinear detector response, as described in Sect. 2.
We, thus, consider an adjustment coefficient in the MC spectrum for the energy scale errors.
The energy E in the MC spectrum corresponding to the scaled energy of the dynode readout is adjusted as where is a coefficient that represents a change in energy. The i th bin content of the MC spectrum, B i , can be approximated as where we use a numerical approach to obtain the derivative as where δ represents a very small change in and E i denotes the central value of the i th energy bin. A linear interpolation of the MC spectrum is used for the small variation of δ . The coefficient in Eq. 3 is determined by fitting the MC spectrum to the data. Since there is the nonlinear detector response modeled by the empirical function obtained in Sect. 2, at low energies, adjusting the energy in the MC spectrum corresponding to the scaled energy of the anode readout follows a different procedure from that of the dynode readout, and is expressed by where f (E) is the empirical function defined in Eq. 1 and C is a coefficient for the linear component determined by fitting the MC spectrum to the data. Figure 6 shows the results considering the adjustment coefficient (blue line) and without considering the adjustment coefficient (red line) in the background modeling fit. It is shown that the background modeling has been improved with the adjustment coefficient.

Background modeling and results
In order to model the measured energy spectrum ranged from 1 keV quantitatively, we have performed Geant4 MC simulations for the background spectra, as described in Sect. 3, which are fitted to the measured data to quantify their background rates. We use a binned likelihood method with the following formula [22], where λ( α) is the likelihood ratio in terms of the rates of the MC components α = (α 1 , α 2 , · · · , α Ncomponents ), D i is the number of events in the i th energy bin of the data histogram and B ij is the number of events in the i th bin of the j th simulation component. The last term denotes a penalty for the rate α j of the j th component and is only active if there is an independent measurement of this component; m j and σ j are the measured value and the error, respectively. As mentioned in Sect. 3.4, low-and high-energy data are taken through anode and dynode channels, respectively, and they have different energy resolutions. Thus, we perform a four-channel simultaneous fit: singlehit low-energy, single-hit high-energy, multiple-hit lowenergy, and multiple-hit high-energy spectra. The fitting ranges of low-energy spectra are [6,70] and [1,70] keV for single-and multiple-hit events, respectively, while that of high-energy spectra is [70, 3000] keV for both single-and multiple-hit events. The lower bound of the energy for multiple-hit events is extended to 1 keV based on the study of lowering the energy threshold, reported in Ref. [10]. The lower bound for the single-hit events is Fig. 7 The energy spectra of single-hit (top) and multiple-hit (bottom) events in Crystal 7. The MC was carried out to fit the measured data. The shaded area in the energy spectra of single-hit events is excluded from the data fitting. set to 6 keV in order not to bias the WIMP signal in the ROI. Figure 7 shows the measured and simulated background spectra of Crystal 7 in both low and high energy regions. The spectra for single-hits and multiple-hits are shown in the top and the bottom, respectively. One can see that the 1.7-year data is well reproduced overall except for the energy region higher than ∼2.7 MeV for single-hit events, while it is well reproduced in multiplehit events. This issue is presumed to be due to the absence of one or more components that could better account for the energy range above 2.7 MeV, and the analysis will continue to figure out the issue. The agreement between the measured and fitted background spectra for both single-and multiple-hit events of Crystal-2, 3, 4, and 6 in both low and high energy regions is as good as shown for Crystal 7.
In the modeling fit, to distinguish between surface and bulk 210 Pb components, we used the depth profiles of the surface 210 Pb, obtained in Ref. [11]; there are two depth profiles for shallow and deep depth distributions of 210 Pb in the crystal surface and their fitted results are listed in Table 2.
In Fig. 8(a) and (b), we compared the fitted activities of internal 40 K and 210 Pb to their measured levels that are determined by the 1.7 years of data for the five crystals with an agreement at the ∼20% level. The fitted activities of long-lived cosmogenic isotopes: 3 H, 22 Na, and 109 Cd for the five crystals are compared with the measured ones, reported in Ref [17], as shown in Fig. 8(c),(d),(e); these values are in reasonable agreement. Figure 8(f) also shows the fitted activities of 113 Sn compared with the measured ones for five crystals.
Based on the background model, we found the background levels for the five NaI(Tl) detectors in unit of counts/day/keV/kg in (1-6) keV as listed in Table 2; an averaged background level for the five crystals is estimated to be 2.85±0. 15  energy region of (1-6) keV. The dominant background contributions are from 210 Pb and 3 H. Figure 9 shows the low-energy spectra of single-hit events averaged for the five crystals in the (1-20) keV energy region. The range of 1 to 6 keV in the MC spectrum is extrapolated from the modeling. The measured energy spectrum after corrections for event selection efficiency [23] is compared with the total of the simulations and is shown to be in a good agreement.

Conclusion
COSINE-100 has been taking data at Y2L from October 21, 2016. We present the background model for the  Fig. 9 The low-energy spectra of single-hit events averaged for the five crystals. The measured energy spectrum after efficiency corrections [23] is compared with the total of the simulations. The range of 1 to 6 keV in the MC spectrum is extrapolated from the modeling.
WIMP search during the first 1.7 years of COSINE-100 data with a total exposure of 97.7 kg·years. Our previous analysis with 59.5-day data showed that 210 Pb and 3 H produce the dominant contributions in the energy region of (2-6) keV. As we lowered the threshold to 1 keV, the background modeling was carried out accordingly. The model includes background contributions from both internal and external sources, including cosmogenic radionuclides and surface 210 Pb contamination. To build the background model with the energy threshold as low as 1 keV, we used a depth profile of the surface 210 Pb contamination that is provided by the measurement with a test setup at Y2L. We also considered the effect of the energy scale errors propagated from the statistical uncertainty and the nonlinear detector response for the simulated spectrum at low energy. This improved background model well matches the measured data not only for single-hit events but also for multiple-hit events. Extrapolating our background model into the Dark matter ROI, we estimate an average background level of 2.85±0.15 counts/day/keV/kg in the energy region of (1-6) keV for the five crystals dominated by 210 Pb and 3 H.