The background model of the CUPID-Mo 0 νββ experiment

CUPID-Mo, located in the Laboratoire Souter-rain de Modane (France), was a demonstrator for the next generation 0 νββ decay experiment, CUPID. It consisted of an array of 20 enriched Li 2100 MoO 4 bolometers and 20 Ge light detectors and has demonstrated that the technol-a


Introduction
Neutrinoless double beta decay (0) is a hypothetical nuclear transition that would occur if the neutrino is its own antiparticle, or a Majorana particle.It consists in the transformation of an even-even nucleus into a lighter isobar containing two more protons accompanied by the emission of two electrons and no other particles, with a change of the total lepton number by two units.The detection of this "mattercreating" process would represent the observation of a new phenomenon beyond the Standard Model [1].Current best limits for 0 half-life are of the order of 10 24 -10 26 yr [2][3][4][5][6][7][8].
The Standard Model process, two-neutrino double beta decay, 2, includes also the emission of two ν and conserves lepton number.Unlike 0 decay, 2 has a continuous energy spectrum and has been observed in more than ten nuclei with half-lives in the range of 10 18 -10 24 yr [9].
One of the largest challenges in 0 decay experiments is the control of the radioactive background, that may produce events in the signal energy region.These could mimic the very rare 0 signal reducing the experimental sensitivity.
During the last 10 years the scintillating bolometer technology has proved that bolometers based on lithium molibdate (Li 2 MoO 4 ), are very promising detectors for next generation 0 searches [10,11].Scintillating bolometers were developed to reduce the background observed in the current leading 0 bolometric experiment, CUORE [12].In CUORE, the background in the region of interest is dominated by surface 's emitted from the copper structure holding the detectors [13].The array of 988 TeO 2 bolometers, installed at the Laboratori Nazionali del Gran Sasso, Italy, has observed a background in the 130 Te region of interest (Q  = 2527 keV) of (1.49 ± 0.04) × 10 −2 counts/keV/kg/yr [4,14,15].The next generation experiment CUPID (Cuore Upgrade with Particle IDentification) will drastically reduce the background thanks to the simultaneous readout of heat and light signals.The capability to discriminate / from  particles with scintillating bolometers relies on the fact that the light emitted in the Li 2 100 MoO 4 by  particles is about a factor 5 smaller compared to the light emitted by /'s of the same energy [10,16].
In addition to the particle discrimination, the CUPID strategy to reduce the background relies on the radiopurity of the scintillating crystals and the minimisation of the passive materials [17].Another key point is that the Q  energy value of 100 Mo (3034 keV) is higher than 2615 keV, implying a signal located above the majority of  lines from natural radioactivity.
The CUPID-Mo experiment [11], located in the Laboratoire Souterrain de Modane (LSM) in France, under an overburden of 4800 metres water equivalent, was built as a demonstrator experiment for CUPID.It consisted of 20 Li 2 100 MoO 4 (LMO) scintillating bolometers and 20 Ge light detectors (LDs) for a simultaneous read-out of heat and light.One of the aims of CUPID-Mo was to validate the background predictions for CUPID, in particular the LMO crystal radiopurity and residual  background.
In this paper we present the background model which describes the background sources in the CUPID-Mo experiment.This model is based on fitting the CUPID-Mo data to detailed Monte Carlo simulations.We show that the residual  background contribution and the radiopurity of the LMO crystals are sufficient to meet the CUPID background goal.
CUPID-Mo was also an important experiment in its own right.In particular, it has set the world-leading limits on the half-life of 0 decay of 100 Mo to both ground and excited states [6,11,18].The detailed study of the experimental backgrounds in the CUPID-Mo experiment enables a high precision measurement of the 2 decay rate and allows to disentangle between the Single State Dominance (SSD) or High State Dominance (HSD) mechanisms [19,20] of the 2 decay process in 100 Mo.It also provides the basis to study new physics processes outside the Standard Model, which could distort the spectral shape of the 2 spectrum, such as 0 decay with emisson of Majoron(s), 2 decay with emisson of Bosonic neutrinos, Lorentz invariance violation or sterile neutrinos [21][22][23][24][25][26][27][28][29][30].

The CUPID-Mo experiment
CUPID-Mo was installed in the EDELWEISS cryogenic setup [31] at LSM.The experiment was in operation between March 2019 and June 2020.
CUPID-Mo used 100 Mo-enriched LMO crystals, where the 100 Mo, the double beta isotope, has been enriched at ∼ 97%.The basic detection modules are the crystals coupled to thermal sensors, consisting of a Neutron Transmutation Doped Ge thermistor, NTD.The top and bottom of the crystals are facing light detectors fabricated from Ge wafers, also instrumented with NTDs to readout the scintillation light signal.The crystals are housed in cylindrical copper holders and supported by PTFE pieces, as shown in Fig. 1.A reflective foil (3M Vikuiti®) is installed around the crystals, inside the copper holder, to increase the light collection.The average weight of the CUPID-Mo crystals is 210 g and the total mass is 4.158 kg corresponding to 2.264 kg of 100 Mo.
The array of 20 bolometers is arranged in five towers with four modules each, as shown in Fig. 2. Each tower is suspended by stainless steel springs to mitigate the vibrational noise of the set-up.The signal from the CUPID-Mo detectors are readout with NOMEX® cables, copper and constantan wires on Kapton pads.Situated in the same cryostat, the EDELWEISS detectors, visible behind the five CUPID-Mo towers in Fig. 2, are equipped with Kapton® pads and Mill-Max® connectors.The detector chamber consists of four copper plates made of NOSV® grade copper to support the bolometers, and is able to accommodate 12 detector towers.
The cryostat involves five thermal copper screens, typically referred to as the 10 mK, 1 K, 50 K, 100 K and 300 K stages respectively.The cryostat screens are made of NOSV and CUC2 grade copper.An internal polyethylene shield, to shield against neutrons produced in the set-up components by (,n) reactions or induced by muons [32], is mounted between the detectors and the internal lead shield, and has a temperature of ∼1 K.An internal lead shield of 14 cm Roman lead [33] is installed inside the cryostat at 1 K, between the detector chamber and the dilution unit (see Fig. 3).Its main purpose is to shield the detectors from radioactive Copper of 99.9975% purity, produced by Aurubis, Hamburg, Germany.
Copper of > 99.990 % purity background of the warm electronics, the cold electronics and the connectors and cables at the 1K stage.
The external shielding closest to the cryostat consists of 20 cm thick lead, with the innermost 2 cm made of Roman lead.The empty space between the lead shield and the outermost thermal screen of the cryostat is flushed with radon depleted air from a radon trapping facility.The average radon level in the air supplied by the facility is 20 mBq/m 3 [34].Following the external lead shield, a 50 cm thick polyethylene shield is used to moderate the radiogenic neutron flux.A plastic scintillator based active muon veto system surrounds the whole experiment for muon tagging [35] (see Fig. 4).

Performances
CUPID-Mo has shown excellent detector performances in terms of energy resolution (7.4±0.4)keV FWHM at 3034 keV [6] and  particle rejection > 99.9% [36], demonstrating that the CUPID requirements are within reach.Further details on the CUPID-Mo set-up and performances are given in [36].

Experimental data
The aim of our data processing is to convert the raw data stream into three calibrated energy spectra: / like events with energy deposits in a single crystal (M 1,/ ), events with energy deposits in two crystals (M 2 ) and of -like events (M 1,  ).These spectra will then be used in a simultaneous fit to extract radioactive contamination values and describe the observed spectra.The algorithms used for the data processing are described in detail in [6] but we will give a summary of the most important steps in the following.We also estimate the detector response parameters (energy resolution, energy bias, efficiencies, light yield) which are needed for postprocessing the Monte Carlo spectral shapes.

Data taking
In this paper, we use the same dataset as in [6] with an exposure of 2.71 kg × yr of LMO corresponding to 1.47 kg × yr of 100 Mo.Our data is acquired as a continuous time-stream and digitized at 500 Hz by the EDELWEISS DAQ [36] and stored at both CC-IN2P3 (France) and NERSC (USA) for offline analysis.We acquire runs, periods of around 10 − 100 hours of stable data taking, of both physics and calibration data, where a calibration source was placed in the vicinity of the experiment.We use regular calibrations with a 232 Th/ 238 U source to calibrate the LMO detectors and a high activity 60 Co source, which generates 100 Mo X-rays in the detectors, to calibrate the LDs.We divide the data into twelve periods  of ∼ 1 month of stable data taking, called datasets.We discard three short periods of data (≈ 1 week each) due to the inability to accurately calibrate this data.

Data processing
We process our data using the C++ softwares Apollo and Diana [37,38], first developed for the CUORE experiment and also used by CUORICINO, CUPID-0 and CUPID [39].
A complete description of the data processing can be found in [6].We identify physics events using an optimal trigger, also used for previous CUPID-Mo analysis.This triggering is used for both the LDs and the LMO bolometers.We then store a 3 s waveform for both LDs and the LMO channels for each triggered events.For each LMO we associate (up to) two light detectors, called side-channels, which correspond to the LDs facing this LMO detector.These are numbered S1/S2 where S1 is the LD with the better detector performance (lower noise and higher detector light yield).
We estimate the amplitude of peaks using an optimal filter, which maximises the signal to noise ratio based on inputs of the known signal shape and spectral noise power density.This is done for all LMO events and also the corresponding LD events on the side-channels.
Next we correct for thermal gain changes and calibrate our data using dedicated 232 Th/ 238 U calibration measurements.This calibration is accurate to around < 1 keV [6] which is sufficient for the binned background model fits.The LDs are calibrated using the dedicated 60 Co calibrations which produces ∼ 17 keV Mo X-rays.

Multiplicity
We define coincidences between physics events, where multiple detectors are triggered simultaneously.This provides useful information since events of 0 decay or 2 decay to the ground state are very likely to deposit energy in only one crystal.However, background events in particular from 's are likely to deposit energy in multiple crystals simultaneously, for example due to Compton scattering in one crystal, or multiple 's from the same decay.We estimate the multiplicity of an event as the number of pulses in different LMO detectors above our analysis energy threshold (set at 40 keV) within a ± 10 ms time window.

Data selection
Several cuts are used to remove non-physical events (for example noise spikes and cross-talk) or coincidences of two or more pulses generated by events very close in time within the same crystal, called pile-up events.We require that there is only one trigger in the 3 s LMO waveform.We then define a pulse shape discrimination (PSD) cut, described in detail in [6], using a principal component analysis method (PCA).We also define a cut on the pulse rise time and optimal filter based PSD variables which help to cut pile-up like events.Details of the choice of the selection cuts is given in [6].
The optimal filter test values or the  2 for rising and falling edges and for the pulse baseline.

Particle Identification
Since CUPID-Mo is a dual readout experiment we can discriminate  from / particles.The use of light detectors also allows us to remove background events in which a particle deposits energy on our LDs.We select / candidate events using the LD signal as following.We normalise the measured LD signals by defining the variable n, as the difference between the measured LD energy,   , and the mean expected / LD energy , normalized by the light band width.We compute n for M 1 events.As each LD has different characteristics, the calculation is done for each channel , each dataset  and for both side channels , i.e.: with  the measured LMO energy.The parameter n ,, has a distribution expected to be centered at zero for /'s and at a value different from zero for  particles.For details on the determination of the mean expected LD energies and its uncertainty, see [6].
For events with two LDs we expect the 2D distribution of  ,1 against  ,2 to be a bivariate Gaussian.As we observe no clear correlation we place a radial cut on the variable: If only one LD is available the cut is instead placed just on this n ,, .We chose a cut of  < 4 to select / events and call this data spectrum M 1,/ .We also construct a spectrum of M 1,  events comprised of high energy M 1 events,  > 3 MeV, with no light selection applied.The same events are obtained with a selection cut  > 4, thus, for simplification we have chosen only the energy cut to select  events.Unlike most other analysis of scintillating bolometers we also develop a light selection cut for M 2 events as described in detail in [18].For a M 2 event the scintillation light recorded can be the sum of that from the crystal above and below a given LD.We use the modeling described in [6] to compute the expected light detector energy for each physics event accounting for multiple contributions to the light yield.From this we can define the normalised LD energy for each pulse in a M > 1 event.We require that each normalised LD energy (for each channel and side channel) is between −10 and 10 , for all but one LD.In this particular LD we observe an accidental contamination of 60 Co.Therefore we generally observe  events in the LMO and  events in the LD with very large energy compared to scintillation light.For the two LMOs adjacent to this LD we make a cut of −10 to 3 , to take into account the energy directly deposited in the light detector.For more details see [18].In addition, to further suppress events from this localized 60 Co source we make a global LD anti-coincidence cut to remove the  background originating from this LD.We remove any events (on non-adjacent LMOs) with a trigger on this LD with energy > 2 keV within a 5 ms window.

Muon veto anti-coincidence
Despite the large rock overburden at LSM, which suppresses most muon events, they still form a possible background source.The EDELWEISS cryostat has a muon-veto system to remove these events, as shown in Fig. 4. We remove events, in each of the M 1,/ , M 2 and M 1,  spectra, with a trigger in the veto system within a 5 ms window.With 98% geometric coverage and the operation voltage adjusted for the aging of the scintillator we expect an O(90%) tagging efficiency of muons with a minimal impact on the / acceptance [35].Since this background was already subdominant and is strongly suppressed by the veto cut we do not include muons in our background model.

Delayed coincidences
Radioactivity from the 232 Th and 238 U decay chains in the LMO crystals could be a significant background in our data.Similar to other analyses of scintillating bolometers [5,40], we can exploit the time correlation of these decay chain events to reduce our experimental backgrounds.In particular, we veto events from the lower part of both chains where there are backgrounds from 214 Bi ( 238 U chain) and 208 Tl ( 232 Th chain).For 208 Tl we veto events in 10 ×  1/2 (1830 s) following a suspected 212 Bi -decay.This time window contains > 99.9% of the 208 Tl decays.
The very low CUPID-Mo radioactivity also enables a novel delayed coincidence cut removing 214 Bi candidate events.The 222 Rn decay chain proceeds as follows: ( We can therefore tag the event based on the 222 Rn or 218 Po  events and a fairly long dead time.We use energy cuts of 5985 − 6145 keV for 218 Po and 5460 − 5620 keV for 222 Rn to tag  candidates.For either type of  candidate events we then veto events within the same crystal within a time window containing 99% of events which is evaluated with MC sampling as 13860 s for 222 Rn and 13620 s for 218 Po.The two possible cuts on 222 Rn or 218 Po improve the rejection power for surface backgrounds.This cut has a small inefficiency (see section 3.9), despite the long veto time.

Data spectra
Based on these cuts we construct the three data spectra used in our analysis: -M 1,/ : Events in one detector identified as /, -M 2 : Events in coincidence between 2 crystals, the two energies deposited in each crystal are summed, -M 1,  : Events in one detector with alpha energy scale (> 3 MeV).
Because of the relatively fast half-life of 2 in 100 Mo (∼ 7 × 10 18 yr) and extremely low levels of contamination, relatively few peaks are observed in the M 1,/ spectrum, where the spectrum of 2 decays of 100 Mo is the dominant feature.The secondary datasets, M 2 and M 1,  however contain a lower fraction of 2 events and therefore provide useful information to determine the location of radioactive contaminations.The experimental spectra after all cuts are shown in Fig. 5.

Data selection efficiencies
We evaluate the efficiency of our cuts and correct the MC simulations by these values.In particular we use events in  peaks from M 2 and M 1,/ spectra to evaluate the efficiency of the PSD (section 3.4), light yield and rise time cuts (section 3.5).We do not observe that the cuts have any energy dependence in the range of the utilised  peaks (236-2615 keV).For cuts where the inefficiency can be considered as a dead time, the multiplicity, muon veto, delayed coincidence and LD anti-coincidence, we evaluate the efficiency using the 210 Po peak.We evaluate the pile-up efficiency, the probability a pulse will be superimposed with another in a 3 s window, using random noise triggers.More details on each of these calculations can be found in [6], and the results are summarised in Table 1.

Energy scale and resolution
We use the observed  peaks in both background and calibration data to predict the energy linearity and resolution.Each LMO detector in each dataset has a distinct energy resolution.
As in [6,11] we perform a fit of the 2615 keV peak in calibration data to extract the resolution of each detector-dataset pair.We use these resolutions to build a function including a common scale factor () which will be determined for the peaks in background data.For our Monte Carlo simulations for each event we sample from a Gaussian with mean  and width  ×  , , where  , is the energy resolution in crystal  and dataset .This energy calibration is discussed in detail in [6].

Features of data spectra
We observe in Fig. 5 that the spectrum of 2 decays of 100 Mo dominates the M 1,/ data, whereas the M 2 spectrum has significant contributions from natural radioactivity, shown by prominent  peaks.These consist predominantly of decays from the 238 U and 232 Th decay chains, however we also observe contributions from 40 K and cosmogenic activation products 60 Co and 57 Co.We also observe a short lived peak of 99 Mo, present for ∼ 1 dataset, from neutron activation after a calibration with an AmBe neutron source.The spectrum M 1,  is dominated by  decays from components very close to the detectors.As shown in Fig. 5, in our data we observe a large contribution of 210 Po,   = 5303 keV, with both a large -value and -energy peak and peaks from several other nuclides in the U/Th chains.
During  decay the energy released is shared between the -particle and recoiling nucleus (NR), with energy O(100 keV).In LMO crystals the range of  particles is about 10 m and a few nm for nuclear recoils.Therefore we expect to observe a peak at the Q-value of the decay for a LMO bulk event.For surface activity the energy spectrum depends on the implantation depth.For shallow contribution O(nm) in the crystal the  or recoil could escape, or both could be contained in the crystal.We therefore expect peaks at the NR energy, at the  energy, and at the -value, with a relatively low flat continuum from partial contained 's or NR.The ratio of these peaks depends on the depth of radioactive contamination.For a deeper contribution O (m) the NR is almost always contained but the 's can still escape after depositing some of its energy, giving rise to a continuum extending from low energies up to the -value.
Similarly, for materials facing the crystals we expect a dependence on the implantation depth: at shallow depths the spectrum will be characterised by peaks at the -energy and NR energy, for a deep contribution this will become a flat spectrum from low energy up to the alpha energy.We note from Fig. 5 that we generally do not observe clear -energy peaks in our data.However due to the limited statistics the data is still compatible with a full surface contamination.The lack of clear -energy peaks creates a challenge for assessing the surface contamination.

Background sources
The background in our experiment is expected mainly from the natural radioactivity in the whole experimental setup, including the detectors.Other contributions from muons, neutrons and environmental gammas are expected to be subdominant, as explained in section 4.1.To minimize the background, all the materials used to build the experiment have been carefully selected in terms of radiopurity.To this end, the daughthers of 238 U and 232 Th decay chains, 40 K, and cosmogenic radionuclides have been measured by High Purity Ge -ray spectroscopy and ICPMS (Inductively Coupled Plasma Mass Spectromery).The CUPID-Mo materials were chosen to minimize the 226 Ra and 228 Th contaminations, as the most critical radioactive backgrounds in the 3 MeV region relevant to 0 decay searches arise from 214 Bi and 208 Tl decays.Table 2 reports the radioactivity in the CUPID-Mo detector components resulting from CUPID-Mo and CUORE measurement campaigns [15].The materials which are directly facing the crystals (all but the springs from Table 2) are referred to as close components in the following.The material choice in the EDELWEISS cryostat was done to minimize the contaminants at lower energies, O (100 keV), which is the region of interest in dark matter searches.Table 3 shows the radioactivity in the EDELWEISS cryostat materials .The NOSV copper is used for the CUPID-Mo detector holders, all the copper parts in the detector chamber and the cryostat screens (with the exception of the 1 K screen).
We identify the most significant contributions to our experimental background using the screening measurements and the analysis of experimental data from section 3.11.
We can broadly categorise our background sources into four groups: -Close source: Radioactivity in the LMO crystal, reflective foils, LDs, PTFE clamps and NTDs, directly facing the crystals; In Table 5, the Kapton connectors, MillMax connectors and Cu Kapton cables belong to the EDELWEISS readout system, while the NOMEX cables are used for the CUPID-Mo readout.-10 mK source: Sources of activity in the 10 mK stage of the cryostat but not directly facing the LMO crystals (springs, cables, connectors, copper plates for bolometer support), as shown in Fig. 2; -Infrastructure source: The copper cryostat screens and the internal shieldings, see Fig. 3; -External: Activity originating from outside the 300 K Cu shield.

Other contributions -muons, neutrons and environmental gammas
The muon flux at the LSM is 5 muons/m 2 /day [35].Muons would generally deposit energy in multiple detectors and be strongly suppressed by anti-coincidence with the muon veto detector (see section 3.6), therefore we do not include them in the background model.
Neutrons may induce background in the 0 region of interest (ROI) if they are captured in the materials of the setup, producing high energy gammas.The thermal neutron flux in LSM has been measured as (3.6 ± 0.05 (stat.)± 0.27 (syst.))× 10 −6 neutrons/s/cm 2 [43] and the ambient neutron flux (fast plus thermal) has been estimated ∼10 −5 neutrons/s/cm 2 in [43,44].Previous work [45] showed that 48 cm of polyethylene reduces the neutron flux by a factor 2 × 10 6 .Taking into account the surface of the CUPID-Mo detectors, we get that the neutron flux expected is less than 1 neutron/year.Thus, ambient neutrons are not taken into account among our background sources.
The gamma flux at LSM has been measured with a portable Ge detector at several locations in the laboratory.At the place where the EDELWEISS set-up is installed, the flux of 2.6 MeV photons was measured as 5.1 ± 0.2 (stat.)× 10 −2 /s/cm 2 [46].Considering that 20 cm of lead reduce the flux by about a factor 10 4 , then the contribution of environmental gammas may not be negligible.We expect about 6 photons of 2.6 MeV on the detectors surface during the course of all data taking.We take them into account by generating decays at the level of the outermost cryogenic thermal shield, as the spectral shapes measured in the detector from a source generated outside the external lead and outside the outermost cryogenic thermal differ slightly only below 500 keV.

Monte Carlo simulations
The Monte Carlo simulation is developed in GEANT4 and implemented with version 10.04 [47].The MC simulation program developed by the EDELWEISS collaboration [32], has been adapted to include the CUPID-Mo detectors and to include the features described below.
We sample the 2 decay spectra of 100 Mo and we consider both the HSD and SSD mechanisms with their parametrization from [48,49].
The radioactive decays in the components of the experiment are generated using both Decay0 [50] and GEANT4.For decay chains in close sources we use the GEANT4 class G4RadioactiveDecay.This allows to generate sub-chains, for example 226 Ra to 210 Pb.We store the final position of the nuclear recoil, and use it as the initial condition for the next decay, along with the time difference.This allows for the accurate consideration of pile-up from events out of the same decay chain and of the delayed coincidence cuts (see section 3.7).From the simulations we then store the energy deposited in both the LMO and LDs.For the 232 Th decay chain, we generate separately 232 Th, 228 Ra to 228 Th and 228 Th to 208 Pb, since 228 Ra and 228 Th have long half-life and so secular equilibrium cannot be assumed.Similarly for the 238 U chain we generate separately 238 U to 234 U, 234 U, 230 Th, 226 Ra to 210 Pb and 210 Pb to 206 Pb.
We use Decay0 for most external sources, not directly facing the crystals, where pile-up events in the same crystal from subsequent decays in a chain are unlikely, and delayed coincidence cuts through the tagging of  events is impossible.For the 238 U decay chain we generate the  emitters 214 Pb and 214 Bi.Since they are in secular equilibrium, we combine their spectra to reduce the number of components in the background model fit.We also generate in some components 210 Bi which is not assumed to be in equilibrium.For the 232 Th decay chain we generate 212 Pb, 212 Bi and 208 Tl out of the 228 Th sub-chain and combine them into one spectrum.We also generate 228 Ac which is not assumed to be in equilibrium.
In addition to the 238 U and 232 Th chains, we simulate 40 K contamination in the springs and the outermost cryogenic thermal shield.We have also considered 60 Co from cosmogenic activation in copper as well as 87 Rb and 90 Sr+ 90 Y in the crystals.A 99 Mo contribution due to neutron activation in the first days of data taking is also simulated in the crystals.
The decays are generated in the bulk of the components and also the surface for close sources, where surface contaminants can produce a distinct energy spectrum compared to bulk contamination.Surface contaminations are modelled with an exponential density profile e −/ , where  is a variable depth parameter.
The particles are propagated through the experimental geometry using the Livermore low energy physics models [51].We use production cut lengths of 1 m for  − / + and 10 m for 's.For LMO these correspond to 1 keV energy thresholds for both  − / + and 's.This choice is based on a study of the impact of the production thresholds on the detected spectra.Thresholds of 1 keV and 250 eV for LMO give comparable spectra, while the computing time is significantly reduced for a 1 keV cut length.

Geometry
We implement a detailed geometry of the CUPID-Mo towers in the MC simulations.In particular, we reproduce the size of each LMO crystal [36] on an individual basis to take into account variations between the crystals.We also include the Ge LDs, the PTFE clamps, the reflective foils surrounding the crystals and the copper holders which are implemented as accurately as possible.Fig. 2 shows the GEANT4 rendering of the simulation geometry of the 10 mK chamber, with the five towers of CUPID-Mo in the front.We included the readout cables, the springs, the EDELWEISS Ge detectors and their connectors.The copper structure supporting the crystals, composed of four copper plates and three copper columns made of NOSV copper, is also included in the simulated geometry.The four copper plates are held by brass Production cuts apply to the production of secondaries.Below the cut, the primary is tracked down to zero energy using a continuous energy loss.
screws with a relatively high mass (see Table 3) which have been modeled as well.
Fig. 3 shows the simulated geometry of cryostat and electronics.The 10 mK, 1 K, 50 K, 100 K and 300 K thermal screens are included individually.The internal polytehylene shielding and lead shielding are also implemented in the geometry.We also note that the geometry extends and includes far components below the 1 K lead shield that are less radiopure, like the dilution unit, the 300 K electronics, the pumps and the He reservoir that is expected to be important for neutron simulations.

Detector response model
To compare the simulated spectra to the measured data we need to account for the finite energy resolution and response of the detectors.The following features are accounted for through a post processing of the MC simulation spectra: -Energy resolution; -Energy threshold of 40 keV; -Event multiplicity; -Scintillation light and LD resolution; -Cut efficiencies; -Inactive periods of detectors; -Pile-up and delayed coincidences in decay chains.
We compute the energy resolution per detector-dataset pair as explained in section 3.10.In particular, for a pulse with energy    in channel c and dataset d we sample from a Gaussian with mean    and standard deviation  ×  , .As is done in experimental data we discard pulses below the energy threshold, < 40 keV, and compute the multiplicity as the number of detectors with  > 40 keV for each simulated event.
We also reproduce the signals measured in the light detectors.We have parameterised the scintillation light energy measured by the LD in data as a function of LMO energy as a second order polynomial, for each LMO and side LD channel.We also parameterise the LD energy resolution as a Gaussian with standard deviation √︃  2 0 +  1 .We use this parameterisation to generate a random scintillation light yield for each event which is summed with the energy deposited in MC from direct particle interactions.We use these light detector energies to reproduce the light yield cuts in the same way as in experimental data in section 3.
To account for inactive detectors we assign a random timestamp from the data taking period to each simulated MC event.This allows us to apply the same cuts to the simulated data and remove events from detectors considered inactive and to account for the reduction of event multiplicity in these periods.

Simulated background sources
Some components produce indistinguishable spectra of energy deposits in the crystal, or in other words, they exhibit degenerate spectral shapes.In this case, we either group them, or generate the radioactive decays in only one, which accounts for all elements with degenerate spectra.This simplification reduces the number of free parameters in the fit of the simulations to the data, however, we need to keep in mind that the posterior distributions account for the sum of the grouped elements.
The reflective foils, the PTFE clamps, and all other passive elements directly facing the crystals produce degenerate spectra.For this reason we have generated radioactive decays only in the Reflectors.We simulated radioactive decays separately in the connectors, the cables and the springs in the detector chamber at 10 mK shown in Fig. 2, right.All the copper elements made of NOSV at the 10 mK stage (holders, four plates, three columns and 10 mK cryostat screen) have been grouped in one background source, which we refer to as Copper supports.
We have found that all thermal screens exhibit degenerate spectra, thus we group the screens made of NOSV copper and refer to as Cryostat screens.We have also found that the internal polyethylene shielding spectrum is degenerate with the one from internal lead shielding.Thus, we have chosen to include only the internal polyethylene shielding contribution in the fit.This element takes into account all contributions from background sources below the internal lead shielding, as the elements of the dilution unit or the 300 K electronics also shown in Fig. 3.
In addition we have considered as a source the outer cryostat screen, called screen 300 K.This volume also includes the contribution from radon present in the air between the 300 K screen and the external lead shielding.

Background model
The goal of the background model is to describe the data (section 3) with the MC simulations (section 5).The parameters of the model then tell us the radioactive contamination of various components of the experiment.We use a binned simultaneous maximum likelihood fit, performed in a Bayesian framework with a Markov Chain Monte Carlo (MCMC) approach [52], developed by CUORE and further optimized by CUPID-0 collaborations [15,42] using the JAGS software [53,54].We model the data in spectra  (M 1,/ , M 2 and M 1,  ) and energy bin  as: The sum  runs over the simulated MC sources,   is a scaling factor for each source (shared by all three spectra) and ( Here the sum  is over the three data spectra and  goes over the bins in each spectrum.JAGS samples the full posterior probability distribution ( ì  |D) given by Bayes theorem: using MCMC.The prior probabilities, ( ì ) are discussed in section 6.3.For each parameter we also extract the marginalised posterior distribution by integrating over the parameter space Ω (excluding the parameter of interest): We choose the mode of the marginalised distribution as our point estimate of the parameter and we compute, by integrating, the smallest 68% Bayesian credible intervals, c.i., around the mode.If the lowest 68% includes the value zero, we give an upper limit at 90% c.i.

MC simulation of 56 Co calibration source
We have performed a calibration with a 56 Co source to validate the energy calibration and resolution of the CUPID-Mo detectors in the 0 ROI, at ∼3 MeV.The measurement is also useful to test and validate the implementation of the Monte Carlo simulations.Two 56 Co sources with an activity of 41 ± 8 Bq, measured with HPGe  spectroscopy inmediately after the calibration, were placed on the outer cryostat screen, inside the external shielding.The configuration was chosen to achieve the highest uniform counting rate in the ROI for all detectors with a total rate below 0.125 Hz as an upper limit on the tolerable pile-up.
We have performed a fit of the calibration data to a MC simulation of the 56 Co sources summed with a background component (detailed later in this section) and pile-up, with only uniform priors.We describe in section 6.3.1 how the spectral shape of the pile-up events is obtained.The fit has thus three parameters: the normalization factor of the background, the one of the 56 Co sources, and the one of the pile-up events.We know the normalization of the background from the background model fit.Comparing it to the normalization factor of the background in the calibration data, we obtain the efficiency of the cuts in the calibration data (68.7±1.4)%.From the normalization factor of the 56 Co sources we obtain the activity of the sources without the efficiency correction.Using the estimated efficiency, we derive the final activity of the 56 Co source of (50 ± 1) Bq, which is in good agreement with the measured activities.
The model shows good agreement with the data in the whole energy range of the fit 200 -4000 keV, as shown in Fig. 6.In Fig. 7 we present the region above 2800 keV with 2 keV binning, where the comparison in this region can be better appreciated.This fit shows that the MC implementation of the set-up is accurate and that the MC is able to describe well the data.

Background sources list
The background model fit includes 41 background sources associated to the bulk volume of the components identified in section 5.3.We included 2 additional sources of surface contamination: the LMO crystal and the Reflective foil (representing all sources facing the crystals).For a detailed list of the sources associated to radioactive contaminants we refer to Tables 4 and 5.The complete list of the background sources in our fit is: -2 decay of 100 Mo to 100 Ru ground state, -2 decay of 100 Mo to 100 Ru 0 + 1 excited state, pile-up (random coincidence of 2 events in the same crystal happening so close in time that the signal is equivalent to that of the sum of the two events), -99 Mo, -12 bulk sources of natural radioactivity detailed in Table 4, -8 sources associated to surface contamination, listed in Table 4, -210 Pb surface contamination with 1 nm and 1 m implantation depth.-Reflectors: -3 sources associated to bulk contaminations, Table 5, -8 sources associated to surface contaminations,  -Close sources, 10 mK and infrastructure: 27 sources associated to the bulk volume of these components, listed in Table 5, -Random coincidence of 2 events in two different crystals called accidentals.
A total of 67 sources are included in the fit.As mentioned in section 5 we have modelled surface contaminations with an exponential density profile e −/ .We have simulated surface contaminations with  = 10 nm and 10 m for all radionuclides in the U and Th chains.The choice 10 nm is driven by the fact the typical range of recoiling nuclei is of the order of some nm for  decays in the U and Th chains.The choice of 10 m considers that mechanical crystal preparation including cutting, cleaning and polishing can lead to deep surface damage and implantation depths of ∼ m, however depths > 10 m would be effectively equivalent to bulk contaminations.We observed that the component corresponding to the shallow surface contamination of the crystal, with  = 10 nm, is needed to properly fit our data.Surface contaminations with  = 10 m give activities which are compatible with zero.Thus, for simplification to minimize the number of degrees of freedom, we choose to include in the fit only the crystal surface contaminations with  = 10 nm implantation depth.Due to the small thickness (70 m) and low density of the Reflectors, surface contaminations with  = 10 m are degenerate with bulk contaminations.Both produce continuous spectra due to the partial energy loss of  particles in the Reflector and the detection of the remaining kinetic energy in the crystal.We have therefore chosen to include only the surface contaminations with  = 10 nm for the Reflectors in the background model fit.
To simulate surface and bulk contaminations in the crystal and the Reflectors, we have generated the decay chains to take into account time correlations and exploit the delay coincidences (see section 5), as done for the data.We observed that the bulk contamination in the Reflectors produce a flat spectra independent of the specific part of the radioactive decay chain and our fits showed the activities of the various subchains (excluding 210 Pb to 206 Pb) were compatible.We hence simplify the fit model by assuming that the entirety of the U/Th (excluding 210 Pb to 206 Pb) chain is in secular equilibrium for the Reflectors.
In addition to the U/Th chains, other contaminations have been included in the crystals.In particular, 40 K can be found as a result of an initial contamination of the powder used to grow the crystals [31].Some anthropogenic radionuclides due to fall out can also be found in enriched crystals [55], thus, we have included the bulk contaminations 87 Rb and 90 Sr+ 90 Y, which are pure -emitters. 99Mo was produced by activation during a neutron calibration with an AmBe source.
For all sources that aren't facing the crystals we have simulated decays of the daughters of 226 Ra and 228 Th.We identify in Table 3 a large content of 210 Pb in the brass screws holding the detector plates.We have simulated 210 Bi in this component and use it to account for this contamination in all 10 mK and infrastructure sources.Cobalt isotopes are expected to be primarily the result of cosmogenic activation in copper.We have therefore chosen to locate 60 Co and 57 Co in the Copper supports and use it to account for this contamination in all 10 mK and infrastructure sources.In all the components we use 228 Ac  emitter with three main  peaks clearly visible in the data, not in equilibrium with 228 Th.In doing so, we have observed that for the 300 K screen the two values from the fit are compatible with equilibrium thus, to reduce the number of components in the fit we have combined 228 Ac and 228 Th.

Choice of priors
We consider informative priors both from screening measurements (section 4) and from other independent measurements.We have informative priors on the contribution of the: -2 100 Mo to 100 Ru 0 + 1 excited state, which has been taken as  1/2 = (6.7 ± 0.5) × 10 20 yr (average value from [9]), -Stainless steel springs included in the set-up to mitigate vibrational noise.These are modelled with high accuracy in the MC, and the values measured by HPGe and used as priors are given in Table 2, -Random coincidence (pile-up and accidentals) events, determined from the rate of single events and from a measurement with a calibration source (see below).

Random coincidence events
Energy deposition in either one or two crystals randomly coinciding in time can cause non-negligible contributions to the high energy region both in M 1,/ and even more so in M 2 .This is a particular concern for 100 Mo, due to the fast rate of 2 decay of  1/2 ∼ 7 × 10 18 yr [56], or around 2 mHz per detector.The events in two different detectors are referred to as accidentals and contribute to M 2 spectrum.The random coincidences in the same detector are referred to as pile-up and contribute to the M 1 spectrum.We predict the spectral shape of these events by convolving the experimental M 1,/ spectrum with itself, i.e. by selecting randomly two energies in the experimental M 1,/ spectrum and summing them.The M 1,/ and the resulting random coincidences spectra are shown in Fig. 8.
The expected number of accidentals is then given by: where  is the total number of M 1 events, Δ/ is the ratio of the width of the coincidence time window, Δ = ±10 ms, to the total measurement time and  LMO is the number of LMO detectors.For the accidental random coincidences we place a prior as a Gaussian function with mean Nacc and  √︁ Nacc .We have used  = 1.2 × 10 6 , for a total measurement time of 2.2 × 10 7 s.We include this contribution only in the M 2 spectrum.
The rate of pile-up events is generally lower than the rate of accidentals, but it is also less well constrained as the coincidence time, or effective time resolution, Δ eff , is unknown a-priori and determined by the effectiveness of the pulse shape cuts used in the analysis.In general this time resolution will also be dependent on the energy of both the primary and secondary pulse as well as on their separation.However, since we are only interested in events in a narrow range of a high energy region (∼ 3 MeV) we can treat this to a good approximation, as energy independent and simplify Eq. 8 to: For thermal detectors typically the timing resolution is between the inverse sampling frequency and detector rise time.
In CUPID-Mo the inverse of the sampling rate is 2 ms and the median value of the rise-time is 24 ms, with 8 ms spread [36].
Similar LMO crystals to CUPID-Mo, tested at LNGS have achieved 1 − 2 ms effective timing resolution [57] also using PSD only on the LMO channel (as we have done in CUPID-Mo).However, this test has pulses with a higher sampling frequency with respect to CUPID-Mo, different operation temperature and noise conditions and slightly different rise time, so cannot be directly extrapolated.
For the prior on pile-up events in our fit we use a measurement with Th/U calibration sources.We consider all events between 4 -5.5 MeV, and from Eq. 9 we can obtain a value for Δ eff .As there are no events in the selected region, we obtain a prior for Δ eff < 7 ms, at 90% c.i.As zero events are obtained, the corresponding probability density function is an exponential, that we use to place a prior on the pile-up rate.We include this contribution only in the M 1,/ spectrum.

Binning and choice of energy intervals
The energy range of the fit is 100 -4000 keV for M 1,/ , 400 -4000 keV for M 2 and 3000 -10000 keV for M 1,  .We Fig. 8 Experimental M 1,/ and random coincidences (obtained by convolution of M 1,/ , arbitrary normalization) spectral shapes.We observe that the random coincidences distribution is shifted to higher energies (as expected [58,59]) and could cause a background at the ROI.use a variable binning for the three spectra to have enough counts in each bin to minimize the effect of statistical fluctuations.We choose a minimum bin size of 15 keV for M 1,/ and M 2 , and 20 keV for M 1,  .We set the minimum number of counts in each bin to be 50 for M 1,/ and 30 for M 2 and M 1,  .We choose each peak to be fully contained in one bin, to minimize the systematic effect of the detector response on our results.

Results
The result of the simultaneous fit of M 1,/ , M 2 , and M 1,  to the CUPID-Mo data with 2.71 kg × yr exposure is shown in Fig. 9. Tables 4 and 5 show the fit results, discussed in section 7.2.We find that our background model is able to reconstruct well the 3 data spectra.On each spectrum the data over model ratio is shown, where the colors correspond respectively, to ±1, ±2, and ±3  with: where  model,i,b is the predicted number of counts in bin  and spectrum  and  data,, is the standard deviation of the data in this bin, √  data,, .To investigate the goodness of the fit we generate pseudoexperiments, or toy Monte Carlo simulations.We sample randomly according to a Poisson distribution the events in each energy bin of the background model best fit reproduction.We fit independently each pseudo experiment and obtain the likehood L (D | ì ), i.e. the probability of the experimental data D given the set of parameters ì  of our model.We show in Fig. 10 the result for M 1,/ , M 2 and M 1,  .The mean of the distributions of M 1,/ and M 2 agree well with the value of the data.For M 1,  the result demonstrate a modest incompatibility between the data and the model probably arising from an incomplete modelling of  detector response or an  miscalibration.This effect is visible for E>6 MeV in Fig. 9 bottom panels.This modest incompatibility has driven the choice of a systematic in our model and we have thus performed a fit with an energy range 3000 -6360 keV for M 1,  .We detail in subsection 7.4 the results.The p value obtained are  = 0.38,  = 0.04 and  ∼ 0 for M 1,/ , M 2 and M 1,  respectively.

SSD and HSD 2𝜈𝛽𝛽 decay mechanisms
The transition between the ground states of 100 Mo and 100 Ru, with spin parity 0 + , is realized via virtual  transitions through 1 + states of the intermediate nucleus 100 Tc.Nuclear theory does not predict a-priori whether this transition is realized dominantly through the 1 + ground state (SSD hypothesis) or through higher excited states of 100 Tc (HSD hypothesis) [60].
We have found that the SSD mechanism of 2 decay to 100 Ru ground state reproduces fairly well the data with a  = 0.38, while the HSD model does not,  ∼ 0. Since our data clearly favours SSD over HSD mechanism for 2, we have used the SSD model in our final fit.

LMO crystal contaminations
The M 1,  spectrum is populated by  decays occurring in the crystals and in the elements directly facing them.As described in section 3.11 we included bulk and surface contaminations in the crystals in our fit.We show in Fig. 11, the resulting components.Since we do not observe clear -energy (NR escape) peaks due to the very low levels of contaminations and thus limited statistics, bulk and surface contaminations are anticorrelated.We performed studies concerning the effect of the location of the contaminations on bulk or surface in the fit results, which we discuss later.
The largest peak in the  region is the 210 Po peak.This peak is largely described by the -value component of the crystal bulk.For the 210 Po in order to fit as much as possible the particular shape peak in addition to 10 nm, implantation depths of 1 m and 1 nm are used in the crystals, and implantation depths of 100 nm and 1 m are used in the Reflectors.In Fig. 11 the left tail of the 210 Po peak is described by the surface contamination on the Reflectors.
The summary of the crystal activities extracted from the fit is presented in Table 4.The LMO crystal contaminations by radionuclides from the 238 U and 232 Th chains are all below 1 Bq/kg.As a study of the effect of the bulk versus surface Fig. 9 Experimental data and background model simultaneous fit reconstruction of the 3 CUPID-Mo data spectra.Two upper panels: M 1,/ , /'s spectrum with energy deposits in only one detector.Middle panels: M 2 , multiplicity 2 events, histogram of the 2 summed energies.Two bottom panels: M 1,  , multiplicity 1 events in  energy region.For each one, the lower panel shows the ratio between experimental counts and reconstruction counts for each bin.The colors indicate the uncertainties at ±1, ±2, and ±3 .The peaks in the spectrum are described by the radioimpurities in the crystal and the continuum by the ones in the bulk of the Reflectors.The small contribution from 10 mK sources corresponds to the holders.
location, we performed a fit without surface contaminations.These results show that, even under this extreme assumption, the results on the bulk activities do not vary significantly.As shown in Fig. 5 the peak at 4.8 MeV contains 234 U and 226 Ra alpha decays.In this analysis this peak is ascribed to 234 U, with a significant uncertainty (reported in Table 4) in the resulting contamination due to the anticorrelation with the 226 Ra contribution.Additionally, in this peak we could have a contribution from the neutron capture in 6 Li [31].Neutrons captured in 6 Li produce an alpha particle plus tritium, 6 Li(n,) 3 H, with a total energy 4.782 MeV.We note also that the level of 228 Ra is not constrained by any  peak.
There is clearly a larger 210 Po contribution than the rest of the 238 U chain, at the level of 96 Bq/kg, possibly introduced during the purification of the enriched material [61].There are also traces of 190 Pt, caused by the crystal growth in a platinum crucible [62] and we find 40 K and 90 Sr+ 90 Y at the level of some hundreds of Bq/kg.We note that 210 Pb, 87 Rb, 90 Sr+ 90 Y and 40 K do not represent a potential background for 0 search, as the   of these radioisotopes is much lower than the 0 ROI at 3 MeV.
We show in Table 4 (bottom) the surface contaminations of the crystals derived from the fit.We studied the effect of including also a contribution with a depth parameter of 10m (i.e., including surface contaminations with  = 10 nm and 10 m) and the decay activitiy is shown in the third column of the table.The results are compatible with the fit including only 10 nm contributions.We observe clear anti-correlation for a given decay chain between the bulk and the surface contaminants in the crystal, but also with the surface of the Reflectors.These anti-correlations are taken into account in the uncertainties given in Table 4.

Radioactive contaminations of the setup components
A list of sources included in the fit and their resulting activities obtained from the marginalised mode and 68% c.i. are shown in Table 5.
The derived activities for the component called Reflectors are mainly constrained by the fit of the continuum in the  region.The values are larger than the measured radioactivities of the reflectors themselves, in particular, in 226 Ra.We remind that this component takes into account all elements directly facing the crystals: PTFE, NTDs, LDs, bonding wires.A contamination of the reflecting foils introduced during the detector assembly could be conceived, explaining the activities obtained in the fit.
Concerning the surface activity on the reflecting foils, we performed a measurement with the BiPo-3 detector [41] which measures 214 Bi and 208 Tl levels through delay coincidences in the Bi-Po cascades.We can also convert the ICPMS results of the bulk measurement by assigning all the contamination to the surface.The surface activity of the Reflectors derived from the fit agrees well within uncertainties with both measurements.
The derived activities in the Kapton cables, the Connectors, the Brass Screws and the Copper supports agree well with the measured values.For the Cryostat Screens the activities obtained in the fit are higher than the measured levels from the raw copper.This points out to an additional contamination introduced during the fabrication of the screens for example due to the weldings.In particular, we have identified from the experimental data that the detectors facing the weldings in the cryostat screens have higher rates in the 2615 keV peak of 208 Tl.
The Screen 300 K accounts for the residual environmental 's and the radon present in the gap between the outermost cryostat screen and the external lead shielding.The 226 Ra contamination derived from the fit shown in Table 5 can be translated into a radon level concentration resulting in (22 ± 3) mBq/m 3 , which is in good agreement with measurements of 20 mBq/m 3 provided by the radon mitigation system in the LSM [34].
Figure 12 shows the breakdown of the components in the fit of M 1,/ .In the region 0.8 − 3 MeV the dominant contribution is the 2 from 100 Mo and the most important contributions from the radioactivity in the materials are the cryostat and shields.We discuss in the next section the main sources in the 0 region.

Background index in the 100 Mo 0𝜈𝛽𝛽 ROI
We use our simultaneous fit to reconstruct the background index in the 0 region of interest.We chose to calculate the background index in the region ±15 keV around 3034 keV.We sample directly the full posterior distribution produced by JAGS for each step  in the Markov Chain by computing: Here   is the background index in the 0 region of interest,   is the integral of the spectrum of MC source j in the ROI,  ,  is the weight for source j in step i, Δ is the width of the ROI and  is the experimental exposure.
The sum goes over all background sources.The MC simulations are themselves the result of a stochastic process they have a statistical uncertainty, this is accounted for by Poisson smearing the MC ROI integrals per step of the Markov Chain.We then use the distribution of   for the full Markov Chain to estimate the marginalised posterior distribution of the background index.We perform this calculation for our maximal model with all parameters, we therefore naturally marginalise over all possible combinations of activities (for example surface or bulk radio-purity) consistent with our experimental data accounting for the systematic uncertainty due to source localisation.
Next we reconstruct the contributions to the experimental background.We divide sources into five categories: -Crystals U/Th; -Pile-up; -Reflectors; -10 mK sources; -Cryostat and shields.
We emphasise that only the first three sources are relevant to CUPID.In the CUPID baseline the reflective foil is removed to improve background rejection.However, as it was noted before Reflectors include all the elements directly facing the crystals, PTFE, bonding wires, heaters.These elements will remain in CUPID.The final two are caused by materials in the EDELWEISS cryostat which is optimised for a dark matter rather than 0 decay search.The posterior distributions of background index from each source are shown in Fig. 14.We derive the background index for each of the sources in the same way as for the full posterior.We find that the crystals give the smallest contribution, with a background index: As shown in Fig. 14, the posterior probability for pile-up allows us to set an upper limit for its background index, < 1.4 × 10 −3 counts/keV/kg/yr (90% c.i.).This is potentially the main background contribution, in particular due to the low CUPID-Mo sampling frequency (500 Hz) and lack of optimised cuts to remove pile-up.In CUPID, heat and light signals will be exploited together with optimised algorithms to remove pile-up events (see for example [63]).Figure 15 gives the background index extracted from Fig. 14 for each of the grouped components.They are obtained from the mode and the smallest 68.3% interval.For the pile-up the smallest 68.3% interval is compatible with zero, thus an upper limit is presented.

Systematics
To check the stability of the model and the systematic uncertainties, we perform a series of different fits.To take into account the systematic uncertainty due to MC statistics, we where,    ,, is the number of MC events in bin  of source  in spectra , and N   , is the expected number.These nuisance parameters added to the model account for the integer Poisson fluctuations in the MC.We find that the fit remains largely unchanged with only a small change in the value of the background index.
To check the stability of the fit, we perform different fits varying the binning, the energy fit region, the choice of background sources and, in particular, the bulk and surface contaminations in the crystals, as follow: -Binning: we repeat the fit with 1, 2 and 20 keV fixed binning in M 1,/ and M 2 .In all cases, the overall goodness of the fit remains, and the value of the background index is compatible within uncertainties to that of the reference fit, as shown in Table 6.We did not repeat the fit with 1, 2 and 20 keV on M 1,  due to the low statistics in each bin of the data; -Fit energy region: our reference fit extends from 100 keV to 4 MeV for M 1,/ spectrum.We vary the energy threshold to 200 keV and find that the background index only varies slightly; -Choice of background sources: our calculation of the background index is naturally marginalising over this uncertainty (see section 7.3).However, as an additional check we perform the fit without including the crystal bulk contribution for the U and Th chains.The values of the activities of the sources change, but the goodness of the fit remains very similar and the value of the background index remains almost unchanged.We then remove the crystal surface contamination and still obtain a background index compatible within uncertainties to that of the reference fit; -Energy region of M 1,  fit: our reference fit extends from [3000 -10000] keV.As described at the beginning of section 7 the M 1,  fit shows a modest incompatibility between the data and the model, mainly in the region E>6 MeV.We thus performed a fit in [3000 -6360] keV to account for this incompatibility as a systematic uncertainty in our model.In doing so, the U and Th contributions in the crystal get more degenerated, resulting in an increase of the Th contamination assigned in the fit.Still the background index is compatible, within uncertainties, to that of the reference fit.
The results of these tests are summarized in Table 6.As argued above, the result given in Eq. 17 is naturally marginalising over the uncertainty on the choice of the background sources.Considering all tests in Table 6 as a systematic uncertainty (with 2 keV fixed binning) and adding them in quadrature, the background index in (3034 ± 15) keV results in: or: B = 3.7 +0.9 −0.8 (stat) +1.5 −0.7 (syst) × 10 −3 counts/Δ FWHM /mol iso /yr (17) We also verified that the reconstructed background index is not biased, by comparing the distribution of background indexes in toy Monte Carlo simulations to that of the reference fit.15 Background index for the various groups of sources.The values are extracted from the mode of each distribution of Fig. 14, with their respective uncertainties.The green bars correspond to the smallest 68.3% interval around the mode, and the yellow bars to the smallest 90% interval around the mode.For the pile-up the distribution is compatible with zero, thus we give an upper limit to 68.3% c.i. in green and 90% c.i. in yellow.

Residual alpha background
Due to our  particle rejection, background events in the ROI from 226 Ra and 228 Th subchains in the bulk and the surface of the crystals generally arise only from energy depositions of  or  particles.However, 238 U, 234 U, 230 Th, 210 Po and 232 Th could also produce events in the ROI through energy deposits of  particles.Even if we apply a light yield cut to remove  background, it is still possible that some  events pass this selection cut.
From the background index distribution of the crystals, one can separate the background from / decays from that coming from  decays, as shown in Fig. 16.One can observe that a non-negligible part of the crystal background index is coming from 's that passes the light yield cut.This  background is coming from surface contamination of the crystals.It corresponds to an  particle that deposits energy in the crystal and where the nuclear recoil deposits energy in the LD.This kind of events can pass the light yield cut mainly for the crystals that face only one LD.We show in Fig. 17, left, the experimental M 1,/ spectrum including all crystals, and the resulting spectrum selecting only the crystals that face two LDs.Such cut remove all the remaining alphas around 5.8 MeV.This effect is also visible in the background model.Figure 17, right, shows the reconstruction of the crystal component of M 1,/ spectrum, and the resulting spectrum selecting only the crystals that face two LDs.We remind that in CUPID-Mo 5 of the 20 LMOs are facing only one LD due to being in the bottom floor of the towers, one LD was not operational and one had a poor performance affecting a further 4 LMOs.We expect that in the case where all the crystals face two LDs, as in CUPID, the  background contribution should be negligible.

Conclusion
In this work we present the development of a background model capable of describing very accurately the CUPID-Mo experimental data with 2.71 kg × yr exposure.We have performed a simultaneous fit of three data spectra, M 1,/ , M 2 and M 1,  , to detailed Monte Carlo simulations.The model is performed in a Bayesian framework with a MCMC approach.We have shown by a fit to a 56 Co calibration source that the MC implementation is accurate and that the MC is able to describe well the data.We used a total of 67 background sources including the bulk and surface radioactive contaminations in the crystal and the components of the set-up.We have performed systematic checks varying the binning, the energy fit region and the choice of background sources that showed the stability of the model.
We have found that the radiopurity of the Li 2 100 MoO 4 crystals is sufficient to reach the goals of the future 0 experiment CUPID.The radiopurity levels of 226 Ra and 228 Th are below 0.5 Bq/kg.We obtain a background index in the region of interest of 3.7 +0.9 −0.8 (stat) +1.5 −0.7 (syst) × 10 −3 counts/Δ FWHM /mol iso /yr, the lowest in a bolometric 0 decay experiment.
The detailing of the background achieved in this work enables promising further studies.We can obtain the 2 decay rate of 100 Mo with high precision.It also allows for studies on various process which could distort the spectral shape, like Bosonic neutrinos, CP violation or 0 with Majoron(s) emission.
The help of the technical staff of the Laboratoire Souterrain de Modane and of the other participant laboratories is gratefully acknowledged.We thank the mechanical workshops of LAL (now ĲCLab) for the detector holders fabrication and CEA/SPEC for their valuable contribution in the detector conception.For this reason, the CUPID-Mo collaboration is particularly sensitive to the current situation in Ukraine.The position of the collaboration leadership on this matter, approved by majority, is expressed at https://cupid-mo.mit.edu/collaboration#statement. Majority of the work described here was completed before February 24, 2022.

Fig. 1
Fig. 1 Left: An individual CUPID-Mo bolometer showing the transparent Li 2 100 MoO 4 crystal, the copper holder, the NTD-Ge thermometer and the Teflon clamps.Right: View of the opposite side of the detector, showing the black light detector, fabricated from Ge wafers [6].

Fig. 2 Fig. 3
Fig. 2 Left: The CUPID-Mo experiment in the EDELWEISS cryostat.The five towers on the front contain the CUPID-Mo detectors and the EDELWEISS detectors can be seen behind.Right: GEANT4 rendering of the detector chamber in the Monte Carlo simulation geometry.Inside the five towers are placed the LMO crystals, the light detectors, the clamps and the reflective foils (not seen).The readout cables and the structure supporting the towers are indicated.

Fig. 4
Fig.4 Visualisation of the EDELWEISS cryostat and shielding as implemented in our MC simulations, we show the cryostat surrounded by the lead shield, the external polyethylene shielding and the muon veto panels.The muon panels are free to move to give a full geometric coverage.

Fig. 5
Fig.5CUPID-Mo experimental data.Left: M 1,/ : events in one detector identified as /.M 2 : events in coincidence between 2 crystals, the two energies deposited in the crystals are summed.Right: M 1,  : events in one detector with  energy scale.

Fig. 7
Fig. 7 Comparison between the 56 Co calibration data and MC simulations, M 1 data in the region of interest.

Fig. 10 Fig. 11 Experimental M 1 ,
Fig. 10 Distribution of − ln L ( D | ì  ) from the toys for the M 1,/ (left), M 2 (middle) and M 1,  (right) spectra.The red line shows the − ln L ( D | ì  ) of the reference fit for each of the spectra.

Fig. 12
Fig.12 Background sources reconstructing the experimental M 1,/ spectrum, grouped by source location.In blue, 2 is the dominant contribution in [350-3000] keV.The most important contribution from the materials, below 3 MeV, are the cryostat and shields, shown in magenta.

Fig. 14
Fig.14 Posterior distributions of background index of the several background sources grouped by source location.Also shown is the full posterior distribution.

Fig. 16
Fig.16 Posterior distribution of background index of the crystal from  and / contamination.
F.A. Danevich, V.V. Kobychev, V.I.Tretyak and M.M. Zarytskyy were supported in part by the National Research Foundation of Ukraine Grant No. 2020.02/0011.A.S. Barabash, S.I.Konovalov, I.M. Makarov, V.N.Shlegel and V.I.Umatov were supported by the Russian Science Foundation under grant No. 18-12-00003.J. Kotila is supported by Academy of Finland (Grant Nos.314733, 320062, 345869).Additionally the work is supported by the Istituto Nazionale di Fisica Nucleare (INFN) and by the EU Horizon2020 research and innovation program under the Marie Sklodowska-Curie Grant Agreement No. 754496.This work is also based on support by the US Department of Energy (DOE) Office of Science under Contract Nos.DE-AC02-05CH11231, and by the DOE Office of Science, Office of Nuclear Physics under Contract Nos.DE-FG02-08ER41551, DE-SC0011091.This research used resources of the National Energy Research Scientific Computing Center (NERSC) and the IN2P3 Computing Centre.This work makes use of the Diana data analysis software and the background model based on JAGS, developed by the CUORICINO, CUORE, LUCIFER, and CUPID-0 Collaborations.Russian and Ukrainian scientists have given and give crucial contributions to CUPID-Mo.

Table 1
Efficiencies for the cuts used on CUPID-Mo data.The PSD and Light Distance cut efficiencies are evaluated using  peaks, and show no energy dependence in the range of the fit.

Table 2
[41]oactivity values of the components of the CUPID-Mo detectors.All measurements have been made by ICPMS, with the exception of the Springs, measured by -ray spectroscopy and the surface contamination of the Vikuiti ™ reflective foil, measured with the BiPo-3 detector[41].

Table 3
Radioactivity of the components in the EDELWEISS setup.All measurements have been made by HPGe -spectroscopy.The MillMax connectors have also been measured by ICPMS.

Table 5 ,
-210Pb surface contamination with 100 nm and 1 m implantation depth.Top: Comparison between the56Co calibration data and MC simulations, M 1 data, with a variable binning in the region between 200 -4000 keV.Bottom: Bin by bin ratio of data and MC.Most of the values are within 3 , with discrepancies below 20%.

Table 4
Radioactive contaminations of the LMO crystals derived from the background model of the CUPID-Mo data, with 2.71 kg × yr exposure.The upper table shows the bulk activites.We report also the results under the assumption of no surface contaminations, to study the effect in the fit of the anticorrelation between bulk and surface activities.The lower table shows the surface activities, we give the activities with MC simulation with 10 nm implantation depth (see text).The effect of including a contribution with a depth parameter of 10 m is shown on the last column.

Table 6
Background Index in ROI for different fits.The tests allow to check the stability of the model and assess the systematic uncertainties.