Results from 730 kg days of the CRESST-II Dark Matter search

The CRESST-II cryogenic Dark Matter search, aiming at detection of WIMPs via elastic scattering off nuclei in CaWO 4 crystals, completed 730 kg days of data


Introduction
The nature of Dark Matter remains one of the outstanding questions of present-day physics. There is convincing evia e-mail: proebst@mpp.mpg.de b e-mail: jschmale@mpp.mpg.de dence for its existence on all astrophysical scales and many theories predict particle candidates that may be able to explain its composition. However, in spite of numerous attempts, Dark Matter particles have not been unambiguously detected so far.
Several experiments currently aim for direct detection of Dark Matter, mostly focusing on a particular class of particles, the so-called WIMPs (Weakly Interacting Massive Particles). WIMPs nowadays are among the most investigated and best motivated candidates to explain Dark Matter. If they exist, they could be present in our galaxy in the form of a halo, constituting the majority of the galactic mass. Rare interactions with ordinary matter would then possibly be detectable in earthbound experiments.
One such project is CRESST-II (Cryogenic Rare Event Search with Superconducting Thermometers), located at the Laboratori Nazionali del Gran Sasso in Italy. In this experiment we aim for detection of the WIMPs via their scattering off nuclei. The main challenges of this kind of measurement are to detect the tiny amounts of recoil energy transferred to the nucleus (O(10 keV)), and to achieve sufficient background suppression to be sensitive to the extremely low rate of anticipated interactions (not more than a few tens of events per kilogram and year).
To meet these requirements, CRESST uses cryogenic detectors in a low-background environment. Furthermore, the scintillating target material CaWO 4 allows for a discrimination of the type of interacting particle. In this way, potential rare WIMP interactions can be distinguished from events which were induced by the dominant radioactive backgrounds.
In this article, we report on the latest data obtained with the CRESST setup. They comprise a total net exposure of about 730 kg d, collected between 2009 and 2011. The structure of the paper is as follows: Sect. 2 starts with a brief introduction to the experimental setup. Section 3 gives the new aspects of the present run and summarizes the observations. The choice of an appropriate acceptance region and the possible backgrounds relevant in this region are discussed. In Sect. 4, we give a qualitative description and estimation of the backgrounds. Section 5 then describes the quantitative treatment in the framework of a maximum likelihood analysis. The results of this analysis are then discussed in Sect. 6. Finally, Sect. 7 gives the outlook for future runs of the experiment.

Experimental setup
A detailed description of the CRESST-II setup was the subject of earlier publications [1,2]. Here, we restrict ourselves to a few key aspects relevant for the discussion of the new results.

Target material
As a target for WIMP interactions, CRESST uses scintillating CaWO 4 crystals. They have a cylindrical shape (40 mm in diameter and height) and weigh about 300 g each. The current experimental setup can accommodate up to 33 of these crystals, constituting a maximum target mass of about 10 kg.
Under the usual assumption of coherent WIMP scatterings off nuclei, the scattering cross section contains an A 2 enhancement (with the mass number A of the target nucleus). In this case one expects that the total scattering rate in CaWO 4 is dominated by interactions with the heavy tungsten nuclei. However, due to kinematics a heavier nucleus tends to receive a smaller recoil energy, and in a detector with a finite energy threshold the other constituents can also become relevant despite the coherence effect. To illustrate this point, Fig. 1 shows, as a function of the WIMP mass, the contributions of the three elements in CaWO 4 to the total rate of WIMP interactions in the energy interval 12 to 40 keV, the range typical of the CRESST detectors.
For low WIMP masses, up to about 12 GeV, the scatterings off tungsten are completely below threshold, and oxygen and calcium recoils give the only possible Dark Matter signal. On the other hand, above WIMP masses of about 30 GeV, tungsten completely dominates. When looking for possible WIMP interactions we therefore consider recoils of all three types of nuclei in CaWO 4 , in order to be sensitive to the largest possible range of WIMP masses. Fig. 1 Contributions of the three types of nuclei present in a CaWO 4 target to the total rate of WIMP interactions, as a function of the WIMP mass and for a cross-section of 1 pb, assuming coherent ∼A 2 interactions. The calculation assumes a detector with a sensitive energy range between 12 and 40 keV

Phonon and light detectors
In order to be able to detect the low energy nuclear recoils the target crystals are operated as cryogenic calorimeters at temperatures of about 10 mK. The energy deposited by an interacting particle is mainly converted into phonons, which are then detected with a transition edge sensor (TES). We thus denote the target crystals with their TES as phonon detectors.
The TES is a thin tungsten film evaporated onto the crystal, with the temperature stabilized in the transition from the normal to the superconducting state. The tiny change of the film temperature (O(µK)) induced by the absorption of the phonons leads to a measurable change in resistance. This signal is read out by SQUID-based electronics. The amplitude of the signal is a precise measure of the deposited energy. After the interaction, the crystal temperature relaxes back to the equilibrium state via a weak thermal coupling to the heat bath.
In addition to the phonon signal, a small fraction of the energy deposited in the target crystal is converted into scintillation light. Each crystal is paired with a separate cryogenic light detector in order to detect this light signal. Most of the light detectors are made from a silicon-on-sapphire wafer (a sapphire wafer of 40 mm diameter and 0.46 mm thickness, with a 1 µm silicon layer on one side, which acts as photon absorber). As an alternative, some light detectors consist of pure silicon wafers of the same size. Similar to the target crystals, each light detector is equipped with a tungsten transition edge sensor to determine the energy deposited by the absorption of scintillation photons.
A crystal and the corresponding light detector form a socalled detector module as shown in Fig. 2. Both detectors of a module are held by thin, silver-coated bronze clamps and are enclosed in a common, highly light reflective housing in order to collect as much scintillation light as possible. The reflector is a polymeric foil which also scintillates. This will be discussed in detail in Sect. 2.4.

Quenching factors and background discrimination
For each particle interaction, a detector module yields two coincident signals (one from the phonon and one from the light detector). While the phonon channel provides a sensitive measurement of the total energy deposition in the target (approximately independent of the type of interacting particle), the light signal can be used to discriminate different types of interactions. To this end, we define the light yield of an event as the ratio of energy measured with the light detector divided by the energy measured with the phonon detector. We normalize the energy scale of the light channel such that 122 keV γ 's from a 57 Co calibration source have a light yield of unity. With this normalization electron recoils induced either by β sources or by gamma interactions, generally have a light yield of about 1. On the other hand, α's and nuclear recoils are found to have a lower light yield. We quantify this reduction by assigning a quenching factor (QF) to each type of interaction. The QF describes the light output expressed as a percentage of the light output for a γ of the same deposited energy.
Some quenching factors can be directly determined from CRESST data. For example, neutrons detectably scatter mainly off the oxygen nuclei in CaWO 4 . The QF for oxygen can thus be determined from a neutron calibration run which took place during the data taking discussed here. The result is Moreover, the quenching factor for low energy α's can be found to be about 22 %, using α-events in the current data set. Similarly, the value for lead can be inferred to be around 1.4 %. Both measurements will be discussed below.
Other types of interactions (in particular calcium and tungsten recoils) are not observed with sufficient statistics in CRESST, and their quenching factors must be determined in dedicated experiments [3]: Corresponding to these different values, there will be characteristic "bands" for the different particles or recoils in the light yield-energy plane. This allows for an excellent discrimination between potential signal events (expected to show up as nuclear recoils) and the dominant radioactive backgrounds (mainly e/γ -events). Furthermore, it is even partially possible to determine which type of nucleus is recoiling. Such a discrimination is possible to the extent to which the different nuclear recoil bands in the light yield-energy plane can be separated within the resolution of the light channel. This then allows a study of potential WIMP interactions with different target nuclei, in parallel in the same setup. Such a possibility can be particularly relevant for the verification of a positive WIMP signal, and is a distinctive feature of CRESST.

Scintillating housing
As mentioned above, the housing of the detector modules consists mainly of a reflecting and scintillating polymeric foil. Making all surfaces in the vicinity of the detectors scintillating is important in discriminating background events due to contamination of surfaces with α-emitters. The basic mechanism of this background is illustrated in Fig. 3.
The most important isotope in this context is 210 Po, a decay product of the gas 222 Rn. It can be present on or implanted in the surfaces of the detectors and surrounding material. The 210 Po nuclei decay to 206 Pb, giving a 5.3 MeV α-particle and a 103 keV recoiling lead nucleus. It can happen that the lead nucleus hits the target crystal and deposits its energy there, while the α-particle escapes. Due to its low quenching factor, the lead nucleus can often not be distinguished from a tungsten recoil and thus can mimic a WIMP interaction.
However, if the polonium mother nucleus was located on the surface of the target crystal or implanted in it (the upper case in Fig. 3), the full 103 keV of the daughter nucleus plus a possible contribution from the escaping α-particle will be deposited in the target and the event will lie above the energy range relevant for the WIMP search.
Another situation arises when the polonium atom was implanted in a surrounding surface. Then the daughter Pb nucleus can lose part of its energy on the way to the target crystal and appear in the energy range of interest (the lower case in Fig. 3). This possibly dangerous background can be rendered harmless if the surrounding surfaces are scintillating; in this case the escaping α-particle produces additional scintillation light when hitting those surfaces and the event will appear as high-light event in distinction to the low-light nuclear recoils.
Hence the scintillation of the complete surroundings of the target crystals plays an important role. With the scintillating foil used as a module housing, currently the only non-scintillating surfaces inside the detector modules are the small clamps which hold the target crystals. In earlier runs, attempts were made to cover these clamps with scintillating layers as well, but these layers appeared to give rise to thermal relaxation events. The current module design therefore avoids any scintillating (plastic) material in direct contact with the crystals. The price for this measure, however, is the presence of several Pb recoil events with energies of 103 keV and below in the data set, as expected from the above discussion. This background must therefore be taken into account in our analysis. Figure 4 shows a schematic drawing of the whole CRESST setup, with the detector modules in the very center. The low temperatures are provided by a 3 He-4 He dilution refrigerator and transferred to the detectors via a 1.3 m long copper cold finger. The detector volume is surrounded by several layers of shielding against the main types of background radiation: layers of highly pure copper and lead shield against γ -rays, while polyethylene serves as a moderator for neutrons. The inner layers of shielding are contained in a gas tight box to prevent radon from penetrating them. In addition, an active muon veto using plastic scintillator panels is installed to tag muons. The veto surrounds the lead and copper shielding and covers 98.7 % of the solid angle around the detectors, a small hole on top is necessary to leave space for the cryostat. Schematic drawing of the CRESST setup. A cold finger (CF) links the cryostat (CR) to the experimental volume, where the detectors are arranged in a common support structure, the so-called carousel (CA). This volume is surrounded by layers of shielding from copper (CU), lead (PB), and polyethylene (PE). The copper and lead shieldings are additionally enclosed in a radon box (RB). An active muon veto (MV) tags events which are induced by cosmic radiation

Data analysis
We apply just a few basic quality cuts to the raw data in order to ensure that only valid events, with well-reconstructed energies in the phonon and light channel, are considered for further analysis. In particular, we require that both the phonon and light detector of a given module were fully operational and running stably at their respective operating points at the time of an event. Data acquisition, readout, and the procedures for monitoring detector stability, trigger efficiency as well as reconstructing the deposited energy from the measured pulses are described elsewhere [1]. All cuts were adjusted on a small subset of the data (about 10 % of the data set, taken from the beginning of the run) and then blindly applied to the full data set.
For the final data set, we reject events coincident with a signal in the muon veto as well as those events with coincident signals in more than one detector module, since multiple scatterings are excluded for WIMPs in view of their rare interactions.
An important aspect of the analysis concerns the bands in the light yield-energy plane. The e/γ -band is highly populated due to the relatively high rate of common backgrounds. This allows us to extract the position and, in particular, the energy-dependent width of this band directly from the measured data. The observed width is dominated by the light channel resolution compared to which the resolution of the phonon channel is much superior. This is understandable in view of the small fraction of the deposited energy appearing as light.
We extract the resolution of the light channel as a function of light energy by fitting the e/γ -band with a Gaussian of energy dependent center and width. We note that, although the production of scintillation light is governed by Poisson statistics, the Gaussian model is a very good approximation in our regions of interest. This is because the e/γ -events produce a sufficiently large number of photons for the Poisson distribution to be well approximated by a Gaussian. On the other hand, for the quenched bands with low light yields, the Gaussian baseline noise of the light detector determines the resolution.
The position and width of the bands other than the e/γband can be calculated based on the known quenching factors discussed above and using the light channel resolutions obtained from the fit to the e/γ -band. In order to get the width of a quenched band at a certain energy the light channel resolution for the actual light energy is used.
To validate this calculation for quenched bands, we use the data from a calibration measurement with an AmBe neutron source placed outside the Pb/Cu shielding. We expect the neutrons to mainly induce oxygen recoils. Figure 5 shows the data obtained by one detector module in this measurement, together with the calculated central 80 % band for oxygen recoils (10 % of the events are expected above the upper and 10 % below the lower boundary). At very small recoil energies, where the light signal is smaller than the baseline noise, the standard event fit of the light signal may deliver negative pulse amplitudes, resulting in the negative light yields in Fig. 5.
The measured energy spectrum of nuclear recoil events is shown in Fig. 6 (the grey histogram). Recoil energies up to about 300 keV are observed, with the spectrum falling off quickly towards high energies. Thereby, the highest energies are possible for oxygen recoils, since in neutronnucleus elastic scattering the recoil energy of the nucleus is inversely proportional to its mass. Simply from the ratio of the mass numbers we then expect the highest energy of calcium recoils to be around 100 keV. This kinematic argument is nicely confirmed by a Monte Carlo simulation of the  The energy spectrum of nuclear recoils induced by an AmBe neutron source mounted outside the lead/copper shielding. The measured spectrum (grey) is well matched by the Monte Carlo simulation (black). The simulation also yields the individual contributions from neutron scatterings off oxygen (red), calcium (green), and tungsten (blue) nuclei. Oxygen largely dominates at all energies and is the only relevant contribution above 100 keV (Color figure online) neutron calibration run, the results of which are also given in Fig. 6. Plotted are the total simulated energy spectrum (black) which closely follows the measured data, as well as the contributions from oxygen (red), calcium (green), and tungsten (blue) recoils.
As expected, above 100 keV we observe almost purely oxygen recoils. Looking at Fig. 5, their distribution fits well into the calculated oxygen band. Towards lower energies, the observed events are still in agreement with the prediction, although an increasing contribution from calcium recoils slightly shifts the center of the observed event distribution to lower light yields.

Data set
The latest run of CRESST took place between June 2009 and April 2011. It included a neutron test and γ -calibrations with 57 Co and 232 Th sources. In total, 18 detector modules were installed in the cryostat, out of which ten were fully operated. The remaining modules cannot be employed for a Dark Matter analysis, principally due to difficulties in cooling the light detectors. However, seven additional individual detectors (six phonon and one light detector) were still operated in order to tag coincident events (with signals in more than one module).
One of the ten operational modules was equipped with a test ZnWO 4 crystal and we do not include it in this analysis because of uncertainties in the quenching factors in this material. Another operational detector module had unusually poor energy resolution, with practically no sensitivity in the WIMP signal region, and was therefore excluded from the analysis. The data discussed in this paper were thus collected by eight detector modules, between July 2009 and March 2011. They correspond to a total net exposure (after cuts) of about 730 kg days, distributed between the detector modules as shown in Table 1. Figure 7 shows an example of the data obtained by one detector module, presented in the light yield-energy plane. The e/γ -events are observed around a light yield of 1. The calculated bands for α's, oxygen recoils, and tungsten recoils are shown. 1 The spread of a band at each energy is chosen so that it contains 80 % of the events, that is 10 % of the events are expected above the upper boundary and 10 % of the events are expected below the lower boundary. This convention will be used throughout the following discussion whenever we refer to events being inside or outside of a band.

Observed event classes
Beside the dominant e/γ -background, we identify several other classes of events: Firstly, we observe low energy α's with energies of 100 keV and less. They can be understood as a consequence of an α-contamination in the non-scintillating clamps holding the crystals. If the α-particle has lost most of its energy in the clamp before reaching the target crystal, it can appear at low energy. The rate of such α-events differs by some factor of two among the detector modules (see Sect. 4.2).
Secondly, Fig. 7 shows a characteristic event population in and below the tungsten band around 100 keV. This is present in all detector modules, albeit the number of events varies. This population can be attributed to the lead nuclei from 210 Po α-decays on the holding clamps (see Sect. 2.4). The distribution of these events exhibits a low-energy tail, with decreasing density towards lower energies. In spite of this decrease, there are detector modules (the ones with a Fig. 7 The data of one detector module (Ch20), shown in the light yield vs. recoil energy plane. The large number of events in the band around a light yield of 1 is due to electron and gamma background events. The black line is the boundary below which 0.1 % of these e/γ -events are expected at each energy. The shaded areas indicate the bands, where alpha (yellow), oxygen (violet), and tungsten (gray) recoil events are expected. Additionally highlighted are the acceptance region used in this work (orange), the reference region in the α-band (blue), as well as the events observed in these two regions. See text for discussion (Color figure online) high population of such lead events) in which the tail visibly reaches down to energies as low as a few tens of keV.
Events without scintillation light might result from relaxation processes connected with an improper mounting of the CaWO 4 crystals. The mounting in the present run was designed to avoid such relaxation processes and there is no indication that such events are present in the data. Possible mechanisms for the creation of such relaxation events and our measures of preventing them are discussed in Appendix B.
Finally, low energy events are present in the oxygen, (calcium,) and tungsten bands at energies up to a few tens of keV, i.e. in the region of interest for the WIMP search. These events will be the main focus of our discussion in the following. We start by defining the acceptance region on which the discussion will be based.

Acceptance region
Depending on the mass of a possible WIMP, any of the nuclei in CaWO 4 can be a relevant target for WIMP scattering as discussed above. We therefore choose our acceptance region such that it includes all three kinds of nuclear recoils: it is located between the upper boundary of the oxygen band and the lower boundary of the tungsten band. This selection automatically includes the calcium band.
We restrict the accepted recoil energies to below 40 keV, since as a result of the incoming WIMP velocities and nuclear form factors, no significant WIMP signal is expected at higher energies. On the other hand, towards low energies the finite detector resolution leads to an increasing leakage of e/γ -events into the nuclear recoil bands. Figure 7 illustrates this fact by showing the boundary (black line) below which 0.1 % of the e/γ -events at the respective energy are expected. We limit this background in the acceptance region by imposing a lower energy bound E min acc in each detector module, chosen such that the total expected e/γ -leakage into the acceptance region of this module is one event in the whole data set. The leakage calculation is based on the fit to the observed e/γ -distribution in the Dark Matter data set as described in Sect. 5.2. Due to the different resolutions and levels of e/γ -background in the crystals, each module is characterized by an individual value of E min acc . Table 1 lists the values of E min acc for all modules. An example of the resulting acceptance region is shown (orange) in Fig. 7 and the events observed therein are highlighted. In the sum over all eight detector modules, we then find 67 accepted events, the origin of which we will discuss in the following. Table 1 shows the distribution of these events among the different detector modules. Since E min acc as well as the width of the bands are module-dependent, different modules have different sized acceptance regions and thus different expectations with respect to background and signal contributions.

Backgrounds in the acceptance region
With the above choice of the acceptance region, four sources of background events can be identified: 1. leakage of e/γ -events at low energies, 2. α-events due to overlap with the α-band, 3. neutron scatterings which mainly induce oxygen recoils in the energy range of interest, and 4. lead recoils from α-decays at the surface of the clamps, degraded to low energy.
In the following, we estimate the contribution of each of the four sources of background listed before and finally investigate a possible excess above this expectation. When present, such an excess may be the result of WIMP scatterings in our detectors, or of course an unsuspected background.
We estimate the backgrounds and the contribution of a possible WIMP signal in the framework of a maximum likelihood fit that allows a treatment with all relevant parameters and their uncertainties simultaneously. However, before introducing this rather abstract formalism in Sect. 5, we give a qualitative discussion of the backgrounds, with the aim of clarifying the basic arguments and assumptions.

e/γ -Background
The lower energy bound of the acceptance region is chosen such that we expect a leakage of one background e/γ -event per detector module. These events are expected to appear mainly at the low energy boundary of the acceptance region where the overlap between the bands is largest and towards the upper boundary, closer to the e/γ -band. The quantitative modeling is discussed in Sect. 5.2.

α-Background
Since the α-band has some overlap with the acceptance region, some low energy α-events may be misidentified as oxygen or even calcium or tungsten recoils. This will lead to a certain expectation n α acc of background events in the acceptance region of each module.
The energy spectrum dN α /dE of the low energy events in the α-band appears to be flat within the available statistics. In order to estimate dN α /dE, we first select a reference region of the α-band which is free of overlap to any other band.
An example of such a reference region is highlighted (blue) in Fig. 7. To obtain reasonable statistics, we have chosen the relatively large energy range of 100 keV for the reference region. The low energy limit E min ref of the reference region is chosen as low as possible, while taking into account the increasing e/γ -leakage into the α-band towards low energies. Low E min ref are desirable, since the reference region should naturally be close to the acceptance region to minimize extrapolation errors. On the other hand, influences of the e/γ -background on the α-estimate should be minimized and the value of E min ref was thus chosen such that only 0.1 counts of e/γ -leakage are expected in the whole reference region for each detector module. Since the width of the bands varies from module to module, each module has its own value of E min ref . These values are listed in Table 2. For our first rough estimate of the alpha background in the acceptance region, we simply count events in the reference region. We then assume constant dN α /dE and calculate the ratio of α-events expected in the acceptance region to those expected in the reference region. Scaling the observed number of events in the reference region by this ratio, we arrive at an estimate of the α-background in the acceptance region. Table 2 summarizes the observed alpha counts n α ref in the reference region and the resulting estimates of the alpha background n α acc in the acceptance region of each module. This results in a total expected α-background of about 9.2 events.
In the likelihood analysis of Sect. 5, we will relax the assumption of constant dN α /dE and also allow for a linear term in the α-energy spectrum. We will, however, see that the simple estimate given here is quite close to the one obtained with the more sophisticated analysis.

Introduction
Throughout our discussion, we distinguish two different classes of neutron production mechanisms: Firstly, free neutrons can be emitted by radioactive processes, in particular spontaneous fission (s.f.) of heavy elements or (α, n) reactions on light nuclei. Such neutrons typically have energies up to a few MeV, for which the polyethylene shielding provides a very efficient moderator. Monte Carlo simulations suggest that neutrons from s.f. and (α, n) reactions in the rock outside the experiment thus only constitute a negligible background at the level of 10 −5 events per kg d [4]. Therefore, such neutrons are a possibly relevant background only if they are emitted inside the neutron shielding, e.g. by s.f. of 238 U in the lead shielding, or by (α, n) reactions or s.f. in the copper shielding. We consider such neutrons here, even though Monte Carlo simulations predict only a negligible background in the acceptance region due to these processes, at a level of 10 −3 events per kg d [5].
Secondly, neutrons can also be produced by muons, either in the lead or copper shielding or in the surrounding rock. In the former case, the muon will mostly be tagged by the muon veto enclosing the Pb/Cu shielding. However, there is a small probability that the muon is missed by the veto because of the hole on top of the setup, as mentioned above (cf. Fig. 4). Such a muon may create a shower of neutrons inside the PE shielding which then reach the detectors. On the other hand, muon-induced neutrons created outside the neutron shielding may penetrate the polyethylene layers if they are energetic enough. Such high energy neutrons then have a high probability to scatter inelastically in the Pb/Cu shielding and to create secondary neutrons and gammas. Ultimately, this leads to events with similar characteristics as when the shower is directly induced by a muon inside the PE shielding.

Experimental information
It is a characteristic feature of neutrons that they can lead to coincident events in more than one detector module at the same time, with at least one module registering a nuclear recoil. A given neutron source will thereby lead to events with a characteristic ratio between single and coincident scatterings. If this ratio is known, the observed coincidences can be exploited to estimate the expected number of single neutron (background) events. This is the basic concept pursued in the following.
We shall base our discussion on two sources of information. One is the results of the neutron test with an AmBe neutron source already mentioned above. The second is the examination of events in coincidence with incoming muons, i.e. muon veto triggers accompanied with a signal in the acceptance region of a module. These two "calibration measurements" are used to infer the properties of neutron backgrounds originating from ambient neutrons and from muons escaping the veto and interacting in the apparatus. We emphasize that Monte Carlo or other external information is helpful, but does not enter into our quantitative estimates.
For each type of neutron mechanism, we use the corresponding calibration to infer the typical structure of the events with respect to their multiplicity, defined as the number of detector modules triggering at the same time (within a time window of 5 ms). For muon-induced neutrons, we observe 40 events coincident with a muon veto trigger and with at least one detector module having a signal in its acceptance region. The multiplicities of these events are shown in the top histogram of Fig. 8. On the other hand, the multiplicities of the events induced by the AmBe neutron source (placed at various positions outside the Pb/Cu shielding but inside the neutron shielding) are given in the bottom histogram of Fig. 8.
Events due to muon-induced neutrons show an obviously higher average multiplicity than events from the neutron test source. This evidently results from muon-induced cascades, leading to neutrons and gammas in different detector modules at the same time.
With this information at hand, we turn back to the Dark Matter data set. In addition to the 67 accepted events, these data contain three events in which several detector modules triggered in coincidence, with at least one module registering an event in its acceptance region. Two of these events have a multiplicity of three (i.e. three modules triggered), while in the third event five modules triggered in coincidence. Such coincidences which include at least one nuclear recoil must involve a neutron, and perhaps γ 's. Accidental coincidences may be neglected in view of the low overall counting rate.
Based on these three coincident events we can use the calibration information given above to scale up to the number of expected single scatters for each type of source.
For a first rough estimate of the level of neutron background, we start with the three observed coincidences and neglect their precise multiplicities. From the histograms in Fig. 8, we find that muon-induced neutrons are characterized by a ratio of single to coincident scatterings of about 0.5, while the neutron source test gives a ratio of about 3.8. Consequently, if we assume that only muon-induced neutrons are present in the experiment, this would lead to an estimated number of single scatterings of 3 · 0.5 = 1.5. On the other hand, in case the neutron background purely comes from a radioactive source, the same estimate gives a background expectation of about 3 · 3.8 = 11.4 single events.
In reality, both types of neutrons may be present, and the above limiting cases show the extremes between which the expected neutron background can lie according to this estimate. Our likelihood analysis of Sect. 5 takes into account the more detailed information of the individual event multiplicities in order to clarify the contributions of the two types of neutron sources to the total background. We will, however, see that the result is compatible with the simple estimates of the limiting cases given here.
An independent aspect of the neutron background concerns the corresponding recoil energy spectrum. Within our narrow accepted energy range, the energy spectra induced by the two types of neutron events are found to be very similar, according to the calibration data discussed above. The spectrum can be parametrized by a simple exponential dN n /dE ∝ exp (−E/E dec ). We determine the parameter E dec from a fit to the spectrum obtained in the AmBe neutron calibration run. In the energy range between 12 keV to 40 keV we obtain E dec = (23.54 ± 0.92) keV.
This similarity in the spectra induced by neutrons from the two quite different sources (in agreement with Monte Carlo results [5]) indicates how the Pb/Cu shielding surrounding the detectors will moderate an incoming neutron flux regardless of its origin. The primary spectrum of the neutrons is washed out by inelastic scatterings in the shielding. This finding supports our use of the results of the neu- tron calibration to estimate the effects of a general neutron background. The only exception to this argument might be a neutron-producing contamination in close vicinity of the detectors. In this case, we would expect a recoil spectrum reaching to much higher energies and some fewer singles for a given number of coincidences. In this case, the application of our above calibration results would lead to a conservative neutron background estimate.

Lead recoil background
To illustrate the lead recoil background from 210 Po decay, Fig. 9 displays the data set of a different detector module as in Fig. 7. Compared to Fig. 7, a more prominent population of 206 Pb recoils below the tungsten band is visible, with a rather long tail extending down to the acceptance region. Since the lead band and the acceptance region overlap considerably, a leakage of some 206 Pb events into the acceptance region cannot be excluded.
For an estimate of this background, we follow a similar strategy as for the α-background. We define a reference region for each detector module which contains predominantly 206 Pb recoils, and model the spectral energy density dN Pb /dE in this region. This model is then extrapolated into the energy range of the acceptance region.
As a reference region, we choose the lead recoil band at energies above the acceptance region, where a possible WIMP signal cannot contribute. In some detector modules with wider bands, the lead band still overlaps with the oxygen band around the lower edge of this energy range. In this case, we additionally restrict the reference region to the lower part of the lead band without overlap with the oxygen band in order to be independent of possible neutron-induced events on oxygen. The event distribution of the Pb recoils peaks at the full lead recoil energy of 103 keV and the upper boundary of the reference region is set at 90 keV so that it covers the low energy tail. An example of the resulting reference region is highlighted in green in Fig. 9. Table 3 summarizes the counts n Pb ref observed in the reference region of each detector module. Figure 10 presents the energy spectrum of the events found in the 206 Pb reference regions of all detector modules, but includes also lead recoils with higher energies to illustrate the peak at the full nominal recoil energy of 103 keV. In the energy range of the reference region (below 90 keV), the tail of the distribution can be modeled by an exponential decay on top of a constant contribution: For a first rough estimate of the recoil background, we simply fit such a function to the spectrum of Fig. 10. The red line shows the result of this fit with the parameters A Pb = 4.53 counts/keV, C Pb = 0.13 and E Pb decay = 13.72 keV. This model then needs to be extrapolated into the energy range of the acceptance region.
To check the validity of this extrapolation we used the SRIM package [6] to simulate the energy spectra expected for three different depth distributions of the 210 Po mother nucleus. These distributions were: an exponential profile with 3 nm decay length peaking at the surface, a uniform distribution in the volume, and finally the depth distribution resulting from implantation due to preceding alpha decays. For the latter case the implantation profile was also calculated with SRIM, assuming that 222 Rn is first adsorbed on the surface of the clamps holding the crystals, followed by two subsequent alpha decays to 210 Po. The results of the three simulations are shown in Fig. 11.
An important result of the simulation is that none of the calculated spectra of the Pb recoils rises significantly to-  wards low energies within the range of our acceptance regions. The simulated spectra for the uniform as well as the implantation profile are rather flat between 40 and 90 keV compared to the data in Fig. 10. However, the energy spectrum from the distribution peaking at the surface has a tail similar to that observed in the data. The curve in Fig. 11 is the result of a fit of Eq. (1) to this spectrum in the energy range of the reference region between 40 and 90 keV. Its ex-ponential decay of E Pb decay = 13.6 keV agrees within errors with the value obtained from the fit of the data in Fig. 10. Below 40 keV the curve in Fig. 11 shows the extrapolation of this function into the accepted energy range. The very good agreement of extrapolation and simulated data in the energy range of the acceptance region justifies the use of this type of extrapolation for our estimate of the Pb background.
For a rough first result, we take a typical energy range for the acceptance region of 12 to 40 keV and estimate the number of Pb recoils in it. From the extrapolation of the fit function in Fig. 10 we calculate about 17 events of 206 Pb background in the acceptance region. Of course, this simple estimate neglects small differences in the overlap of the lead band with the acceptance region in different detector modules, as well as the acceptance reduction in the reference region due to partial overlap of lead and oxygen bands. Nevertheless, the result is very close to the final value that we will obtain from the full likelihood analysis, which performs the background estimate module-wise and hence takes such differences into account.

Maximum likelihood analysis
In the previous section, the qualitative principles of our background estimates were discussed. This section explains how we formulate these concepts quantitatively. We choose the framework of a likelihood analysis to estimate the unknown parameters of our backgrounds and as well as the ones of a possible signal. This formalism also allows us to take into account and propagate the corresponding uncertainties. We first give a general overview of the formalism before focusing in detail on the treatment of the different backgrounds in the current measurement.

General concepts
The maximum likelihood fit is based on a parameterized model of the backgrounds and a possible signal. The parameters are then varied to find the values for which the model most likely reproduces the data.
As we will see, this procedure allows one to take into account the actual position of the events in the light yieldenergy plane, and not merely their presence in a broad acceptance region. Also, measurements from reference regions, outside the acceptance regions, may be introduced to help determine some of the parameters. Furthermore one may, in the framework of the method, give a quantitative estimate of how well a best set of parameters is determined.
The maximum likelihood method itself however gives no direct indication of the quality of the resulting fit. To deal with this question, we shall attempt to give a characterization of the quality-of-fit using the so-called p-value.

Likelihood function
The basic input to the analysis is a model for each source X that may contribute to our accepted events, be it background or a possible signal. Such a model describes the expected distribution of events in the (E, y) plane (where E denotes the recoil energy and y the light yield).
We formulate such a model in terms of a density ρ X (E, y | p X ) in the (E, y) plane. The number of events expected from source X in any subspace of the (E, y) plane is then simply given by the integral of ρ X over this region. ρ X depends on a set of unknown parameters summarized in the vector p X . These are the parameters to be varied.
In our case, each such density function is the product of two components: the expected recoil energy spectrum dN X /dE and a function that describes the expected event distribution in the light yield coordinate at each recoil energy E.
As discussed above, we use a Gaussian distribution for the light yield, where the center of the distribution is given by the respective quenching factor and the width is essentially determined by the resolution of the light channel.
Since different detector modules have different resolutions, the densities ρ X need to be defined for each module individually. We therefore add an additional index d for "detector module" to each density. Assuming that we have considered all significant sources of events, the total density for module d is where we have taken into account the four backgrounds discussed above, plus a possible WIMP signal "χ ". The vector p summarizes all the unknown parameters. We now have observed a set of events in the acceptance region of each module, located at the positions (E i , y i ). The functions ρ d can be fitted to the event distribution in the respective module and, from this the most likely values of the unknown parameters can, in principle, be determined. The normalizations of the ρ depend on the variable parameters and give the total number of expected events N via where the integral runs over the acceptance region of module d and thus yields the total expected number of accepted events in this module. The ρ d are thus no probability densities, but they can be used in the so-called extended maximum likelihood formalism [7] to formulate the likelihood function which then takes the form with the outer product running over the detector modules and the inner one over all accepted events in the respective module. The exponential factor takes into account the normalization of the ρ d . We emphasize that, in (4), we directly evaluate ρ d at the position of each observed event and no binning is involved. By fitting the density functions to the observed event coordinates, one makes use of the full available information from the two-channel measurement. Using the experimental data, the fit then finds the most likely values of the parameters, including those for the backgrounds and a possible signal.

Reference regions
In practice, some of the unknown parameters in p are not sufficiently constrained by the observation in the acceptance region alone. In such cases, additional observations ("reference measurements") can be exploited which are made either in other regions of the measurement parameter space (for example in the α-reference region introduced in the previous section), or in completely different experiments or measurement channels. Each such reference measurement yields a separate likelihood function which constrains a subset p of p, and, assuming all measurements are independent, the total likelihood is simply given by the product This concept can also be exploited in order to take into account the effects of uncertainties on otherwise fixed quantities, like e.g. the quenching factors of O, Ca and W, which enter the analysis and have been measured previously. To this end, these quantities are added to p as free parameters and, at the same time, an additional likelihood term is included for each of them which models the uncertainty of the quantity in question (e.g. a Gaussian around the best estimate, with the width given by the error on this estimate).
Finally, maximizing the total likelihood function L tot (p) leads to estimates for all unknown parameters p. Most of the time, however, one is only interested in a subset of these values, the others just being nuisance parameters required to construct the likelihood. In our case, the only parameters of interest are the WIMP-nucleon cross section σ WN and possibly (if we find a clear signal) the WIMP mass m χ . Our aim is to derive confidence intervals for these relevant quantities, taking into account the nuisance parameters and their uncertainties.

WIMP parameters
In a first step, we will thereby only concentrate on σ WN and ask whether our measurement indicates σ WN > 0 significantly. All other parameters shall be summarized in the vector p . This question corresponds to a test of the nullhypothesis σ WN = 0 with the model.
As a convenient test statistic, we employ the likelihood ratio Here,σ WN and p in the denominator are the maximum likelihood estimators of σ WN and p , respectively, and p is the conditional maximum likelihood estimator under the condition σ WN = 0. Λ is hence a measure of how "signal-like" our observation is, in the sense that small values indicate a considerably better description of the data with an allowed WIMP contribution than without. According to Wilk's theorem [8], if σ WN = 0 were the true assumption, then the quantity would have a χ 2 -probability distribution for one degree of freedom in the limit of large statistics. We have verified with a Monte Carlo simulation that this approximation holds well for the case of our analysis. This allows for a simple calculation of the statistical significance S, with which we can reject the null hypothesis σ WN = 0 when having observed a certain value q obs : A higher value of S obviously implies a smaller probability that a statistical fluctuation of the backgrounds may produce a data set as "signal-like" as the observed one, although there is no true signal present.
In a second step, if σ WN > 0 can be established with sufficient significance, the above approach can be generalized and m χ can be treated as a second parameter of interest. The aim is then to calculate a confidence region in the (m χ , σ WN ) plane. Such confidence regions are given by the contours on which the likelihood has decreased by a certain factor δL with respect to the maximum, provided that all parameters other than m χ and σ WN are refitted in each point. The value of δL that yields the desired confidence level can thereby be obtained from a χ 2 -distribution for two degrees of freedom. In fact, the same procedure can be applied to calculate confidence regions for any combination of n ≥ 1 parameters, using the χ 2 -distribution for n degrees of freedom. The one-dimensional case is commonly known from its implementation in the MINOS program of the MINUIT software package [9].

p-Value
As mentioned above, the significance resulting from the above likelihood ratio test is, in itself, not a good measure of the quality of the fit to the data. High significances do not automatically imply a good agreement between the fitted model and the data, so that the quality-of-fit has to be determined in an independent step. One possibility of doing so is described in [10] and we adopt an analogous procedure here: The (E, y)-plane (or a subspace of it) is divided into twodimensional bins, separately for each detector module. In each bin i, the expected number e i of events can be calculated by integrating the density ρ d . To go with this expectation, we have certain numbers of observed events in the bins. We can then determine the total probability of this observation by assuming an independent Poisson process with expectation e i in each bin i and multiplying all corresponding Poisson probabilities.
Let our real observation have such a probability P obs . Independently, we can then generate artificial data sets where, in each bin i, the generated number of events is Poisson distributed with the expectation e i . We employ a Markov Chain Monte Carlo for a fast generation of these data sets. Each generated set s can then again be characterized by a certain probability P s . From the distribution of the P s , we finally determine the probability to get a data set with probability less or equal to P obs . This is called the p-value and can be used to describe the quality-of-fit.
Often, as in our case, the model used to generate the artificial data sets is the result of a previous fit to the real observed data, where parameter values have been tuned for an optimal description of this observation. The p-value as obtained above needs to be corrected for this bias and a procedure of doing so is suggested in [10]. We perform this correction, but we note that its influence is naturally small when the number of free fit parameters is small compared to the number of bins considered.
Having now established the formal framework of our analysis, we will focus in the following on the concrete construction of the densities as well as the required reference measurements for each considered source of events.

e/γ -Background
The distribution of the e/γ -events in the (E, y) plane is observable directly in the Dark Matter data set. Due to the large number of such events, statistical fluctuations are negligible and we use the observed distribution directly to fix the densities ρ d γ (E, y). The fit of the energy-dependent Gaussian width of the e/γ -band as outlined above thereby gives the parametrization of the light yield distribution of the events. On the other hand, we use a histogram of the observed recoil energies (bin width 0.1 keV) as the expected energy spectrum and interpolate it to arrive at a continuous function. In the energy range of interest, the statistical uncertainties of the bin contents of this histograms are below 10 −2 . The resulting densities ρ d γ can therefore be treated as fixed contributions to the total density in each module, with no free fit parameters.
This estimate of the leakage of e/γ -events into the acceptance regions relies on the assumption of a Gaussian light yield distribution of the events in the e/γ -band. This assumption is confirmed by the data and is discussed in more detail in Appendix A.

α-Background
As discussed above, the observation of events in overlapfree regions of the α-band indicates an approximately flat energy spectrum of the α-background in our data. Nevertheless, to account for a possible small systematic energy dependence, we model the energy spectrum in each detector module d as introducing the fit parameters c d α and m α . c d α is modulespecific and describes the level of α-activity observed in the respective module, while we assume the possible slope m α to be common to all modules.
Obviously, the above fit parameters are only weakly constrained by the events observed in the acceptance region, where other sources contribute as well. Therefore, we additionally consider the α-reference region for each detector module as introduced in Sect. 4. It is this region that mainly influences the estimates of the α-parameters. Technically, we simply extend the inner product in Eq. (4) to also run over the events observed in the reference region and modify the integral in Eq. (3) accordingly. Since the contributions from other sources are small in the reference regions, this modification mainly leads to additional constraints on the α-parameters.
A second benefit of using these reference regions is that the low-energy quenching factor for α-particles can be left free in the fit. This is particularly important, since our observation of alphas in this data set is the best measurement available for this quantity. The quenching factor is automatically constrained by the events observed in the reference region and the uncertainties of this measurement are directly built into the likelihood. Of course, in order to define the reference region, a reasonable starting assumption for the quenching factor is required. We obtain it simply from the one-dimensional light yield distribution of the low-energy α-events. If this starting assumption is not too far from the true value, the likelihood fit can give a good estimate for the quenching factor, even though it is based on the events previously selected under the assumption of a slightly different starting value. We find that the result is indeed very robust against reasonable changes in the starting value.
We also confirmed that the result is very robust against some reasonable changes of the energy range of the reference regions. A selection of the same energy range for all detector modules of 35 to 135 keV, or selecting a 150 keV wide reference region with the lower energy limits of Table 2 did not change results in any relevant way.

Neutron background
For the neutron background, we consider the two types of neutron creation mechanisms discussed in the previous section. We have seen that both, a radioactive neutron source and muon-induced neutrons can be described by the same exponential energy spectrum of the corresponding recoil events. Our model is hence with E dec = 23.54 keV as given in the previous section. The amplitude A n is a function of the total expected number of neutrons, i.e. of the sum n rad + n muon of accepted single neutron scatterings due to radioactive sources and muoninduced neutrons, respectively. Both these numbers are unknown parameters. These parameters shall here be defined as the sum of expected events in all detector modules. Nevertheless, (10) describes the expected energy spectrum of neutron-induced events in each module. Although neutrons will scatter off all three types of target nuclei in CaWO 4 , Monte Carlo simulations show that more than 90 % of all scatterings detected in the energy range of interest happen off oxygen. In our background model, we hence treat neutron events as oxygen recoils and use the corresponding quenching factor to describe their light yield distribution. The uncertainty of the oxygen quenching factor as given in Sect. 2.3 is included in the likelihood, assuming a Gaussian error.
With also other sources of events contributing to our acceptance regions in the oxygen band, the unknown parameters n rad and n muon of the neutron background need to be constrained by an additional reference measurement. As outlined above, we exploit the ability to cause coincident events in more than one module to estimate the contribution from neutrons.
Each of the two neutron creation mechanisms considered here has a certain expected ratio of coincident events with multiplicity m > 1 to single scatterings: The multiplicity histograms of Fig. 8 represent our measurement of these ratios.
On the other hand, we have observed coincident events in our background data as outlined above and we will use them to derive constraints on the expected number of single neutron scatterings. To this end, we treat the occurrence of events with each multiplicity m as an independent Poisson process. This yields a likelihood factor of the form L m neutron (n rad , n muon ) = Pois N m obs | r m rad · n source + r m muon · n muon (12) for each multiplicity m, where Pois(x | y) denotes the Poisson probability to observe x events when y are expected, and N m obs is the number of observed events with multiplicity m. In our case, we have N 3 obs = 2, N 5 obs = 1, and N m obs = 0 for all other m > 1.
Each number N m obs can be partly due to neutrons from a radioactive source and to muon-induced neutrons. We emphasize that, since the multiplicity spectra for the two types of neutron mechanisms are significantly different, a fit can distinguish the two contributions based on the observed multiplicities. It will ultimately choose the distribution that fits best to the data.
For neutrons from radioactive sources, the measurement of the coincidence rate has sufficient statistics (Fig. 8) and we obtain the ratios r m rad directly by dividing the contents of the respective bins. On the other hand, for muon-induced neutrons the statistics of coincident events is low and statistical uncertainties of the histogram in Fig. 8 need to be taken into account. We accomplish this by fitting the observed multiplicity spectrum with a (purely heuristic) exponentially decaying function. This function is then evaluated to obtain the ratios r m muon . We directly add the likelihood of this fit to our total likelihood function and determine the parameters of the exponential simultaneously with all the other estimates. This way, their statistical uncertainty is automatically considered.
While an exponential fit to the multiplicity spectrum is clearly not the most general form, it is evident that the abundance of coincidences decreases towards higher multiplicities. Our fit models this behavior and provides a reasonable approximation to the observed spectrum, in particular at the low multiplicities which are most relevant here.

Lead recoil background
In analogy to the α-background, we study the lead recoil background in a region where the Pb-band is free of other sources of events. As discussed above, this indicates that the energy spectrum of this background has a decreasing tail towards lower energies which we model by an exponential starting at 90 keV (the upper energy bound of the lead reference region), on top of a constant contribution: In contrast to the simple fit of this function discussed in Sect. 4.4, we take into account the differences between the detector modules here: The free parameter A d Pb may be module-dependent and describes the different recoil background rates in the individual detector modules, while we use the same spectral decay length E Pb decay and constant background term C Pb for all modules. The latter quantities are characteristic of the implantation profile of α-emitters in the clamps and can thus be assumed to be universal if the underlying implantation mechanism is the same for all clamps.
The unknown parameters of this background model are constrained by including the reference region in the lead band as introduced in Sect. 4.4. The technical realization is identical to the one discussed for the α-background. Since the other known sources of events play only a negligible role in this region, this is purely a reference measurement for the parameters of the lead recoil background. Moreover, the inclusion of the reference region again allows to treat the quenching factor of lead as a free fit parameter, which is automatically constrained by the observed reference events. The corresponding uncertainty is thus directly built into the likelihood function.

WIMP signal
The density ρ χ (E, y | m χ , σ WN ) of a possible signal due to coherent WIMP-nucleon scatterings is the sum of three components, one for each possible recoiling nucleus in CaWO 4 . We calculate the expected recoil energy spectrum for each component as a function of the WIMP parameters, using the usual standard assumption of an isothermal WIMP halo of density 0.3 GeV/cm 3 , with a galactic escape velocity of v esc = 544 km/s and an asymptotic velocity of v ∞ = 220 km/s. Effects of the nuclear substructure are taken into account by the Helm form factor as given in [11], and the resulting energy spectra are finally convolved with the resolution of the phonon detectors to obtain the ultimately measured spectrum.
In the expected light yield distribution, we take into account the uncertainties of the quenching factors for all three nuclei, approximating the errors given in Sect. 2.3 by Gaussians. We note that the inclusion of these uncertainties has only a minor influence onto our results and even doubling the errors has no relevant effect. In the same way, varying the galactic escape velocity within its uncertainties is found to have negligibly small effects.

Results and discussion
In this section, we summarize the results obtained from the maximum likelihood fit as introduced above. Table 4 Results of the maximum likelihood fit. Shown are the expected total contributions from the backgrounds considered as well as from a possible WIMP signal, for the parameter values of the two likelihood maxima. The small statistical error given for the e/γbackground reflects the large number of observed events in the e/γband. The other errors correspond to a 1σ confidence interval as determined by MINOS (see Sect. 5

Resulting fit parameters
We find that the total likelihood function has two maxima in the parameter space, which we denote M1 and M2, respectively. M1 is the global maximum, but M2 is only slightly disfavored with respect to M1. We will hence discuss both solutions in the following. Table 4 shows the expected contributions of the backgrounds and of a possible WIMP signal in the two likelihood maxima. The background contributions are very similar for M1 and M2: The expected e/γ -background is one event per module according to the choice of the acceptance region, with a negligible statistical uncertainty due to the large number of events in the e/γ -band. The lead recoil and the α-background are similar to our simple estimates given in Sect. 4. Both these backgrounds are slightly larger than the contribution from neutron scatterings. In the context of the latter, the fit assigns roughly half of the coincident events to neutrons from a radioactive source and to muon-induced neutrons, respectively. This translates into about 10 % of the single neutron background being muon-induced.
In both likelihood maxima the largest contribution is assigned to a possible WIMP signal. The main difference between the two likelihood maxima concerns the best-fit WIMP mass and the corresponding cross section, with m χ = 25.3 GeV in case of M1 and m χ = 11.6 GeV for the case M2. The possibility of two different solutions for the WIMP mass can be understood as a consequence of the different nuclei present in our target material. The given shape of the observed energy spectrum can be explained by two sets of WIMP parameters: in the case of M1, the WIMPs are heavy enough to detectably scatter off tungsten nuclei (cp. Fig. 1), about 69 % of the recoils are on tungsten, ∼25 % on calcium and ∼7 % on oxygen, while in M2, oxygen (52 %) and calcium recoils (48 %) constitute the observed signal and lead to a similar spectral distribution in terms of the recoil energy. The two possibilities can, in principle, be discriminated by the light yield distribution of the signal events. However, at the low recoil energies in question, there is considerable overlap between the oxygen, calcium, and tungsten bands, so that we can currently not completely resolve the ambiguity. This may, however, change in a future run of the experiment. Figure 12 illustrates the fit result, showing an energy spectrum of all accepted events together with the expected contributions of backgrounds and WIMP signal. The solid lines correspond to the likelihood maximum M1, while the dashed lines belong to M2. The complicated shape of the expectations is the result of taking into account the energydependent detector acceptances. In particular, the different energy thresholds of the individual detector modules lead to a steep increase of the expectations when an additional module sets in.
We note that neither the expected α-or lead recoil backgrounds nor a possible neutron background resemble a WIMP signal in terms of the shape of their energy spectrum. Even if our analysis severely underestimated one of these backgrounds, this could therefore hardly be the explanation of the observed event excess.
On the other hand, the leakage of e/γ -events rises steeply towards low energies and one may be tempted to consider a strongly underestimated e/γ -background as the source of the observation. However, in addition to the energy spectrum, also the distribution in the light yield parameter needs to be taken into account. Figure 13 shows the corresponding light yield spectrum of the accepted events, together with the expectations from all considered sources. Again, the shape of the expectations is the result of the individual detector acceptances being considered. As expected, the contributions from the e/γ -and also from the α-background quickly decrease towards lower light yields and thus differ significantly from the expected distribution of a WIMP signal.
In order to check the quality of the likelihood fit, we calculate a p-value according to the procedure summarized in Sect. 5.1. We divide the energy-light yield plane into bins of 1 keV and 0.02, respectively, and include the acceptance region of each module as well as the alpha-and Pb recoil reference regions in the calculation. The two likelihood maxima are found to give very similar results, with p-values of about 0.36 and 0.35, respectively. This not very small value for p indicates an acceptable description by our background-andsignal model.

Significance of a signal
As described in Sect. 5.1, the likelihood function can be used to infer whether our observation can be statistically explained by the assumed backgrounds alone. To this end, we employ the likelihood ratio test. The result of this test naturally depends on the best fit point in parameter space, and we thus perform the test for both likelihood maxima discussed above. The resulting statistical significances, at which we can reject the background-only hypothesis, are for M1: 4.7σ for M2: 4.2σ.
In the light of this result it seems unlikely that the backgrounds which have been considered can explain the data, and an additional source of events is indicated. Dark Matter particles, in the form of coherently scattering WIMPs, would be a source with suitable properties. We note, however, that the background contributions are still relatively large. A reduction of the overall background level will reduce remaining uncertainties in modeling these backgrounds and is planned for the next run of CRESST (see Sect. 7).

WIMP parameter space
In spite of this uncertainty, it is interesting to study the WIMP parameter space which would be compatible with our observations. Figure 14 shows the location of the two likelihood maxima in the (m χ , σ WN )-plane, together with the 1σ and 2σ confidence regions derived as described in Sect. 5.1. The contours have been calculated with respect to the global likelihood maximum M1.
The new result is found to be consistent with the data of an earlier run of CRESST [1] published in 2009. Figure 14 reproduces the corresponding exclusion limit from [1] (the dashed red curve). It was calculated from the tungsten recoil bands of two detector modules, taking into account three events observed in an exposure of 47.9 kg d. As a good (and conservative) approximation for high WIMP masses, only tungsten nuclei were considered as a possible target for WIMP scatterings at that time. However, since additional oxygen and calcium recoils will increase the sensitivity for light WIMPs, we also have reanalysed the 2009 data under this aspect. In contrast to a similar analysis done earlier [19], we thereby do not restrict the acceptance region to the tungsten band (thus accepting only a fraction of the oxygen and calcium recoils), but consistently consider all three nuclear recoil bands, in the same energy range as used in [1] and [19] (10-40 keV). This extension leads to five additional accepted events. The resulting exclusion limit (calculated with the usual maximum gap method [20]) is shown in Fig. 14 and found to lie completely above the 2σ contour obtained in the present work.
Independently, we note that the parameters compatible with our present observation are in considerable tension with the exclusion limits of other experiments. Moreover, the parameter regions compatible with the observation of DAMA/LIBRA (regions taken from [18]) and CoGeNT [17] are located somewhat outside the CRESST region.

Future developments
Several detector improvements aimed at a reduction of the overall background level are currently being implemented. The most important one addresses the reduction of the alpha and lead recoil backgrounds. The bronze clamps holding the target crystals were identified as the source of these two types of backgrounds. They will be replaced by clamps with a substantially lower level of contamination. A significant reduction of this background would evidently reduce Fig. 14 The WIMP parameter space compatible with the CRESST results discussed here, using the background model described in the text. The contours have been calculated with respect to the global likelihood maximum M1. Additionally shown are exclusion limits from CDMS-II [12,13], XENON100 [14], the low-threshold analysis of XENON10 [15], and EDELWEISS-II [16], as well as the 90 % con-fidence regions favored by CoGeNT [17] and DAMA/LIBRA [18] (without and with ion channeling). For comparison, we also show the CRESST limit obtained in an earlier run in 2009 [1] and the result of a reanalysis of the 2009 data, taking into account all three nuclei in our target material (see text) the overall uncertainties of our background models and allow for a much more reliable identification of the properties of a possible signal.
Another modification addresses the neutron background. An additional layer of polyethylene shielding (PE), installed inside the vacuum can of the cryostat, will complement the present neutron PE shielding which is located outside the lead and copper shieldings.
The last background discussed in this work is the leakage from the e/γ -band. Most of these background events are due to internal contaminations of the target crystals so that the search for alternative, cleaner materials and/or production procedures is of high importance. The material ZnWO 4 , already tested in this run, is a promising candidate in this respect.

Summary
With an exposure of 730 kg d, the CRESST Dark Matter search has observed in the latest run a total of 67 events in an acceptance region of low energy nuclear recoils. Possible background contributions to this number include leakage of e/γ -events, events from neutrons, from alpha-particles, and from recoiling nuclei in α-decays. We have estimated these four backgrounds and have found using a likelihood ratio test that, at a significance larger than 4σ , these backgrounds are not sufficient to explain all the observed events. Scatterings of WIMPs may be the origin of this effect and, under this assumption, we have derived the corresponding WIMP parameters. Finally, we have presented the plans for the next run of the experiment, in which we aim at a further clarification of this hypothesis. physics data set discussed in this work, in which the collected number of e/γ -events by far exceeds any of the performed calibration runs. Figure 15 (top) shows the energy spectrum of one exemplary detector module (Ch45). The spectrum is dominated by a feature around 50 keV which is due to a contamination of the crystal with 210 Pb. This isotope β-decays to 210 Bi, mostly to an excited state which quickly decays to the ground state under the emission of a 46.5 keV γ . Since we observe the sum of the γ -and the β-energy in the feature with the sharp left shoulder at 46.5 keV, we conclude that the 210 Pb contamination is located in the volume of the crystal. The spectral density of events in this feature is higher than in the energy range of the acceptance region by a factor of 10 to 20. Most of the background in the e/γ -band below 40 keV is due to 227 Ac and 90 Sr impurities in the volume of the crystal. Figure 15 (bottom) shows a light yield histogram of the events observed inside the above feature between 47 keV and 53 keV, together with a Gaussian fit to this distribution. Given the rather wide energy window, the Gaussian provides a very reasonable description of the data. We can also confirm the absence of a noticeable tail of the distribution towards lower light yields. There is only one event at a yield of 0.38, somewhat outside the distribution, which is very likely an alpha event from the relatively wide alpha band of this detector. The number of observed outliers is thus below one in 10 5 e/γ -events.

Appendix B: Relaxation events
In early stages of the CRESST experiment with sapphire crystals, an unexpected high rate up to ∼1 s −1 of signal pulses appeared, which were not caused by particle interactions [21]. Instead, they were originating from microscopic fractures in the sapphire crystal caused by the very tight clamping of the detectors. Such fracture processes create high-frequency phonon pulses, similar to particle events, and thus result in identical shapes of the recorded temperature pulses. At that time, the crystals were held by tiny sapphire balls. The excessive pressure due to the extremely small contact area between balls and crystal, together with the high forces used to press these balls against the crystal, was responsible for the formation of microscopic cracks in the sapphire crystals. After dismounting, these cracks were clearly visible under suitable illumination with an optical microscope. The energy spectrum of these events extended up to high energies, well above 100 keV, with a characteristic spectrum following a power law dN/dE ∝ E −β [21]. These signals had a clear tendency to cluster in time: When one signal was observed, the chance that other ones occurred was strongly enhanced. Due to this experience we increased the contact area between crystal and support and thus reduced forces to a minimum. Thereafter no such crack-type events with the characteristic energy spectrum and clustering in time have been observed.
In a previous run of CRESST-II with CaWO 4 crystals, we had covered the bronze clamps holding the crystals with scintillating plastic to allow efficient vetoing of the 206 Pb recoil background. Unfortunately, relaxation processes in the plastic at the contact area with the crystal were found to cause slow signals in the phonon channel without any associated light. The time constants of these signals varied from pulse to pulse, presumably depending on how far the energy had to be transported in the plastic before reaching the crystal. The quasi exponential energy spectrum of these events also extended well above 100 keV and the rate was found to decrease with time and disappeared after about half a year after the cool-down. A reference detector with uncovered bronze clamps, operated in the same run, did not show these Table 5 The WIMP parameters resulting from the likelihood fit when the overlap with the no-light band is removed from the acceptance region. The corresponding likelihood function exhibits two maxima, M1 and M2, similarly to the analysis of the full data set relaxation type events. Due to the varying time constants of these relaxation pulses it was difficult to exclude with high confidence a rare misidentification of such completely dark signals at the lowest energies. For this reason we used bare metallic clamps and avoided any contact of the crystals with plastic in the present run. In other cryogenic experiments detectors are typically in contact with plastic (the crystal has to be insulated to allow charge collection in Ge for example). This is very likely the reason for the wide-spread opinion in the field that relaxation type of events are unavoidable in cryogenic detectors. 206 Pb recoil events have very little light and a good fraction of them well below 100 keV is practically dark in the present data. The low energy tail of the 103 keV 206 Pb peak in the present data decreases rather than rises towards low energies which would be characteristic for relaxation type of events. We also do not observe the typical clustering in time of low-light yield events characteristic for relaxationtype events.
Despite of the arguments given above we can nevertheless analyze the data by removing the overlap with the no-light band from the acceptance region. The shaded histogram in Fig.16 illustrates which of the accepted events are found in the band where no-light events are expected. Removing the no-light band from the acceptance region eliminates most of the accepted 206 Pb background events but also a considerable fraction of the tungsten as well as some of the oxygen events and thus substantially reduces the number of accepted events from 67 to 37.
After removal of the no-light band from the acceptance region, still the same two maxima M1 and M2 of the likelihood function are found with similar parameters for WIMP mass and scattering cross section, but of course with considerably reduced significance of 2.3σ for both maxima, see Table 5.

Appendix C: Data plots
In Fig. 7 and Fig. 9 we have shown the data of two exemplary detector modules in the light yield-energy plane. For reference, Fig. 17 gives the same type of plot for the remaining six detector modules considered in our analysis. The data of Ch45 and Ch47 contain a prominent population of events around 100 keV between the e/γ -and the α-band. These events are due to surface α-decays from 210 Po, where the escaping α-particle produced additional light when hitting the scintillating housing of the module (see the discussion in Sect. 2.4). This class of events is normally present in all detector modules, but in most modules it can be removed by a dedicated cut which exploits the different pulse shape of the respective light signals. Due to the poorer quality of these signals, this cut is not fully efficient for Ch45 and Ch47 and was thus not performed for these two modules. In all other cases, the cut was checked not to remove any of the events relevant for the analysis.
Variations in the exact energy dependence of the e/γ light yield lead to slight differences in light yield scales of different detectors, if the light yield of e/γ -events is always set to one for the 122 keV line of the 57 Co calibration source. That explains why the quenched bands in the figures are centered at slightly different light yields even if the quenching factors are universal for all detectors. These variations affect only the presentation of the data and have no influence on the results of the analysis.
There is a population of events in the oxygen band above 40 keV in the detectors, which is difficult to explain by the neutron background discussed so far. However, the actual neutron spectrum may be harder than assumed in our analysis and oxygen recoils may extend to higher energies and explain these events. This idea is supported by the observation of a relatively large number of 5 coincident oxygen recoils in the data in the energy range from 40 keV to 300 keV, compared to the 3 coincident recoil events in the original acceptance region. Such a hard neutron background might for example be created in the clamps of the detectors or in the relatively large amounts of plastic material in connectors Fig. 17 The data of the six remaining detector modules considered in the analysis. The highlighted bands are (from top to bottom) for α-particles, oxygen, and tungsten recoils. Also highlighted are the acceptance regions and the events observed therein and cabling close to the detectors. The 210 Pb in the clamps was very likely introduced by graphite which served to cover the melt and prevent oxidation while pulling a wire, which was the starting material for the production of the clamps. Under the microscope tiny carbon enclosures can be seen in the material after etching the surface. If most of the 210 Pb-activity is still in the carbon, some (∼20) neutrons may have been produced by α-n reactions on 13 C. We have performed a Monte Carlo simulation of the recoil energy spectrum in CaWO 4 produced by neutrons from α-n reactions on 13 C.
To study the influence of events at energies above 40 keV on the results, we increase the upper energy limit of the ac-ceptance region to 68 keV. This approximately doubles the energy range of the original acceptance region. In this extended energy range the simulated energy spectrum of oxygen recoils can be approximated by a single exponential with a decay energy of E neut = 56.8 keV.
In a first step we repeat the fit in this wider acceptance region with the original shape of the neutron background. The number of expected neutron singles obtained from the best fit almost triples (27.4 ± 7) in this acceptance region of twice the original width, and the significance of the WIMP signal slightly reduces to 3.6σ , other parameters are very similar (M χ = 11.7 GeV, σ WN = 4.6 · 10 −5 pb).
If we allow now for the harder spectral shape of the neutron background, the expected neutron singles still come out high (26.6 ± 6.7), but the visual fit quality of the energy spectrum improves and the significance of the signal increases to 4.3σ , slightly above the original value. WIMP mass and scattering cross section (M χ = (11.2 ± 2.1) GeV, σ WN = 4.3 · 10 −5 pb) come out very close to the original values and the energy spectrum and yield distribution reveal a fit quality similar to the original one. The high mass solution also yields fit results, apart from expected neutron singles, close to the original ones (significance = 4.6σ , expected neutron singles = 25.1 ± 6.8, M χ = (25.7 ± 5.0) GeV, σ WN = 1.4 · 10 −6 pb).