Identifying Heated Rocks Through Feldspar Luminescence Analysis (pIRIR290) and a Critical Evaluation of Macroscopic Assessment

Throughout (pre)history, non-flint rocks have been used to structure fireplaces, to retain heat, to boil liquids, and to cook food. Thus far, the identification of heated non-flint rocks in archaeological contexts largely depends on a visual (macroscopic) assessment using criteria thought to be diagnostic for thermal alteration. However, visual identification can be subject to observer bias, and some heat-induced traces can be quite difficult to distinguish from other types of weathering or discolouration. In this paper, we present feldspar luminescence analysis as an independent, objective way to identify heated non-flint rocks and to evaluate the results against the established visual macroscopic method for the identification of such pieces. This is done by submitting manuported rocks with and without inferred macroscopic characteristics of heating, originating from the Last Interglacial, Middle Palaeolithic site Neumark-Nord 2/2 (Germany), to feldspar luminescence analysis (pIRIR290). Results of the feldspar luminescence analysis are compared with the visual assessments. This proof of concept study demonstrates the potential of luminescence analyses as an independent, quantitative method for the identification of heated rocks—and their prehistoric applications like hot-stone cooking, specifically for cases where macroscopic assessment cannot provide reliable determinations.


Introduction
The timing and nature of the controlled use of fire and its functions are highly contested issues in the field of human origins, with important implications for the understanding of various aspects of hominin behaviour and cognition, including the use of fire for cooking and for thermoregulation during colonisation of low-temperature environments (Alperson-Afil & Goren-Inbar, 2006;Macdonald, 2018;Wadley, 2013;Wrangham, 2009). Age estimates for controlled use of fire vary from an early Pleistocene origin (c. 2 million years ago; Wrangham, 2009) to 300-400,000 years ago (Roebroeks & Villa, 2011;Shimelmitz et al., 2014;Sorensen, 2017). Some workers suggest that even the latest Neandertals (60-40,000 years ago) did not know how to produce fire and only used it opportunistically, when natural fires made it available (e.g. through lightning strikes, spontaneous combustion of organics, or volcanic action) (Dibble et al., 2018;Sandgathe et al., 2011;contra Sorensen, 2017;Zilhao & Angelucci, 2018).
Hominin fire use is identified in the archaeological record based on the presence and nature (i.e. the way materials are affected by heat) of fire remains, such as charcoal, heated bone, ash, heated sediments, and heated lithics. These materials may have become heated accidentally-e.g. by being situated on a surface on which a fire was built (see Sorensen & Scherjon, 2018 for a model of such processes)-or may have been intentionally heated, e.g. bone through use as fuel or through cooking (e.g. Costamagno et al., 2005;Théry-Parisot, 2002) and lithics through heat treatment (e.g. Brown et al., 2009;Schmidt et al., 2012Schmidt et al., , 2013. From the Upper Palaeolithic onwards, non-flint stones with signs of heating, so-called fire-cracked rocks, show up in the record, sometimes in relatively high quantities. They are usually interpreted as cooking or boiling stones (Bicho et al., 2003;Gao et al., 2014;Manne et al., 2006;Nakazawa et al., 2009;Thoms, 2009), although alternative uses as e.g. heat retainers (e.g. Black & Thoms, 2014;Holdaway 1987), this method is unsuitable for most other rock types, mainly due to the absence of suitable luminescence-sensitive quartz minerals. This is also the case for optically stimulated luminescence (OSL) on quartz, especially in plutonic and metamorphic bedrock samples (Guralnik et al., 2015). Apart from issues with the availability of luminescence-sensitive quartz, conventional methods targeting quartz also have age constraints, limiting their application to the Late Pleistocene (Timar et al., 2010). As an alternative, the analysis of infrared-stimulated luminescence signals from feldspar is a promising avenue for three main reasons: (1) most plutonic and metamorphic rock samples provide suitable feldspar luminescence (Guralnik et al., 2015;Sohbati et al., 2013), (2) the age range is significantly extended to MIS 5 and beyond (Sohbati et al., 2012a), and (3) recent advances resolved methodological drawbacks related to feldspar specific signal instability by the use of a post-infrared infrared (pIRIR) luminescence signal (Thomsen et al., 2008). There is a study (Rapp et al., 1999) that used the conventional IR 50 signal to assess heating of granite, but this method is known for problems with signal instability (fading).
Here, we present and evaluate the potential of the pIRIR feldspar luminescence method and in particular the use of the pIRIR signal measured at 290 °C (termed pIRIR 290 ) as a way to establish past heating of rocks by applying it to granite, diorite, porphyry, quartzite, and vein quartz (from here on referred to as 'quartz') material from an archaeological context. Although quartzite and quartz rocks predominantly consist of quartz minerals, which are insensitive to post-IR stimulation, it is possible that those rocks contain small traces of feldspar, i.e. it is worth checking their suitability for pIRIR 290 luminescence analyses. The pIRIR 290 method in general has proven to be successful in dating pre-MIS5 sedimentary features (e.g. Buylaert et al., 2012;Kars et al., 2012) and was also successfully tailored to dating rock surfaces (Sohbati, 2013). The archaeological stones featured in this study were excavated from fine-grained silty deposits of the Last Interglacial, Middle Palaeolithic basin site Neumark-Nord 2/2 (Saxony-Anhalt, Germany) (Gaudzinski-Windheuser &  and interpreted as manuports (Pop et al., 2018). Of these, particularly the granites showed clear visual characteristics that were potentially related to heating and provided the impetus for further systematic research of all stones excavated from this site. The aims of this study were twofold: (1) to unambiguously distinguish heated from unheated stones and (2) to test visual assessment as a method to identify heated rock, by comparing commonly used macroscopic traits of thermal alteration with our feldspar luminescence results. The reliability, applicability (mineral content, age limits), and practicality of these methods are discussed here and suggestions are made for further quantitative research avenues.

Materials
The archaeological material used in this study was excavated from the basin site Neumark-Nord 2/2, which can be attributed to the Last Interglacial and was dated with TL to 126 ± 6 ka (Richter & Krbetschek, 2014;Sier et al., 2011). This site, situated 170 km southwest of Berlin (Germany), is located in a former lignite quarry that was in use until the early 1990s. These large-scale quarrying activities exposed a small basin structure within the underlying (Saalian) glacial till deposits filled with fine-grained silt loams, as well as several archaeological layers with lithic artefacts and faunal remains. Excavations were carried out between 2004 and 2008 by the Landesamt für Denkmalpflege und Archäologie Sachsen-Anhalt (Germany), the Römisch-Germanisches Zentralmuseum (Germany), and Leiden University (the Netherlands) . The most find-rich horizon, NN2/2, yielded ca. 120.000 faunal remains and 18.689 lithics Pop, 2014), including 504 gravel to cobble-sized stones, that were sourced by hominins from local exposures of Saalian glacial till deposits. Apart from being transported, 63 of the gravel to cobble-sized stones show traces of hominin involvement in the form of use as percussive tools (Pop et al., 2018). Other pieces show modifications that are frequently connected to thermal alteration (e.g. fragmentation, discolouration, and cracking) but can alternatively be the result of post-depositional surface modifications (PDSM). The cooccurrence of these finds with unambiguous evidence for the use of fire at the site (Pop et al., 2016) raises the question whether the modifications on the stones are indeed the result of thermal alteration and, if this is the case, what the function of these stones could have been.
To address these site-related issues and the methodological questions regarding the identification of heated stones in the Palaeolithic record, all stones have been inspected for macroscopic signs of thermal alterations using the criteria outlined below ("Methods: Macroscopic Assessment of Thermal Alteration" section), while a subsample has subsequently been subjected to feldspar luminescence analysis ("Methods: Feldspar Luminescence Analysis" section).
In addition, 5 experimentally heated granites were added as control samples for the feldspar luminescence analysis to test the general applicability of the approach. These control samples derive from a granite rock sample that was taken from quarry 'Domelaar' (Markelo, NL) that exploits Lower to Middle Pleistocene ice-pushed sands that also contain Saalian glacial erratics. Four fragments (cut with a watercooled diamond saw) of this granite sample were heated in an oven for 2 h to 400, 600, 800, and 1000 °C, respectively (see Online Resource 2: Figure A for appearance of samples heated to 400 and 600 °C). One fragment remained unheated. Of the four heated fragments, the two pieces heated to 800 and 1000 °C were heavily fragmented during heating and could therefore not be analysed further.

Methods: Macroscopic Assessment of Thermal Alteration
The way stones are affected by heat is governed by their thermal conductivity (related to mineral content, grain size, porosity, pore fluid, and anisotropy), heating rate, temperature, exposure time, and mode of cooling (Backhouse & Johnson, 2007;Deal, 2012;Homand-Etienne & Troalen, 1984;Schön, 2011). This means that different rock types exposed to the same heating and cooling conditions may display different characteristics and that different rock types displaying the same characteristics may not have been exposed to the same conditions. Furthermore, experimental material from the Laboratory for Artefact Studies (Faculty of Archaeology, Leiden University) studied by the authors shows that stones from the same rock type do not always react in the same way when exposed to the same conditions. A review of the literature (Online Resource 1: Table A) shows that there is no uniform nomenclature to describe heated stones (although attempts are made to change this; e.g. Neubauer, 2018) and that descriptions of macroscopic traces of heating are usually very general and often lack images to support descriptions. In addition, it is not always clear whether descriptions are based on experiments or assumptions about archaeological material based on circumstantial evidence. Furthermore, experiments usually focus on one or two specific rock types (relevant to a specific case study) and commonly do not account for all relevant variables, making comparisons between studies difficult. Nevertheless, the consulted literature and the experimental reference collection of the Laboratory for Artefact Studies (Leiden University, the Netherlands) (Online Resource 1: Table B) allow for the identification of a set of common traits: discolouration (usually reddening), cracking, and fragmentation (curvilinear or angular), but taking into consideration their particular appearance on specific raw material types.
For the description of the NN2/2 stones, the study of Pop et al. (2018) has been used as a basis, providing data on archaeological provenance, rock type, size, and weight. To this, macroscopic features potentially related to thermal alteration have been added, as outlined in Table 1 ( Based on the recorded variables, the likelihood of potential heating was assessed (Table 1). Based on size, rock type, and the results of the macroscopic thermal alteration assessment (see "Macroscopic Assessment of Thermal Alteration" section), a selection for feldspar luminescence analysis was made (see Table 2).

Methods: Feldspar Luminescence Analysis
The luminescence property of minerals, in our case feldspars, is associated to trapped charge that accumulates in defects in the crystal lattice. The accumulation of this trapped charge is caused by a constant flux of ionising background irradiation in the sample or sample surroundings, i.e. it is a time-dependent process. The trapped charge can be either depleted by exposure to heat (thermal resetting) or daylight exposure (optical resetting or bleaching). After zeroing by either heat or daylight, the luminescence clock in the minerals is reset and, if thereafter shielded from either light or heat, the trapped charge starts to accumulate again. In the luminescence laboratory, we can measure the luminescence intensity that is proportional to the accumulated trapped charge. The accumulation of the natural luminescence intensity is approaching a saturation limit after which the rate of net accumulation is zero (see also Aitken, 1998;Preusser et al., 2008 for the general principles of luminescence analysis). This saturation limit is sample and mineral dependent. For feldspars, saturation is typically reached after 300 to 500 ka after the last thermal or optical resetting event. In this study, we aim to use the luminescence intensity of feldspar minerals from archaeological rock samples to decide whether the rock sample was considerably heated during the last 300 to 500 ka (natural luminescence intensity < saturation limit) or not (natural luminescence intensity > saturation limit).
Sample preparation for the luminescence analysis of the rock samples took place under subdued orange light conditions at the Netherlands Centre for Luminescence dating (NCL) in Wageningen and the Nordic Laboratory for Luminescence dating (NLL) in Risø (Denmark). The stones were mounted in a vice and cores were made using a water-cooled drill mounted on a drill press (cf. Sohbati et al., 2011). Initially, the aim was to establish luminescence depth profiles, by cutting the cores into 1 mm thick slices (using a water-cooled diamond wafer blade). However, this proved problematic as heavily weathered stones (due to either heating or other processes) fragmented during coring. The same was the case for experimental pieces heated to 600 °C and higher (see Online Resource 2: Figure A). Therefore, for most cores, only one subsample from the innermost part of the stone was isolated for measurement ( Fig. 1a), to only identify sustained heating at a threshold temperature that can thermally reset the luminescence signals of the inner-rock material (as opposed to superficial heating or optical resetting). Prior to luminescence measurement, subsamples were gently crushed and loaded into aluminium cups. No further chemical or mechanical treatment was carried out.
The luminescence measurements on the rock slices were carried out using Risø TL/OSL readers (DA-OSL-15 and -20) equipped with infrared stimulation (870 ± 40 nm stimulation wavelength) and photon detection through LOT D410  interference filters centred round the ~ 410 nm (i.e. blue) emission. The palaeodose received by the rock samples since the last thermal or optical resetting was obtained by an elevated temperature pIRIR 290 single aliquot measurement protocol, applying a stimulation temperature of 290 °C (pIRIR 290 , Buylaert et al., 2012). This pIRIR 290 luminescence signal detected in the blue emission dominantly arises from the K-rich orthoclase feldspars in the rock samples (Baril & Huntley, 2003;Prescott & Fox, 1993) and is regarded as not being affected by anomalous fading (e.g. Buylaert  Fig. 1 a Luminescence history of stones that were heated in the past and the application of feldspar luminescence analysis to verify this in a quantitative way. b The analytical luminescence procedure: the graph on the left shows a typical natural pIRIR 290 luminescence signal (L N 1) from a single aliquot of granitic material. Subsequent to the read-out of the natural signal, a dose response curve is produced for each individual aliquot by stepwise irradiating the aliquot with larger laboratory doses (i.e. regenerating a dose) and measuring the corresponding growth of the pIRIR 290 luminescence dose response (L i ). Every measured signal is normalised and corrected for sensitivity changes by monitoring the response of each natural and regenerative signal to a constant laboratory test dose and by dividing L N or L i by its test dose response (L X /T X ). This test dose corrected dose response of the natural or regenerated signals is plotted as a function of the applied laboratory dose (right hand graph) and fitted by a single saturating exponential function (dose response curve). The onset of luminescence signal saturation is reached when the natural signal attains 85% of full signal saturation (= no luminescence signal growth with increasing laboratory dose). This onset of saturation is referred to as the 2D 0 threshold. For details regarding this single aliquot regenerative dose protocol, the reader is referred to the review provided in Wintle and Murray (2006). In this paper, we use the 2D 0 threshold to decide whether the tested material was heated during the recent past (last ~ 300 ka) or not. If the test dose corrected natural signal plots below the 2D 0 threshold, the analysed material is characterised as heated (e.g. L N 1/T N 1, right graph). If the test dose corrected natural signal plots above the 2D 0 threshold, the analysed material is characterised as unheated (e.g. L N 2/T N 2, right graph). This analysis is repeated on at least six aliquots per sample et al., 2012; Kars et al. 2012). Anomalous fading describes an a-thermal loss of luminescence signal through time and is a typical property of conventional low-temperature infrared-stimulated luminescence (IRSL) signals from feldspars (Spooner, 1994). However, luminescence signal stability is an important prerequisite for using luminescence signals as a proxy for past heating, which is presumably the reason why previous application of conventional and thus unstable IRSL to study the heating history of rocks (Rapp et al., 1999) were never fully embraced, neither by the luminescence nor by the archaeology communities. The signal stability (or absence of fading) of the elevated temperature pIRIR 290 feldspar luminescence signal was confirmed by determining field saturation for unheated samples ( Fig. 1b; for detailed explanation, see Kars et al., 2012). Also, note that considerable thermal treatment likely resets the pIRIR 290 signals from feldspars taken from the innermost part of the rock samples while optical resetting only affects the outer part of the rock (Sohbati et al., 2012b). Hence, to establish the heating history, we need to focus on the rock slices taken from the inner part of the stone (Fig. 1a).
To establish whether the inner part of a rock was heated at some point in the past, we compare the sensitivity corrected natural pIRIR 290 luminescence (L N /T N pIRIR 290 ) to a sensitivity corrected pIRIR 290 dose response curve. If the natural pIRIR 290 signal clearly plots below luminescence signal saturation, defined by the 2 times D 0 criterion equivalent to ~ 85% full saturation (Wintle & Murray, 2006) of the corresponding dose response curve, we regard this sample as being heated in the past. We used a single saturating exponential fit of the dose response curve to calculate D 0 . If the natural pIRIR 290 signal clearly plots above this threshold, it is regarded as unheated. The procedure is illustrated in Fig. 1. This saturation threshold typically corresponds to a dose of ~ 700 Gy, implying that heating events older than ~ 300 ka (depending on the dose rate) cannot be reliably detected by this approach. Note that the main goal of this study is to establish whether stones were heated and not the exact timing of the heating events. The latter would require complex dose rate consideration, which is beyond the scope of this paper.

Macroscopic Assessment of Thermal Alteration
Of the 380 NN2/2 (archaeological) stones analysed, visual inspection suggested that 315 are most likely unheated (82.9%; see Online Resource 4 for complete dataset). The other 65 pieces (17.1%) show potential signs of thermal alteration, of which 40 (10.5%) are in the low confidence, 19 (5.0%) in the medium, and 6 (1.5%) in the high confidence category (Fig. 2). Of the potentially heated stones, most are in the granit(oid) category, followed by gneiss(oid), porphyry, quartz, and quartzite (Fig. 3). Although limestone can be used in anthropogenic fire practices (e.g. Ellwood et al., 2013), limestone (n = 22) was excluded from further analysis. They are expected to contain little feldspar, and their small size and weathered state make sampling (drilling) for feldspar luminescence difficult.
Of the 43 potentially heated pieces excluding limestone, 32 are fragmented (74.4%), of which most (n = 30) are individual pieces (i.e. other fragments are missing), but in two cases, 2 or more fragments were found together. Notable is e.g. 'Granite 1′, which was fragmented into 30 larger fragments (Fig. 4). Of the fragmented pieces, most show curvilinear breaks (n = 3), sharp edges (n = 7), or-most common-a combination of both (n = 10, e.g. Granite 2, Porphyry 1, Fig. 4). Probably related to fragmentation, but described separately, is cracking; of   Fig. 4), a single or few cracks (e.g. Porphyry 1, Fig. 4), and polygonal cracking (e.g. Granite A, Granite 1, Fig. 4, interpretation based on the breakage pattern of the many fragments). Cracking can also be observed in thin section for 'Granite B', but not for 'Granite A' (see Online Resource 2: Figure B). Discolouration has been observed on most of the 35 stones, primarily involving yellow to brown iron-like staining that is not necessarily related to thermal alteration. Exceptions are present, like the pink discolouration visible on 6 pieces (e.g. Quartzite 2, Fig. 4), which may be related to heating. A difference in colour can be observed for the two granites in thin section: Granite A shows lighter grey feldspar crystals than Granite B (see Online Resource 2: Figure B). Residues were present on 29 of the 43 stones, but primarily concern varying proportions of (calcareous) sediment and iron-most likely not related to thermal alteration. For each of the rock types (excluding limestone), two pieces of sufficient size for coring were selected for feldspar luminescence analysis: one piece of the highest possible heating category (macroscopic assessment) and one that was most likely unheated (category 0). Based on the high (preliminary) expectations for the granites, two additional pieces of this rock type were added, which are both in the possibly heated category. This makes for a total of 14 stones to be submitted to feldspar luminescence analysis (Table 2, Fig. 4).

Feldspar Luminescence Analysis
To test the general applicability of the approach, we analysed five experimental control samples derived from a granite rock sample (see "Materials" section). The two control samples that did not fragment heavily during heating and could therefore be analysed (400 and 600 °C) show luminescence resetting (luminescence signal clearly below saturation threshold) of the inner-rock material, whereas the inner material of the unheated control sample shows feldspar luminescence signals well above the saturation threshold (Fig. 5a, see Online Resource 5: Figure A for decay and dose response curves of unheated sample).
In Fig. 5b, we show the feldspar luminescence versus depth results for two exemplary archaeological samples. Sample 'Granite A' shows non-saturated pIRIR 290 luminescence signals of the inner-rock material indicating thermal resetting of the rock (see Online Resource 5: Figure B for decay and dose response curves), whereas sample 'Granite B' shows saturated pIRIR 290 luminescence signals of the inner material suggesting that the luminescence clock was not thermally reset. The results of the feldspar luminescence analysis of all archaeological rock samples are listed in Table 3. From eleven samples, it was possible to obtain a measurable pIRIR 290 feldspar luminescence signal. Both the Granite 1 and quartz 1 sample did not emit sufficient feldspar luminescence presumably resulting from the absence of luminescence-sensitive K-rich feldspar grains in the analysed rock matrix. For those two samples, it was not possible to evaluate the heating history using feldspar luminescence. Because of the absence of a feldspar luminescence signal for quartz 1, quartz 2 was not further analysed. From the eleven luminescence-emitting samples,  Fig. 4 Overview of the 14 stones sampled for feldspar luminescence analysis. Noted are the results of the macroscopic assessment, and the level of correspondence between the macroscopic assessment and the feldspar luminescence analysis (green checkmark = match; red cross = no match; orange dash = no signal/ not analysed) two (Granite A and Quartzite 2) show sensitivity corrected natural pIRIR 290 signals clearly below the saturation threshold (see "Methods: Feldspar Luminescence Analysis" section) of the corresponding dose response curves. For those two samples, it is very likely that they were heated during the past ~ 300 ka (upper dating limit of feldspar pIRIR 290 in this context). For the remaining nine samples, natural signals plot clearly above the saturation threshold strongly suggesting that those stones were not heated, at least not for the last ~ 300 ka, which is far beyond the well-constrained age of the NN2/2 deposits (Sier et al., 2011 and references therein).

Match Between Macroscopic Assessment and Feldspar Luminescence Analysis
Of the eleven archaeological stones submitted to macroscopic assessment and yielding a measurable pIRIR 290 signal, seven showed a match in observations (63.6%, Table 4). Of these, five are unheated according to feldspar luminescence analysis (Granite B, Diorite 2, Quartzite 1, Porphyry 2, Gneiss 2) and two heated-which also had medium to high confidence level for the macroscopic assessment (resp. Quartzite 2 and Granite A). Of the pieces that do not match (n = 4), all are unheated according to feldspar luminescence analysis, but were marked as heated with low to medium confidence by macroscopic assessment. Pieces marked as unheated were never heated according to feldspar luminescence analysis.

Comparison Macroscopic Assessment and Feldspar Luminescence Analysis
All the stones without macroscopic signs of heating were shown to be unheated with feldspar luminescence analysis (5/5, 100%). Although a small sample, this suggests that macroscopic assessment does not misidentify heated pieces as unheated.  Table 3) such that the corresponding dash-dotted lines plot on top of each other. For dose points that plot above the y-axis break, and thus well beyond the 2D 0 threshold, it was not possible to determine an equivalent dose  However, some unheated pieces will likely be misinterpreted as heated by macroscopic assessment, e.g. due to a certain level of similarity between traces produced by heating and post-depositional surface modifications. The analysis shows that of the eight stones that were macroscopically assessed as heated, two yielded no signal or were not further tested due to their mineral composition, and only two are definitely heated. This low success rate (2/6, 33.3%) of macroscopic assessment is especially striking given the fact that for the potentially heated category, the more promising samples were subjected to feldspar luminescence analysis. This suggests that macroscopic assessment can overestimate the amount of heated stones in an assemblage, possibly to the point where heated pieces are macroscopically identified in an assemblage where there are none. These results show the need for quantifiable, independent methods like feldspar luminescence analysis for the identification of heated stones in the archaeological record and warrant further development of such methods (see "Feldspar Luminescence Analysis " and "Other Quantitative Identification Methods" sections).

Macroscopic Assessment
In light of the limited overlap (n = 2) in the results of the macroscopic assessment and the feldspar luminescence analysis for heated material, it is not possible to provide a set of macroscopic features that unambiguously characterise heated stones (not to mention for each individual raw material category): fragmentation and cracking are observed not only on both heated pieces, but also on pieces without confirmed thermal alteration. Furthermore, in the case of the heated granite, the cracking is polygonal and present throughout, while it is very limited in the heated quartzite. Curvilinear breaks with sharp edges are often assumed to be typical of fire-cracked rock but cannot be observed on the heated pieces identified in this study, yet they do appear on the unheated material. The heated granite seems to show yellow discolouration, but this can be identified as iron staining and is also visible on other, unheated pieces. The heated quartzite has a pink colour, but it is unclear whether this is discolouration through heating or the original colour of the piece. This study clearly illustrates the problems with macroscopic assessments of heated stones. One important problem is that there is a large degree of equifinality in the features caused by thermal alteration and those caused by weathering, e.g. through thermal stress (both cold and hot, but unrelated to fire), or pressure release, as well as chemical and biological weathering (Backhouse & Johnson, 2007;Deal, 2012). In particular, the till deposits from which the NN2/2 stones were sourced have underwent many temperature and pressure changes related to glacial processes, before the stones were transported to the site by Middle Palaeolithic hominins. Apart from glacial transport, weathering processes may have also worked on the material after deposition. Weathering before and after deposition may act as a catalyst in the expression and amount of thermal damage observed, e.g. a rock whose integrity is severely affected by weathering is significantly more prone to cracking and fragmentation than an unweathered one. A further complicating factor is the fact that rock types respond differently to both weathering and thermal stress, the latter being the product of both temperature and heating duration. A flaming fire generally reaches temperatures around 600-900 °C, while a smouldering fire generally has a temperature of around 500 °C (Bentsen, 2013;Rein, 2009). Any stones that are placed within these fires, and are therefore in direct contact with the heat source, are expected to reach similar temperatures. However, the heating rate and final temperature of the stones will vary based on exposure time, rock type, and the size of the stone. It should be noted that stones placed on the outside of a fire will reach far lower temperatures than the fire itself. In the future, it would be worthwhile to better study the exact temperatures at various depths in the rocks through controlled experiments with thermocouples. This would also establish the ideal depth for sampling and ease the interpretation of results obtained by independent, quantitative methods like the feldspar luminescence analysis employed here.
Accounting for all the factors involved in the expression of thermal alterations and weathering processes is very difficult. In the case of NN2/2, with its large variety in rock types and weathering largely related to the source of the material, macroscopic assessment remains highly problematic. Depending on the characteristics of an assemblage, quantitative methods are therefore either important to verify the macroscopic observations and address the inherent subjectivity of this type of analysis, or an absolute necessity to draw any meaningful conclusions about the heating of stones by fire at archaeological sites. The advantages and limitations of such methods, including the feldspar luminescence method that has been used in this study, will be discussed in the following two paragraphs.

Feldspar Luminescence Analysis
The feldspar luminescence analysis using the pIRIR 290 signal successfully applied here not only has some clear advantages, being independent and less subjective, but also poses some limitations: First of all, the methodology is not or badly suitable for rock types (e.g. quartz and basalt) with low feldspar content (K-rich feldspars in particular). Even though quartz can contain small traces of feldspar suitable for analysis, those from NN2/2 did not yield a sufficient feldspar luminescence signal. On the other hand, the (K-rich) feldspar content of both quartzite samples was high enough to obtain a sufficient pIRIR 290 luminescence signal. From luminescence sediment dating, it is well known that not pre-processed (i.e. density separated and HF etched) quartz-rich sandy deposits typically show strong feldspar luminescence signals, likely originating from very small amounts of K-rich feldspars in the sediment (or rock) matrix. To our surprise, one granite sample ('Granite 1′) yielded no feldspar luminescence signal. This may be related to the fact that the analysed sample of this heterogeneous rock type contained predominantly non-feldspar minerals (e.g. quartz, mica, or amphibole) or feldspars that did not provide a sufficient luminescence signal. Overall, we would expect sufficient feldspar luminescence signals in the majority of plutonic, metamorphic, and sedimentary rocks. Only volcanic rocks or sedimentary rocks, which are predominantly made from volcanic material, may be problematic in terms of the feldspar luminescence (e.g. Tsukamoto et al., 2011).
Secondly, rocks of sufficient size are required to make sure that the luminescence signal has not been optically reset. The optical bleaching front that penetrates the rock depends both on the translucencies of the rock matrix itself and bleaching time (e.g. Meyer et al., 2018). Typically, optical resetting affects the outer 1 to 10 mm of the rock (Sohbati et al., 2012b;Sellwood et al., 2019) thus it seems advisable to take the inner-rock samples from well below this threshold. In practice, that might be problematic in some cases, as heated stones are likely to fragment in smaller pieces, either during heating (as e.g. during our experimental heating), during cooling, or due to post-depositional modifications such as various weathering processes that can easily act upon small cracks formed by heating.
The sampling method chosen here, coring with a diamond drill 1 cm in diameter, also requires at least fist-sized rocks (ca. 8 cm in diameter) that can better deal with the forces acted upon them. Nevertheless, not all sufficiently sized rocks survive this treatment, as they fragment depending on their friability. Either way, the method can be considered fairly destructive. Furthermore, the handling of the stones and the application of cooling water make subsequent use-wear/residue analysis impossible. Better understanding of the amount of light penetration of various rock types and different sampling methods can potentially reduce size constraints significantly, as well as the destructiveness of the method. Currently, there is a promising new generation of luminescence imaging techniques under development, which will be less destructive and more tailor-made for this kind of applications (Sellwood et al., 2019). These new techniques are based on the newly discovered infrared photoluminescence (IRPL) signal from feldspars, which can image the trapped charge without depleting the dosimetric information (Prasad et al., 2017). However, this method is not available to our research team yet and will be a task for future research.
Despite the simplified protocol applied here for feldspar luminescence analysis, i.e. one focused on heating (y/n) rather than obtaining ages or the degree of heating, analyses are still relatively time-consuming and require a well-equipped luminescence laboratory. It is possible to use this simplified protocol to analyse a larger assemblage of rocks (e.g. n = 50), but it will still require approximately 3 months of dedicated luminescence analyses, which will clearly limit the application possibilities in standard archaeological research. However, our data also demonstrates that this luminescence analysis can help to (1) establish a meaningful ratio of heated vs. unheated rocks for a representative sub-set of an assemblage, which can then be used to make a projection on the total number of heated rocks within an assemblage; (2) in the same way, it could help to shed more light on lithological or spatial patterns within an assemblage, which will help us to establish stronger links between the rock artefacts and the archaeological context; (3) it has been shown that rocks that are identified as not being heated from the macroscopic assessment match well with the luminescence assessment ("Match Between Macroscopic Assessment and Feldspar Luminescence Analysis" section), which means the former method can be used to pre-select the assemblage in order to minimise the sample number for the subsequent luminescence analyses, and (4) it may in the future be possible to improve macroscopic assessment of potentially heated rocks through an independent cross-calibration using feldspar luminescence.

Other Quantitative Identification Methods
Whereas the application of feldspar luminescence analysis is a step in the right direction to identify heated stones in a less subjective, quantitative way, there are still several constraints and limitations that illustrate the need for other methods that are: (1) lower-cost, (2) potentially applicable to rocks of smaller size, (3) applicable to rocks with different mineralogical compositions, and (4) less destructive.
The OSL dating method applied to quartz minerals offers an alternative but is age-limited to less than 100 ka (e.g. Timar et al., 2010), especially in settings with dose rates well above 1.5 Gy ka −1 . Furthermore, it often shows poor bedrock luminescence properties (e.g. Guralnik et al., 2015). It is therefore unsuitable for material from Early and Middle Pleistocene sites (like the Last Interglacial case study site Neumark-Nord 2/2 presented here) but could potentially be well-suited to later Late Pleistocene localities, especially to sandstone or quartzite, which are typically the preferred bedrock lithologies for quartz OSL. An interesting alternative avenue in this respect could be the correlation of quartz OSL sensitivity to heating of the rock matrix. It is well established in the OSL literature that quartz OSL sensitivity, which is a measure of the intensity of the OSL signal per unit radiation and per subsample, significantly changes through heating (e.g. Bøtter-Jensen et al., 1995;Poolton et al., 2000). Especially at temperatures between around 500 to 870 °C, the UV emission of quartz OSL, which is the typical emission used for quartz OSL, seems to increase (e.g. Schilles et al., 2001). From our limited experimental data, we conclude that the feldspar luminescence-based method is sensitive to heating below 400 °C (see Fig. 5a), thus observing quartz OSL sensitivity could complement feldspar luminescence analyses by helping to establish whether a rock was moderately or extensively heated to temperatures above 500 °C. However, in terms of practical constraints such as minimum physical sample size, destructiveness, availability of the lab facilities, and labour intensity, quartz OSL is similar to feldspar luminescence methods. An additional challenge for quartz OSL methods is to extract a clean quartz dominated luminescence signal in the presence of feldspars.
Rockmagnetic measurements and analysis can be used to identify heated stones by identifying the formation of new ferromagnetic minerals that have formed as a result of this heating. This method, if successful, distinguishes rocks in archaeological context that have been heated versus non-heated. The rockmagnetic methods used for this are low-field magnetic susceptibility, isothermal remanent magnetisation (IRM) acquisition, thermomagnetic curves, and hysteresis loops (see e.g. Carrancho et al., 2014). Rockmagnetic analysis uses smaller sized samples than the luminescence analyses and is less labour and time intensive. Preliminary studies on a small sample of experimental granites (MJS and A. Carrancho, unpublished work) are (so far) inconclusive, most likely due to the large variation of ferromagnetic minerals within this rock type.

Archaeological Implications and Future Perspective for NN2/2
If one extrapolates the ratio of heated/unheated pieces identified with feldspar luminescence analysis to the total number of stones form Neumark-Nord 2/2 with macroscopic signs of heating, one can expect a total of only 14 stones to be heated. Since none of the stones without macroscopic signs of heating were heated according to feldspar luminescence analysis, we expect no heated stones among them. Given the fire-proxy rich setting of Neumark-Nord 2/2, we conclude that a low number of stones have been heated at NN2/2, either accidentally by anthropogenic fires (see Pop et al., 2016) or intentionally. In the latter case, this practice would not have been systematically performed within the excavated area.
Alternatively, heated stones may have been used regularly, but their low numbers are the result of substantial fragmentation due to exposure to high temperatures or frequent reuse. An example of such heavy fragmentation could be Granite 1, which consists of more than 30 fragments found together. Most smaller fragments have, due to their unsuitability for luminescence analysis, been excluded for analysis here.
Although not all small, non-flint fragments were recovered during fieldwork, concentrations of small crystalline rocks were documented in the north-eastern sector of the site, where also many highly fragmented pieces of bone can be found. The fragmentation patterns on these bones indicate predominantly fresh breaks Kindler et al., 2014), but the degree of fragmentation is beyond the requirements of marrow extraction and may therefore point to a form of resource intensification in which boiling stones could have played a role (Manne et al., 2006;Nakazawa et al., 2009).

Conclusions
Heated stones constitute an important find category for the reconstruction of past fire practices, potentially providing insights into aspects of fire-related human behaviour that cannot be obtained from other proxies. Practices like hot-stone cooking, the use of stones as heat retainers or to structure hearths, are relevant for the study of resource use/intensification, thermoregulation, and spatial behaviour and hence for our knowledge of the development of the human niche.
Recognising heated stones through macroscopic assessment proves to be problematic, as it is subjective, difficult to reproduce, and traces are often difficult to distinguish from weathering processes. Therefore, a robust, quantitative method of identification is needed. For this purpose, the use of feldspar luminescence analysis was explored. Although results show good agreement between macroscopic traces and luminescence signals for the (presumed) unheated pieces, visual assessment of heat alteration overestimated the actual amount of heated non-flint rocks. This illustrates that visual assessment results in a high rate of false positives.
While feldspar luminescence seems to offer reliable identification of heated stones, the time and labour consuming nature of the existing methods make the analyses of a statistically representative sample in the case of large (heterogeneous) assemblages, like that of Neumark-Nord 2/2, challenging. Furthermore, the stateof-the-art method requires sufficiently large samples and is destructive. However, some of these limitations also affect other quantitative methods, but current methodological improvements (e.g. IRPL) have the prospect to provide more tailor-made solutions for future research. Rockmagnetic analyses present an additional research avenue, but may be limited in their application to specific rock types.
Feldspar luminescence analysis has so far allowed for the unambiguous identification of two heated non-flint rocks at Neumark-Nord 2/2, from a context that yields evidence for fire use, but lacks in situ features like hearths. Together with other contextual evidence (e.g. highly fragmented bones), this evidence may point to the use of heated stones in resource intensification practices such as grease rendering. To test this hypothesis, further research is needed to identify heating of small stone fragments (which may have been fragmented during heating and subsequent weathering) by e.g. using magnetism and to detect low-temperature thermal alteration of bones.
This proof-of-concept study demonstrates the potential of quantitative methods in the identification of heated rocks-and their prehistoric applications like hotstone cooking, and warns against the use of macroscopic assessment and the use of inferred diagnostic criteria without further independent confirmation.