Defining and Characterising Clusters in Palaeolithic Sites: a Review of Methods and Constraints

Spatial analysis studies in Palaeolithic archaeology arise as indispensable research tools for understanding archaeopalaeontological sites. In general terms, spatial studies have been specialised in the description of the distribution of materials and in the definition of accumulation areas, with the aim of distinguishing intentional activities or studying postdepositional processes. In recent decades, the development of GIS tools has enabled huge strides forward in the field of spatial archaeology research, such as spatial inferential statistics. These tools are particularly useful in the identification and location of clustering from statistical criteria, facilitating the subsequent analysis of accumulations through other archaeological, taphonomic and spatial techniques, such as fabric analysis or directional distribution. The cluster analysis, and its contextualisation considering all the archaeological and stratigraphical variables, allows the inference of some of the processes and factors that could have taken part in the accumulation of materials, as well as assessing how this affected the composition and preservation of the archaeological assemblage. The present article reviews the more traditional and innovative methods for studying horizontal distribution patterns and the objective definition of clusters, highlighting the parameters, uses and limitations of these techniques. We present an application of these methods to different Palaeolithic sites, going through different scenarios, such as location (open-air vs. cave), context, scale (large vs. small area), excavation methodology and spatial record methods.


Introduction
Spatial horizontal studies focussing on defining clusters and distributional patterns are coming to the fore again as helpful tools to interpret archaeopalaeontological assemblages. These methods, combined with a prior characterisation of the site stratigraphy, provide spatial and statistical criteria to identify, characterise and differentiate activity areas and the influence of postdepositional processes in the formation of these distributions. The first spatial studies applied to Palaeolithic archaeology can be traced to interrogating human behaviour through studying the location of the materials found at a site, and the idea that the position of the archaeological materials could reflect a past moment frozen in time (Laplace & Méroc, 1954a, 1954bLeroi-Gourhan, 1950). This idea evolved and increased the interest in the identification and interpretation of accumulation areas, giving rise to some articles focussing on understanding the spatial arrangement of materials and its possible meaning in human evolution. These publications were accompanied by plans and drawings that helped analysis and interpretation of the spatial disposition of the excavated materials (Baker, 1977;Carandini, 1979Carandini, , 1981Harris, 1979;Isaac et al., 1971), a trend that has continued to the present day and has turned out to be a very useful tool for spatial analysis.
Ethnographic analogies were also decisive in spatial archaeology studies, as well as in the interpretation of Palaeolithic sites (Yellen, 1977a(Yellen, , 1977bBinford, 1978Binford, , 1983O'Connell, 1987, O'Connell et al., 1988Enloe et al., 1994). The results obtained in such work have been relevant to refining and tackling the most suitable spatial analyses for getting an approximation to the ways of life of human groups during the Palaeolithic. More recent work has applied spatial statistics and geostatistics in ethnoarchaeology (Lancelotti et al., 2017), revealing these techniques to be powerful tools to get for extracting as much information as possible in ethnoarchaeological contexts (Biagetti et al., 2016;Carrer, 2017;Maximiano, 2012;Negre, 2015), and therefore aiding better comprehension of Palaeolithic activities and their social organisation. The interpretation and analysis of the spatial arrangement of materials and anthropic structures have allowed organisational patterns to be investigated, which may furnish clues about important aspects of social organisation, like the structuration of activities in the occupation space, identifying zones of processing and exploitation of resources (Alperson-Afil et al., 2009;Blasco et al., 2016;Goren-Inbar et al., 2004;Martínez & Rando, 2001;Mora et al., 2020;Sánchez-Romero et al., 2020), as well as domestic or dumping areas (Bourguignon et al., 2002;Vaquero et al., 2004;Vaquero & Pastó, 2001). Additionally, these analyses allow evaluating the impact of other agents on the archaeological record, like the action of carnivores (Arilla et al., 2020;Camarós et al., 2013) or the degree of influence of postdepositional processes (Benito-Calvo & De la Torre, 2011;García-Moreno et al., 2016;Jia et al., 2019;Sánchez-Romero et al., 2016).
The first works focussing on the study of spatial patterns emerged in the 1970s, with mainly visual approaches (Dacey, 1963;Davis, 1975;Hietala & Stevens, 1977;Hodder & Orton, 1976;Whallon, 1973Whallon, , 1974, and the application of quantitative methods that were more commonly used to analyse spatial patterning in ecology and botany (Clark & Evans, 1954;Thompson, 1958;Kershaw, 1961;Morisita, 1962;Pielou, 1969;inter alia). These first approaches to the identification of spatial distribution patterns were applied in Palaeolithic archaeology through innovative techniques, like nearest neighbour analysis (Dacey, 1963;Whallon, 1974), the analysis of distributional patterns by squares or quadrats (Davis, 1975;Haberman, 1974) or the application of chisquare tests to discriminate accumulation zones (Carbonell et al., 1980). However, density analysis started to be applied in Palaeolithic archaeology from the 1950s (Clark & Evans, 1954;Clarke, 1968;Davis, 1975), but was a manual and laborious process that restricted its application and interpretation. Later, besides density analysis, other studies also related to horizontal distribution of materials and unsupervised classification methods appeared, such as k-means. The first applications of this method in archaeology date to the 1980s, reaching their zenith with the use of k-means, a method widely used for classifying and grouping archaeological assemblages (Kintigh & Ammerman, 1982;Ammerman et al., 1983;Simek & Larick, 1983;Simek, 1984;Vaquero, 1999;Vaquero & Pastó, 2001;Mitchell et al., 2006;inter alia). Although these methods provide a useful and quantitative base, and are therefore reproducible, they are somewhat limited by subjective parameters, like setting up the number of groups in k-means or the search radius and classification in density analysis. Besides, these methods do not give statistical significance values for discriminating discrete areas. Nevertheless, recent years have borne witness to a renewed interest in spatial horizontal patterns in Palaeolithic contexts, which includes more detailed studies of the identification of material clustering and anthropic structures, a point being addressed through the application of Geographic Information Systems (GIS) tools. Current trends in spatial distribution analysis in archaeology incorporate tools that identify main accumulations of materials according to spatial statistical criteria, the so-called hotspots methods (Caruana et al., 2014;De la Torre et al., 2020;Domínguez-Rodrigo & Cobo-Sánchez, 2017a, 2017bMora et al., 2020;Sánchez-Romero et al., 2016Shipton et al., 2018). These methods identify statistically significant accumulations according to a given quantitative variable and the spatial relationship among data. Thus, it is possible to discriminate concentrations not only by the clustering ratio (distance between items) but also according to the characteristics of the materials, such as length or weight. Discretising horizontal patterns not only provides new criteria for interpreting sites, but also increases the resolution of other spatial techniques through their application to specific accumulations (Sánchez-Romero et al., 2016.
The present work reviews these different techniques applied up to now in the horizontal identification of distributional patterns of materials, comparing methods and highlighting their limitations in relation to Palaeolithic data directly and indirectly derived from cave and open-air sites. Methodological novelties are also presented in terms of using and combining specific tools for performing a more efficient and solid analysis of horizontal distribution patterns.

Materials
The data employed for this work, which will serve to illustrate the applications, advantages and limitations of each of the different tests and analytical methods, come from the sites of Ambrona, Amalda I and Aranbaltza II (Fig. 1). These three sites are very different in terms of context, scale, chronology and excavation method (Table 1), and turn out to be an excellent case set against which to test the feasibility, limits and advantages of the wide range of tools used to perform a complete spatial analysis. The open-air site of Ambrona was systematically excavated for the first time in the 1960s, with discontinuous excavation projects up to today (Howell, 1963(Howell, , 1965Howell et al., 1995;Sánchez-Romero et al., 2016;Santonja et al., 2018). This site was dated to 350 ka BP (AS6) (Falguères et al., 2006) and contains a large number of Palaeoloxodon antiquus bones and Acheulean lithic industry (Units AS1-AS5), as well as Equus remains and Early Middle Palaeolithic lithic industry pieces (Unit AS6) Santonja et al., 2005). The excavation methods used at this site have been varied, from plan drawings to data collection with total station. In this way, all the registration methods were tested in order to evaluate their effectiveness and limitations in terms of post-excavation data processing and to perform a complete spatial analysis. Another of the sites is the Amalda I cave, where its Level VII was recently dated between 44.5 and 42.6 ka uncal BP (Marín-Arroyo et al., 2018), and on which the present study is focussed. This site was excavated in the 1980s (Altuna, 1990) using measuring tape and a theodolite as a recording method for larger materials. Thus, we had to combine this XYZ information with that obtained after applying random XY coordinates according to information about the square of provenance and excavation spit (Rios-Garaizar, 2012;Sánchez-Romero et al., 2020). Lastly, the Châtelperronian open-air site of Aranbaltza II has allowed us to work with more advanced recording methods, since this site was excavated between 2013 and 2016 and the materials were recorded exclusively by total station. This site has provided a large amount of lithic materials in a relatively reduced area, where there is no presence of bone remains.
Thus, the fact that these three sites are very different has allowed us to compare results considering several constraints, such as different scales, contexts, techniques and excavation methodologies, size and type of materials or data collection methods. This work presents a review of the effectiveness, limitations and possible errors and difficulties that can be found when some methods and analytical tools are applied, as well as the type of results obtained.
The input data have been classified by categories, according to whether they proceed from direct data collection in the field, or are indirect data from secondary or derived sources: -Category 1: Direct data with coordinates collected in the field, either with total station or according to the grid/square. In this case, we have handled data from the sites of Amalda I and Aranbaltza II.  Blasco et al., 2016;Sánchez-Romero et al., 2020). By applying these methods, all the entities without spatial references can be located precisely, whether they are from raster (such as drawings or photographs) or discrete information (points, lines or polygons). The data for this category are from Ambrona and also partially from Amalda I and Aranbaltza II. On this point, it is important to remark that most sites combine both categories, since recording all the remains is quite complicated, especially at those sites where there is a high density of small or very small size remains. Thus, these methods that we proceed to describe allow the recovery, incorporation and interpretation of all remains, even those that are usually discarded from this kind of studies due to their size, recording method or because they were recovered during older excavations.

Methods of identification and classification
Defining general distributional patterns The step preceding the identification of singular clusters should be a definition of horizontal distribution patterns, through evaluation of the general pattern of the assemblage by defining whether the materials are dispersed, clustered or randomly distributed. The type of distribution is measured by the quotient D = variance (S2) / average (m). A random distribution shows a variance higher than the average, where there are no patterns, but this is due to chance and there are no elements that determine the position of the materials. In a clustered distribution, the variance is equal to the average, and here the items are grouped due to a factor that has notionally caused that grouping.
In spatial analysis, we study which factors could be responsible for that clustering of materials, since there are several anthropic and postdepositional processes that can produce these patterns. Lastly, in dispersed distributions, the variance is null, showing a uniform distribution of items within the study area. Most of the methods devoted to categorisation of the general pattern are based on the identification of a null hypothesis, but depending on the data in question, different methods and approaches can be applied. In this regard, several methods are available according to the type of data: -Applicable methods by quadrats: These methods evaluate the distributional patterns according to the number of points (items) contained in each square and their distribution along the surface (Quadrat Method, Lee & Wong, 2000). The most widely used statistical tests for quadrat sampling are chi-square (Pearson, 1900) and Kolmogorov-Smirnov (K-S) (Daniel, 1990;Conover, 1999;Senger, 2013), which allow us to infer whether the materials are clustered, dispersed or randomly distributed (Krauth, 1993;Lee & Wong, 2000;Sánchez-Romero et al., 2020). -Applicable methods based on distance between remains: These methods use the XY coordinates of each item to calculate the distance and relation between them, evaluating their distributional patterns. Among these methods are Average Nearest Neighbour (ANN), Ripley's K Function (De la Torre et al., 2018;Giusti et al., 2018;Sánchez-Romero et al., 2020;Spagnolo et al., 2019Spagnolo et al., , 2020, General G, Getis-Ord Gi*, Global Moran's or Anselin Local Moran's I (Sánchez-Romero et al., 2016Shipton et al., 2018).
Regarding the methods that can be applied to points with XY data, these can be divided into global and local methods. The former (Table 2) are useful to evaluate whether the general pattern is dispersed, clustered or random. This is a preliminary step before the application of additional local methods, since it determines whether the general pattern is indeed clustered. General methods identify clustering patterns in relation to a given quantitative variable. However, these analyses do not detect specificities in heterogeneous scenarios. Thus, they do not provide the identification of subzones where there are clustering or dispersion phenomena (Siabato & Guzmán-Manrique, 2019). The evaluation of general associations within the assemblage is the main characteristic that distinguishes global analyses from local ones.

Identifying and defining local clusters
Density analysis is probably the most common method used in spatial archaeology (Sañudo et al., 2012;Sánchez-Romero et al., 2016Blasco et al., 2016;Pop et al., 2016;Spagnolo et al., 2016Spagnolo et al., , 2020Alperson-Afil, 2017;Villaverde et al., 2017;Giusti et al., 2018;Coil et al., 2020; inter alia) since it shows graphically where the entities are more or less concentrated. The use of this kind of map was introduced in the analysis of archaeological data in the 1980s (Carbonell et al., 1980;Hodder & Orton, 1976;Hodder, 1988)-drawn by hand-but it was not until the late 1990s that it experienced an increase in its application. This change could be due to the availability of this method as a tool in GIS, very useful for two-dimensional estimation of density in spatial patterns (Baxter, 2003). One of the most common methods is kernel density estimation (KDE), which differs from normal density estimation in the application of a kernel function to calculate the magnitude per unit area of line or point features. KDE requires an input value corresponding to the search radius that varies depending on the Measuring the distance between each element centroid and the location of its closest neighbour centroid, calculating the average of all these closest or nearest neighbour distances (Getis, 1964). This ratio is calculated dividing the observed average distance by the expected average distance.

Ripley's K Function
Calculating the average distance between elements within determined distance bands, dividing this value by the average density of elements over the whole of study area (Ripley, 1976). This function shows how the spatial dispersion or clustering of the remains changes when the neighbourhood changes, and is very useful when we want to know whether the clustering or dispersion of the remains change at different scales of analysis.
Global Moran's I Measuring the spatial autocorrelation between the entities' locations and the attribute values using the Global Moran's statistic. This method of analysis, given a set of entities and an associated attribute, calculates the z-score and p-value in order to indicate whether the null hypothesis should be rejected or not. As in the case of ANN, the null hypothesis establishes that the values associated to the entities are randomly distributed.

Getis-Ord General
Evaluating the pattern and the general trend of the data. As an inferential statistic, the results of this analysis have to be interpreted within the context of the null hypothesis, as in the case of the Moran's I. The null hypothesis of General G states that there is no spatial clustering of feature values. This method is most appropriate for testing the general spatial pattern for the whole area (Getis & Ord, 1992).
analysed surface and the amount and size of the materials under study, and this parameter determines the detail and accuracy of the density analysis. Graphical density representations provide a continuous map of the materials' concentrations, which on many occasions need to be segmented or classified to identify and study homogeneous groups. In order to delimit the zones of highest concentration objectively, statistical classifications can be performed. This task can be undertaken using unsupervised classification methods, such as natural breaks classification (Jenks, 1967). This unsupervised method is based on the natural groupings inherent in the data, maximising the differences between classes, without intermediate classifications and establishing the limits where the differences among the data values become more significant (De la Sánchez-Romero et al., 2016. A widely used unsupervised method to classify horizontal distributions is k-means, where the characteristics of the classes are initially unknown (Forgy, 1965;Hartigan & Wong, 1979;Kintigh & Ammerman, 1982;Kintigh, 1990;Rios-Garaizar, 2012;Sánchez-Romero et al., 2016). In this method, the classification is automatic, with practically no intervention by the user, forcing the classification of the dataset into a selectable number of groups. The number of classes or groups must be set at the beginning and the results do not furnish statistical significance for the clustering. K-means has been largely applied in the first studies related to the intra-site spatial archaeology (Blankholm, 1991;Kintigh, 1990;Lemke, 2013;Simek, 1984;Vaquero, 1999), providing a fast method to classify the whole assemblage, but lately this is being superseded by other methods. On the other hand, unsupervised classification methods considering training areas and statistical values of classes have not been tested on Palaeolithic maps, as far as we know.
The different methods explained above show limitations in the identification of isolated clusters, mainly derived from a lack of objectivity or statistical criteria to identify and establish the boundaries of the clusters. These limitations can be addressed through different local analyses that have been little applied in spatial archaeology so far. These methods (Table 3) allow the individual identification of levels of clustering, random or dispersion distribution of each variable in relation to the rest of the units through neighbourhood criteria. During recent years, these methods are starting to be applied to confer statistical significance on the cluster mapping Giusti et al., 2018;Mora et al., 2020;Sánchez-Romero et al., 2016Spagnolo et al., 2019Spagnolo et al., , 2020, with the aim of getting a more objective identification of the clusters in clustered distributions. These techniques can be applied to any quantitative variable, in order to assess different factors that could have boosted the concentration of materials at this point of the site. These methods are mainly represented by the Getis-Ord Gi* and Anselin Local Moran's I spatial statistical tests (Table 3), which identify statistically significant concentrations from the spatial relationship among individual items, which are defined by a quantitative variable (length, weight or frequency). However, the parameters and limitations controlling hotspots methods have scarcely been tested to date.

Characterising clusters
The identification of clusters opens up a new research avenue into the spatial distribution patterns of Palaeolithic sites. Through the spatial and archaeological study of each of the clusters individually, and their relationship with the rest of the identified clusters, new questions can be addressed from different perspectives in relation to the possible causes of material clustering. The clusters have to be analysed individually in order to know their composition and features, such as raw materials, technology, typology, function, taxa, skeletal parts, cutmarks, toothmarks or digested bones, among others. In addition, information like the shape, length and fabrics of remains (Sneed & Folk, 1958;Schiffer, 1987;Benn, 1994;Bertran & Texier, 1995;Bertran et al. 1997;Dibble et al., 1997;Ringrose & Benn, 1997;Lenoble et al., 2000;Benn & Ringrose, 2001;Bertran & Lenoble, 2002;Lenoble & Bertran, 2004;McPherron, 2005 The cluster definition also allows evaluating the position, shape, size and orientation of the concentrations. These methods have been applied to use-wear characterisation (De la Torre et al., 2013), but they are also appropriate to be performed in spatial analyses, to assess the cluster distribution in relation to the site structure and its local environment (Sánchez-Romero et al., 2020). One of these techniques is the directional distribution, which, through the standard deviational ellipse (Mitchell, 2005)

Discussion
Application and limits of the methods and their suitability according to the available data The spatial analysis of Palaeolithic sites allows a vast amount of information to be put together, which needs to be homogenised, classified and spatially correlated. In this work, we aim to review the different analytical methods and go one step further in the identification of clusters, as a way of segmenting horizontal palimpsests, identifying, characterising and interpreting the archaeological accumulations found at Palaeolithic sites.
To this end, we work with spatial databases from different contexts that require a good understanding of the data handled, their origin, limitations and the methodology applied to collect them. Working with data taken directly in the field (or category 1) is preferable, since this provides detailed information about the spatial position of the materials under study (Dibble, 1987;Dibble & McPherron, 1988;McPherron, 2005;McPherron et al., 2005) and confers great versatility upon the analyses. As an example, the work performed at Aranbaltza II, where materials are recorded with total station, offered great precision in tackling spatial analysis of the distributional patterns (Fig. 1). In this particular case, because it is a small area (around 18 m 2 ), the precision in data recording gains even more importance, since the margin is much tighter than for larger sites, such as Ambrona or Amalda I (Fig. 1). To do this, accuracy in bringing all the data together is critical to detecting significant archaeological clusters. As an example, the area of the main clusters in Ambrona varies from 10 to 45 m 2 approximately (Sánchez-Romero et al., 2016), while in Amalda they are 1-6 m 2 (Sánchez-Romero et al., 2020) and < 1m 2 at Aranbaltza II.
However, not all sites present these characteristics in their data, especially for old excavations, when these techniques were not available or not in wide use. In addition, as mentioned above, most sites cannot offer a complete field record with XYZ data for all excavated materials. In these cases, indirect data sources (or category 2) are a good alternative, since they allow a large amount of information Sánchez-Romero et al., 2016) to be recovered and give these data a second life, as otherwise they would have been passed over and probably not fully taken into consideration. The Ambrona site, where data from the 1960s to the 1980s excavations were combined with data from the 2000s (Sánchez-Romero et al., 2016), is a good example of recovering information that would have not been used (Fig. 2), since it was unpublished or serving as merely as illustrations (Howell, 1963(Howell, , 1965. The recovery of this type of information implies the application of techniques such as georeferencing of plans (Benito-Calvo & De la Torre, 2011;Sánchez-Romero et al., 2016), which allows combining and scaling maps with the aim of obtaining complete information about the excavated materials and their position, not only individually but also in relation to the site and the remaining materials. This kind of work with preexisting plans allows the spatial analysis of large pieces that were drawn in the field (Boschian & Saccà, 2010;Sánchez-Romero et al., 2016;Walter & Trauth, 2013). Unfortunately, these maps usually show the larger materials, not lithic pieces, which are smaller and normally represented by points. Spatial distribution analysis in drawings where items are represented by both points and polygons requires special considerations, since the transformation of all geometries to points could produce an underestimation of the distribution pattern at the study scale. Thus, the representation of large pieces with points at bigger scales does not provide the real occupation of the space by that piece, causing subsequent errors when analysing density, clusters or in other spatial analysis.
After working with a large number of spatial datasets, from different methods of data collection and excavation methodologies, we can establish that the most important point is the systematisation of the method, both in the field and the data treatment at the lab. It is essential that the data handling protocol is properly followed and maintained, with the aim of minimising the errors and uncertainties to the extent possible (Martínez-Moreno et al., 2016). Errors are inevitable, but with this systematisation they can be reduced or their accumulation avoided, which is critical, especially in those cases in which the data quality and accuracy are unknown. When the data from older excavations is dealt with, it is important to highlight that a new database is being created combining old data (such as drawings) with new data (such as the coordinate method where the material is being integrated) (Benito-Calvo & De la Torre, 2011;Sánchez-Romero et al., 2016). In this way, a new lease of life can be given to old documents or information, thus providing valuable new information for the interpretation of site formation processes and the activities performed at the site (Fig. 2).
To recover the spatial position of materials only represented by points, especially when the information is incomplete or comes from old excavations (Blasco et al., 2016;Rios-Garaizar, 2012), spatial modelling through random models has proved to be a useful resource. In the case of Amalda I, Rios-Garaizar (2012) modelled the spatial position of the pieces without XYZ data, using their positions in the square and a random distribution model (Rios-Garaizar, 2012;Sánchez-Romero et al., 2020). This XY coordinate assignment was evaluated using the unsupervised k-means method, which allowed the present authors to evaluate whether the same clusters persisted, and this was checked for each modelling performed. Random techniques are also applied to distribute the smallest pieces (e.g. debris) that are not usually coordinated during the excavation process. In the case of Aranbaltza II, random coordinates were generated for the smallest pieces according to the number of pieces and a neighbourhood determined by the collection radius (Blasco et al., 2016;Rios-Garaizar, 2012;Sánchez-Romero et al., 2020) (Fig. 3).
The generation of random coordinates in the Amalda I and Aranbaltza II sites has allowed incorporating information that would otherwise have been relegated to the background, or even excluded from a spatial analysis study. In both cases, we were able to verify that there were no variations of any kind in the number of clusters, their extension or location (Rios-Garaizar, 2012; Sánchez-Romero et al., 2020). The possible variations that could exist in terms of the location of the remains within the radius do not influence the cluster definition at all. However, taking this into consideration, it is important to highlight that if the area were larger and the material collection radius were also larger, the interpretation of the site might be affected. In the latter case, there could be variations in the location of the clusters and their extension, which could affect in some ways interpretation of the factors that could have affected the formation of the site and the accumulation of materials. Thus, and taking the cases of Amalda I and Aranbaltza II as contrasted examples, when a spatial study is undertaken with random coordinates, it is important to keep in mind the following: (1) defining the radius of collection of materials exactly and accurately is essential, and (2) awareness of the area where this method is being applied, since the greater the area and the greater the radius where we apply this random distribution of remains, the greater the error range and, Fig. 3 Modelling of pieces without XY information considering the radius where they were collected (a). Distribution resulting from this modelling (b), and together with the rest of the materials with XY information (c and d) therefore, the lower the precision. Above all, as we previously mentioned, adhering to exhaustive criteria in data collection is key, since the variations that may exist in the method will be added as accumulated errors, affecting the data postprocessing, and therefore rendering spatial study and interpretation of the site even more difficult.
Data preparation determines the accuracy and limitations of the spatial analysis. These techniques were tested on the sites of Ambrona, Amalda I and Aranbaltza II using different databases at different scales, and hence verifying their effectiveness in diverse scenarios (Sánchez-Romero et al., 2016. In all cases, the distributional general pattern could be inferred through chi-square and Kolmogorov-Smirnov tests (Fig. 4), pinpointing the clustered nature of all assemblages. These procedures allow an approximation to the general distribution pattern of the site, although recently these tests have been applied for predicting archaeological information from unexcavated areas (Domínguez-Rodrigo et al., 2017). Besides, these techniques are also applicable to providing general relationships between materials, such as, for example, the relationship between the raw materials and the type of tools. In this way, these tests are applied not only to learn the distribution type of the materials, but also to evaluate any correlation between them. This last use is the most common in archaeology, in order to reveal the correlation between variables and the statistical significance of the data handled (Simek & Leslie, 1983;Brantingham et al., 2007;Sisk & Shea, 2008;Bernatchez, 2010;Vaquero et al., 2017;Giusti et al., 2018;inter alia). Because these tests, at spatial analysis scale, only indicate the random, clustered or dispersed general pattern of the materials, other analyses are necessary to locate the clusters, and hence to assess them more thoroughly.
The widespread use of computers and spatial analysis software, such as GIS, permitted density analyses to be broadly applied in archaeology. This technique requires a search radius whose use depends on the data and site extension, determining the accuracy and representativeness of the density results (Fig. 5). In general, the Fig. 4 Example of Amalda distribution plan and the frequency table elaborated from the number of quadrats, number of points and number of points contained in each quadrat. With these data and table, it is possible to calculate the chi-square and Kolmogorov-Smirnov values, which allow the distribution of the materials to be known selection of this parameter is not usually tested or assessed, which could affect the interpretation of the results. As an example, at Ambrona, several radii were tested, with a search radius of 2 m being selected due to the dimensions of the analysis area and the size of the materials (Sánchez-Romero et al., 2016). However, for Amalda I, the selected search radius was 0.5 m, while at Aranbaltza II it was 0.30 m (Sánchez-Romero et al., 2020). In these cases, the areas are much smaller than Ambrona, as is the size of the analysed materials, so in consequence the selected radius is smaller than the one chosen for larger areas. In all cases, density results were tested with other statistical methods (such as hotspots), in order to provide a cross-validation of the results and to add statistical criteria to the cluster identification (Fig. 6). Density analyses are applicable to point and line entities, generating a continuous raster distribution, but do not provide any segmentation of the space for separating clusters. This method is usually used to observe the occurrence of materials, although other quantitative variables can also be selected, such as size or weight.
In this sense, this method has been recently combined with other unsupervised classification methods, such as the Jenks method (Sánchez-Romero et al., 2016. For Ambrona, the identification and comparison of large accumulations of materials were performed using these techniques, highlighting the clusters of faunal remains (Sánchez-Romero et al., 2016), which were also cross-validated with hotspot methods (Fig. 6). The identification of these clusters and their individual study, together with their correlation with the stratigraphic units, facilitated the identification of processes related to different sedimentological areas in a wetland environment. In Amalda, the unsupervised classification method of k-means allowed the verification of clustering patterns through modelling the spatial distribution of random points (Rios-Garaizar, 2012). However, this method has several limitations that can skew the identification of clusters if the analysis is solely founded upon it. The classification of k-means is mainly based on the number of groups, chosen previously by the user, and normally selected without objective criteria. Besides, the classification is forced, not supported by the criteria of significance, something that also is not detected by this method. Through the application of k-means, we cannot obtain statistical significance for the clustering, but rather the grouping of materials according to their proximity, and the number of groups will vary with the number determined by the user. The step preceding the classification of point distributions should be an objective definition of the number of groups, either through a previous plot with sum-of-squares errors that helps the number of clusters to be understood (Kintigh & Ammerman, 1982), or through the Ward hierarchical classification, which enables appraising the number of groups into which the distribution seems to be organised. In general, unsupervised methods ease the identification of uncertain clustering areas, especially by means of the limits established between groups. We tested some of these methods in the studied cases, like Ambrona (Fig. 6) or Aranbaltza II (Fig. 7), good examples of very different open-air sites.
More recently, these studies are being complemented with techniques that enable more thorough investigations into material distribution patterns, providing data reinforced by statistical significance. These methods are proving themselves very useful for estimating distributional patterns, but they are also applied as complementary tests to verify the results obtained by previous analyses. De la Torre and Wehr (2018) used them to discover the artifact density and clustering pattern using this function as a complementary analysis to ANN analysis, while Giusti et al. (2018) also used ANN to estimate the cumulative distribution, with Ripley's K function as a correction to reduce the edge effect bias. However, Spagnolo et al. (2019) applied Ripley's K function to Fig. 6 Comparison between different methods of classification and identification of clusters at Ambrona: a hotspots (Getis-Ord Gi*) by points, b hotspots (Getis-Ord Gi*) by quadrats, c K-means and d Kernel density analysis verify the variations of clustering or dispersion rate, pinpointing the sensitivity of this method to study-area variations. This having said, other global methods, like General G or Global Moran's I, have barely been used in comparison with the application of others, like Ripley's K function, for example. In the case of Amalda I, we applied different global methods, such as ANN, Global Moran's I, incremental spatial autocorrelation and the Ripley's K function (Sánchez-Romero et al., 2020) to evaluate the clustering nature of the distribution pattern and its characteristics, like the statistically significant clustering or dispersion according to the distance range or the distance where the maximum clustering of materials was found.
However, global methods cannot identify the position, extent and statistical significance of the clusters. Nevertheless, such characteristics are provided by local methods, which identify the clusters and their statistical significance in the context of the whole assemblage. The first work using these methods in Palaeolithic archaeology applied them to use-wear analysis of lithic tools (Caruana et al., 2014) and intra-site spatial analysis (Sánchez-Romero et al., 2016), revealing their effectiveness at detecting statistically significant clusters of high and low values (De la Torre et al., 2020;Domínguez-Rodrigo & Cobo-Sánchez, 2017a, 2017bMora et al., 2020;Sánchez-Romero et al., 2016;Shipton et al., 2018). The application of these methods to the analysis of distributional patterns in Ambrona defined the main accumulation areas (Sánchez-Romero et al., 2016), both at individual level and by squares. These tools were also applied in Amalda I to assess the spatial organisation of the activities and the use of space by Neanderthals, as well as the accumulations caused by carnivores in inner and more protected zones (Sánchez-Romero et al., 2020). In this case, this method allowed testing the number of elements per square according to different variables (burnt remains, cutmarks, toothmarks, etc.), which yielded complete information about the distribution and clustering patterns of Amalda I with variables that are not themselves quantitative. These techniques produce a valuable picture and assessment of clustering, but they are controlled by several parameters which have been little tested so far. Hotspots are mainly controlled by the quantitative variable and the spatial relationship employed to perform the analysis. Thus, data characteristics and the area of the site are essential features for selecting the spatial relationships (fixed, inverse and Fig. 7 Comparison between some of the same methods used in Ambrona but in the smaller area of Aranbaltza II: a Kernel density analysis and b K-means inverse squared), since the results vary with these. We can verify the importance of the spatial relationships and the differences between the obtained results (Sánchez-Romero et al., 2020) (Fig. 8), highlighting the suitability of one spatial relationship or another according to the data in question. All these spatial relationships were tested, in order to evaluate the suitability of each according to the data. The application of Getis-Ord Gi* and Anselin Local Moran's I to Amalda I was conducted using the length of the lithic pieces and faunal remains. The fixed relationship enabled a solid definition of clusters, while the inverse and inverse squared distance relationships only detected some scattered points.
In the case of Aranbaltza II, this method was applied according to the length of the lithic pieces and some clusters composed of few materials were detected, both high and low values (Fig. 9). Although the distribution of the assemblage is clearly clustered, the clusters detected were dispersed and composed of just a few elements, which provide no clear evidence for explaining the site formation processes. This happens because the tool detects the clustering of high or low values (in this case, length), but if there are no such clusters, the distribution turns out to be "not significant" in terms of statistically significant clustering. This happens for all the spatial relationships applied to the Aranbaltza II assemblage. However, as we noted in the rest of the cases analysed, the fixed relationship usually provides well-defined clusters, which is due to the fact that the distance band employed by this relation ensures that each element has at least one neighbour. In the inverse and inverse squared distances, all elements influence all others, but the further away they are, the smaller the impact. Apart from this, the weight Fig. 8 Application of Getis-Ord Gi* to lithic materials of Amalda II by length, and comparison between the different spatial relationships: a fixed, b inverse and c inverse squared. Same comparison but considering the number of remains per quadrat: d fixed, e inverse and f inverse squared of the distance bands is critical, as the definition of a correct distance will determine the influence of neighbours, and hence the results obtained. In the cases analysed, we observed that the extent of the study area influences the application of one spatial relationship or another, and therefore the interpretation, which enables the processes that took part in the formation of the site to be inferred, is also affected.
Additionally, it is important to consider the application of the FDR correction, which adjusts the statistical significance by reducing the error thresholds of p-values. This yields more solid detection of the statistically significant clusters but can also eliminate or reduce relevant values for the study, and hence bias the results (Fig. 9). The single use of the FDR correction should be applied cautiously and always as a complement to Fig. 9 Samples of application of FDR in Getis-Ord Gi* analyses according to the length of lithic materials (a, b) and to the number of remains per quadrat (c, d) in Aranbaltza II other analyses, since it may lead to the loss of relevant information. The functioning of this parameter was tested in Amalda I and Aranbaltza II, and we were able to verify that the results of this correction are conditioned by the type of distribution and number of elements, as well as the size of the analysed area. For Amalda I and Aranbaltza II, where the studied areas are smaller and more enclosed, its application simplifies the results excessively. In the case of Amalda I, the application of FDR reduced the statistical significance of the clusters when the fixed spatial relationship was applied, making delimitation of the clusters more difficult. However, in this case, this correction did not entail a great change with respect to not applying it (Sánchez-Romero et al., 2020). On the contrary, at Aranbaltza II, the FDR correction eliminates the highsignificance spots of high and low values (Fig. 9a, b), blocking any identification and definition of clusters. In the case of its application to the frequency by squares (Fig.  9c, d), FDR reduces the extent of hotspot clusters and eliminates the coldspot ones. The FDR correction indicates which clusters are more robust according to their statistical significance, but the results should not be relied on alone for the final identification of the most significant clusters.
Regarding the application of local methods, Anselin Local Moran's I and Getis-Ord Gi* are the most common ones. The main difference between them is that Anselin Local Moran's I detects atypical values, while Getis-Ord Gi* allows the clustering degree (statistical significance) of low and high values to be known, excluding the analysis of atypical values (Getis & Ord, 1992;Ord & Getis, 1995;Anselin, 1995Anselin, , 2019Siabato & Guzmán-Manrique, 2019). In the case of spatial studies in archaeology, we have tested both methods in the identification and definition of clusters, verifying the advantages and disadvantages of each (Fig. 10). In these two examples (Fig. 10), both methods recognise the same clusters, identifying accumulations of low and high values. However, for spatial analysis in archaeology, knowing the statistical significance of the clusters proves a more useful resource than the presence or absence of atypical values, since it enables the clusters of materials to be delimited more accurately. A good example is observed with Aranbaltza II (Fig. 10c, d), where if only the clusters detected by Anselin Local Moran's I are considered, numerous small concentrations of materials would be classified as statistically significant, although they are composed of just a few pieces. In reality, though, these clusters do not provide information that helps elucidate the formation processes or the presence of, for example, certain activity areas, since these concentrations are formed only of a few elements and, in most of the cases, are of low statistical significance. At this point, and as we have been indicating throughout this work, it is important to highlight that these methods should be supported by other analyses and we should not base all the spatial analysis results only on statistical data. It is necessary to evaluate to what extent these clusters made up of just a few pieces are decisive, since these pieces could be important to shedding light on the performance of a specific activity.
Getis-Ord Gi* is the most widely used method (Sánchez-Romero et al., 2016Spagnolo et al., 2019Spagnolo et al., , 2020Mora et al., 2020;Giusti et al., 2018;inter alia). In spatial archaeology, we mainly work on analysing the distribution of materials in relatively small and limited areas (in comparison with other surfaces, such as those dealt with by geography or ecology), trying to understand the relationships of the different elements that make up the site as a whole, not only considering lithic and bone assemblages, but also structures (like hearths) or natural elements (like large blocks), and the space itself, such as cave walls. Therefore, the evaluation of the distribution patterns and the clustering degree of the analysed variable provided by Getis-Ord Gi* acquires greater relevance for interpretations at the work scale and the studied distributions. However, when the analysis is performed at smaller scales and where great detail is required for knowing the clustering and distributional patterns of values, Anselin Local Moran's I enables the variation among values and the possible existence of atypical values in relatively homogeneous distributions to be observed. Regarding spatial analysis in Palaeolithic archaeology, the same clusters are identified by both methods, although the concentration degree and its statistical significance become more useful in terms of identifying and pinpointing clusters of high and low values, rather than the presence or not of atypical values. Thus, the combination of data and analytical methods is key to defining clusters or accumulations robustly, beyond mere visual identification. As we have previously seen, the methods and techniques for the identification analysis of clusters are numerous, so their use and combination prove to be a good method for correct identification of the main concentrations of materials. In the case of hotspots, although they are a useful and powerful tool, it is important to underline that these methods are also determined by parameters whose selection should be based on previous analysis and supported by other kind of identification methods. In order to achieve a solid identification of clusters, the combination of techniques and disciplines shows itself to be the best method, since this entails a more complete understanding of the site spatial distribution.

From clusters to the interpretation of site formation processes and human activity patterns
The cluster definition is the basic first step that allows the analysis of the processes that caused them to accumulate, as well as evaluation and understanding of the distributional patterns detected at Palaeolithic sites. From the spatial analysis point of view, several techniques are currently applied to explain the existence and location of clusters. Indeed, studies such as palaeogeographic reconstructions (Oliver, 1990) can turn out decisive for estimating the possible causes of material accumulations and their location, since the same process that modelled the palaeosurface could also be responsible for the material distribution. The possibility of gaining an approximate idea of the context in which materials were deposited is important, since it will allow carrying out a more exhaustive study evaluating other kinds of elements that could have been also involved in the accumulation process. Although intra-site palaeogeographic reconstructions are starting to be seen in the last few years (De la Torre & Wehr, 2018; Giusti et al., 2018;Bargalló et al., 2020;Sánchez-Romero et al., 2020), they are still scarce, since in many cases there is not enough information to infer the palaeosurfaces on which the excavated materials were deposited. At Amalda I, the palaeogeographic reconstruction made it possible to observe that the main accumulations were independent of the palaeotopography of the level, and the lithic and faunal remains clusters were located in regular areas with a gentle slope (Sánchez-Romero et al., 2020).
When it comes to characterising the identified clusters, there are some widely used spatial techniques that enable the processes that could have affected the materials contained in each to be evaluated, such as analyses of orientation, slope and 3Dfabric (Benn, 1994;Bertran & Lenoble, 2002;Lenoble & Bertran, 2004;McPherron, 2005McPherron, , 2018Benito-Calvo & De la Torre, 2011;Benito-Calvo et al., 2009;García-Moreno et al., 2016;Sánchez-Romero et al., 2016;Spagnolo et al., 2020;inter alia). The isolation of the main clusters and their analysis permits the clustering features to be detailed, and therefore zooming in on the possible processes that could generate these accumulations. The characterisation of the main clusters at Ambrona made it possible to establish the correlation between the georeferenced materials and their stratigraphic attribution, even when data about the units were not available in the plans (Sánchez-Romero et al., 2016). In this case, the data extracted from the identified clusters were orientation and size patterns, which were cross-referenced with the information obtained from stratigraphic studies of each Ambrona unit. There are other examples, similar to Ambrona, such as Neumark Nord , where these techniques have also been also applied to understand formation processes.
The directional patterns allow testing whether the clusters are conditioned by the space where they are located, such as cave walls or the limits of an open-air excavation, or if they are independent of this type of factor. If the clusters are distributed according to the limits of the site, it is possible that this accumulation could have been constrained at the moment of the deposition of the materials, the geometry and the walls of the cave being the conditioning factors for the direction of the flow. In the case of Amalda I, the identified clusters showed independent patterns, different from the pattern identified for the whole assemblage. Thus, through the application of the directional distribution method, together with the other analyses performed, it was inferred that the accumulation patterns of the main clusters did not seem to reflect natural accumulation processes (Sánchez-Romero et al., 2020). However, other factors should be considered that might not be dependent on the limits of the site, such as random accumulations distributed according to the needs or activities at that time. Certain activities can generate accumulations that are also somewhat constrained, especially if the intention is to concentrate the materials at specific points of the site (dumping areas) or to use the site's own structural elements, such as boulders or cave walls, for certain activities (Leroi-Gourhan & Brézillon, 1972;Martínez & Rando, 2001;Meignen, 1994;Sañudo et al., 2012;Vaquero et al., 2015;Vaquero, Chacón, Cuartero, et al., 2012a;Vaquero, Chacón, García-Antón, et al., 2012b). Thus, directional distribution ellipses can be helpful for offering deeper understanding of the accumulation dynamics and spatial relationship in the context of the whole site, not just in terms of spatial location but also regarding the site boundaries.
Throughout this article, different analytical methods for the detection of clusters and their subsequent interpretation have been mentioned, showing the strengths and weaknesses of each. To do this, we presented different sites, from a large open-air site to a small cave, with different data, excavation and data collection methods. As has been seen, the effectiveness of the methods mostly depends on the data employed and the question to be answered. Additionally, and as was mentioned above, the use of these methods in isolation will not enable inference of what happened on the site, since other disciplines are necessary for interpreting the results. Methods such as Getis-Ord Gi* and Anselin Local Moran's I allow significant accumulations of materials according to quantitative variables over the whole site to be detected, but the relevance of these accumulations needs to be explained from an archaeological and stratigraphic point of view. This means it is necessary to evaluate whether, considering the number of remains, composition, context and extension of the site, these accumulations are really significant, or not, like in the case of those detected at Aranbaltza II. On the other hand, palaeotopographic reconstructions of the study levels where the spatial analysis is being performed may be essential to understanding the presence of certain accumulations and characteristics of the materials, such as orientation and/or slope patterns. This is a very useful resource for all types of sites, but its use depends on the availability of the data and how the data collection was compiled. The incorporation of data from old excavations, their amalgamation with new data and the subsequent spatial analysis depend on (1) whether such data are available and (2) the type of information about them to hand. Therefore, and trying to answer the question of which methods work better for one or another site, or which are more suitable for different types of sites, we cannot give a definitive response since it all depends on the data employed and available, and the question the researcher wants to answer.

Conclusions
This work is intended to offer an overview of the most widespread methods for identifying horizontal distribution patterns, and the tools that have emerged in recent years. It establishes that there is not just one tool valid for cluster detection, identification and analysis, but rather the combination of several analytical methods, thorough control over the data and exchanging data with other disciplines are key to conducting a complete study of the distributional patterns in Palaeolithic spatial archaeology.
The application of all the methods described here to data from different origins (plans, total station, random modelling) and sites (context, area, chronology, materials) has allowed us to test the spatial analysis method on a wide range of possibilities. The results, both at individual level and when compared mutually, have brought out the advantages and limitations of each of the methods, and how these limitations can be overcome by combination and application of other analytical techniques and methods. Thus, the detection of clusters using hotspots may provide objectivity and accuracy, helping to infer the possible processes responsible for these accumulations. However, this method is also dependent on testing and on existing detailed knowledge of the site and the materials composing the assemblage.
Cluster identification is a key step in segmenting horizontal accumulations and analysing their formation, so the application of different and varied methods and techniques acquires even greater importance. This approach offers the opportunity of testing the agency of these concentrations of material to rule out naturally driven processes, as a mandatory procedure prior to interpreting a given spatial distribution in terms of human activity. This article has demonstrated that practically all the available data can be useful for spatial studies, bringing order and putting everything together for deeper investigation of what happened at the site.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.