Introduction

Inferring behavior with potentially adaptive significance from the stone artifact record has been a major challenge in our field from its inception. This challenge remains despite the current resurgence of raw material and use-wear studies, holistic assemblage-scale analyses, and advanced experimental and quantitative approaches. One of the major reasons behind this challenge is the persistent focus on artifact manufacture and form—both the technological and morphological appearance of stone artifacts (Dibble et al. 2017; Holdaway and Douglass 2012). While more than a century of research has moved us a long way towards understanding how artifacts were manufactured and what was their immediate utilization, these are just two aspects in the complexity of use of stone, and as such are only part of the broader array of potential data sources for developing inferences about past hominin lifeways. By stone use, we mean any kind of action involving stone in its either natural or humanly modified form: knapping, transporting, selecting, (re)using, recycling, etc., where any one of these kinds of actions over time makes up a behavioral process.

The conventional focus on artifacts’ technological and morphological appearance is directly related to the current conception of what assemblages of stone artifacts represent and how the patterns and variability among those assemblages emerge. The emphasis on the roles of selection, recycling, and transport in this formation has recently increased (see papers in Barkai et al. 2015; Douglass et al. 2016; Lin 2018; Peresani et al. 2015; Turq et al. 2013; Venditti et al. 2019), yet the predominant perspective is that assemblages are long-term products of what happened at a site reflecting the central tendencies in stone use during the time of their formation. While the current understanding does acknowledge the long-term aspect of artifact accumulations, interpretations usually remain unimodal, being related to a single (long-term) underlying system (e.g., demographic, cultural-normative, or adaptive). In our view, this largely continues the nineteenth and early twentieth century interpretation of the past material culture.

The conception of assemblage and its implication for interpretation remains a long-standing topic in the discussion about the formation of the archaeological record (Bailey 2007; Binford and Binford 1966; Foley and Lahr 2003; Gould 1980; Holdaway and Wandsnider 2006; Isaac 1972; Lucas 2012; Shott 2008; Stern 1994). This discussion has been providing a critique of the ways we interpret stone artifacts to study past behavior and adaptation, focusing on the lack of separation between what we see archaeologically—the artifacts in their present sedimentary context—and inferences from these artifacts. The recognition of such a separation would imply that the stone artifact record and stone artifact assemblages are not direct measures of adaptation to selective pressures of the local socio-natural environments. Instead, “behavioral processes” deduced directly from assemblages of artifacts may in fact reflect geomorphology rather than behavior (Davies et al. 2016; Holdaway and Fanning 2008; Kirkby and Kirkby 1976; Waters and Kuehn 1996). If this is the case, even less adaptive meaning can be attributed to these direct behavioral inferences. We think that to counter this, the inferential path from the archaeological record to past behaviors should go through a more rigorous procedure than is asserted by approaches that focus on the variability (either discrete or continuous) in techno-morphological characteristics of stone artifacts and assemblages.

Here we build on this discussion and move it forward by introducing first the notion of aggregates to expose additional issues behind the concept of stone artifact assemblage. We think these issues have not been confronted to the extent that is needed to fully grasp the connotations of record formation for the interpretation of stone artifact accumulations. We explain how the notion of aggregates provides a more meaningful perspective for investigating this formation and making analytical constructions. With the intent to convey more effectively our view of how patterns and variability emerge from these accumulations, we then introduce the notion of formational emergence. Properties of the record can and often do emerge in ways that are not directly translatable to long-term behavioral patterns. For inferring the behavior that could be used to inform on past lifeways, we suggest a shift from the analysis of technological and morphological appearance of assemblages using imposed categories of predefined meaning to an analysis that focuses on modeled practice of stone use. This practice is the interaction of different behavioral processes and on different scales of time and space that might be interpreted using stone artifacts as a proxy for those processes rather than as their direct manifestation. Finally, we discuss the potential implications of these notions for a number of theoretical and methodological aspects within stone artifact archaeology. These concepts provide a perspective that steers away from studying artifacts as direct and explicit sources of behavior to one that requires a review of the way we sample the stone artifact record and how we construct knowledge about the past.

With this focus on practice, our goal is to advocate an approach of stone artifact analysis that investigates the formation of the record including the emergent patterns and variability in record measurements in order to infer meaning from combinations of processes involving stone that varied through time and space. By situating practice within ecological and social contexts, we seek to infer its relevance for adaptation and through this, its evolutionary significance.

Aggregates

As archaeologists, we are aware that the formation of the archaeological record has no discrete end (Ammerman and Feldman 1974; Lucas 2012). Just as stone objects themselves may be subjected to further re-modification, the stone artifact record is re-formed as objects are added or removed as seen at different spatiotemporal scales (Carr and Bradbury 2018; Dibble et al. 2017; Holdaway and Davies 2019; Isaac 1981; Turq et al. 2013). Research on various questions about the past requires us to take out and use groups of stone artifacts, rather than individual finds, from this spatiotemporal continuity. These stone artifact assemblages are usually defined based on a depositional unit. However, because of interpretive implications that have pervaded the traditional concept of an “assemblage,” we think that investigating past human behavior by relying on this concept has run its course. During the past five decades, discussions about the formation of the archaeological record have repeatedly brought up issues with the basic assumptions and implications of this concept. The most common concerns are about the temporal relation of artifacts within an assemblage and preservation of assemblage “original” inventory (Ascher 1968; Bailey 1983, 2007, 2008; Binford 1978, 1980, 1981a; Isaac 1981; Lucas 2012; Murray 1999; Schiffer 1972, 1983, 1985, 1987; Shott 1998; Stern 1993, 2008; see also Dibble et al. 2017; Holdaway and Wandsnider 2006). We think that stone artifact archaeology needs to account for these and other concerns alike by improving further the understanding of how stone artifacts accumulate and how the emergent properties in those accumulations may be interpreted.

Some of these concerns have led researchers to avoid reference to the term “assemblages.” Kleindienst (2006), for example, suggested that an empirical collection of artifacts from the same depositional context should be called an “aggregate” or a “sample,” that is “whatever does not denote any ‘group of people’ assumed to be related in any social or biological sense. It only refers to content, or the material or observations of evidence interpreted to be humanly produced” (2006: pp. 17–18) (see also Clark et al. 1973; Dunnell 1992; Hull 2005). Along this line, we follow Kleindienst (2006) in advocating the concept of the “aggregate” for any grouping of stone artifacts sampled from the stone artifact record.

We do not consider the term “aggregate” to be a word replacement for “assemblage.” Rather we see it as a concept that emphasizes the process of record formation and subsequent sampling by archaeologists. As such, we embrace the role of the analyst and the research undertaken when constructing units of grouped artifacts. This contrasts with an approach that sees assemblages as part of the discovery of natural units of people, culture, behavior, or adaptation. We believe that the aggregate notion allows for the articulation of entanglements (Hodder 2012) or associations between individual stone artifacts that arise due to the fusion of behavioral and natural processes taking part in their accumulation through time and space. This meaning is, of course, different from aggregates in the sense of aggregation based on provenience information during excavation (Lewarch and O’Brien 1981). We also want to make a distinction with “aggregate analysis” of groups of artifacts which is usually contrasted to the analysis of individual artifacts (see papers in Hall and Larson 2004), as well as to distinguish the notion of aggregates from the use of this word to describe geological layers and grouped sedimentary packages (Marean 2014; Smith et al. 2018).

Below, we develop the notion of aggregates in relation to three assumptions that are implicit in the concept of an assemblage. These assumptions have not been challenged to the same extent as the assumptions around contemporaneity, integrity, and post-depositional effects. They are as follows: (1) an assemblage is a cumulative outcome of deliberate human actions over time; (2) the inventory of the assemblage retains structure and composition according to an underlying and long-term cultural-normative, functional, or adaptive system; and (3) an assemblage can be discovered and identified as an inherently meaningful unit during archaeological fieldwork.

Assemblages and Cumulative Production

Stone artifact studies in the nineteenth century used replication to demonstrate that stone objects found in prehistoric sites were not produced naturally but were of anthropogenic origin (Evans 1897; Johnson 1978; Van Riper 1993). If the form of artifacts was found to be produced by humans, it followed that the assemblage of artifacts, that is, its structure and composition, was a by-product of deliberate human actions. With on-the-ground proximity of artifacts, this reasoning resulted in perceiving assemblages and their inventories as cumulative outcomes of production, transport, use, and discard over a short or long term (Inizan et al. 1999). This perception often implies that, even if accumulated over a long term, these groups represent artifacts of shared life history and, therefore, they are inherently meaningful collectives (Binford 1981b).

In contrast to the assemblage, the notion of an aggregate emphasizes the possibility that artifacts found together in one place may have arrived there as the result of different life histories (Shott and Sillitoe 2005). Artifacts functioned together at particular times and places but arguably such an association cannot be presumed from a group of artifacts found together in a deposit. Selection, flaking, use, and transport events took place through a number of different spatiotemporal associations, and processes of erosion and sedimentation acted on artifact visibility and distribution, making them differentially available for new use events (Bailey 2007; Davies et al. 2016, 2018; Dibble et al. 2017; Foley 1981; Gould 1980; Holdaway and Davies 2019; Holdaway and Douglass 2015; Stern 1994; Waters and Kuehn 1996). Like flakes found and reused today by Maale in southwest Ethiopia during their forays, or spolia—stones taken from Roman ruins and used in construction during the later times—many stone artifacts in the deeper past have been re-associated into different collectives based on their availability and/or utilities in the changing social and environmental settings (Carr and Bradbury 2018; MacCalman and Grobbelaar 1965; McDonald 1991; Whyte 2014; see also Weedman Arthur 2018). During the history of their transpiring associations, stone artifacts can acquire a variety of attributes. Their current archaeological context, or provenience, is just one of these attributes (Dunnell 1992: p. 34). Therefore, just as the direct relationship between deliberate human action and the form of many individual stone artifacts has been contested (Davidson and Noble 1993; Dibble 1987; Dibble et al. 2017; Hiscock 2004; Shott 1989), the notion of aggregates brings the same skepticism when it comes to the structure and composition of assemblages of artifacts.

Assemblages as Long-term Cultural-Normative Units or Models of Adaptation

The assumption of assemblages as cumulative outcomes of associated human actions has influenced methods of archaeological classification and systematics. If an assemblage from Late Pleistocene deposits in Southwest France, for instance, is composed of truncated-faceted pieces, Kombewa elements, and small-sized Levallois cores, it would be classified as Asinipodian (Bordes 1975; Dibble and McPherron 2006), while an assemblage containing side-scrapers, cores reduced with a recurrent Levallois method, and low amounts of stepped retouch, would be referred to phase 7 of the regional Middle Paleolithic (Guibert et al. 2008; Morin et al. 2014). Units such as these serve as a means of organizing groups of artifacts and denoting their inventories to enable comparisons. Thus, we know what Tabun-D Levantine Mousterian signifies in empirical terms: blades and elongated points derived from Levallois and non-Levallois unipolar convergent and bi-polar cores, etc. (Bar-Yosef 1998). However, when we a priori assume that these empirical units have an ideational meaning beyond their inventory, such as that they directly reflect particular socio-cultural norms or a model of adaptation, at least two related issues arise (Shea 2014; see also Dunnell 1986; O’Brien and Lyman 2002; Ramenofsky and Steffen 1998). First, such conflation of empirical with ideational perpetuates the view that an assemblage, throughout the time of its formation, retained technological and morphological composition in accordance with that ideational meaning. Second, once derived, this meaning is usually reserved only for the assemblages with those specific inventories. Thus, and leaving aside potential disagreements over exact industrial definitions, the ideational Clovis (if it did exist with the meaning of a culture and/or adaptation) (Bradley et al. 2010; Goebel et al. 2008; Haynes 2002), for instance, must at all times have contained a fluted spear point. An assemblage without such a point could not be considered to represent ideational Clovis. Nor could ideational Stillbay exist without bifacial foliate points (Goodwin and Van Riet Lowe 1929; Henshilwood 2012) or ideational Nubian without so-called Nubian cores or points (Goder-Goldberger et al. 2016; Guichard and Guichard 1965; Olszewski et al. 2010; Van Peer 2001), and so forth. Even within a single denoting unit (commonly referred to as “techno-complex” or “industry” or “facies” [see Clark et al. 1966]), if assemblages classified into that unit differ in relative proportions of the same techno-morphological attributes (e.g., one Tabun-C assemblage has more unidirectional than centripetal cores, while another has more centripetal cores than unidirectional), then these differences are often taken to represent the complexity within an ideational meaning (most often in terms of demography) within that unit. The problem with this approach is that, in order to be viable, it has to implicitly assert that hominins of a particular socio-cultural group discarded their various kinds of stone objects in exactly the same proportions in all places they were at in the landscape.

But the same issues occur when assemblages are interpreted in functional terms as models of adaptation over the long term. Here, assemblage structure and composition are thought of as leftovers from an iterated strategy, a “problem-solving process” (Nelson 1991: p. 58) of mobility, technology, and subsistence (Holdaway and Wandsnider 2006). Functional assemblage interpretations occur when complexity and curation (Andrefsky 2009; Binford 1979) in the forms of artifacts are equated with ethnographically derived models of site function (residential base, short-term camp, etc.) and mobility (Bamforth 1988; Binford 1979; Parry and Kelly 1987; see Holdaway and Davies 2019). One example comes from the interpretation of landscape use during the Middle Stone Age in the Western Desert near Abydos in Egypt. Here, artifact assemblage composition was approached as a direct indicator of adaptation, with a settlement system characterized as “circulating” (Chiotti et al. 2007; Olszewski et al. 2010). As Binford (1978) showed many years ago, however, what is found at one location need not relate to the primary function for which people congregated in one place (see also Hayden 1979). Artifact deposition is probabilistic in the sense that more curated (sensu Shott and Sillitoe 2005) items have a lower chance of being discarded than those that are less curated (Douglass et al. 2018; Holdaway et al. 2004; Shott 1989). An object does not get deposited as often if its use-life exceeds the duration of an occupation (Ammerman and Feldman 1974). Thus, items that were more utilized or important to past people may, in fact, have a higher likelihood of being absent from sites (Schiffer 1975b; Schlanger 1990; Surovell 2012; Varien and Potter 1997). If, in contrast, groups of stone artifacts are considered as part of a perpetual transformation (an aggregate), formed through the process of archaeological sampling (see below), these groups would be difficult to interpret as normative or functional packages.

In contrast to assemblages, the notion of aggregates makes it clear that one cannot rely on forms (the composition of techno-morphological attributes) of groups of artifacts to create ideational units. All the examples discussed above, Asinipodian, Clovis, Stillbay, etc., as well as all other known industries or techno-complexes, should be understood primarily as units of description. If the Aterian, as a descriptive unit, for instance, is defined by the presence of stemmed pieces (Caton-Thompson 1946; Reygasse 1921), then there is no Aterian without stemmed pieces (Dibble et al. 2013). Searching for other markers of this unit, even if this search does not rely on a single tool type, would presume the existence of the Aterian as a real cultural or functional meaning. Changing the technological definition of this descriptive unit does not in and of itself make that unit any more real in the past. The unit would not be defined better, just differently. Unquestionably, there are regularities in distributions of artifacts of particular techno-morphological attributes in time and space, and these regularities are invaluable for endeavors such as relative estimates of the age of deposits that have no other geochronological information. These broad-scale patterns may also provide insights for documenting trajectories in the changing capabilities of hominin stone knapping (Ambrose 2001; Perreault et al. 2013; Shea 2013; Stout 2011; Stout et al. 2011). Shared ideational meaning, however, cannot be presumed to reside in shared artifact forms and assemblage compositions. We come back to this discussion later, but approaching the record through the notion of aggregates makes it harder to conflate empirical groups of stone artifacts and their description with ideational units and interpretation, and implies that such conflation does not inform us about the past groups and their adaptation in the ways assumed.

Assemblages as Discovered Collections

The concepts of assemblage and site encompass the implicit idea that assemblages and sites are naturally occurring phenomena and exist independently of archaeological reasoning. This is based on a perspective that archaeologists’ perception of these concepts has nothing to do with how assemblages and sites are physically defined (see Schlanger 1992). Whether or not based on geomorphological and/or topographical features such as caves, terraces, and beds, or quantitative categories such as high-density scatters, assemblages and sites defined by these physical spaces are very much approached as having an intrinsic boundary. Dunnell (1992: p. 34), however, argued that the “archaeological record is more or less continuous distribution of artifacts on or near the surface of the planet, not a collection of sites waiting to be found.” Foley (1981) additionally observed that because record formation is a never-ending process, all sites are defined by the act of our own observation at a particular point in the present time. The observable unevenness in continuous distribution and visibility of stone artifacts has the consequent effect of pushing archaeologists to obtain assemblages at places where there is something to sample (e.g., the targeting of appreciable artifact densities). This sampling relies on aspects of the specific geomorphology of certain places. As a result, we convert the continuous distribution of artifacts into discrete artifact depositional associations. In open-air contexts, this is often dictated by the extent of deflation or erosion. In sediment traps, such as caves and rock-shelters, the sampling may rely on interfaces (contacts) of geological layers.

However, the definition of these layers and artifact associations is a matter of scale of observation. From a micromorphology perspective, for example, there are geological interfaces that are not observable from our usual macro-scale observation (Goldberg and Berna 2010). In Pech de l’Azé IV in Southwest France, for instance, the predominant coarse sediment fraction throughout the sequence contains small amounts of moisture-retaining particles of clay and silt, making the sediments in this cave less prone to freeze-thaw effects. This obscures all but the coarsest signatures of different sedimentation regimes when excavating and looking at the profiles (Turq et al. 2011). Furthermore, certain behaviors can result in post-depositional processes that have impacts transcending these stratigraphic boundaries. Recent experiments on the use of fire (e.g., Aldeias et al. 2016; March et al. 2014; Sievers and Wadley 2008) indicate that combustion features have considerable downstream impacts on underlying deposits. In this regard, it becomes clear that our grouping of stone artifacts based on strata and surface visibility is based on those aspects of geomorphology that are visible in the present day, but may not reflect boundaries that are more relevant for behavioral analysis.

In addition, once discovered, or rather defined, by an archaeologist, an artifact’s current depositional association may not be final. Artifacts’ use-lives continue long after their recovery during surveys and excavation, as they circulate within museums and research labs (Harris et al. 2019; Lucas 2012). The current Middle Paleolithic Combe-Grenal museum collection, for example, showed substantial differences between the stored assemblage inventories and descriptions of those same assemblages made created during Bordes’ excavations (Dibble et al. 2009). Clearly then, just like sites, assemblages are not entities that relate solely to human action in the past. This so-called sample bias is a well-known phenomenon in the study of paleontological record (Peres 2010), but rarely acknowledged and investigated in the archaeology of presedentary societies. The physical and temporal boundaries of assemblages and sites are defined by us due to how we sample and not by the past makers and users of those objects and those places.

To sum up, in contrast to the concept of assemblages, the notion of aggregates aims to make apparent that any group of stone artifacts, however defined empirically, is not a natural collection that is discovered. Instead, it is an empirical unit that is defined based on adopted research and fieldwork design. Since the stone artifact record is in a state of perpetual formation, this means that the structure and composition of groups of stone artifacts cannot represent some distorted past reality, but instead allow inferences to be made about formation itself (Bailey 1981, 1983, 2007; Binford 1981a; Foley 1981; Holdaway and Davies 2019; Holdaway and Wandsnider 2006, 2008; Hull 2005; Lucas 2012; Murray 1999). If aggregates of stone artifacts are not inherent normative or functional packages, but are accumulations of artifacts that have been made visible due to sedimentation processes and affected by how we construct our samples, the question that follows is one of their interpretations.

Formational Emergence

Most archaeologists today are aware that the time and space dimensions underpinning the ontology of the stone artifact record far exceed the scale of human behavior that we can observe ethnographically. Because of this awareness, a common approach to groups of stone artifacts now invokes the time-averaging concept and the “palimpsest” label (Bailey 2007; Barton and Riel-Salvatore 2014; Bunn et al. 2010; Lucas 2012; Malinsky-Buller et al. 2011; see papers in Mallol and Hernández 2016; Stern 1994; Vaquero et al. 2012; Wandsnider 1992). But, because of the lack of clearly stated or explored connotations of time-averaging and palimpsests in specific case studies, invoking them casually and without further explanation can incite major misdirection for interpreting the measured properties. In order to explain what we think is a more appropriate interpretative platform for aggregates of stone artifacts, we first briefly raise some issues with time-averaging and palimpsests as used commonly in stone artifact studies.

Time-Averaging and Palimpsests

“Time-averaging” was borrowed from invertebrate paleontology where it was defined as “… the process by which events that happened at different times appear to be synchronous in the geological record” (Kowalewski 1996: p. 318). It can also be defined as “... the process by which organic remains from different time-intervals come to be preserved together” (Kidwell and Behrensmeyer 1993: p. 4; see also Behrensmeyer and Schindel 1983). The application of this concept to groups of asynchronously deposited stone artifacts thus seems most reasonable. In interpretation, however, the use of the term “time-averaging” in our field often results in a misleading understanding that a group of stone artifacts represents a typical behavior or strategy (in stone tool making, mobility, place use, etc.) over the time it took for the group to accumulate. Such an understanding of measured properties of assemblage as reflecting central tendency in behavior a priori precludes us from thinking about those properties as an emergent effect of variability in behavior and natural agents. This was noted by Stern (2008: p. 135; see also Stern 1993, 1994) who argued that groups of stone artifacts cannot be an average representation of the behavioral events that led to their formation, because objects were likely not deposited in proportions matching the frequency and the durations of those events. She also noted that assemblage properties do not reflect an average over time because it is unlikely that the structure of the record at any place remained the same over the time span of its formation or that the only alteration to the record that occurred over time was the addition of new materials (Stern 1994: pp. 101–102). Implicit in this perspective is that the rate of accumulation of artifacts may be different to the rates of accumulation of other components of the record (such as fauna and sediments) in the same archaeological deposit (Stern 1993, 1994, 2008). This means that interpretation of observed patterns in stone artifact aggregates should be inseparable from understanding the formation of their deposits.

Unfortunately, neither does the use of the “palimpsest” label by itself avoid some of these misleading understandings when it comes to interpretation. Although this term characterizes groups of stone artifacts as long-term or multiple-occupation accumulations, it does little to help in conceptually interpreting the observed patterns. The formative nature of the archaeological record at any place falls somewhere along the continuum between a “true palimpsest,” where all traces of previous occupation were removed during the deposition of new materials (the “processes of erasure” by Lucas 2012), and a “true stratigraphy,” where all depositional events are sealed by subsequent sedimentation and so are preserved and visible (the “processes of inscription” [Lucas 2012]). Anything less than “true stratigraphy,” therefore, results in overprinting and thus the creation of palimpsests at some level (Bailey 2007; Binford 1981a). In addition, often times the palimpsest label is used to emphasize low temporal resolution or disturbance/distortion of meaningful archaeological patterning. This negative emphasis carries a tacit implication that palimpsests need to be disentangled to be of any inferential value and that there exists an ideal type of deposit on which inferences should be built. The result is the unfortunate preference for sites with presumably short-term events or occupations (with both “short-term” and “event”/“occupation” lacking unambiguous definitions) (Anderson and Burke 2008; see papers in Cascalheira and Picin 2020; Machado et al. 2013; Malinsky-Buller et al. 2011; see papers in Mallol and Hernández 2016; Romagnoli et al. 2018; Vaquero 2008).

However, even single-event episodes of occupation and sealed deposits such as burials, caches, hearth features, and occupation floors are temporal palimpsests in their own right (Bailey 2007; Binford 1981a; Lucas 2012). A scatter of tools abandoned by a Hadza hunter at a particular moment in the present reflects a palimpsest of events that include the manufacture, use, movement, discard, and deposition of those tools through time and social space (Ingold 1993; Olivier 1999). The iconic Roman city of Pompeii is often presented as the epitome of a sealed event in the past. It is true that many objects were “frozen in time” at Pompeii; an event denoted by the well-known reverse casts of individuals sealed in the falling volcanic ash. However, the behavioral record that is found at that site is not a single day in 79 C.E. Rather, the materials from Pompeii represent an accumulation of materials by multiple agents that differ in their spatiotemporal scale (e.g., from plate tectonics triggering the eruption to taphonomy of human bodies buried in ash) (Murray 1999). The patterns in the record apparent at Pompeii are not directly reflective of human behavior at the moment of the volcanic eruption, but they emerge as an outcome of the life histories of human individuals, the objects and spaces that they used, the social groups they were members of, and the natural processes that make these accumulations visible to us today. This is the notion of formational emergence.

Formational Emergence

The notion of formational emergence refers to some extent to the concept of emergence as it is used in the theory of complex systems within the natural and social sciences and in philosophy (e.g., see papers in Clayton and Davies 2008; Holland 1998; Johnson 2001; Kim 2006; Van Gulick 2001; for applications in archaeology, see Bentley and Maschner 2008; Kohler 2012; Ur 2014). It refers to the fact that complex systems (both physical and adaptive) often exhibit patterns that are not exhibited by their parts themselves but emerge as compound effects of variously scaled interactions between their parts (Goldstein 1999; Holland 2014: p. 38). Classic examples that could be used as analogies include the emergence of snowflake patterns or of v-shapes in flocks of birds. In these particular examples, the parts are not only the ice crystals and birds, respectively, but also the variations in their situational contexts comprised of temperature, atmospheric pressure, aspects of the wind, etc. Our contention here is that, similar to these examples, the variability in the artifact composition of aggregates, and the properties measured therein, do not derive from an average or a sum of related actions of human individuals. Instead, these emerge through the interaction of a variety of anthropogenic and natural processes operating over different times and spaces (see Hodder 2012; Latour 2005; Martin 2013).Footnote 1

In their agent-based simulation, Davies et al. (2016) assessed the relationship between the effects of erosion and deposition on the archaeological record of Western New South Wales, Australia (see also Holdaway et al. 2017). Specifically, they modeled how these processes affect the temporal distribution of chronometric ages obtained from hearth features. They demonstrated that gaps in that chronology do not necessarily represent changes in human demography, occupation intensity, or strategies of landscape use (which are common interpretations of gaps in archaeological chronologies). Rather, these gaps in the record can be produced as emergent phenomena through an interaction between sedimentation and erosion which affects the preservation of hearths, the distribution of these hearths, and their composition and visibility. As absolute dating is related to these formational processes, the sequence of absolute dates reflects this history of formation and not necessarily any behavioral pattern. These hearth features themselves were products of human behavior, but the emergent patterns of this record are a compound outcome of both anthropogenic and natural processes.

Davies et al. (2016) provided one of the first formal models of formational emergence, but implicit examples abound in the recent literature (see also Davies et al. 2018 for a stone artifact example). Turq et al. (2013), for instance, showed for the Middle Paleolithic in Southwest France that production and use of stone objects was segmented in temporal, spatial, and social domains, such that artifacts constantly formed new associations. Their analysis of stone artifact provenance and refitting indicates complex recycling behaviors that imply that the emergent patterns in the record at several open-air and rock-shelter locations could have resulted from a variety of independent production, transport, re-use, and discard events over the landscape (see also Gravina and Discamps 2015; Lin 2018).

A similar landscape-scale example of formational emergence comes from the study of tool-use behavior among wild chimpanzees in the Taï National Park in Ivory Coast. Luncz et al. (2016) detected a distance-decay effect in that record, where the weight of hammerstones used for nut-cracking decreased with increasing distance from raw material sources over a range of more than 2 km. In the early hominin record, this distance-decay pattern is often thought to reflect higher planning abilities, a cognitive capacity to mitigate the risk coming from the particular distribution of resources, and greater knowledge of the relationship between the physical properties of stone and its potential extractive use (Blumenschine et al. 2008; Braun et al. 2008; Potts 1994; Stout et al. 2010). In the record of this group of wild chimpanzees, however, Luncz and colleagues demonstrated, that such pattern is a net effect of many multi-directed short-term and short-distance hammerstone transport and re-use actions by different chimpanzee individuals. There were no deliberate, direct movements of these objects from the raw material sources to the location where they can now be recovered. Insofar as the record of living chimpanzees can be used as a reference for interpreting early hominin behavior, inferences about hominins planning long-distance transport could be an overinterpretation of an emergent pattern in their record that was neither under the control of those hominin tool-users nor produced by their behavior only (Brantingham 2003; Davies et al. 2018; Pop 2015).

Another example of formational emergence is from the Neanderthal record in Southwest France. Figure 1 (Rezek et al. 2018) shows twenty-six aggregates of stone artifacts that were sampled from three locations: the caves of Roc de Marsal and Pech de l’Azé IV, and the rock-shelter of Combe-Capelle Bas, with ages ranging from about 95 to about 45 ka (Dibble et al. 2018; Dibble and Lenoir 1995; Goldberg et al. 2012; Guérin et al. 2012; McPherron et al. 2012; Richter et al. 2013; Turq et al. 2011; Valladas et al. 2003). These aggregates represent circulation and accumulation of stone artifacts through the stone-use behaviors and geomorphological processes that took place in this landscape over time.

Fig. 1
figure 1

Emergent pattern in the record of Neanderthals in Southwest France from 95 until 45 ka. The aggregates are from Roc de Marsal, layers 2, 3, 4, 5, 6, 7, 8, and 9; Combe-Capelle Bas, layers I-2B, I-2A, I-1E, I-1D1, I-1D, II-4A, II-4B, II-4C, and II-4E; and Pech de l’Azé IV, layers 8, 6B, 6A, 5B, 5A, 4C, 4A, 3B, and 3A. The pattern indicates a relationship between movement of stone, re-use, and geometry of stone artifacts. This emergent pattern should be approached as primarily formational and not as directly behavioral (The relative surface area of flakes was calculated for all complete and unmodified flakes as flake length × flake width/squared flake thickness. The relative amount of retouched artifacts was calculated as the ratio between the amount of modified (resharpened or retouched) artifacts (complete blanks and fragments with platform) and the amount of unmodified flakes (complete blanks and fragments with platform). The absolute difference between the amounts of observed and estimated cortex for each aggregate was calculated as the absolute value of (1 – cortex ratio). (DFFit = 0.78; DFBeta “relative amount of retouched artifacts” orig. − 0.106, min. − 0.13, max. − 0.085; DFBeta “difference between the amounts of observed and estimated cortex” orig. − 0.158, min. − 0.179, max. − 0.135; Cook’s distance 0.17; leverage 0.24 (threshold 0.31); VIF 1.27; “relative amount of retouched artifacts estimate = − 0.105, SE = 0.052, t = − 2.019, p = 0.055, lower CI = − 0.214, upper CI = 0.003; “difference between the amounts of observed and estimated cortex” estimate = − 0.158, SE = 0.052, t = − 3.029, p = 0.006, lower CI = − 0.266, upper CI = − 0.05; adjusted R-squared 0.47; F = 11.94, df = 2,23, p < 0.001; R-squared for “relative amount of retouched artifacts” 0.15, R-squared for “difference between the amounts of observed and estimated cortex” 0.29). Data is available in Rezek et al. (2018))

In this region, within this 50,000-year period, the association between the three measurements—the median surface area of flakes relative to their thickness, the relative number of retouched artifacts, and the absolute difference between the amounts of observed and estimated cortex—is patterned. This pattern could indicate that Neanderthals of this region had a long-term strategy of intense movement, re-use, and maintenance of thick flakes (Delagnes and Rendu 2011; Hiscock et al. 2009; Rolland and Dibble 1990; Turq 1992). However, first, a clear distinction has to be made between the emergent pattern that we model based on our measurements and the behaviors that we try to infer (see Gifford-Gonzalez 1991). This means that, instead of regarding instantly the emergent properties in archaeological deposits as direct and exclusive results of behavior, it would be crucial to approach to this emergent property not as to behavioral but as to, foremost, formational (Binford 1980; Davies et al. 2016; Holdaway and Davies 2019; Holdaway and Fanning 2008; Kirkby and Kirkby 1976; Waters and Kuehn 1996). This requires us to consider the ways of integrating the geomorphological history of the same deposits (and the overall landscape) with a variety of potential factors of record formation and with our measurements of that record prior to the development of any behavioral interpretation. One factor of record formation, for example, could be that over time Neanderthals targeted already existing stone artifact accumulations to acquire stone. They may have stopped visiting such locations at times of increased sedimentation, when the visibility of stone artifacts was becoming low. Micromorphological analysis could provide us with insights about past erosion and sedimentation rates that could then be used in agent-based modeling of how changes in geomorphology may have acted on the distribution of artifacts locally and regionally in the past, making them visible at certain times and places to potentially be selected. Ultimately, the interpretations of behavior behind the emergent pattern would be evaluated based on their probabilities and confidence intervals derived by considering various kinds of factors of formation and geological agents, rather than based directly and solely on the emergent (measured) properties of aggregates used. Again, it is not behavior itself that is emergent in the record, but the patterns and variability in measurements of that record.

Emergent patterns and variability are also a result of sampling, so the pattern in Fig. 1 remains to be tested with other aggregates from Southwest France dating to this time range. Since these patterns emerge as complex outcomes from the intersection of past behaviors, geomorphology, and sampling, it could be that the spatiotemporal scale of formational emergence needed to be sampled to enable us to study the role of stone in the past lifeways depends on ecological and geomorphological histories of a particular landscape. This counters the common assumption that only deposits of a fine temporal resolution can be vantage points for studying the relationship between stone use and adaptation. It is, however, possible that many interesting emergent patterns would not be detected if, instead of longer-term aggregates, we focused on idealistic, single occupation surfaces. To understand the complexity of stone-use behavior within human-environment interactions requires repeated observations of the past on different temporal and spatial scales, both small and big. The formation of the archaeological record with its emergent properties provides this opportunity (Bailey 2008; Binford 1981a). As aggregates of stone artifacts represent a multitude of differently scaled contexts, we have, in effect, unrestricted access to multiple places over many times (Ascher 1968; Bailey 1983; Hull 2005; Murray 1997, 1999).

Embracing the dual notions of aggregates and formational emergence implies that our efforts should not be directed at attempts to isolate behavior by correcting for post-depositional effects (i.e., stripping away of “transformations” [Schiffer 1972, 1987]). Rather, to follow Binford (1981a), it may be more productive to focus on finding the ways of inferring behavior from the nature of the data which is a product of both behavior and the constant formation of archaeological deposits themselves. This perspective is certainly reminiscent of some of the arguments raised in the classic “behavioral archaeology” debates (Binford 1981a; Schiffer 1975a, 1976, 1985, 1987; Shott 1998). However, we feel as though we have not been particularly successful in implementing the lessons learned from these debates and that these lessons require re-consideration given the increasing understanding of the complexity of record formation. But, modeling this formation by incorporating natural processes into interpretation of emergent properties is only one step towards inferring evolutionary significant behavior from the stone artifact record. We also need to shift our major focus of analysis towards modeling the practice of stone use. By this practice, we mean the combination and interrelationship of selecting, flaking, transporting, reusing, maintaining, discarding, etc., inferred from emergent properties at different scales of time and space.

A Focus on Practice

The conceptualization of how stone artifacts accumulate and how patterns and variability emerge as outlined above incites the need to rethink the connection between the current appearance of the stone artifact record and behavioral evolution. The evolutionary significance of a behavior relates to how it affords adaptation to the selective pressures of the local socio-natural environment and to naturally or humanly induced changes in those pressures (see Bettinger 2009; Laland et al. 2015). In this respect, aggregates and formational emergence call for a change in the focus of analysis. This change is from the conventional approach that is more interested in manufacture and the current techno-morphological form of artifacts and assemblages to a focus on the practice of stone use. In this section, we first briefly discuss some of the major drawbacks that focusing on form and its manufacture has for studying past adaptation. We then make a case for the focus on stone-use practice.

Conventional Focus

Evolutionary studies of stone artifacts have taken two distinct but somewhat complementary directions. One of these is that different forms of artifacts represent different peoples, a long-standing framework in archaeology and implicitly related to cultural transmission and dual inheritance theory (Boyd and Richerson 1985; Cavalli-Sforza and Feldman 1981; Dunnell 1980; Eerkens and Lipo 2005; Henrich 2015; McElreath and Henrich 2009; O’Brien and Lyman 2003; Shennan 2008, 2011; Tëmkin and Eldredge 2007). This framework uses the presence and frequencies of morphological and technological attributes of stone artifacts as extended phenotypes of socio-cultural groups and reflections of population history (e.g., Goder-Goldberger et al. 2016; Groucutt et al. 2019; Lycett and von Cramon-Taubadel 2015; Scerri et al. 2014; Schillinger et al. 2017; Tostevin 2012). A different direction, and one more relevant here because we are interested in ways stone artifacts can help us to infer past lifeways (rather than units of demography), is related to human behavioral ecology and optimal foraging theory (Odling-Smee et al. 2003; Smith 1983; Winterhalder and Smith 1981). In stone artifact studies, this direction has rested largely on arguments of economic efficiency, which are used to interpret “technological organization” (Bamforth 1986; Beck et al. 2002; Binford 1979, 1980; Carr and Bradbury 2011; Elston 1990; Elston and Brantingham 2012; see papers in Goodale and Andrefsky 2015; Kuhn 1994; Nelson 1991; Odell 1996; Torrence 1983, 1989). Stone artifacts are seen as mechanisms to optimally negotiate environmental and social challenges, with their use and manufacture considered to be acting directly on inclusive fitness of individuals. Because of this, artifacts and assemblages are often perceived as representing the adaptation by and of itself.

What needs to be underlined is that the foundation of both of these directions (i.e., artifacts as discrete social groups and artifacts as adaptation) identifies techno-morphological form of stone artifacts and assemblages and their manufacture as the unit of analysis. By “form” we mean artifact morphologies, dimensions, reduction sequences, nominal techniques of their manufacture (Levallois, discoidal, bifacial, etc.), and the structure and composition of techno-morphological attributes in artifact assemblages. As the result, the focus on form and its manufacture has become the major venue for investigating behavioral complexity and behavioral evolution (Ambrose 2001; Foley and Lahr 2003; Muller et al. 2017; Perreault et al. 2013; Shea 2013). The theoretical justification of using form as a variable for measuring behavioral complexity and evolution, however, is rarely made explicit (Andersson et al. 2014; Holdaway and Douglass 2012; Shea 2011, 2014). Instead, certain artifact forms are assumed to represent universal and direct currencies of an efficient economy. One of those examples is elongated blanks that have more working edge per mass of raw material (Bar-Yosef and Kuhn 1999; Braun 2005; Eren and Lycett 2012; Muller and Clarkson 2016). Implicit in these investigations is that certain values in the measured unit of analysis (more edge per unit mass) translate to efficiency in the use of stone resources across different times and places (but see Eren et al. 2008; Iovita 2014; Lin et al. 2013; Rezek et al. 2018). Similar assumptions are relatedi to small flake production (Dibble and McPherron 2006), manufacture of microliths (Elston and Brantingham 2012; see papers in Elston and Kuhn 2002), and bipolar flaking (e.g., Pargeter and Eren 2017). The obvious issue with such assumptions is that they inadvertently strip artifacts from their local socio-natural contexts (Holdaway and Douglass 2012). This issue became particularly apparent in the debates about tool size efficiency (Kuhn 1994; Morrow 1996) and group mobility based on differently maintained or curated artifacts (Binford 1979; Nelson 1991; Parry and Kelly 1987). In short, these debates highlighted difficulties in interpretation and showed how the same values in the analyzed aspect of form can relate to both higher and lower tool efficiency (in the first debate) and more intensive and less intensive mobility (in the debate).

Nonetheless, in the analysis of this form, more recently we started to include not just discrete nominal types, techniques, and their frequencies. Indeed, an overemphasis on form also exists with quantitative methods such as geometric-morphometrics and high level statistical procedures, as well as with holistic analysis of attributes on the assemblage level. By replacing discrete categories with clusters of similarity in continuous variables of shape and dimension, the application of these quantitative methods provides a computable and refined means for (re-) defining descriptive units of classification. The application of these quantitative methods by itself, however, does not overcome the theoretical issue of ascribing a direct meaning to those units. Even highly detailed analyses of form contextualized within an understanding of artifact manufacture may ultimately result in linking the outputs of such analyses directly to cultural traditions or adaptations. A more potent inquiry into the role of stone use in ancient lifeways requires that we concentrate more on understanding how the record formed rather than on quantifying how artifacts and assemblages look and how they were manufactured. There were undoubtedly many places where, during certain times, manufacture may have had little or no role in record formation. Together with geomorphic processes, the record there may have formed primarily due to selection and transport (Hiscock 2004; Holdaway and Davies 2019; Holdaway and Douglass 2012; Sahle et al. 2012).

A Focus on Practice

Figure 1 is an example of a focus on practice. As described already, it features the interrelationship of flake relative surface area, relative number of retouched items, and difference in cortex as an emergent property of the Neanderthal record at particular temporal and spatial scales. The units of measurement used for modeling this interrelation are grounded in some of the fundamental principles of fracture mechanics, mathematics, and solid geometry. The means for implementing these fundamental principles into the interpretation of the emergent properties in the record can be provided by controlled experimentation (Eren et al. 2016; Lin et al. 2018; Rezek et al. 2016; Carr and Bradbury 2010) and exploratory agent-based modeling and simulations (e.g., Clevis et al. 2006; Davies et al. 2016, 2018; O’Sullivan and Perry 2013; Premo 2007, 2010; Wainwright 1994). Combined, these approaches lead to the development of:

  1. (1)

    Units of measurement on a common scale between aggregates of different geographical and temporal contexts

  2. (2)

    Formal models that can serve as a basis for inferring the possible proximate causation behind the values of those aggregate measurements

  3. (3)

    Integration of models of possible proximate causation that can lead to the assessment of a range of factors, and their interactions, behind the emergence of patterns and variability in those measurements

We highlight the cortex ratio (Dibble et al. 2005) as an example of these units and models. Experimenting with how principles of solid geometry relate to the amount of cortical surface of stone can produce (1) a universal unit of measurement (cortex ratio) that can measure differences between observed and estimated cortex in sampled aggregates. It also provides (2) a formal model that allows us to investigate the possible proximate (immediate) link between these measured differences and behaviors such as carrying stone in and out of samples during the formation (Dibble et al. 2005; Douglass et al. 2008; Lin et al. 2015). The mechanism of formation of any emergent property that forms at a temporal scale of thousands of years is necessarily complex. Therefore, the interpretation of patterning in cortex ratio from different archaeological samples and under varied natural contexts should benefit from (3) integration with models and simulations of other behavioral (e.g., selection and discard), as well as natural, processes of formation. This allows assessment of the intersection or interaction of a range of factors that could have led to the emergence of the cortex ratio measurement in the sampled aggregates. The assessment of the intersection of different factors can reveal the likelihood of each factor as contributing to the emergent cortex ratio (Davies et al. 2018; Ditchfield et al. 2014). As another example, models of possible proximate causation developed through experimentation on how certain fundamental principles of fracture mechanics relate to stone raw material and edge angle can be integrated to assess the intersection between tool use and trampling behind emergent patterns in stone artifact edge modifications (e.g., McPherron et al. 2014). Similar reliance on fundamental principles in generating units of measurements has been advocated also for interpreting emergent properties of the faunal record (Blumenschine et al. 1994; Domínguez-Rodrigo 2008; Marean and Assefa 1999).

There are three important points that need to be underlined when applying models of proximate causation to infer human actions behind archaeological data. First, as explained in the “Formational Emergence” section, we have to find ways to integrate these models with models that would account for geomorphology of both the individual deposits and the regional landscape from where these aggregates were sampled. The emergent property in Fig. 1 could be the outcome of the interrelationship between flaking, maintenance, and transport of stone, but we cannot presume that hominin actions alone account for the emergent patterns in archaeological deposits that we observe. Of special importance in this regard are formal models that can recreate coupled processes of artifact discard and sediment movement (Davies et al. 2016; Kirkby and Kirkby 1976; Waters and Kuehn 1996). The aim in integrating sedimentological processes into the analysis is not to “strip away” the transformation of the record to reveal a more “pristine” behavioral signature. Rather, we emphasize the integration of geomorphological processes with behavior as these are intertwined in the formation of the record itself (Binford 1981a). In some instances, the geomorphological context is so interrelated with the behaviors that those behaviors are hardly understandable independently of the depositional settings and the formational history of the record (Foley et al. 2017; Holdaway et al. 2012).

Second, a proximate causation is just one possible factor. It cannot be used to infer behavior directly from the emergent properties of archaeological data as these emerge from the intersection of behavior, geomorphology, and the local ecological context. Davies and colleagues (2018), for instance, provide an example of how modeled cortex ratio values can be influenced by the natural distribution of stone raw materials. Aspects related to transport of stone (like import and export of cortex and non-cortical volume) and the natural distribution of stone should be treated as separate but interacting factors. As such, an investigation into stone movement using the cortex ratio model (Dibble et al. 2005; Douglass et al. 2008) would be more accurate if done together with modeling of the natural distribution of stone. This complexity in the mechanisms of formation suggests that we develop more advanced methods for integrating different models of proximate causation to account for a more exhaustive list of behavioral and non-behavioral factors behind record formation.

Third, the models of possible proximate causation do not provide answers about the meaning of that causation. The causation is, in the theory of the model, only proximate to the measurement. The exterior platform angle vs. platform depth model (Lin et al. 2013) can inform us on ways of manipulating with core platform variables and if flaking was such that it produced high relative amounts of working edges (this production being the proximate causation behind low platform depth and high exterior platform angle values). However, this model cannot tell us if the production of high amounts of working edge was related to economical and efficient use of stone as a resource (Rezek et al. 2018). As argued many times already (e.g., Carr and Bradbury 2001; Chatters 1987; Holdaway and Douglass 2012), the production of thick flakes, which represents relatively low amounts of working edge to raw material volume, can also represent economization and efficient use of stone, just in a different way and where this is the optimal solution to particular constraints. Again, we need to find and contextualize ways of integrating different models to start an investigation of the intersection of a range of behaviors and natural processes involved in creating the observed variability.

Therefore, we emphasize that there is no singular model of possible (behavioral) proximate causation that can serve as an instrument for a direct interpretation of emergent properties in the archaeological record. Even more critically, no single model can be used to arrive at inferences about adaptation. Instead, in interpreting emergent properties, we would use and add different data to move back and forth between this interpretation and different models of possible proximate causations. Then on another level, by situating this interpretation within the data and models on subsistence, paleoenvironment, and (theoretical) social background, we would move back and forth between inferences about adaptation and emergent properties. These alternating movements simultaneously allow (re)assessment of the complexity of the intersection between different processes that formed the record and (re)evaluation of stone use within the local ecological contingencies. Such a hermeneutic procedure (Hodder 1991, 1992: pp. 183–215; Johnsen and Olsen 1992; see Vella 2000: p. 30) resting on quantitative modeling would generate and regenerate likelihoods and confidence intervals used for abductive reasoning (McKaughan 2008; Niiniluoto 1999; Walton 2005) in explaining formational emergence. On the second level, this procedure would create contextual grounds for inferences related to past lifeways.

The intersection between different kinds of processes involved in formational emergence is where we can get insights into the practice of stone use. By building inferences about adaptation in the complexity of embeddedness of this practice within environmental and social background, the basis for the kind of evolutionary understanding we seek to make is not identical to the theoretical foundations that involve individual models of possible proximate causations for the measured data. Following Wylie (1985, 1989), this separation greatly strengthens the ensuing inferences, because they can be supported and validated (or not) by quite different sets of theories (see also Bettinger 1987; Binford 1987; Gifford-Gonzalez 1989, 1991; Wylie 1988). This also makes a clear distinction between the units of measurement and the inference, and prevents conflation of stone tool technology with adaptation. In our example from Fig. 1, the emergent pattern comes from three independent units of measurement. Once models of other behavioral and natural processes are integrated, inferences about adaptation will not rely on the presence or absence of the production of certain morphological or technological elements or on the classification of those aggregates based on their descriptive attributes. This alleviates the risk of making inferential leaps and circular inferences where the conclusions would be predetermined by our units of analysis. To study behavioral variability, complexity, and evolution, we call for further development of the methodology and theoretical platforms that would facilitate this focus on practice.

Implications of Aggregates, Formational Emergence, and the Focus on Practice for Stone Artifact Archaeology

There are direct and indirect implications of thinking about the groups of stone artifacts through the notion of aggregates, viewing the patterns and variability in the record as cases of formational emergence, and focusing on stone-use practice. We divide these implications into two groups: those that relate to the sampling of the record and those that affect the construction of knowledge about past adaptations, behavioral evolution, and culture.

Sampling the Stone Artifact Record

First, the notion of aggregates, as described here, is less prone to assumptions that groups of artifacts reflect the sum of activities performed over certain time periods within the boundaries of sites (Dunnell 1992; Wandsnider 1992). Both aggregates and formational emergence make it clear that patterns that we see at particular places are net effects of past users of stone from different times over a landscape scale, natural processes affecting their distribution and visibility, and our own sampling. This means that these emergent properties are not free of the ramifications of how the record is sampled. Freeman (1994), for example, reported that excavating the same layer in Cueva Morín in Cantabria in two separate excavation areas revealed different emergent patterns that could be related to the intensity of tool re-use. Similarly, in Contrebandiers Cave in Morocco, two aggregates, one from the middle area and another further back in the cave, revealed emergent properties in the same geological layer which potentially arose due to different actions of stone selection and movement (Dibble et al. 2012).

Examples like these demonstrate that the variability in the intersection between different behavioral and natural processes forming the record is independent of space-time dimensions captured by our excavation and survey units following geological and topographic features. They also imply that the rate of change in emergent properties is—not surprisingly—uneven through space and time (Bailey 1983, 1987; McGlade 1995, 1999; McGlade and Van der Leeuw 1997; Murray 1997, 1999; Olivier 2011). This also means that aggregates and sites are archaeological constructs with space and time boundaries definable in different ways. In order to investigate multiple scales of formational emergence, alternate sampling methods, independent of the current boundaries of visible geological and topographical features, should be employed (Dunnell 1992; Murray 1999, 2008). This is especially important since, as already said, the spatiotemporal scale of the record needed to be sampled for studying the role of stone in the past lifeways may depend on geomorphological and ecological histories of the particular landscape. Viewing groups of stone artifacts as aggregates and archaeological deposits through formational emergence potentially could stimulate our thinking about how flexibility in sampling strategies can be used to detect and study the variability of stone-use practice within different scales of ecological contexts.

Second, the notions of aggregates and formational emergence make us less inclined to discriminate locations on the basis of their temporal resolution, preservation, length of stratified sequence, and density of finds. For example, locations with high densities of artifacts where the historical integrity of deposits is described as not being compromised with mixture and disturbance are generally thought to have higher information potential and tend to be preferred for investigation (see papers in Gamble and Porr 2005). Yet, low-density artifact concentrations can emerge due to the intensive and recurrent use of these places as sources of raw material and stone artifact provisioning in the past when they contained more stone artifacts than they do today (Camilli and Ebert 1992; Dibble et al. 2017; Gravina and Discamps 2015; Isaac et al. 1981; Lin 2018; Schick 1987a). Exploration of such low-density places would itself be critical for reconstructing the landscape use. Because all artifacts have aggregated as a consequence of a variety of factors, artifact accumulations of whatever density offer some information. The significance of that information depends on the research question. Furthermore, disturbance is not a discrete but a continuous variable, and virtually all deposits have gone through at least some degree of disturbance, if only because of the differential preservation of stone, faunal, and other kinds of remains (Schick 1986, 1987a, 1987b; Schiffer 1972). As Schick (1986: p. 88) pointed out, “since disturbance can occur to varying degrees, it would obviously be desirable to go beyond either stamping a site with a 'primary context' seal of approval or else relegating it to the hinterlands of 'secondary or derived context.'” If such relegations were made consistently and rigorously, given the antiquity of Pleistocene deposits and relatively high number of processes affecting those deposits, virtually no location with deposits from that epoch would be worth investigating (Binford 1981a; Gifford 1981; Stiner 1991). The notions of aggregates and emergence imply that all locations are significant and that, because of this, the existence of genuinely “key” or “flagship” sites (Gamble 1999: p. 68) that speak for the cultural or behavioral trajectories of the region is highly arguable at the very least (see for example the discussion around the use of the Tabun Cave sequence as a pan-Levantine model for the archaeology of the Middle and Late Pleistocene in the region [Culley et al. 2013; Hauck 2011; Hovers 2009; Marks and Rose 2014; Meignen 2011; Mustafa and Clark 2007; Shea 2014]).

Knowledge Construction

In using aggregates and formational emergence, as well as focusing on practice, our knowledge about past adaptations and cultural dynamics could be constructed in fundamentally different ways. We now discuss the implications for this knowledge construction, which, as a complex matter, warrants a multifaceted discussion. Below we divide this discussion into three parts, which address some of the shortcomings in traditional approaches in modeling past adaptations and cultures and provide certain new directions.

Deconstructing the “Knowns” in Past Behavior and Adaptation

Since the archaeological record itself does not come with explicit instructions for its interpretation, our knowledge of past human behavior, adaptation, and socio-cultural dynamics is constructed by applying external theories (from population dynamics, sociology, behavioral ecology, ethnography, etc.) to archaeological data. In addition, archaeologists use their daily experiences as primary analogues for making sense of archaeological remains (Holdaway and Wandsnider 2006; Plog 1974). All these sources of referential knowledge relate to behavioral phenomena that are believed, albeit most of the time implicitly, to be substantively uniform across time, space, and social contexts (Bailey 1983, 2008). This analogical approach in archaeological thinking (see Binford 1981a, 1987; Clarke 1968; Gifford-Gonzalez 1989, 1991; Wylie 1985, 1988, 1989) explains why we frequently conflate the definition of descriptive units with the interpretations these units were designed to make possible. Moreover, since these references for behavior and adaptation are rarely made on an experience, a model, or an analogy surpassing the lifetime of a generation, it is easy to understand the archaeologists’ preference for sites and assemblages of the finest temporal resolution.

But, there is another problem with our (often unconscious) use of external references as substantively uniformitarian in archaeological interpretations. This problem is part of the epistemological dilemma traceable to the so-called Meno’s Paradox in Socratic dialogues that asks how can we acquire new knowledge if the “knowns” are already predetermined by the imposed external references? The near universal use of a reference for hunter-gatherer land use based on the ethnographic work of Binford (1979, 1980) is perhaps the best example of this paradox in our field (even though Binford 1980 himself did not advocate his model to be applied universally). Admittedly, today, the two mobility and settlement strategies from this reference, forager/residential and collector/logistical, are applied more as two endpoints of a bi-directional continuum rather than as completely exclusive conditions (e.g., Barton and Riel-Salvatore 2014; Clark and Barton 2017; Close 2000; Perreault and Brantingham 2011). Because we depend on theories when interpreting the past, using this external model of settlement strategy as a theoretical heuristic is naturally acceptable. At the same time, however, by projecting it top-down onto archaeological data, there are high odds for circular reasoning in trying to fit the data to these predetermined categories and labels simply because they are known (Murray 2001). This hampers recognition of other, unknown and multi-directional, continuums in past landscape use during the two and half million years of hominin survival in numerous and different habitats and social contexts. The likely outcome is masking the range of potential factors of formation that are unfamiliar to us as researchers, because the conclusions are bound to be predetermined, or at least limited, by the preexisting theory. This predetermination of inferred outcome is the major limitation of the epistemology offered by the Middle-Range Theory (Binford 1977, 1982)—an inferential link between the record and the past living systems—when such theory relies on external references that are assumed to provide a direct analogy to past behavioral phenomena in substantively-uniformitarian sense (Kosso 1991; Raab and Goodyear 1998; Tschauner 1996).Footnote 2

We think that the solution to this problem requires two changes. The first is a change in perspective: instead of using categories defined from the form of the current material world to impose interpretations on the archaeological record, we think we need to take a ground-up perspective on the interplay between different processes of record formation leading to interpretation through the focus on practice as described in the previous section. The second change is to (1) use external references from a theory that is uniformitarian in a methodological (as opposed to substantive) sense (Bailey 1983, 2008; Gould 1965) and (2) apply such external references only for assessing the possible proximate causations behind the emergent properties in aggregates rather than for making the final inference about the use of stone in adaptation. These references are those that are built on fundamental principles of mathematics, physics, and chemistry, and, through experiments and agent-based modeling, they can generate models of proximate causation in the studied processes (see the section above). To make final inferences, as mentioned before, we would alternate between the interpretation of emergent properties in the record and different models of possible proximate causations to (re)assess their intersection in forming the record, and, based on this, (re)evaluate the practice of stone use within the modeled socio-environmental context. Unlike in the case when external references based on substantive uniformitarianism serve as sources of hypotheses, in the approach suggested here, where external references are based on fundamental physical principles, they serve as a means to assess the hypotheses and lower their likelihoods. This is done by updating the modeled distribution of probabilities of different scenarios each time we move back and forth between the interpretation of emergent patterns and different models of possible proximate causations, and, on another level, between inference related to a past lifeway and this interpretation. In this way, the references used (now based on methodologically-uniform phenomena) and the process of inference are intellectually independent from what we wish to investigate (Amick and Mauldin 1989; Binford 1981b; Dunnell 1971: pp. 50–51), which is something that constitutes the merit of scientific reasoning (Bettinger 1987; see also Hempel 1965; Kosso 2011).

To sum up this point, when the record is conceived as comprised of assemblages and when inference is built on external references based on substantive uniformitarianism, the effects of sampling the record can produce either arguments that are potentially tautological (because the “knowns” are already known) or outliers considered to be “errors” (because they deviate from the “knowns”) (see Murray 2001). In contrast, with the ground-up perspective inherent in the notions of aggregates and formational emergence, and with the focus on practice, the effects of sampling the record generate variability. Given that this approach allows the methodology of inference to be independent of the inference itself, the generated variability is used not only for assessing hypotheses, but also, at the same time, and as a probabilistic outcome, allows for exploratory discoveries about behavior in the past.

Inferring Past Ideational Meaning and Culture

We have already discussed how the manufacture and techno-morphological appearance of stone artifacts and assemblages has been the foundation for searching and defining social units and cultural meanings. Qualitatively observed or quantified cases of spatial and temporal patterning in this form have been interpreted to reflect socially transmitted ways of flaking stone and making tools, with the implicit assertion that other cultural norms were shared as well. In this way, patterning in form is used both to search for those socio-cultural identities, at the same time, and to validate their existence, thus making form to be the start- and the end-point of archaeological systematics. There are two issues related with this approach that necessitate a re-consideration of using form and its manufacture as a starting point in the search for ideational units. The first obvious issue is methodological: how to determine if the measured techno-morphological attributes were indeed socially transmitted and if their quantified variability indeed reflects this transmission? While application of neo-Darwinian models of trait transmission can help to explain this spatial and temporal patterning as a result of stabilizing processes due to evolutionary forces such as drift and selection (Brantingham and Perreault 2010; Collard and Shennan 2008; Eerkens et al. 2017; Eerkens and Lipo 2007; Jordan and Mace 2017; Lycett and von Cramon-Taubadel 2015; Mesoudi 2016; O’Brien and Lyman 2003), it has to be experimentally demonstrated (Morgan et al. 2015; Nonaka et al. 2010; Stout et al. 2014; see also Mesoudi and Whiten 2008; Tostevin 2012), rather than presumed, that the measured attributes and their values required high-fidelity social transmission (Corbey 2016; Tennie et al. 2017).Footnote 3

The second obvious issue is theoretical and well-known: the implicit belief that similarity in form equates to similarity in meaning. This is most apparent when we sample similarly looking artifacts—most notably bifaces, pointed shapes, Levallois products, microliths, cores, etc.—but it relates also to assemblages that share the same techno-morphological structure and composition. These groups of shared appearance are then used to define causes for the temporal and geographical variation in that appearance and most often the resulting interpretations hinge on the existence of distinct socio-cultural entities. But, according to the notion of formational emergence, we cannot ignore the possibility of natural processes contributing to the variability in emergent patterning in form within and between groups of artifacts, at least on some scale of time and space. Besides ignoring the possibility of equifinality in these cases, assuming that similar forms have similar meaning in effect denies the role of context (Martin 2013: pp. 112–122). The existence of the same meaning of any kind would entail that it had to be independent from local behavioral, social, and natural contingencies. Yet, the innumerable ways these contingencies could be intertwined (see Hodder 2012; Latour 2001, 2005) with hominin technological inventiveness and skills, demographic variables such as population size, age structure, and social networks affecting transmission, diffusion, and accumulation of techno-morphological traits, together with processes of selection and recycling of existing objects, could have produced differently scaled autocorrelation effects where aggregates that are near each other in time and space appear more similar in their form than those that are distant (see The First Law in Geography in Tobler 1970).Footnote 4

Indeed, the meaning of the stone artifact record does not reside in artifacts themselves nor in their form. In our opinion, focusing on practice of stone use and situating it within its context provides a ground-up path for inferring this meaning. This of course requires an integration of the connotations of formational emergence described here. Since formational processes, including behavior, operate and unfold over different spatial and temporal scales (Bailey 1983; Holdaway and Wandsnider 2006; Hull 2005; Wandsnider 1992), the inferred meaning is not independent of the scale of observation. In addition, when the meaning is inferred in cultural terms, we have to bear in mind that culture is a dynamic phenomenon that is produced and reproduced constantly and non-linearly (Bourdieu 1977; Geertz 1973), and, as Hull (2005: p. 356) put it, “any cultural ‘present’—be it archaeological or ethnographic—is but a moment amid such dynamism” (see also Gell 1992; McGlade 1999; McGlade and Van der Leeuw 1997). To the extent that culture is reflected and can be studied from the stone artifact record, the visibility, span, and detail of any such moment in the past are a composite outcome of changes in practice, technology (affected by a range of factors such as inventiveness, skill, and raw material) (Bettinger et al. 2006; Boone and Smith 1998; Fitzhugh and Trusler 2009; Kolodny et al. 2016; Ugan et al. 2003), demography (affecting transmission and diffusion of ideas) (Henrich 2004; Kuhn 2012; Premo 2012; Shennan 2001), processes of accumulation, sedimentation, and erosion (Davies et al. 2016, 2018; Miller-Atkins and Premo 2018), and, lastly, our sampling design and the precision of our dating methods to measure the rate of those changes (see Hull 2005). As explained, with the focus on practice, the increasing amount of available paleo-ecological data, and the principle of parsimony, we can employ mathematical simulations and agent-based modeling of social and natural processes to propose different kinds of meaning, not as unconditional scenarios but as probabilistic outcomes, as well as estimate them with ranges of uncertainty (see Lake 1996).

Investigating Behavioral Efficiency and Evolution

As implied by the focus on practice, after inference is made from the emergent properties of the record with integration with a variety of environmental variables, we can then assess the efficiency and significance that stone artifacts may have had for adaptation. This has certain negative implications for the emphasis on complexity in technological and morphological attributes in modeling hominin behavioral evolution (e.g., Ambrose 2001; Foley and Lahr 2003; Perreault et al. 2013; Shea 2013) where “formal” technological aspects and advanced tool forms would be used as universal measures of efficient and complex behaviors. It also vitiates the common routine of ascribing a (direct) adaptive meaning to emergent patterns and investigating those patterns with hypotheses that rest on a presumption of an effecient and strategic response (see Andersson et al. 2014; Stiner and Kuhn 2016). This is especially the case if this response is presumed in terms of asymmetrical evolution—human groups always finding a way to behaviorally adapt to naturally induced environmental changes (but see niche construction in Fuentes 2016; Laland et al. 2015; Odling-Smee et al. 2003; Smith 2016)—where every assemblage is considered to be a winnowed outcome of natural selection. Such teleological view of stone artifact assemblages, in which their purpose was to allow or facilitate adaptation, has been extremely overexploited. Just because stone artifacts remain to study does not mean that they were the main conduit through which adaptations were achieved. Ethnographic studies of contemporary foragers (e.g., Hayden 1979; Kelly 2013; Marlowe 2010; Weedman 2006; White and Thomas 1972) and the debates around prestige hunting, costly signaling, and the agency of non-individual social entities (e.g., Bliege Bird and Smith 2005; Codding and Jones 2010; McGuire and Hildebrandt 2005; Quinn 2015, 2019; Speth 2010) show that hunter-gatherer lifeways do not always comprise optimal stone use (Boone and Smith 1998), as well as that interaction with the environment and adaptation to it by means of technology are two separate and distinct phenomena. For the stone artifact record, it is without doubt that focusing on flaking and manufacture (on any or all of its aspects: complexity, esthetics, shaping, technological skill, edge-per-mass efficiency, microlithization, compositing, etc.) on its own cannot be the answer to assessing efficiency, complexity, and evolutionary significance of the use of stone (Andrefsky 2009; Bamforth and Bleed 1997; Binford 1979; Carr and Bradbury 2001; Chatters 1987; Holdaway and Douglass 2012; Kelly 1988; Kuhn 1995; Nelson 1991; Rezek et al. 2018; Shott 1986; Torrence 1983; White 1977). If we assume that within particular ecologies in the past hominin use of stone was indeed subject to selective pressures (see Blumenschine and Pobiner 2007; Dominguez-Rodrigo and Pickering 2017) and if selection on behavior did affect stone artifacts, aggregates and formational emergence compel us to consider the inferred practice of stone use rather than continue to characterize behavioral efficiency and adaptation based on the level of formal complexity of that record. Depending on the social context (Bliege Bird et al. 2016; Bliege Bird and Smith 2005; McGuire and Hildebrandt 2005; Weedman Arthur 2018) and the local environment (Bamforth 1988; Klein et al. 2007; Stiner and Munro 2011), this practice could have been a strategic reaction or a random response, and it could have been adaptive or not. Unquestionably, behavioral and cultural evolution is in the relationships between people, environments, and objects, not the objects themselves (Hodder 2018; Martin 2013; Olsen 2010). The approach towards inference building as outlined in this paper could potentially expose, in probabilistic terms, the randomness, efficiency, and adaptive importance of particular practice of stone use.

Conclusion

The increasing use of an array of advanced quantitative methods and techniques of analysis in stone artifact archaeology today may give an impression that, in terms of interpretation, the study of stone artifacts has progressed significantly from the culture-historical paradigm that dominated this field several decades ago. It is true that our methods of analysis have advanced considerably and that the analyses are now more holistic and take into account multiple aspects of the record. However, and despite the fact that the question of how stone artifacts accumulate has been raised continuously for more than half a century (Bailey 1983, 1987; Binford 1981a; Dibble et al. 2017; Foley 1981; Gould 1980; Holdaway and Wandsnider 2006; Isaac 1972, 1981; Lucas 2012; Murray 1997; Schiffer 1972, 1985), the problems are still in the perception of the ontology of the material under analysis and in the presumption that their shared form carries the same ideational meaning. This stands regardless of the scale of analysis and if this form is captured by using discrete types or by quantifying continuous variability. As a result, the classic disputes (e.g., style vs. function) about the source of many parts of this variability continue to be revived. Our discipline has hardly managed to move beyond such polemics, which we see as a symptom of a flawed understanding of how stone artifacts accumulate and how patterns emerge from those accumulations. Through either short-term functionalist interpretations or misinterpretations of the long-term time-averaging concept, archaeologists continue to seek normative central tendencies in stone object production and take the stone artifact record as a direct representation of behavior and adaptation. We advocate a different route in studying past adaptations from this record, one that keys into the variability in the practice of stone use over temporal and spatial scales. This is about studying behavior as multi-scalar and multi-dimensional dynamics rather than a composite of directional advances.

Having in mind the dimensions of time and space along which individual stone objects circulate forming the record (Ingold 1993; Lucas 2005; Murray 1997; Olsen 2010; Weedman Arthur 2018), the geological processes involved in this formation and in the visibility of artifacts (Davies et al. 2016; Davies and Holdaway 2017; Holdaway and Davies 2019), and effects of our sampling procedures, we believe that the understanding of the relationship between stone artifacts as outlined with the notion of aggregates should be taken as default. Together with formational emergence, this can change some of the persisting attitudes in approaching the record—such as that post-depositional formation processes are noise that has to be removed—and, like Binford (1981a) argued, offer a perspective on the record that is more potent for inferences of behavioral evolution (see also Bailey 2008). In this perspective, behavior is not emergent directly from the properties of the record but it has to be inferred from its formation. It is exactly these formational changes that the record goes through that allow us to infer the non-linearity (McGlade 1999; McGlade and Van der Leeuw 1997) of behavioral processes over time and their potential evolutionary significance. Following Binford (1981a), this perspective underscores the exceptionality of aggregates and the record of any time and place, rather than emphasizing their palimpsest nature to undervalue them with categories such as disturbance, mixture, and incompleteness. In other words, it views the record not as an impoverished material imprint reflecting the past, but as, to borrow from Freeman (1994), a kaleidoscope, where, just as different patterns appear with every rotation capturing different assortments of colored particles, new patterns and variability will emerge with shifts and flexibility in the sampling of its spatiotemporal dimensions.

Thus, this perspective builds on the long-term aspect of formational emergence. However, the intent is not to adopt an environmentally deterministic stance that would discredit the role of social agency in structuring the archaeological record (Shanks and Tilley 1987; see also Bailey 2008). As a solution to this particular problem, we would need to work on developing formal models that would account for social agency in record formation over long timescales and on finding ways of implementing these models into our inference building. When it comes to modeling past adaptations, the long-term aspect of formational emergence can be the true potential that the stone artifact record has (Bailey 2008; Murray 1997, 2002). Current approaches focusing on the form of stone artifact assemblages rather than on a long-term practice of stone use, in our opinion, fail to leverage this potential.

Furthermore, and in relation to this potential, investigating behavioral efficiency by focusing on practice situated within the local landscape could make it possible to endorse a different view in some of the long-standing questions in our field. Some of those questions concern the relationship between the apparent technological stasis during the Early Stone Age (Roche et al. 1999; Semaw 2000; Stout et al. 2010) and behavioral dynamics and flexibility of early groups of Homo; then, the extent of changes in adaptation potentially related with observable and variously scaled “transitions” in the form of the record (see papers in Hovers and Kuhn 2006; Monnier 2006); and perhaps an alternative to the “behavioral modernity” concept in studying behavioral similarities and differences between our and other hominin species (Henshilwood and Marean 2003; McBrearty and Brooks 2000; Shea 2011). In essence, focusing on practice could bring us closer to knowing the past cultures and adaptations as they were rather than as portrayed by a predefined socio-cultural label or an ethnographically created category. Unfortunately, the “tyranny of familiar things” (Plog 1974; Wobst 1978) has created a standpoint that past human behavior should mirror our contemporary one. As a result, if the data do not fit the known models of explanation, the problem is most often ascribed to the data and even to the record (its “integrity”), but not to the models, whereas, and as noted by Hodder (1992: p. 214), when the models of explanation make all data fit, then the data “make sense.” For too long, and without much needed scrutiny, the explanatory references from social and evolutionary theories have been flowing into our field instead of being built within it (Murray 2002, 2008; see also Clark and Barton 1997). With aggregates, formational emergence, and the focus on practice, we want to use stone artifacts as a medium for investigating past behavior that was potentially of evolutionary significance, rather than to take them as a direct measure of adaptation and behavioral evolution on the grounds of assumingly substantively-uniformitarian phenomena. We see this shift as an opportunity to overcome Meno’s Paradox in our field and to find out what we do not already know about (past) human conditions.

Centering on form to study behavior, adaptation, and culture has a considerable historical precedent (e.g., Commont 1909; Garrod and Bate 1937; Goodwin and Van Riet Lowe 1929). Looking beyond our field, perhaps this centering came as a consequence of prioritizing the physical form of material world over action in the early modernist approaches to culture and society (Latour 1993; Martin 2013). The reasons may also be more practical: unlike analyzing the form of artifacts and the record, focusing on the practice of stone use is much less straightforward. Such a commitment requires experimentation, simulation, and mathematical modeling to develop the appropriate units of measurement and formal models that will account for both the behavioral and natural processes of record formation (which involves conducting another project in the lab and in silico, in addition to the one of archaeological data acquisition). This in turn requires more time, more funding, and a thorough deliberation upon employed experimental and simulation designs. It also requires much greater attention to the formation of individual deposits from which inferences are drawn, particularly to assess congruity between the scales and mechanisms of their formation and the parameters of social and ecological models that are potentially used in interpretation (Murray 2002). But at the same time, these requirements and their costs could increase collaborations and data sharing; incite further developments of standards in fieldwork, analysis, and data publication; and necessitate new initiatives related to experimental and simulation protocols and transparency of inferential procedure (e.g., registering the research design before obtaining the data in order to avoid malpractices like, for example, p hacking). The rewards of being able to ask questions that to a large degree are different from those which are asked in current cultural-historic and trait-transmission approaches and to look beyond the form of the record for the behaviors that may have been truly affording adaptation, however, would greatly surpass these investments.