Introduction

Human culture, i.e., the entirety of human behavior and material items, changes over time. This change is fueled by different factors, such as genetically induced increased brain size and neural configuration (Bolhuis et al., 2014; Suzuki et al., 2018; Pinson et al., 2022), environmental change (Maier et al., 2021), population dynamics (Shennan, 2001), and not least the transmission mechanisms of cultural traits themselves (Eerkens & Lipo, 2007). The process of culture change exhibits many aspects that mark it as evolutionary in nature. First and foremost of these are the transmission and heritability of cultural traits (Cavalli-Sforza, 1986), the introduction of variation through copying errors and innovations (Perry et al., 2021), and the differential sorting of these traits in the next generation by either stochastic phenomena, such as drift, or selection (Neiman, 1995). It is a topic of debate to what extent human intention introduces a certain directedness in this process, and whether it is therefore Darwinian or Lamarckian in nature (Hodgson, 2001; Mesoudi et al., 2004), or whether it should be called evolution at all (Gould, 1978). It is beyond the scope of this contribution to repeat and weigh all the arguments, and it thus must suffice to state that we side with those who deem cultural evolution an appropriate term and a useful framework for analyzing and explaining the archaeological record (Boyd et al., 2013). Importantly, given the analytical and temporal scope of this paper (assemblage compositions throughout millennia), the individual only plays a minor role, and the processes under study are emergent properties at the level of groups, usually referred to as macroevolution (Prentiss et al., 2009).

The transmission of cultural traits may follow different algorithms, increasing, decreasing, or maintaining variation (Eerkens & Lipo, 2005). Selective sorting occurs when individuals or groups can choose between two or more variants with differential payoffs, while stochastic sorting occurs when differences in variants are uncorrelated to payoff. Distinguishing between stochastic and selective mechanisms as agents in the formation of the archaeological record is not trivial. Usually, the fit or diversion of the archaeologically observable patterns to those shown by models of neutral stochastic cultural transmission is considered. Here, a good fit is considered as indicative of stochastic processes, while stronger deviations should indicate selective processes (Steele et al., 2010). However, stochastic and selective processes may and likely do operate simultaneously, and approaches exist to estimate their respective influence (Brantingham & Perrault, 2010). Selective processes usually exhibit a certain directedness, resulting from selective pressure. Selective pressure occurs when there is variability in the ability to increase payoffs in the natural or social environment, usually acting against variants with lower payoffs. If, for instance, two variants of projectile morphology differ in their ability to create payoff (either in the form of being more efficient in hunting endeavors or being socially favored), the variant with the lower payoff value would decrease under selective pressure.

There are many examples from different periods and regions where studies conducted under cultural evolutionary theory have considerably advanced our understanding of past processes (e.g., Kandler & Shennan, 2015; O’Brien et al. 2016; Crema et al., 2016). The applicability of cultural evolution as an analytical and explanatory framework for the archaeological record, however, hinges on the data we can gather. Here, problems arise not only from the patchy nature and sometimes poor chronological control, especially for the material culture of the early periods. At least equally problematic is the way archaeological data is traditionally recorded. While it is indispensable to gather quantitative data that can be included in statistical analyses, most information on Paleolithic tool morphology is recorded in typologies. A fundamental question is thus how lithic tools can be analyzed in such a way that their traits can be meaningfully included in studies of cultural evolution.

Typologies, in general, are systematics for recording information and as such have been and still are a mainstay of Paleolithic research. They are established standard procedures for recording information on tool morphology and, thanks to their wide distribution, are helpful to exchange such information between researchers both nationally and internationally. By design, typologies are made for capturing the generic ideas of tools, such as “hand-axe,” “scraper,” or “borer,” rather than their individual properties. At least for functionally distinctive Upper Paleolithic tool types, use wear analyses have often confirmed that scrapers were indeed used for scraping and borers for boring (e.g., Sano, 2012). The fact that types subsume repeatedly re-occurring morphological traits that coincide with a specific use pattern, manufacture process, and/or life cycle indicates that tool types are not arbitrarily forced upon the archaeological record but—despite being undeniably etic categories—do indeed capture generic ideas which seem to have had some emic significance in the past (Maier, 2015). It is therefore explicitly not the intention of this paper to criticize typologically oriented research as such. To the contrary, typology is a permissible and valuable tool of Paleolithic research.

Having said this, it also needs to be acknowledged that typologies, like all other approaches, have their strengths and weaknesses and thus cannot be a one-fits-all solution. Instead, while a typological analysis is a powerful tool for answering some questions, it can be inappropriate for answering others. The quality of performance—i.e., recording archaeological information for subsequent analyses and answering specific questions—depends on the extent to which the logical design of a typology matches the demands of the analytical methods. As is often the case, what is a strength in some situations is a weakness in others. What enables typologies to capture generic ideas, for instance, prevents them from recording individual traits, because no systematics can be both flexible and rigorous at the same time. To illustrate this aspect, it is necessary to take a look at diversity and variability in the archaeological record and the systematics used to record them.

Two kinds of multiplicity are relevant when studying artifacts: diversity and variability. While diversity accounts for the number of units within a systematics, i.e., the number of object classes (e.g., points, scrapers, borers), variability accounts for multiplicity within a single object class (symmetric points, backed points with retouched base, etc.). Diversity and variability are thus not different in nature but describe multiplicity at different levels of analytical scale. Distinguishing between diversity and variability and recording both kinds of multiplicities separately is pivotal for research on cultural evolution, since they may follow independent trends or inversely correlate with the same external factor, e.g., temporal or geographic distance. It is possible, for instance, that declining population numbers correlate with a decrease in the diversity of objects but with an increase in their variability (Maier et al., 2021).

Generally, there are two kinds of systematics for recording information: classifications, aiming at capturing the individual object with all its specifics, and categorizations, aiming at capturing the generic idea of groups of objects, such as “hand-axe,” “scraper,” or “borer.” While classifications usually rely on monothetic algorithms to construct classes (each element must show all defining attributes of its class), categorizations usually rely on polythetic algorithms to construct categories (elements do not need to show all defining attributes of its category).

Typologies (e.g., Spaulding, 1953; Sonneville-Bordes & Perrot, 1954, 1955, 1956a, b; Bordes, 1967; Isaac, 1977; Klejn, 1982; Debénath & Dibble, 1994) are categorical, polythetic recording systems. As such, they are highly flexible regarding the membership of objects within their categories. This has the advantage that smaller differences between individual artifacts can be neglected in favor of overarching similarities. Typological flexibility, however, comes at the expense of clearly defined boundaries between types and a neglect of individual morphological differences. A systematic and structured recording of diversity is thus problematic, since there are no explicit rules for delimitating types against one another. Even within the same typological system, delimitations between types often do not follow consistent criteria but are constituted by changing combinations of morphological, technological, and techno-functional attributes (Ioviţă, 2009). Thus, the recording of diversity is an unsystematic byproduct of typological sorting rather than an explicit methodical goal. For variability, the situation is even worse. Aiming at generic ideas, typologies treat artifact morphology approximatively and holistically. In-group variability, i.e., morphological variability within a type, is thus treated as statistical noise (instead of valuable information), consequently not recorded, and eventually not available for analyses. Typologies, therefore, capture morphological information only coarsely. Moreover, because types are generic, unchanging morphological categories, typologies are incapable of providing information to perform high-resolution analyses of change in artifact morphology through time. In addition, typological information is usually represented as a categorical (nominal) variable, and its statistical applicability is therefore limited. This is especially true when considering that types are mostly based on shape information and that geometric morphometrics (see below) can measure shape differences directly. Many questions regarding cultural evolution are therefore out of reach for studies applying typological data only recorded as categorical variable.

In contrast to typologies, monothetic classificatory approaches are rigid regarding group membership in classes. They do not aim at generic concepts but focus instead on individual properties of individual artifacts. Therefore, they are usually trait-based. Trait-based recordings in archaeology are common-place for technological analyses since quite a while (e.g., Auffermann et al., 1990; Tostevin, 2003; Nigst 2012; Simon & Moreau, 2012). For lithic tools—the chronologically and chorologically most sensitive part of the lithic artifact spectrum—a strict trait-based systematics is not available to date. Attempts to make morphological information compatible to evolutionary approaches by creating trait-based recording systems for tools, such as Laplace’s “Typologie analytique” (Laplace, 1974) with its “typometry” approach (Plutniak 2022), often led to hybrid systems, where information on technological aspects is recorded according to the different manifestations of specific traits (e.g., bulb present/absent), while morphological information is recorded typologically. Over the past two decades or so, a number of studies have addressed this issue. A common approach is using geometric morphometrics (e.g., Archer et al., 2016; Delpiano & Uthmeier, 2020; Leplongeon et al., 2020; Weiss, 2020). Analyses with geometric morphometrics often have a focus on shape, while leaving aside information on dimension and position and kind of retouch. Though, a range of studies have also included such data (e.g., Ioviţă & McPherron, 2011; Morales et al., 2015; Weiss et al., 2018). Eventually, diversity and individual variability of lithic tools is a current topic of research. Most notably, O’Brien et al. (2014) have applied a trait-based recording system for Paleoindian lithic points in North America. A system that integrates information on shape, size, and retouch of artifacts and allows for a detailed analysis of small incremental change through time is still missing for the European record.

To overcome this paralyzing situation, we present PyREnArA (Python-R-Environment for Artifact Analysis), a half-automated, trait-based recording system for lithic tools. PyREnArA provides traditional and new information on artifact morphology, customized for the implemented statistical methods, which allow for quantitative analyses of morphological change and to statistically determine the amount of variation that correlates with the progress of time. In doing so, it provides insight into material culture evolution far beyond traditional typology. However, as a completely independent analytical system, it is not intended to substitute typological research, but rather to complement it and to provide new perspectives on the archaeological record, which are otherwise invisible. We use photographs as well as drawings obtained from the literature. The latter is a treasure trove largely untouched by current research. Our approach is thus also an attempt to valorize this sleeping potential and making old well-sketched artifact drawings relevant for new research.

Material and Methods

Material

Throughout the European Upper Paleolithic, a number of tool types, such as end-scrapers or burins, occur in virtually all assemblages. Here, they constitute the so-called fond commun (the common background) with little morphological variability. Points and other possible projectile implements, in contrast, show a comparatively high regional and chronological variability. As part of the hunting gear, they are also crucial in the acquisition of food. Among the tools, they are thus particularly suited for studies with a focus on the evolution of material culture.

For this study, we selected 12 assemblages from 8 sites in Lower Austria and Moravia dated to between 32 and 27 ka cal BP (Suppl. 1). The two neighboring areas are a hotspot for hunter-gatherer activities in Europe during this period. In traditional taxonomy, they are attributed to the Pavlovian, Willendorf-Kostenkian, or the Gravettian sensu lato. Judging from traditional typological analyses, the selected assemblages cover a period when diversity in lithic projectiles—as assessed from the number of types reported in the literature—changes from comparatively high levels between 32 and 29 ka to comparatively low levels between 29 and 27 ka (Maier et al., 2021). Thus, pronounced temporal trends in artifact morphology are expected. The chronological attribution of the assemblages is based on weighted averages of the available 14C-dates (Suppl. 1). Most of the recorded information has been obtained from drawings in the archaeological literature. For Grub/Kranawetberg, unpublished photos have been used (References in Suppl. 11).

Relying on photos and particularly drawings from different decades and publications with different conventions and skill levels might introduce errors to the data set. However, as long as these drawings have been made with the best possible compliance to the original artifact—something we generally assume for publications in our subject—minimal deviations and unsystematic errors are considered not more pronounced than using measurements of artifacts dimensions or other quantitative data. Testing for different performance in using drawings and photos, however, is also part of this study. For the selection of artifacts, their morphology as discernable from the drawings and photos has been considered as a single criterion. Solely complete artifacts with a projectile-like morphology, i.e., at least one pointed end and a general slender appearance, have been considered, irrespective of whether they are considered to be hafted frontally or laterally. After selecting these artifacts, their typological attribution as given in the respective publications has been recorded. In total, information on 267 individual artifacts attributed to 21 types has been recorded and analyzed (Table 1).

Table 1 The three shape clusters in relation to typology (left) and assemblages (right) sorted by chronological succession from old (bottom) to young (top). Types occurring in clusters are shaded in gray

The decision to incorporate only complete artifacts excludes a large part of the projectile spectrum. Also, fragmentation—depending on the fracture type—is often a sign of usage, indicating that these pieces have been selected for their properties after manufacturing. Complete pieces, in contrast, might have never been used for various reasons. Analyzing only complete pieces might thus introduce a bias to the data set. However, we consider this potential bias reliable, since broken and complete pieces usually show highly similar morphologies. Eventually, we consider imponderabilities from the incorporation of fragmented pieces more distorting—particularly given that the presented approach is new and experience with reading its results is limited—than a potential bias introduced by incorporating complete pieces only.

Method

Before presenting the method, it should be stated that the individual parts are not new, but rather well-implemented in archaeological research. It rather is the internal logic of linking consecutive steps in a half-automated procedure that can be of value.

PyREnArA—Trait-Based Recordings of Tools to Capture Variability

The artifacts are recorded using the half-automated system PyREnArA (Python-R-Environment for Artifact Analysis; Suppl. 2), which has been developed over several semesters in seminars with students from the Friedrich-Alexander-University Erlangen-Nürnberg and the University of Cologne. The coding (RJ, FL, and GR) also has been part of student projects. PyREnArA can be accessed following this link: https://doi.org/10.5281/zenodo.7778071 (John et al. 2023). In the first step, artifact drawings and photos are scanned and oriented according to a standard protocol to minimize measurement errors. At this point, the program extracts traditional as well as new traits of object morphology and size, such as dimensions, the geometric center, the outline-to-area ratio, angles, symmetry, the position of the longest extension (e.g., longitudinal or transversal), and others (Suppl. 2). While most recorded traits are geometric and metric properties of the objects, we also compute the fractal dimension (FD; Seidel, 2018) of the artifact outline using the box-counting method. The FD is a measure of the filling of the image, in our case the roughness of the outline and thus informative for the simplicity or complexity of the course of the edges. Since the FD is insensitive to scale (Viengkham et al., 2019), the influence of image resolution to the results is considered negligible. The individual traits are compared and displayed in a pairsplot showing Pearson correlation coefficients (McKinney, 2010; Suppl. 3). In a second step, the standard-oriented images are then given as output files with lateral bars for each artifact. These bars are used by the researcher to color-code information on retouch. Subsequently, the thus-prepared images are again uploaded, and information of the length, position, and kind of retouch are extracted (Suppl. 2). Third, information on site location (in decimal degrees), dating (in ka cal BP), and traditional typological categorization of the artifacts are recorded. Eventually, each artifact is recorded with quantitative information on its chronological and spatial position as well as its individual morphological configurations. This means that the present variability in morphology and tool design is recorded at the level of individual artifacts, instead of the level of types. Eventually, the obtained data can be analyzed using descriptive as well as multivariate statistics.

Shape Analysis—Assessing Diversity

The recorded contours are analyzed with regard to their shape. We use an elliptic Fourier analysis (EFA), which is a well-established procedure in geometric morphometrics (Claude, 2008, Bonhomme et al., 2014) (Suppl. 2.5) and has proven its potential for lithic analysis (e.g., Leplongeon et al., 2020; Matzig et al., 2021). The resulting outline EFA data is then submitted to a between-group PCA (Cardini & Polly, 2020). The PCA scores are used to distinguish an optimal number of clusters, determined by a post hoc analysis using the silhouette criterion—i.e., an internal cluster validity criterion (Schmidt et al., 2022)—in a hierarchical agglomerative clustering algorithm (Murtagh & Legendre, 2014). Besides classical applications of geometric morphometrics, the classes separable by their shape are used in order to assess the diversity present within the dataset.

Analyzing Cultural Evolutionary Trends

A main goal of this study is to analyze possible cultural evolutionary trends in the data set. An evolutionary trend should be visible in traits that show a directed development over a longer period of time due to selective pressure. Stochastically changing traits, in contrast, show the absence of selective pressure and thus a development without evolutionary trends. In order to identify possible evolutionary trends, we perform a redundancy analysis (RDA; Legendre & Legendre, 2012; Roth 2022) with the chronological position of the tools as a canonical variable (Suppl. 2.6). Of course, also other criteria (e.g., the geographic location) can be used as canonical variables to assess broader spatial trends.

Results

In the following section, we start by presenting statistical parameters on the relation between the analyzed parameters for an initial assessment. This is followed by the results of the shape analysis. Subsequently, we discuss the results of the redundancy analysis for the entire data set as well as two chronological subsets. Eventually, observations on the kind and position of retouch are presented.

Before reporting the results, it should be stressed that the performance of PyREnArA works equally well with photos and drawings. This approach therefore allows one to use the ample data available in artifact drawings fruitfully for future research.

Pearson Correlation Coefficients

The Pearson correlation coefficients for the analyzed metrical variables give a first impression about the general structure of the data set (Suppl. 3). They show that only 10 out of 36 analyzed parameter combinations show correlation coefficients of more than 0.5, indicating that most parameters are rather uncorrelated. The ratio of positively and negatively correlated parameters is more or less balanced. The strongest correlation can be observed between the length of an artifact and the length of its contour (r = 1). This is not necessarily the case, since an increase in length coinciding with a reduction of width might result in stable contour length. Evolutionary trends towards a lengthening of cutting edges are discussed as general evolutionary trajectories (e.g., Leroi-Gourhan, 1964; Tactikos, 2003) and may also occur in projectiles. The color-coded elements of each assemblage indicate—at least for the observable older part—a trend towards increasing contour length. Similar observations can be made for the contour length and the surface area of the artifacts. Interestingly, the fractal dimension does not correlate with any of the other parameters (all values range between 0.05 and −0.21), indicating its analytical value as an independent trait.

Shape Analysis

The analysis of the shape parameters suggests a separation of the lithic artifacts into three clusters, which capture 35% of the observable variation in outline shape (Fig. 1; Suppl. 4). Cluster 1 contains pieces which are relatively wide (located to the left of the x-axis) and a lateralization with backing preferentially on the left (located to the lower part of the y-axis). Cluster 2 comprises pieces which are relatively narrow (located to the right of the x-axis) and slender. At the same time, specimens in this cluster are more axisymmetric (small extent on y-axis) and—despite being more numerous—are less scattered and thus appear morphologically more standardized. Specimens in cluster 3 are again widely scattered, relatively large, and rather non-axisymmetric, but in contrast to cluster 1 exhibit a lateralization with backing only on the right side (left upper part of the graph). It is noteworthy that the younger specimens of clusters 2 and 3 are located towards the center of the plot, indicating a tendency towards less lateralized and more balanced forms.

Fig. 1
figure 1

Results of the shape analysis. The data set is separated into three clusters. Cluster 1 (green dots) contains wider pieces with backing rather on the left side, while cluster 2 (orange dots) contains slenderer and rather axisymmetric pieces. Cluster 3 (purple dots) contains wide pieces with backing on the right side. Note the younger specimens of clusters 2 and 3 (indicated with a y) are located towards the center of the plot

Redundancy Analysis

For the redundancy analysis (RDA), we use the ratio scaled (metric) variables: length, width, length-width ratio, area, percent area, contour length, geometrical center (MP) - center of mass (CM)-y-offset, MP-CM-x-offset, and fractal dimension as dependent variables. These exhibit a reasonable correlation structure as can be seen in the pairsplot (Suppl. 3). As the independent (canonical) variable of the RDA, we selected time as captured in the 14C-dates of the assemblages. The RDA was performed using the function “rda()” of R package vegan (Oksanen et al., 2020) in R 4.2 (R Core Team, 2022).

Entire Investigated Period

In a first step, we analyzed the whole dataset as well as all three clusters for the entire investigated period, before dividing them chronologically in a second step (see below). The permutation test of the RDA for the whole dataset (p = 0.057) is slightly above the threshold for significance, and only cluster 2 returned significant results (p = 0.011) (Suppl. 5, Tab. 5.1). However, the explanatory value of the factor time for the analyzed features is with just 2.7% rather low (Suppl. 5, Tab. 5.2). In simple words, all differences between the artifacts regarding length, width, fractal dimension, etc. can only be related to a small amount to the differences between the assemblage dates. Hence, morphological differences related to the chronological position of the assemblages exist, but other differences between the artifacts, not related to time, are much more pronounced. This is highlighted by the fact that around 47% of the observed variance in the entire dataset is explained by the first PCA axis (Suppl. 5. Tab. 5.2), which runs perpendicular to the RDA-axis and is thus unrelated to time. Figure 2 shows the RDA results for cluster 2 over the entire investigated period. The younger the projectiles, the lower are their percent area values, indicating that the objects fill less space in the bounding box by becoming pointier, and the lower the MP-CM-x-offset, indicating a decrease in asymmetric lateralization. In parallel, the projectiles show higher values for the length-width-ratio, thus becoming slenderer. To see if the comparably weak correlation of change with time is a result of trends and countertrends, canceling each other out, we also analyzed the older (> 30 ka cal BP) and younger part (≤ 30 ka cal BP) of the assemblages separately.

Fig. 2
figure 2

RDA of cluster 2 over the entire investigated period. X-axis: RDA, y-axis: first PCA-axis. Triplot shows 49% of total variance (see Suppl. 5, Tab. 5.2). Time is getting younger from left to right and explains 2.7% of the total variance. (Note that the x-axis of the plot can be mirror-inverted, but this is not changing the results, see Roth, 2022)

> 30 ka cal BP

In contrast to the entire data set, the projectiles older than 30 ka cal BP returned significant results in the ANOVAs for the evaluation of all clusters (1+2+3) (p = 0.001) and cluster 2 (p = 0.001), while cluster 1 (p = 0.14) and cluster 3 (p = 0.056) are above the threshold for significance (Suppl. 5, Tab. 5.1). For those cluster returning significant results, the explanatory value of the factor time for the analyzed features is considerably higher. For an evaluation of all clusters (1+2+3), it is 7.3%, and for cluster 2, it is 17% (Suppl. 5, Tab. 5.2). When all projectiles (clusters 1+2+3) are analyzed jointly (Fig. 3), the length-width index shows a reverse trend towards stouter forms while getting younger. However, this does not imply that pieces are getting shorter. To the contrary, they are getting both wider and longer. But since they are becoming faster wide than long, the overall shape trend is towards stouter forms. We also see a strong decrease in the MP-CM-y-offset, while the percent area only plays a minor role. This indicates a trend towards a balanced ratio between the upper and lower part of the projectiles and only a weak trend of pieces becoming pointier.

Fig. 3
figure 3

RDA of all projectiles (clusters 1+2+3) > 30 ka cal BP. X-axis: RDA, y-axis: first PCA-axis. Triplot shows 51.7% of total variance (see Suppl. 5, Tab. 5.2). Time is getting younger from right to left and explains 7.3% of the total variance

Cluster 2 of the projectiles > 30 ka cal BP shows trends very similar to those of all clusters evaluated jointly (Fig. 4), with the difference that the explained variance is substantially higher. In other words, the overall trend described above is more pronounced in this subset. The most salient difference, however, is the decreased importance of change in the length-width index, leaving the tendency towards more balanced MP-CM offsets (i.e., less asymmetry) for both the x and the y axis most as the most relevant trend for this data subset.

Fig. 4
figure 4

RDA of cluster 2 > 30 ka cal BP. X-axis: RDA, y-axis: first PCA-axis. Triplot shows 48.9% of total variance (see Suppl. 5, Tab. 5.2). Time is getting younger from right to left and explains 17% of the total variance

≤ 30 ka cal BP

For the projectiles younger than 30 ka cal BP, the ANOVAs for cluster 1 (p = 0.02) and cluster 2 (p = 0.03) returned significant results (Suppl. 5, Tab. 5.1). The explanatory value of the factor time for the analyzed features is comparatively high with 9.1% for the former, but rather low for the latter with 3.2% (Suppl. 5, Tab. 5.2). The RDA shows that in cluster 1, the length-width index declines with time, showing the same trend to stouter forms as the joint evaluation of all clusters for the period > 30 ka cal BP (Fig. 5). In cluster 2, in contrast (Fig. 6), the factor percent area gains in importance, indicating a trend towards pointier forms. While the decreasing MP-CM-y-offset is now stronger related to the progress of time, the length-width index ratio does not show relevant trends.

Fig. 5
figure 5

RDA of cluster 1 ≤ 30 ka cal BP. X-axis: RDA, y-axis: first PCA-axis. Triplot shows 51% of total variance (see Suppl. 5, Tab. 5.2). Time is getting younger from left to right and explains 9.1% of the total variance

Fig. 6
figure 6

RDA of cluster 2 ≤ 30 ka cal BP. X-axis: RDA, y-axis: first PCA-axis. Triplot shows 49% of total variance (see Suppl. 5, Tab. 5.2). Time is getting younger from left to right and explains 3.2% of the total variance

Retouch

Regarding different aspects of retouch, there is no discernable chronological trend that would cover the entire investigated period. Indeed, most of the analyzed traits, such as retouch position (dorsal, ventral, or both faces), lateralization (left/right), location on the artifact (tip, base, medial), or the mean absolute length of retouched edges, behave stochastically in relation to chronology (Suppl. 6). However, the percentage values of lateral retouch, backing retouch, and the mean relative length of retouched edges show a trending behavior (Fig. 7). Since in the observed projectiles, lateral retouch covers the largest parts of the contour, both values are strongly coupled. Initially, between 32 and 29 ka cal BP, the amount of lateral retouch declines. At around 29 ka, this trend stops. Although the sharp peak (Milovice) certainly exaggerates the general trend, it seems that we might observe a rise in the frequency of lateral retouch followed by a stochastic phase. Backing retouch, on the other hand, shows a stochastic behavior before 29 ka but a declining trend thereafter.

Fig. 7
figure 7

Percentage values of lateral retouch (blue), mean relative length of retouch (orange), and backing retouch (gray) from old (left) to young (right)

Discussion

The trait-based analysis of projectile points in Lower Austria and Moravia between 32 and 27 ka cal BP with PyREnArA gave a number of new results. These results provide a complementary view of traditional analysis of tools using typological information and—more importantly—open completely new ways of approaching the analysis of evolutionary trends in the development of material culture. Both aspects will be addressed in the following.

Typological and Trait-Based Recording of Tools

By design, typological and trait-based recording systems show fundamental differences and thus serve different epistemological goals. Therefore, results obtained by one of these systems cannot be falsified by the other. Rather, these systems ideally complement one another. To be sure, there is no approach that is per se better for lithic analysis in general, but selecting one or the other should depend on the question at hand. Research interested in a high-resolution analysis of morphological change through time is likely better off with a trait-based recording system because variability is captured in the first place, diversity is captured more systematically, and the recorded information is compatible to statistical analysis.

Typological analyses—not differentiating between diversity and variability—have subdivided the recorded set of projectiles into 21 traditional types (Table 1). When taken as a canonical variable, however, this typological attribution accounts only for 5.6% of the observed variances in the entire data set (Suppl. 7). The complementary view of our study suggests the existence of three morphological clusters on the level of tool diversity. These three clusters are separated on the basis of a joint, multivariate evaluation of their properties and cannot be reduced to simple morphological description. However, it can be stated that clusters 1 and 3 contain more non-axisymmetric and, in relation to cluster 2, rather stout pieces with complementary lateralization, while cluster 2 contains more axisymmetric pieces with, in relation to clusters 1 and 3, slenderer forms.

Regarding variability, it can be stated that this level of variance is strongest in cluster 1, less pronounced in cluster 2, and least pronounced in cluster 3. In other words, cluster 3 is morphologically most consistent, while cluster 1 comprises pieces which are morphologically more diverse (Suppl. 9).

Taking these three clusters, we see five groups of traditional types (Table 1), those being related exclusively to cluster 1 (n = 3), cluster 2 (n = 5), or cluster 3 (n = 2), and those occurring in at least two (n = 7) or all three clusters (n = 4). It thus seems that half of the types do not reflect the main statistical criteria for their morphological distinction. This becomes even more important considering that the 4 types occurring in all clusters encompass 69% of all recorded objects (n = 183). Given that types exclusively attributed to a single cluster usually occur in small numbers, it can be assumed that with increasing numbers, the probability rises that types are attributed to more than one cluster. Moreover, only two assemblages comprise specimens exclusively attributed to a single cluster. This pattern does not seem to be linked to their geographic location. The rather low influence of the geographic location of the analyzed assemblages either in Lower Austria or Moravia is also reflected in the RDA with location as a canonical variable, where 5% of the observed variance is explained by the geographic position of the sites (Suppl. 8). This finding is unexpected since well-discernable spatial signals in the composition of lithic tool assemblages have also been observed for Magdalenian assemblages in central Europe (Maier, 2015) as well as the middle and final Gravettian in France (Klaric et al., 2009; Vignoles et al., 2021). Chronologically, however, projectiles of cluster three occur predominantly in the older assemblages > 30 ka cal BP (77%), while about two-thirds of specimens from cluster 2 occur in the younger assemblages < 30 ka cal BP (63%). Cluster 1, however, shows a rather even ratio of 54% in the older and 46% in the younger assemblages (Table 1).

A question arising from the high number of 21 traditional types is if all recorded objects can indeed be regarded as projectiles. Since it is not always possible to unequivocally infer function from form, we examine the tip angle as a functional element of artifacts to see whether they could serve—in principle—as projectiles. Measuring angles on artifacts is not trivial. If the tapering is not constant, as is often the case in lithic artifacts, the outcome is influenced by the points chosen to measure the angle. In these cases, it makes a difference if the angle is measured from the position where an artifact starts getting narrower or from two points closer to the tip. Since deviations from constant tapering in lithic projectiles are usually towards more convex shapes (rather than towards concave shapes as observable for borers), measuring the angle at two points towards the tip returns systematically higher values, thus reducing the risk of underestimating tip angles. For this reason, we opted for angle measurements between the tip and its two closest measurement points (see Suppl. 2.3). We use traditional arithmetic means of angles (circular data) for reasons of comparison to older data sets instead of other statistical tools for circular data (e.g., Pewsey et al., 2013).

For projectiles to be functionable, an upper tip angle of 55° has been suggested (Beckhoff, 1966). With tip angles at around 51° and 47°, respectively, artifacts in both clusters are below the threshold of 55°. The tip angles fall below the mean tip angles recorded for early Late Pleistocene point assemblages from Ethiopia, ranging between 53° and 64° (Sahle & Brooks, 2019). They prove to be below Middle Paleolithic points from Bouheben, France, at 63.8° as well as late Middle Stone Age points from South Africa ranging from 68.3° to 61.9° (Villa & Lenoir, 2006). Moreover, the tip angles overlap with the values of projectiles from the French Upper Paleolithic (Peterkin, 1993). They are, however, clearly above values between 30° and 40° considered to work best for projectiles (Friis-Hansen, 1990). They are also above mean values found for Late Paleolithic projectiles of the Federmesser-Gruppen and Ahrensburgian of about 39° (Dev & Riede, 2012) or the Swiderian willow leaf points (40°) and tanged points (33°) (Serwatka, 2018). Since measuring the angle between the tip and the widest point of the pieces would reduce the angles significantly, we conclude that we are well within the spectrum of functionable projectiles.

Diachronic Development of Material Culture and the Identification of Evolutionary Trends

The course of cultural evolution involves stochastic change and directed trends. A directed trend, in this sense, is everything that departs from stochastic change, therefore including also stabilizing effects and periods of unaltered stasis in traits, and is usually seen as the result of selection (Brantingham & Perreault, 2010). Importantly, selection in this regard must not be confused with individual decision-making or active choice. Rather, it is a macroevolutionary process during which two or more alternative variants with unequal performance come under selective pressure from the social or natural environment, leading to differential reproduction of these variants within a community of cultural transmission throughout generations. This pressure, however, is neither evenly distributed over all material culture traits, nor is it constant through time, but varies according to changes in the natural and social environment. If selective pressure is strong, traits should show directed trends or stasis, while under low selective pressure, traits are free to develop more randomly and thus should show no or only weakly directed trends. Changes in the trending behavior of morphological traits might thus allow to infer shifts in selective pressure (Dunnell, 1978; Neiman, 1995; Rogers & Ehrlich, 2008; Brantingham & Perreault, 2010; Creanza et al., 2017).

The natural environment can exert selective pressure, for instance, via declining temperatures, causing unaltered clothing techniques to be a less efficient protection against the cold or via shifts in prey species which render existing ballistic properties of projectiles less efficient. A possible source of selective pressure from the social environment can be, e.g., conformist or prestige-biased transmission, leading to differential copying rates in a population (e.g., Whiten et al., 2003; Kohler et al., 2004). Since social sources of selective pressure are likely to operate on shorter periods of time due to the lifespan of individuals and changing fashions, such trends might be too short-lived to be visible in our study (Perreault, 2012). The absence of evidence for socially exerted selective pressure should therefore not be taken as evidence of absence. Because the identification of trends is not an explanation of cultural evolution in itself, linking trends to selective processes requires functional reasoning (Boyd et al. 2013), i.e., comparing potential aspects of fitness or adaptation with the trait’s supposed functional differences, particularly when complex traits are involved.

In the presented study, some morphological traits of projectile points in Lower Austria and Moravia between 32 and 27 ka cal BP show directed trends associated with the passage of time. Interestingly, most of the analyzed traits do not show any discernable trends but change rather randomly within these 5000 years (Suppl. 6). We thus do not find evidence for selective pressure on the position of retouch on the dorsal, ventral, or both faces as well as their lateralization on the left or right side. It thus rather seems that they are subject to individual preferences. Whether these preferences are an expression of random choice or some kind of biased transmission cannot be meaningfully addressed because the resolution of our data is too coarse to make inferences about transmission processes over a few generations. Maybe a bit more surprising is the absence of such evidence for the location of retouch at the tip, base, or medial part of the projectile and the mean absolute length of the retouch. The shaping of the hafting part (usually base and lateral parts) can be considered a crucial element of projectile design, as is the tip, where retouch can alter the ability to penetrate prey skin. In these cases, it might be inferred that all available options observable in our dataset performed rather similarly so that no preference was given to any of these options in the long run.

The only traits related to retouch that show identifiable trends through time are the percentage values of lateral retouch and—alongside—the mean relative length of retouched edges as well as the percentage values of backing retouch (Fig. 7). Lateral retouch and mean relative lengths both show a slow but constant decline between 32 and 29 ka cal BP, resulting in a lengthening of unretouched cutting edges. This observation might be an expression of a selective pressure towards longer cutting edges, improving the penetrating properties of projectiles. However, it also might point towards optimizing processes in blank production, already affording blanks with favorable morphological properties that require less reworking to obtain the desired shape. This trend, however, stops between 30 and 29 ka cal BP, when it seems like selective pressure shifts away from lateral retouch of the edges and towards backing, which changes from a rather stochastic behavior in the previous millennia towards a declining trend. Backing is important for lateral hafting modes, since it provides a larger area for adhesives to connect the implement with the shaft. The decline in backing retouch thus might indicate a shift from rather laterally to frontally hafted implements.

This shift from laterally to frontally hafted implements is also supported by the observable trends in projectile shape. Over the entire period of investigation, we observe a shift towards less lateralized and more axisymmetric forms. On the one hand, projectiles of the axisymmetric cluster 2 occur predominantly in the younger assemblages (Table 1), and the respective younger specimens of clusters 1 and 3 show a more balanced shape (Fig. 1). Moreover, the strongest observable trend of all RDAs, the one for the older specimens of cluster 2, also shows a trend towards more balanced ratios both with regard to the left and right but also the upper and lower part of the projectiles. The latter trend is still observable in the younger specimens of cluster 2 together with a tendency to become pointier.

It thus seems that, despite some divergences within the subsets of clusters 1, 2, and 3, there is an overarching long-term development visible in the entire dataset. This might indicate a certain regression to a mean shape, which—after a pronounced diversification in projectiles at around 33 ka cal BP (Maier et al. 2021)—is approximated independently and complementary in all three clusters. This finding might be interpreted as a process of more or less conscious standardization. However, given that the same trend is visible in all clusters and occurs over the course of several millennia, it seems more likely that we are observing the results of selective pressure towards an optimization of physical ballistic properties. In this sense, a shift towards frontally hafted and pointier implements likely enhances the ability to penetrate prey skin while keeping tradeoffs regarding the stability of composite projectiles minimal.

In an attempt to provide a functional reasoning and meaningful reading of the observed developments, we propose the following interpretation (Fig. 8). At around 33 ka cal BP, non-axisymmetric straight-backed lithic implements, often with an obliquely retouched base and—most salient—a pointed tip (typologically labeled as Gravette points) became widespread in western and central Europe (Noiret, 2013; Pesesse, 2013). This design is in clear contrast with previously dominant laterally retouched Dufour bladelets of the Aurignacian that show no special modification of their ends and thus likely marks the beginning of a major shift in hunting gear technology. The period between 33 and 29 ka cal BP saw a pronounced increase in the morphological spectrum of pointed lithic implements of various shapes, followed by a marked decrease after 29 ka cal BP, around the time when shouldered points made their appearance (Maier et al., 2021). Our analysis shows that this development is accompanied by a trend towards more axisymmetric pieces with the center of mass in the basal part. From a functional point of view, we thus probably observe a directed trend from laterally hafted backed implements without points to latero-frontally hafted and pointed implements. This development required changes in several morphological aspects of the implements, particularly, in backing, symmetry and center of mass. During this process, we observe three major transitions:

  1. 1.

    Integrating and streamlining

Fig. 8
figure 8

Development of traits under selective pressure in the analyzed case study

Between 43 and 33 ka cal BP, small, roughly rectangular, straight retouched and probably laterally hafted implements (Dufour bladelets) characterize the assemblages. Due to their axisymmetric morphology and unmodified ends, the transition from the pointed shaft to the implements was abrupt. At around 33 ka cal BP, these bladelets disappear and larger, straight-backed but now non-axisymmetric and probably still laterally hafted implements (Gravette points) emerge. These implements integrate the length of the cutting edge of two or more bladelets into a single piece and, through their tip at the end of the backed lateral and a corresponding basal modification, streamline the slightly curved cutting edge thus smoothing the transitions from the shaft.

  1. 2.

    Moving some implements to the front

The second transition probably involved a shift of some of the implements from the lateral to the frontal part of the shaft. This involves a change to more axisymmetric, curved-backed, or only laterally retouched implements and a shift of the center of mass from the center to the base. The exact timing of this process is currently not discernable but probably started around 31.5 ka cal BP at the latest. In typological terms, it probably corresponds to the advent of Vachon points (Simonet, 2011). Additionally, axisymmetric curved-backed points and non-axisymmetric straight-backed points continue to be used in parallel. This indicates that the laterally hafted implements were used together with frontally hafted ones.

  1. 3.

    Integrating frontally and laterally hafted implements into a single item

The third step in this development was again a synergetic integration, this time of two functionally separated parts into a single, latero-frontally hafted object, the shouldered point. Shouldered points with their distinct morphological structure can be described as a latero-frontally hafted implement. While the tip is located frontally, the lateral edge opposite of the shoulder can protrude from the shaft (Lund, 1993; Weber, 2009; Kufel-Diakowska et al., 2016). Shouldered points, thus hafted as inserts of hunting weapons, incorporate the properties and functions of laterally and frontally hafted implements in a single piece. They would thus allow for a decreased production of lateral implements of various shapes, matching the general impoverishment of the tool corpus after 29 ka cal BP. The shift from laterally to frontally or latero-frontally hafted pieces might also explain the end of the declining trend in relative retouched edge length around that time. It is likely that blank production was not optimally adapted to the altered requirements for the end-products that likely served as blanks for the manufacture of shouldered points. As a consequence, blanks probably needed to be transformed more intensively, leading to an increase in the relative retouched edge length. The rather stochastic behavior of these traits during the period after 30 ka cal BP might be an expression of an experimental phase with low standardization in the production and morphology of shouldered points. This period thus may have provided a socio-economic context of change, promoting the testing of new ideas or respectively appearance of inventions which predominantly operates with progressive modification or re-combination of internal elements (Lemonnier, 1993; Arthur, 2009; Roux, 2010) and is often triggered by incidental circumstances, such as copying errors of information or transmission from pre-existing technologies (de Beaune, 2004; Kuhn, 2012; Porraz et al., 2021).

Projectiles from two assemblages, Krems-Wachtberg AH 4.4 and Jarošov-Podvršťa, had initially been recorded but because of their low numbers of complete projectiles not incorporated in the analysis. Hence, these assemblages might serve as a kind of test for a more general explanatory power of our results. Both assemblages are dated to around 31 ka cal BP (Suppl. 1). According to our analysis, we would expect projectiles in assemblages of that age to have a balanced to bottom-heavy, slight left-skewed to axisymmetric outline with straight backing preferentially on the right side and rather little retouch on the side opposed to the back. We find that these expectations are in good accordance with the observable specimens (Suppl. 10). In both assemblages, backing is preferentially on the right side, with little retouch on the opposite side, and one specimen in each assemblage shows intensive retouch on both sides, giving it an axisymmetric shape. If a tip is preserved, it is rather centered on the central axis, and even specimens with a straight back do not exhibit a strong asymmetry.

Eventually, it might be that we observe evolutionary trends at different levels of conscious awareness, that is, with different levels of human intent. The very slow development towards more pointed and axisymmetric projectiles likely reflects a rather low but discernable selective pressure towards better penetration of prey skin than conscious optimization processes. Given the long duration and incremental change, only observable at the scale of millennia, this has probably not been consciously noticed by the knappers themselves. Here, we thus might observe a selective process completely decoupled from individual decision-making and thus human intention (cf. O’Brien & Lyman, 2000). The inferred shift in hafting technique in our data set also takes place over a long period of time and thus might likewise be an expression of unconscious adaptation. However, it might also reflect a chain of conscious alterations in design from laterally hafted, asymmetrical points with a straight back (Gravette/Microgravette points) towards potentially frontally hafted, symmetrical points with a curved back (Vachons points), eventually leading to their synergetic incorporation in shouldered points. However, functional data, archaeological contextual information concerning bone technology, hunting prey, and paleoenvironmental data must be consulted in the future to strengthen our interpretations on temporal trends in tool design and, moreover, to determine the most likely evolutionary scenario (see for example Vignoles et al., 2021; Taipale et al., 2022).

Conclusion—and New Questions

The presented study is a first step towards a trait-based analysis of lithic tool morphology in the European Upper Paleolithic record. As such, it allows for some careful conclusions but—above all—generates a number of new questions and hypotheses to be addressed in future studies.

Generally, trait-based analyses of lithic tools can meaningfully complement those from typology and are particularly suited to address questions of diachronic developments and cultural evolution. The advantages of a trait-based recording in comparison to traditional typological data are that it

  • provides a complementary view to typological recordings,

  • enhances the resolution of archaeological data,

  • standardizes the morphological grouping of artifacts, making it less subjective or sensitive to research traditions,

  • is compatible with multivariate statistics,

  • allows for a trait-specific analysis of morphological change through time and thus for the detection of trending and stochastically changing morphological traits,

  • and opens new fields of archaeological inquiry, where questions of the evolution of material culture can meaningfully be addressed.

The full potential of a trait-based recording, however, will only become apparent when other assemblages are analyzed in the same way to see how the observations of our case study compare to other periods and areas.

Depending on the natural and social environment, traits can shift in and out of selective pressure, and only some traits are under such pressure at a given period and area, while others are subject of individual preferences. The visibility of evolutionary trends and their observable strength varies with the temporal and spatial scale of observation. Analyses at different scales might even produce seemingly contradictory results. Trends and countertrends affecting the same trait (e.g., an initial lengthening and subsequent shortening) can cancel each other out. Change correlated to time can thus seem less pronounced in longer periods than in shorter ones (Perreault, 2012). Also, what looks like a gradual development on a larger scale might appear in the form of a punctuated equilibrium at smaller scales (Kolodny et al., 2015). Supposed evolutionary leaps—thanks to innovations and conscious design decisions, such as the invention of shouldered points—are followed by periods of relative evolutionary stasis where projectile morphology oscillates around a mean without moving in specific directions.

For the investigated case study, we see a slow trend towards more pointed and axisymmetric projectiles, likely decoupled from human intent, and a shift from laterally to latero-frontally hafted implements, maybe as a result of an optimizing process towards the synergetic incorporation of different functions into a single implement, the shouldered point. Most of the analyzed trends appear not to experience selective pressure, and most of the observable variance does not seem to be related to the passage of time but is potentially caused by spatial signals or differences in lithic raw materials, individual knapping skills, or weapon preferences. Assessing the observed rates of change against those expected from drift alone might then help to address questions of different modes of cultural transmission that might affect cultural evolution, such as unbiased or conformist-biased transmission.

In the absence of comparable data, it is difficult to say whether the observed amount of variance attributable to time in our case study—between 3 and 17%—is representative for evolutionary trends during the (Upper) Paleolithic. Future studies will have to show whether these numbers are at average or exceptionally high or low, ideally including more assemblages with good chronological control and available contextual information to refine our understanding of specific and general patterns of cultural evolution.