Introduction

The time slice of roughly 300–30 ka (i.e. the Middle Stone Age [MSA] in Africa and Middle Palaeolithic [MP] in Eurasia), identified for this special volume, represents the coming of age of Homo sapiens. During this time, our ancestors also shared the Old World with other human groups such as the Neanderthals and Denisovans (see Galway-Witham et al. 2019 for suite of concurrent hominins), and perhaps also Homo naledi (Dirks et al. 2017)—all of whom ultimately disappeared as distinct populations or became absorbed into a single surviving H. sapiens population.

Being uber-social and being known for our technical proclivity make H. sapiens part of long evolutionary trajectories in the hominin clade, whereas being defined as sapient (based on our cognition) currently sets us apart from all other living creatures on earth. Some have suggested that this latter development occurred with a single sudden mutation or novel gene constellations at post-50 ka (e.g. Klein 2000, 2019). It is our take, however, that human cognitive evolution is a long, incremental, multi-faceted, and continuous process. The genetic and fossil records place the emergence of H. sapiens at ~ 300 ka, with a focus on sub-Saharan Africa (see Mounier and Lahr 2019 for synthesis; also Schlebusch et al. 2020). Work on the archaeological record over the last two decades and a re-assessment of the behavioural modernity thesis demonstrate that complex cognition can be traced back to at least 100 ka in H. sapiens populations from the region (e.g. Henshilwood et al. 2011; Wadley 2015, 2021; Davies 2019). Thus, the perceived boundaries relating to our ‘origins’, and a possible disjuncture between physical and cogni-behavioural evolution, sometimes referred to as the ‘sapient paradox’ (e.g. Renfrew 1996, 2008), are not static. Instead, they are continuously adjusted as/when new knowledge becomes available.

Today, our sapient characteristics allow us to do some things differently from other living animals:

  1. 1.

    Cognitively: We think about and find solutions for old and new problems (real and imagined, tangibly and intangibly) (e.g. Fuentes 2014).

  2. 2.

    Biologically: We experience and process the world through our uniquely evolved bodies (internally and externally) (e.g. Zihlman and Bolter 2015).

  3. 3.

    Technically: We cannot survive without technology—we are obligatory tool users (e.g. Shea 2017).

  4. 4.

    Socially: Our technologies are embedded in our culture/s as a result of social learning, teaching, and negotiation (e.g. Boyd et al. 2011; Högberg and Lombard 2021).

  5. 5.

    Ecologically: We are the only surviving hominin that inhabits, changes and/or dominates most ecological niches on earth, aided by technology (e.g. Heyes 2012; Roberts and Stewart 2018).

In their synthesis of the last million years of human evolution, Galway-Witham et al. (2019) remind us that behavioural complexity is not unique to H. sapiens and that we still do not understand the relationships between the different MSA/MP human groups, their biology, behaviours, and cognition. To them, finding answers to such questions “is important because independent behavioural similarity could suggest that these other human populations should be considered as ‘modern’ as we are, on a cultural and cognitive level, while fully acknowledging their biological distinctness. In contrast, divergence in the archaeological record would reinforce a biological, behavioural and cognitive difference between us and the other human species with which we shared the Pleistocene world” (Galway-Witham et al. 2019: 9). Zilhāo (2019) suggests, however, that it is necessary to apply the same criteria of inference to all contemporary human groups, instead of using present-day H. sapiens as the standard.

We agree with Galway-Witham et al. (2019) that gaining increasing resolution on the relationships between the different MSA/MP human groups, their biology, behaviours, and cognition is a worthwhile endeavour. A complicating factor, however, is that several phases of contact between H. sapiens and archaic humans existed, and populations mixed multiple times across the Eurasian continent, resulting in fertile offspring (e.g. Green et al. 2010; Posth et al. 2017; Wolf and Akey 2018; Gokcumen 2019; Villanea and Schraiber 2019; Chen et al. 2020). Admixture events took place, for example, at ~ 120 ka in the Near East (Hovers 2006; Kuhlwilm et al. 2016), most likely in western Asia at ~ 60–50 ka (Nielsen et al. 2017), and as late as 42–37 ka in present-day Romania (Fu et al. 2015; Gokcumen 2019). Currently, all non-sub-Saharan African human genomes carry ~ 2–4% Neanderthal ancestry, and some an additional 2–5% Denisovan. Present-day levels of ‘archaic’ DNA do not, however, reflect levels of prehistoric admixture, which could have been 2–5 times higher (Wolf and Akey 2018, also see Villanea and Schraiber 2019). Hence, a reasonable assumption is that some late MP Eurasian populations were ‘mixed’ before H. sapiens DNA became dominant in all surviving groups. A strict cogni-behavioural and/or techno-cultural separation between such later groups is therefore perhaps unfeasible. This complex topic cannot be teased apart here. Instead, we focus our exploration of human cognitive evolution on the period before ~ 50 ka, looking at cognitive variation in Neanderthal and H. sapiens, the two groups mostly represented in MSA/MP archaeology (Note: We reject any form of scientific racism or social Darwinist interpretation of our work [for discussion see Dennis 1995; Høiris 2016].)

A Four-Field Co-evolutionary Model

We suggest that a useful way to understand cognitive evolution holistically is by looking at relevant aspects of biology, technology, society, and ecology through a four-field co-evolutionary feedback loop (Fig. 1). Mitleton-Kelly and Davy (2013: 44) describe biological co-evolution as the dynamic reciprocal influences of two or more populations on each other, wherein each entity works as a selective force on the other, resulting in causal influence on each other’s evolution. Co-evolution thus involves give-and-take influences, changing “the behaviour of the interacting entities within a social ecosystem”. In the context of gene-culture co-evolution theory, it is not only two or more populations that are part of this dynamic. Instead, aspects associated with culture (social learning) and the socio-economy of a population, such as diet, and technology are also included (e.g. Kendal et al. 2011; O’Brien and Laland 2012; Causadias et al. 2018). Hence, the way we understand or think about the world (our cognition) is continuously being shaped and re-shaped by our technologies, our biology, social interaction with others, and our ecological niche/s, whereas our technologies, our biology, social interaction with others, and our ecological niche/s are continuously being shaped and re-shaped by how we understand or think about the world (Fig. 1).

Fig. 1
figure 1

The four-field co-evolutionary model for human cognition. The examples are general and not specific to the time slice discussed in this contribution. Figure copyright is held by the authors

Thus far, several models have contributed to discussion about the evolution of so-called modern cognition—none more so than the enhanced working-memory model (see Coolidge 2019 for most recent synthesis). Other useful frameworks include expert cognition (e.g. Wynn and Coolidge 2004), material engagement theory and metaplasticity (e.g. Malafouris 2013, 2015; Roberts 2016), theory of mind (ToM, aka mindreading/time travel) (e.g. Gärdenfors 2006; Dere et al. 2019), and cognitive task-structuring strategies (e.g. Fairlie and Barham 2016). Galway-Witham et al. (2019) use orders of intentionality (as variation of ToM) to differentiate levels of cognition (also, see e.g. Dunbar 1998).

As an inclusive cognitive framework that does not start from the perspective of the ‘modern mind’, we have been exploring causal cognition for the evolution of human thinking since our split from the chimpanzees (e.g. Lombard and Gärdenfors 2017; Gärdenfors and Lombard 2018, 2020). The resulting model includes seven grades of causal understanding, each operating increasingly detached in time and space from the other (see Table 1). We emphasise, however, that although they are able to inform on each other, they do not necessarily follow a unilinear evolutionary trajectory. Each grade may within itself comprise several levels of understanding (e.g. basic, average or enhanced, or in the case of ToM increasing orders of intentionality) that evolved through time in various circumstances.

Table 1 The 7-grade causal cognition model and its relation to the archaeological record (adapted from Lombard and Gärdenfors 2017; Gärdenfors and Lombard 2018)

Arguments for the model are based on comparisons between the capacities for causal cognition relying on the methodological principle of cognitive parsimony (Gärdenfors and Lombard 2020). This approach implies that if the cognitive capacities required for an activity or technique A is a subset of those required for an activity or technique B, then A is evolutionarily prior to B. Even though this principle does not say anything about dating, it makes it possible to argue that one type of activity is evolutionarily older than another.

We use the 7-grade causal cognition model, because it combines a variety of relevant cognition types. For example, it includes aspects of working memory (Bauer and Booth 2019), episodic memory (Suddendorf 2017; Brinums et al. 2018), analogical reasoning (Krzemien et al. 2017), intentionality (Sloman et al. 2012), general ToM (Barrett et al. 2010), relational complexity (Halford et al. 2010), and social cognition (Rochat et al. 2004). Causal cognition is also integral to tool use (e.g. Wolpert 2003; McCormack et al. 2011; Osiurak and Reynaud 2020), making it important in terms of later hominins evolving into obligatory tool users (Shea 2017). In addition, it plays a central role in niche theories that attempt to bridge evolutionary biology, philosophy, cognitive science, and anthropology (e.g. Bertolotti and Magnani 2017), with Dunbar (2002: 205) arguing that “causal reasoning must be one of the fundamental bases on which all cognitive processes operate”.

In our model, we group aspects such as DNA, brain, and body under biology and differentiate between the technological, social, and ecological environments or settings (Fig. 1). What distinguishes our model from others (e.g. Bruner and Iriki 2016) is that we explicitly place cognition at the centre, relating it directly to becoming H. sapiens during the MSA/MP and as defining aspect of the sole surviving hominin population. New is also that we deal more explicitly, compared to similar models, with the human body (e.g. Creanza and Feldman 2016), both internally and externally, as cognitive processor and socio-technical interface. Throughout our discussion of biology, society, and ecology as three co-evolutionary fields, we weave the thread of technology as the fourth field and archaeological anchor. We focus only on selected aspects within each field to illustrate the dynamic interaction between them and cognition. By doing so, we do not exclude any other aspects or bodies of work that may be equally relevant. This article serves two purposes:

  1. 1.

    As an example of a revised co-evolution concept model—the four-field co-evolutionary model for human cognition—from which to theorise about the evolution of the human mind during the MSA/MP

  2. 2.

    To enrich previous reference points (e.g. Galway-Witham et al. 2019) for the MSA/MP time slice by testing the four-field co-evolutionary model against a set of Neanderthal and H. sapiens cases

Human Biology and Becoming Sapient

In this section, we present aspects of the human body/biology and how it interacts with cognition. We start with exploring biology as internal cognitive processor and then discuss aspects of the body as external socio-technical interface.

Internal Cognitive Processing, Brain-Selective Genetic Variants

Many topics could feed into the discussion of internal cognitive processing, such as the volume of blood flow/oxygen to the brain (e.g. Seymour et al. 2016), or the influence of gut microbiota on neuro-development and social behaviour (e.g. Proctor et al. 2017; Hooks et al. 2019; Sherwin et al. 2019). Here, we touch only on a few brain-selective genetic variants in H. sapiens compared to the Neanderthals as an example of how DNA may contribute to variation in cognition. Importantly, these differences do not imply ‘superiority’ in either instance; they merely indicate alternative developments in the context of a long, shared evolutionary history between the populations (e.g. Kuhlwilm and Boeckx 2019).

Drawing from medical science and comparing the DNA of contemporary humans with that of Neanderthals, Schlebusch et al. (2012) highlighted some brain-selective genetic variants in H. sapiens. These include gene regions medically associated with brain development, skull shape, and plasticity (e.g. SULF2 and RUNX2), a range of cognitive and memory regulators (e.g. DYRK1A, NRG3, CADPS2, AUTS2, SDCCAG8 and LRAT), and aspects of language inheritance (e.g. ROR2, which is up-regulated by FOXP2). Neubauer et al. (2018) emphasise a further set of genes with positive selection in H. sapiens compared to the Neanderthals. These genes are important for the development of the nervous system such as those involved in axonal and dendritic growth and synaptic transmission (e.g. NOVA1, SLITRK1, KATNA1, LUZP1, ARHGAP32, ADSL, HTR2B, and CNTNAP2).

Human-accelerated regions for aspects of cognition and social behaviour (e.g. CUX1, PTBP2, GPC4, and CDKL5) were also identified by Doan et al. (2016). Whereas the work of McCoy et al. (2017) not only shows the pronounced downregulation of Neanderthal alleles controlling the cerebellum and basal ganglia that are often associated with motor control and perception but also regulates aspects of cognitive function and language processing. Kuhlwilm and Boeckx (2019) recently compiled a revised and extended catalogue of 647 single-nucleotide changes in 571 genes that potentially distinguish H. sapiens from the Neanderthals and Denisovans, requiring further experimental assessment. Some of the genes in their list affect neurons, brain growth and cognition, and one of them (ADSL) has recently been identified as selected for in H. sapiens after our divergence from Neanderthals (Stepanova et al. 2020). Based on the genomes of living and ancient Khoe-San populations from southern Africa, Schlebusch et al. (2020) found new variations in early H. sapiens subsequent to their split from the Neanderthals, but before their split from other African populations at more than 300 ka. These findings include a brain-related region of the LPHN3 gene on chromosome 4, which plays an important role in determining the connectivity rates between the principal neurons of the cortex.

What we highlight above is only the proverbial tip of the iceberg (see Kuhlwilm and Boeckx 2019). An explicit study on how the mentioned gene regions, or the effects of them in various combinations and their cognitive expressions, may affect the evolution of causal and/or technical thinking has not yet been attempted. However, based on the particular nature of the genetic distinctions summarised above, and notwithstanding possible overlaps or similarities in cognition, the notion that Neanderthals as a population were cognitively ‘no different’ from H. sapiens before their admixture is unlikely (contra Zilhāo et al. 2010).

Internal Cognitive Processing and the Brain

One way in which brain-selective genetic variants may have contributed to MSA/MP human variation is in the size, shape, and functional connectivity of the brain (e.g. Oldham et al. 2006; Bruner 2021). Galway-Witham et al. (2019: 2) graphically illustrate volume/brain size overlaps between MSA/MP populations, showing the general increase in human cranial volume through time, and the substantial overlap in brain size between H. sapiens and the Neanderthals since at least 250 ka.

Whereas an increase in brain size probably played a role in early hominin evolution in terms of the social brain hypothesis (e.g. Dunbar 1998; Pérez-Barbería et al. 2007), amongst other things, it is clear from the data presented in the Galway-Witham et al. (2019) graph that by the MSA/MP, this aspect was less distinct between Neanderthals and H. sapiens. This may well indicate that in terms of some aspects in our social behaviour, these two populations were similar (Zilhāo 2019). Such an interpretation would explain the propensity for genetic admixture between these groups (e.g. Nielsen et al. 2017), similarities in symbolic behaviour (Galway-Witham et al. 2019), and probably also for material culture exchanges (e.g. Flas 2011; Higham et al. 2014). On the other hand, Pearce (2018) argues that: “The archaeological record suggests that compared to Neanderthals, contemporary modern humans maintained social ties between greater numbers of individuals over greater distances. […S]uch differences would have influenced neural development, driving differences in brain structure and the degree of social complexity that each taxon could sustain cognitively”. This indicates that brain structure, and not just size/volume, influenced social behaviour.

Developments in palaeoneurology highlight both similarities and distinctions between Neanderthal and H. sapiens brain regions. As hominin brain size increased, they generally display a relative flattening in non-modern humans, but Neanderthals display wider superior parietal lobules, with H. sapiens having an even larger parietal lobe expansion resulting in the bulging of the parietal profile and a more rounded skull shape—partly caused by the enlargement of the precuneus (e.g. Bruner et al. 2018). Here we highlight the precuneus, because it has been identified as developing to its current morphometric dimensions only in H. sapiens, becoming visible in the fossil record from ~ 100 ka (e.g. Bruner 2021). Bruner’s extensive body of work on the precuneus (summarised in Bruner 2018), strongly associates this brain region with the “functional integration of body and vision, bridging somatosensory and occipital signals” (Bruner 2018: 143). It is a region that plays a central role in modern human brain organization. For example, it forms part of the default-mode network, amongst other networks (Margulies et al. 2009), that facilitates ‘mind-wandering’ or imagination, contributes to external task performance, and is active when a person thinks about themselves, other people, remembering the past, or when planning for the future (e.g. Sormaz et al. 2018).

The precuneus has been connected to higher-grade causal reasoning, the ToM impulse to actively teach, and techno-behaviours that represent such cognition, for example bow hunting or archery (e.g. Chang et al. 2011; Gärdenfors and Lombard 2018; Lombard 2019). Independent brain-scanning studies further demonstrate that the precuneus is associated with analogical reasoning (Wu et al. 2016), deductive and inductive reasoning (Alfred et al. 2018), and hypothesis-understanding (Wertheim and Ragni 2017). In their enhanced forms, all of these types of reasoning need come together to facilitate grade 7 causal cognition (Table 1).

Medical science indicates that the precuneus, with its extensive network activity and structural connectivity, has strong association with several enhanced cognitive traits including:

  • Episodic memory (Chen et al. 2016)

  • Working memory, planning, and response inhibition (Stokes et al. 2011; El-Hage et al. 2013)

  • Language network reorganization and verbal fluency (Cruchaga et al. 2009; Krug et al. 2010)

  • Event processing and attention-related neural activity (Greenwood et al. 2012; Breckel et al. 2015)

  • Age-associated executive function (El-Hage et al. 2013; Miranda et al. 2019)

  • Brain connectivity during associative emotional learning and emotional processing (Ćurčić-Blake et al. 2012; Tozzi et al. 2016)

  • Empathy, ToM, and social cognition (Laursen et al. 2014)

This list is not exhaustive, and the references therein were chosen because they also provide genetic markers for the brain regions associated with the cognitive traits—their selection history will be the focus of a future study. For the purposes of this paper, we suggest that the list provides interesting clues to some of the behavioural and cognitive aspects that may have varied between H. sapiens and the Neanderthals resulting from dissimilarities in how the precuneus developed before their admixture.

Bruner et al. (2018) explains how the frontal region of the precuneus mostly facilitates body cognition, whereas its posterior region deals with visual cognition, and the middle portion serves as integrative hub for signals received both bodily and visually in a process called ‘visuospatial integration’. This process helps to facilitate understanding the ‘body-environment physical coordination’, as well as assimilating visual images with conscious, self-centred episodic memory recall that provides the necessary scaffolding for mental experiments, thus for imagination (also see Fletcher et al. 1995). As previously highlighted, the most advanced form of causal network understanding is hypothetical reasoning, which necessitates the imagining of different possible outcomes in the future based on contemplating past experience. Bruner and Lozano Ruiz (2014, 2015) have argued that based on variation in the precuneus and the archaeological and fossil records, the process of Neanderthal visuospatial integration may not have developed in the same way as that of H. sapiens, potentially impacting their use of technology.

Similarly, Wenderoth et al. (2005) suggest that the coordination effort during bimanually coordinated behaviour (such as shooting a bow or playing a piano) is facilitated by regions such as the precuneus, contributing to higher order performances as an interface between action and cognition. Thus, even though it seems that the left inferior parietal lobe may govern straightforward day-to-day tool use scenarios (e.g. Osiurak et al. 2020), we argue that the functional interconnectivity and associated cognitive traits of the precuneus are key to technical innovation in terms of attention, planning, and coordinated bimanual tool engagement (e.g. Vingerhoets et al. 2009; Rossit et al. 2013).

Other brain regions may also point to variation in cognition between H. sapiens and the Neanderthals. For example, the work of Neubauer et al. (2018) and Sereno et al. (2020) highlight the cerebellum as an area that is enlarged, rounded, and more folded in current H. sapiens compared to Neanderthals. Although the cerebellum is mostly known for its association with motor-related functions such as movement coordination and balance, it also facilitates aspects of spatial processing, working memory, language, social cognition, and emotional processing (also see Sokolov et al. 2017). Because there is evidence for positive selection in genes associated with the cerebellum in H. sapiens such as BRNCHA (see McCoy et al. 2017), this is a good example of co-evolution between cognition, brain, and DNA in this population. The thalamus is a further brain region implicated in the globularity of the H. sapiens brain and is linked to language readiness. In this context, Boeckx and Benítez-Burraco (2014) sought to identify gene candidates that play some role in brain growth, regionalization, and/or neural interconnectedness. They arrived at a tentative candidate list that included several genes (e.g. USF1, RUNX2, DLX1, DLX2, DLX5, DLX6, BMP2, BMP7, and DISP1), finding solid evidence of a selective sweep in RUNX2 subsequent to the split between H. sapiens and Neanderthals (Green et al. 2010).

Whereas the current precuneus, cerebellum, and thalamus morphologies may be exclusive to H. sapiens, Pearce et al. (2013) point out that the large Neanderthal brains (sometimes surpassing that of H. sapiens in volume) and variation in parietal lobe shape indicate that the Neanderthals may have had cognitive and/or neurological specialisations that were not equally developed in H. sapiens. The parietal expansion in both populations that include the temporal region, in combination with a prefrontal cortex that is modern-like already in H. heidelbergensis (e.g. Robson and Wood 2008), may account for similarities in symbolic material culture as observed in the archaeological records of both species (Zilhāo 2019; also see Henshilwood and Dubreuil 2011).

Notwithstanding parallels between human populations and the plasticity and interconnectivity of the modern human brain (e.g. Gómez-Robles et al. 2013), both medical science and DNA studies provide insight into cognitive and behavioural aspects that may have been shaped by differences in brain morphologies during the MSA/MP. These include variation in memory processing, verbal fluency, brain connectivity, emotional and social processing, visuospatial integration, and bimanual technical engagement. So that the current understanding of at least three brain regions predicts that the same (not excluding similar) behaviour and cognition in Neanderthals and H. sapiens is unlikely. Of course, as admixing between the species and their descendants continued, variations would conceivably become less distinct until only the H. sapiens population was left.

The Body as External Socio-technical Interface

Whereas the brain may be seen as internal cognitive processor, our human body (shape, size, sex, and function) and sensory organs represent each person’s interface with the world (e.g. Bruner and Iriki 2016). Our bodies have evolved differently from other animals in some aspects. Amongst other things, our hominin physiology in combination with our cognition allowed us to produce and use tools to save or extract energy, to shape and change other materials, and to control or manage a range of social and environmental strains (e.g. Shea 2017). These strategies were so successful that later on in our evolution, our bodies became increasingly gracile, compared to that of other predators, “a body form adapted to a technology-dependent life” (Zilhāo 2019: 3). Shea’s (2017) analysis of the lithic evidence shows that from ~ 300 ka, most human groups were obligatory stone tool users (but see Dusseldorp and Lombard 2021 on H. naledi).

Hominins, and other animal tool users, extend their bodies, and by implication, their minds, through the ‘prosthetic’ use of technology (Malafouris 2013). Malafouris’ work provides a framework for the interactions between brains, bodies, and things through notions of the extended self and tectonoetic awareness, the brain-artefact interface, and material engagement theory (e.g. Ihde and Malafouris 2019), culminating in what he describes as metaplasticity (e.g. Malafouris 2015, 2021). At the core of his reasoning lies the co-evolution of the brain, body, and material culture or technology, and the idea that humans shape their own evolutionary trajectories by inventing, producing, and adapting the technologies through which our minds and bodies interact with the world.

For example, the bodily adaptation of humans to throw accurately and forcefully probably had a long evolutionary history. Adaptations that enable elastic energy storage and release at the shoulder first appear in their ‘modern’ configuration in H. erectus 2 million years ago (e.g. Roach et al. 2013). Throwing spears represent the indirect transmission of arm force. The thrusting effect of the spear is detached in space from the thrower. This entails that the mapping between cause and effect must be inferred from the behaviour of the animal that is hit. In learning such mapping, some representation of force transmission and therefore grade 6 causal reasoning is required (Table 1; Gärdenfors and Lombard 2018, 2020). Whether Neanderthal physiology indicates thrusting or throwing spears remains a point of debate (Churchill 1998; Kortlandt 2002). Churchill and Rhodes (2009) even question whether spear throwing was habitually the case for MSA/MP H. sapiens—although they admit that their osteological results are ambiguous. An experimental study by Milks et al. (2019) shows that hunting in the form of throwing spears was likely within the range of the Neanderthals. This is consistent with aspects of the European archaeological record such as the wooden weapons from Schöningen in Germany dating to ~ 300 ka (Thieme 2000; Schoch et al. 2015). If Neanderthals used throwing weapons, this would indicate at least basic levels of grade 6 causal cognition (Table 1), which is also associated with the hafting stone tools that was well within their technical range (e.g. Koller et al. 2001; Sykes 2015; Niekus et al. 2019; Gärdenfors and Lombard 2020; but see Schmidt et al. 2019).

Bruner and Iriki (2016: 98) highlight how body posture and locomotion affect the senses and thus behavioural relationships with the environment, and that “primates, are strongly dependent on the eye-hand system, and coordinated by processes of visuospatial integration”. This could reflect differences in the ways that H. sapiens and Neanderthals were able to integrate their inner perceptions with their external worlds through the body interface. To them (Bruner and Iriki 2016), changes in visuospatial integration functions and their associated parietal brain areas, such as the precuneus and the intraparietal sulcus, may have been key in developing capacity for embodiment and spatial cognition required for the technical patterns and ranges we see in humans today. Whereas a reasonable level of visuospatial integration is required for spear throwing, enhanced integration is necessary for bow hunting where multiple factors need to be compensated for, and the flight of the arrow envisioned in the mind of the hunter before a successful shot (e.g. Lombard and Haidle 2012; Williams et al. 2014; Coolidge et al. 2016).

Considering the external body’s interaction with MSA/MP cognitive evolution is not clear-cut. Adaptation to throwing, for example, was probably grounded in the common ancestor of the Neanderthals and H. sapiens, as are the associated techno-behaviours of spear hunting and hafting. These traits and their associated cognition therefore did not develop independently in the two populations. As a result, similar behaviour and cognition are indicated by spear throwing. Variation in visuospatial integration and evidence of bimanual bow hunting for H. sapiens during the later stages of the MSA/MP show divergence in terms of biology, technical range, and cognition since least about 70 ka.

Society as Conduit for Knowledge Transfer and Technical Innovation

Despite the fact that our current cognition, technical inventiveness, and range of material culture engagement clearly differentiates us from other animals, Von Hippel and Suddendorf (2018) note that relatively few people today can self-identify as inventors of new products. To address this discrepancy, they propose a social innovation hypothesis wherein our minds evolved with a social rather than a technical orientation, because social relations are more rewarding for most people compared to the satisfaction of invention. This is in line with de Waal’s (2016: 258) discussion of social learning, where he emphasises that for chimpanzees as well as humans the ‘social’ is equally important as the ‘learning’ part of social learning—“social learning is more about fitting in and acting like others” than learning a skill. Hence, acceptance from others is a deeply rooted emotional function that enhances the motivation to be successful in learning (Yu et al. 2018).

Whilst we agree that social rewards are a key component of cognitive evolution, we draw attention to the fact that whereas many other animal species have highly complex social structures, none has yet become obligatory tool users or serial inventors and innovators. So that we take a more moderate position, suggesting that a single explanatory framework may not suffice for the fact that at different times throughout our long evolutionary history, there may have been different emphases on a range of push and pull factors at play.

What is more, there are clear co-evolutionary links between social cooperation, causal cognition and ToM (e.g. Barrett et al. 2010). The 7-grade causal cognition model (Table 1) was developed in an archaeological context. Yet, its basic premises hold equally true for social cognition where causal reasoning allows for (a) predicting social outcomes based on thinking about observations made during past social contexts; (b) affecting and controlling current and imagined future social events; and (c) predicting a range of possible/plausible social causes from effects, even if the causes are not perceivable. Thus, even if causal understanding may not be ‘necessary’ for some forms of ‘culturally evolving technology’ (e.g. Derex et al. 2019), we argue that it is critical in terms of the social cognition and resulting social systems that facilitate the techno-behaviours on which we became dependent. In evolutionary terms, we therefore see an inextricable link between human technical and social cognition (also see Heyes 2012; Lombard and Gärdenfors submitted).

In this section, we draw on previous work (e.g. Högberg 2009; Högberg and Lombard 2016a, 2021; Lombard and Högberg 2016; Gärdenfors and Högberg 2017, in press; Lombard et al. 2019b) to exemplify how technical inventions during the MSA/MP became human culture or embedded in society through highly socialised systems such as intentional teaching and social negotiation. With this focus, we do not intend to downplay other social approaches to cognitive evolution, but to demonstrate some ways in which an archaeological approach to teaching and socio-technical frameworks can contribute to such discussion.

Human Teaching: an Archaeological Perspective

Whilst many animals, especially primates, engage in advanced forms of social learning (de Waal 2016; Boesch et al. 2019), processes of intentional teaching through verbal instructions, sounds and gestures, together with pro-social acts of feedback, are unique to humans (Kline et al. 2013; Kline 2015; Gärdenfors and Högberg 2017). In some cultures, it is limited to a few specific instructive tasks; in others, it dominates the way children learn (Tomasello et al. 1993; Strauss et al. 2002; Csibra and Gergely 2009, 2011; Lancy 2016). Several studies emphasise intentional teaching as important in cultural transmission when learning complex, cognitively opaque skills such as the making of elaborate stone tools (d'Errico and Banks 2015; Gärdenfors and Högberg 2017). Cross-cultural research underlines that it is indeed in economically and culturally highly valued domains where teaching is emphasized (Kline et al. 2013).

Leadbeater et al. (2006) suggest that in the context of the evolution of teaching behaviour, ‘teaching’ should be reserved for not only the transfer of skills but also the transfer of concepts, rules, and strategies (for discussion on divergences in what is meant by teaching see Gärdenfors and Högberg 2017). When intentional teaching is understood in this way, it not only conveys knowledge, but “also facilitates creative problem solving and provides the scaffolding for reorganizing and ‘playing’ with ideas until they [learners] produce unexpected or novel outcomes and innovations” (Riede et al. 2018: 47). In its most advanced form, such playing with ideas equals the hypothetical thinking associated with grade 7 causal cognition (Table 1) and has been associated with the precuneus as part of the brain’s default-mode network.

Discussing the evolution of intentional teaching as a gradual process, Gärdenfors and Högberg (2017, in press) present six levels of teaching in the context of an ‘archaeology of teaching’ (Table 2). For all six levels, it is assumed that the teacher has an intention that the learner learns something that she/he would not learn without the intervention of the teacher. Gärdenfors and Högberg (2017, in press) suggest that these six levels imply a long evolutionary history of intentional teaching. Cogni-behavioural requirements such as intentionality, ToM, and communication capacities increase successively with each of the six levels. Whereas indexical or iconic gesturing may be sufficient for levels 1–3, level 4 requires displaced communication and levels 5 and 6 require a symbolic language (Table 2). Below, we first discuss stone tool use and knapping technologies to illustrate this for levels 1–4. For level 5, spear technology and bow-and-arrow technology are used.

Table 2 Six levels of intentional teaching following Gärdenfors and Högberg (2017, in press). Intentional teaching is influenced by the social environment (Di Paolo et al. 2018) and can be set up in various formal and informal learnership connections. For example, (a) one to one, where teacher teaches a learner; (b) one to many, where a group is taught by a teacher; (c) many to one, where a person is taught by a group of people; or (d) many to many, where a group is taught by another group (d'Errico and Banks 2015; Morgan et al. 2015; Högberg and Lombard 2016a)

Previously, we showed that levels 1 and 2 teaching preceded the genus Homo (Lombard et al. 2019a), level 3 teaching was probably needed to master core maintenance as part of Oldowan stone-flake production (Gärdenfors and Högberg 2017; but see Stout et al. 2019), and level 4 teaching was key to understand the concept and hierarchy of platform preparation in the reduction sequence of late Acheulean hand-axe production (Gärdenfors and Högberg 2017). Consequently, the cognitive ability for intentional teaching by evaluative feedback, drawing attention, and demonstrating and communicating abstract concepts (Table 2) was in place well before the onset of the MSA/MP.

There are knapped technologies in use during MSA/MP that require level 3 and/or 4 teaching (Table 2). One example is variations of the Levallois technology found in many places and over a long time during the MSA/MP, used by both Neanderthals and H. sapiens (Bordes 1980; Debénath and Dibble 1994; Dibble and Bar-Yosef 1995). Schlanger (1996: 231) shows that the course of action involved in the technology “was structured and goal-oriented”, and included a “generative interplay between the mental and material activities of the […] flintknapper”. Wynn and Coolidge (2012: 70) write about the importance of the ‘distal convexity of a core’ as a focal point in Levallois technology, highlighting it as an important feature for the learner to understand. In terms of inanimate causal cognition (grade 6, Table 1), this indicates that Levallois knappers understood how a core needed to be set up to ‘behave correctly’ in future. We have not yet analysed the level of teaching needed to perform Levallois knapping, and it is unclear whether level 3 or 4 teaching is needed here. Another example of MP technology is provided by Ruebens (2013) who discusses bifacial hand-axes from ~ 100 ka (also see Brenet et al. 2017). Because they were made by using platform preparation, Gärdenfors and Högberg (2017) describe these artefacts as evidence of level 4 teaching for Neanderthals.

Teaching by explaining relationships between abstract concepts (level 5) and narrating (level 6) both presume displaced communication, i.e. that the teacher uses a symbolic language (spoken or gestured, Table 2). Gärdenfors and Högberg (2017: 195) explain that in this level of teaching, causal relationships can be seen as “patterns in time […] if cause C is present, then effect E will follow, where C and E are concepts”. One example of this would be teaching someone how to produce birch tar adhesive and then how to use it to fix a stone tip to a wooden shaft to make a spear that will eventually be used in a hunting campaign to kill an animal. In this example ‘birch tar’, ‘adhesive’, ‘fixing’, ‘hunting campaign’ and ‘kill’ are all concepts. The use of adhesives is well known from MSA assemblages (e.g. Wadley et al. 2004; Rots et al. 2011, 2017; Charrié-Duhaut et al. 2013) and with increasing frequency from Neanderthal contexts (see Niekus et al. 2019 for summary). Birch tar production by Neanderthals has been discussed by Koller et al. (2001, but see Schmidt et al. 2019 for critical comments). Niekus et al. (2019) present results from the analysis of birch tar found on a flint flake dated to ~ 50 ka. They conclude, “this find provides evidence on the technological capabilities of Neandertals” to produce adhesives and multi-component tools (Niekus et al. 2019: 1).

An example of a more advanced form of level 5 teaching would be explaining that if you take a branch from this bush, its wood is elastic enough to make a bow using a string with enough tensile strength to span the bow and launch an arrow with a small, brittle stone tip that will break off in the body of the prey animal that will then leave a blood spoor for you to track and kill. Here, abstractions of the concept ‘branch’, ‘elastic’, ‘bow’, ‘string’, ‘tensile strength’, ‘span’, ‘arrow’, ‘brittle’, ‘stone tip’, ‘prey animal’, ‘blood spoor’, ‘track’, and ‘kill’ are combined in one technology, the bow-and-arrow set, with the purpose to explain causal relationships of what will happen. In southern Africa, some bone and stone arrow tips pre-date 60 ka (e.g. Backwell et al. 2008, 2018; Wadley and Mohapi 2008; Lombard 2011), where bow hunting could have been practiced since ~ 70 ka (Lombard 2020).

We suggest that multi-component machines such as bow-and-arrow technology, for which evidence appears during the later phases of the MSA, and that requires coordinated bimanual manipulation, hierarchical composition, and the understanding of abstract properties such as elasticity, tensile strength, and brittle, indicates at minimum level 5 teaching that implies relatively high orders of ToM as well as grade 7 causal cognition.

Narrating (level 6 teaching) implies even more developed abilities to model the mental state of others, as well as the many aspects of materials involved in stories told. Lewis et al. (2017: 1069) demonstrate that “mentalising tasks [e.g., high orders of intentionality and ToM] are cognitively more demanding than factual tasks, and […] that there is a significant parametric effect in the brain regions involved as a function of the intentionality level at which subjects are required to work (higher order tasks require the recruitment of disproportionately more neural effort compared to equivalent non-mentalising tasks)”. Hence, there are meaningful differences in teaching factual tasks by explaining relationships between abstract concepts (level 5) and narratives (level 6) about social rules and alliance networks materialised in material culture, with the purpose for learners to make inferences from the stories that will influence their behaviour in the future.

The communicative aspects and the forms of advanced capacity for ToM involved in level 6 teaching require the well-developed ability to understand the emotions, attention, desires, intentions, and beliefs of others (Gärdenfors 2007). Kawamichi et al. (2014) speculate that this capacity may have developed differently in Neanderthals compared to H. sapiens (also see Nakahashi 2015). Variation in the genomes and brains between the two species associated with aspects of social cognition and language, as discussed above, may be a further indication of such different development. Yet, whereas teaching through narration is deeply imbedded in most recent hunter-gatherer socio-economies, we have not yet identified material culture that can convincingly demonstrate level 6 teaching during the MSA/MP before ~ 50 ka, so that we do not know what role it played at the time.

Collectively, levels 4 to 6 involve teaching for understanding—if learners can perceive the patterns involved in the concepts or the relations between them, they will be able to generalize their behaviour to new variations. Depending on how teaching is organised socially and culturally, it leads to variation in creative flexibility that may be transferred to domains of activities that are beyond those originally taught. In its most evolved form, teaching allows for the learner to abstract the content from what is being taught, and in multiple situations and in multiple ways flexibly change established styles and ways of problem solving by applying abstract concepts to new situations (Bateson 1972; Greenfield 2004). This involves the cognitively demanding high order of intentionality and ToM understanding that Lewis et al. (2017) discuss, and at least basic levels of grade 7 causal cognition (see above and Table 1). Omura (2014) suggests that these abilities developed differently in H. sapiens compared to other populations, and we suggest that the relatively narrow social networks inferred for Neanderthals by Pearce (2018) might have influenced how teaching was organized and consequently how teaching for causal understanding developed in them (also see Wynn and Coolidge 2012).

Galway-Witham et al. (2019: 16) stress that “while there are certainly biological differences between Neanderthals and H. sapiens, the behavioural gap has narrowed to a point where there seems to be little difference between the two, even at the point when Neanderthals were becoming physically extinct” (also see Villa and Roebroeks 2014; Roebroeks and Soressi 2016, but see Schmidt et al. 2019 for critical discussion). Our unpacking of aspects of intentional teaching within the context of a four-field co-evolution approach to human cognitive evolution, suggests the same. Based on what some of their technologies reveal, both Neanderthals and H. sapiens practiced relatively high levels of teaching, requiring at minimum communicating abstract concepts (level 4), and explaining relationships between these concepts (level 5). We speculate that this overlap in teaching abilities facilitated cultural exchange and learning between the two populations during opportunities of social engagement, as well as aspects of symbolic behaviour. Certain gene and brain regions selected for in H. sapiens, as discussed above, would have fast-tracked effective teaching and learning, which could have played a role in their relatively quick down regulation in Neanderthal genetic variants transferred into H. sapiens (Wolf and Akey 2018).

For example, there seems to be an intimate link between teaching and learning and precuneus activation, and even an increase in its volume. On a social level, Watanabe (2013) found that the implicit interpersonal synchronisation that forms an essential part of teaching enhanced activity in the precuneus of both teacher and learner, whilst Delazer et al. (2005) showed that learning-by strategy (hypothetical causal reasoning) also activated the precuneus. In a situation where students received 2 weeks of cognitive training, they showed a significant volume increase in the dorsomedial frontal cortex, the orbitofrontal cortex, and the precuneus compared to students who were not involved in any teaching activity (Ceccarelli et al. 2009). In a study involving expert archers and non-archers (Chang et al. 2011), it was found that cortical regions that show greater activation in non-archers mainly involved the parietal cortex, including the precuneus. This observation was interpreted to reflect the fact that non-archers who have not yet mastered the complex task of accurate archery had to strain to integrate the relevant sensory and cognitive information needed to learn how to plan a successful shot. Such strain on the precuneus could indicate how the social processes of teaching and learning techno-behaviours might have contributed to the expansion of this brain region in H. sapiens (e.g. Lombard 2019; also see Bruner 2021).

Notwithstanding biological and some technological differences, teaching behaviours in Neanderthal and H. sapiens populations may have been similar, and variation may have been by degree rather than kind, so that the populations were probably able to learn from each other once they met. The association with the precuneus and teaching and learning in both technical and social settings indicate that H. sapiens brains, cognition, and social structures may have co-evolved to accommodate effective knowledge-transfer systems.

Technical Inventions Become Human Culture Through Social Negotiation

Hovers and Belfer-Cohen (2006: 299) argued that the emergence of H. sapiens culture may appear erratic because techno-behaviours become visible in the archaeological record “only after appropriate cues in the social and physical environments have triggered the passage from latent potential [invention or individual causal understanding], to actualised behaviour to prevalent norms [innovation or society’s understanding of the practical and social impact of a techno-behaviour]. Once the initial trigger kicks in, the particular behaviour appears. However, in order for such a behaviour to persist [their emphasis], the pertinent knowledge must be retained and transferred down the generations” (also see e.g. Tomasello 1999; Tostevin 2003, 2012; Omura 2014). Consequently, key to understanding human cognitive evolution (in terms of how it reflects in technology), rests not only in studies of an individual’s capabilities for tool invention, production, and use but also in the analysis of knowledge-transfer systems that emerge when technologies become socially accepted and transmitted between generations and groups (e.g. Högberg and Lombard 2016a, 2021).

When thinking about how new technologies became persistent in a social setting (and as such visible and prevalent in the archaeological record, representing the cumulative causal understanding of a population) and to understand what could have led to their demise, it is useful to focus on socio-technical framework structures (Geselowitz 1993). Such structures interrelate amongst members of social groups and shape their thinking and acting towards a technological development (Bijker 1995). Discussing the concepts of introduction, closure, stabilisation, and destabilisation, we here elaborate on aspects of how new technologies can become socially accepted and consistently performed throughout generations within a group and/or among different groups (Högberg 2009; Högberg and Lombard 2016a, 2021) (Table 3).

Table 3 Phases in a socio-technical framework, modified from Högberg and Lombard (2021, see also Bijker 1995)

Many invented technologies are never accepted outside of their creation, nor become widely used despite their ingenuity. For a new technology to become accepted and part of everyday life, it needs introduction beyond the context of its ‘invention’, ‘discovery’, or ‘improvement’ (Renfrew 1978). As soon as it is introduced, it becomes a focus for negotiation in a wider social setting. Within the social and material networks involved in the negotiation, a give-and-take between people and technologies emerges (Latour 1992, 2007). These negotiations are not solely bound to technology-specific properties, such as their efficacy, productivity, proficiency, or individual preferences. They also relate to social norms and practices. If the new technology is not rejected by broader society, a technological ‘closure’ emerges across groups within a society or across societies. Here, an acknowledged social understanding of what constitutes the technology is established. Any previous social flexibility is markedly reduced or disappears. The technology is now in a state of stabilisation, which means that negotiation about its place and meaning in society is no longer needed. This technological stabilisation can vary between different social groups (Bijker 1995). If a stabilisation process lasts for a long time, that is, when intergenerational knowledge transfer happens for generation after generation without any considerable change, it becomes an institutionalized techno-behaviour, a social norm within a group and across a social landscape (see Högberg and Lombard 2021).

Once in closure, technology continues to develop and/or change in interaction with the social network in which it functions. This interaction is not just about introduction, closure, and stabilisation; technologies are also destabilized by being rejected or copied and become dismissed or brought to new closures. In other words, through time, society gives new meanings to technologies that lie outside the consensus of previous closures. Destabilisation means that meaning, significance and values formerly ascribed to a technology by stabilisation no longer apply (Table 3). Innovations are developed, expanding on original intents, starting the destabilisation process. Either it results in the disappearance of the original technology or a new closure process. New and old perceptions about the technology are re-aligned with social negotiation that successively moves towards stabilisation. This can in turn lead to further development or change to adapt the technology to suit new demands entailed by destabilisation (Bijker 1995). Here, we use a selection of stone tool technologies to elaborate on possible MSA/MP socio-technical frameworks, with the purpose to add to the discussion on relevant cogni-behaviours.

As mentioned above, the Levallois reduction approach was wide-spread and used in Africa and Europe by both Neanderthals and H. sapiens to produce flakes, points, and various retouched implements (Debénath and Dibble 1994; Tryon et al. 2005; Lycett et al. 2010; Delagnes and Rendu 2011; Shea 2013; Wurz 2013; Schmidt et al. 2019). Notable spatio-temporal variation exists in how it was performed (Locht et al. 2016). Delagnes and Rendu (2011: 1771) show, for example, how it varies “significantly in terms of duration of reduction sequences”, and Mathias et al. (2020) reports on gradual changes in the lithic record from the Early MP and onward. For the Neanderthals, Richter (2000) also reports on variation in regional technological introduction, closure, and destabilisation of tool production and tool curation traditions between ~ 280 and 60 ka (see also Dibble and Bar-Yosef 1995; Uthmeier 2000; Ruebens 2013; Brenet et al. 2017).

Notwithstanding spatio-temporal variation, the Levallois reduction technology as a general approach to produce specific kinds of flakes and points persisted throughout the whole MSA/MP. In the context of our discussion on socio-technical frameworks, Levallois reduction technology can be seen as an example of a long-lasting MSA/MP socio-technical stabilisation spanning numerous generations and multiple populations for thousands of years. Consequently, long-lasting traditions (e.g. Levallois reduction approach) as well as regional technological variation seem to have been working simultaneously in Neanderthal populations. Galway-Witham et al. (2019: 14) conclude that “Neanderthals of the late Middle Palaeolithic (~115-39 ka) had a developed sense of distinct cultural identities expressed through the lithic record” and that “[…] this support[s] the notion of distinct regionalized cultural behavior among late Neanderthal groups in Western Europe as defined through the lithic record” (also see Soressi 2005; Morin 2012). These outcomes imply variation in Neanderthal technology and closure processes that formed in regional settings. However, the variation is based on a general socio-technical stabilisation (i.e. changes referring to existing reduction approaches), not on the introduction and closure of new technological innovations (i.e. newly invented reduction systems). Hence, stabilisation seems to have been strong in Neanderthal techno-behaviours and destabilisation in terms of knapping norms slow or non-existing.

The MSA in sub-Saharan Africa is also the period during which stone tool technologies started to indicate regional variability and group identity (e.g. Clark 1988). Whilst the record is not yet understood fully, broad overviews of its cultural sequence indicate that socio-technical processes of introduction, closure, stabilisation, and destabilisation were relatively dynamic. In southern Africa, some of the technical turnarounds seem to reflect the introduction and stabilisation of new innovations such as the use of pressure flaking and/or heat treatment to produce thin bifacially flaked Still Bay points from ~ 80 ka (Henshilwood et al. 2001; Högberg and Lombard 2016b; Schmidt and Högberg 2018). The Still Bay industry was destabilised and replaced by ~ 66 ka with a bladelet-based microlithic industry known as the Howiesons Poort lasting to ~ 58 ka (e.g. Henshilwood et al. 2011; Wadley 2015). During this time, we also see the introduction of bow hunting in the region. Both of these industries seem to have spread relatively quickly across the landscape subsequent to their introduction (Högberg and Lombard 2016a, 2021; Lombard et al. 2019b). The maintenance of extended spatio-temporal social ties could explain the relative rapid spread of technological concepts such as the making of Still Bay points in southern Africa, as well as its replacement with a distinctly different industry across the sub-continent. These socio-technical expansions were probably facilitated through chains of short-distance interaction in foraging ranges and during aggregation and dispersal phases (e.g. Wadley 1987; Pearce 2018), similar to that of ethno-historically known foragers (Yellen 1997; Wiessner 2002).

As mentioned earlier, Pearce (2018) stresses that H. sapiens maintained social ties between larger numbers of individuals and over greater distances than what Neanderthals did. Such variation in the spread and structuring of social networks would have also influenced cognitive and neural adaptation. For example, to be able to keep tabs on several social partners in their absence in the context of dispersed networks, the capacity to create a ‘virtual inner reality’ is needed (e.g. Bruner 2010). This implies hypothetical reasoning (grade 7 causal reasoning) or imagining alternative solutions and the necessary neural architecture for such thinking.

Once again, the precuneus is one of the brain regions that helps to facilitate this type of social cognition (Pearce 2018; also see Freton et al. 2013 for independent neurological study). Emonds and colleagues (Emonds et al. 2014) further found that cooperating pro-social individuals show significantly more activation in the precuneus compared to individuals who are less cooperative or more ‘pro self’. Thus, whereas the unique evolutionary trajectory of the precuneus in H. sapiens is strongly linked to aspects of technical cognition (e.g. Bruner 2021), it is equally linked to social cognition, most likely resulting from a co-evolutionary process. Variation between the morphology and connectivity of the precuneus in Neanderthals and H. sapiens would therefore indicate variation in socio-technical behaviours between the two populations.

Looking at the cases presented above, we see that they indicate both socio-technical frameworks of stabilization and destabilization and new introduction. The Levallois concept in use is, notwithstanding its regional variations, an example of a general long-term stabilization, and a wide-spread use of this technology by both Neanderthals and H. sapiens. At the same time, we also see innovation, stabilisation, and destabilisation in H. sapiens technology in scope not comparable to what we see in Neanderthal technology, consequently indicating different cogni-behavioural development from a socio-technical perspective.

Ecology

Galway-Witham et al. (2019: 2) highlight the waxing and waning of the African desert landscapes as a key factor in the geographic control of human populations and cite suggestions for climate change as a driver in the dispersal of hominins into southern Europe (also, see Stewart and Stringer 2012 on ecological refugia and climate change). One of the reasons proposed for the ultimate loss of side branches to our evolution, such as H. naledi during the MSA/MP, may have been “the result of the cumulative effect of differential resilience to both climate change and inter-group competition” (Mounier and Lahr 2019: 10). Populations focusing on diverse aspects of an ecological niche will cause selection for different capacities (see Dusseldorp and Lombard this volume). This precludes straightforward or uni-modal interpretations for the evolution of primate or human cognitive and social behaviours (e.g. Foley and Gamble 2009; Boesch 2012). Rosati (2017: 691) suggests an integrative theory that is able to “account for how humans are unique in both our sociality and our ecology”, similarly Malafouris (2015, 2019, 2021) proposes that by inventing technologies that allow mind and body to interact with the world, humans form their own evolutionary trajectories.

At the beginning of this paper, we highlighted that one of the things that differentiates humans from all other animals is that, aided by technology, we became the only surviving hominin inhabiting, changing, and dominating most ecological niches on earth. Consequently, nature does not pre-set human environments—instead, human endeavour creates, enriches, maintains, and modifies them. The result is that H. sapiens occupy a socio-cognitive niche that constantly and seamlessly interact with ecological settings and technological domains.

Much has been written about the physiological and/or behavioural adaptation of Neanderthals to Late Pleistocene Eurasian climates (e.g. Trinkaus 1981; Holliday 1997; Richter 2006; Finlayson and Carrión 2007; Smith 2015; Stewart et al. 2019), but little is known about the cognitive effects of these adaptations. From a technological point of view, Bocquet-Appel and Tuffreau (2009: 287) demonstrate that between ~ 240 and 40 ka Neanderthals in general show “inertia in the development and use of lithic tools for 200,000 years, despite the four cool to cold macroclimatic periods they experienced”. This indicates a socio-technical stabilisation, well adapted to function in various climates. On the other hand, the physiological adaptations of Neanderthals to their climatic conditions were probably more energetically expensive, compared to H. sapiens (e.g. Froehle and Churchill 2009). This hypothesis submits that Neanderthal populations may have been more specialised and would have been at greater risk in changing or calorie-scarce ecologies, which may have also affected their reproduction (e.g. Nielsen et al. 2017; Pontzer 2017; Hajdinjak et al. 2018).

Again, we draw on spear vs. bow hunting. Shea and Sisk (2010) consider behaviours such as bow hunting to be ecological niche-broadening compared to spear hunting. It can be easily and globally adapted to prey type, season, socio-economy, and ecology, providing techno-behavioural flexibility that dramatically increases the scope and potential for successful hunting forays (Lombard 2016). We suggest that bow hunting also provided H. sapiens with other evolutionary advantages. For example, it allows a single hunter to achieve what can be only accomplished in a group with spear hunting, but is equally adapted to group hunting—allowing for increased demographic plasticity on the landscape. Bow hunting is therefore not only niche-broadening in terms of prey type and/or landscape (Shea and Sisk 2010), nor is it necessarily aimed at specialisation (e.g. Dusseldorp 2012), it also increases the fitness profile of a single person or a small group during times of change. Examples of such change may include shifts in climate, migration into territories with different flora and fauna, or strain on food packages for either ecological or socio-political reasons or both. The main evolutionary advantage of technologies such as a bow-and-arrow set is the amplification of conceptual, technological, and behavioural modularization (Lombard and Haidle 2012; see Bassett et al. 2011 for neurological equivalent), and the resulting socio-economic and ecological flexibility, which is achieved through analogical reasoning and grade 7 causal understanding (Table 1). Thus, a population’s cognitive and technical ability to incorporate bow hunting in their arsenal was probably part of a suite of similarly flexible techno-behaviours that equipped our ancestors to deal successfully with a wide range of ecological and socio-political challenges, either in thriving in their existing terrains or when moving onto new landscapes.

During MIS 5 (~ 130–71 ka), H. sapiens hunter-gatherers in southern Africa hunted with spears (Villa and Soriano 2010). The subsequent MIS 4 (~ 71–57 ka) demarcates general climatic/ecological change in the region on several fronts. For example, it is a relatively cold period with temperatures of 1.5–2 °C cooler than the Last Glacial Maximum (LGM) at ~ 22 ka, moisture/water levels were generally higher than before, and there is an increase in evergreen woodlands and an expansion of Afromontane vegetation (e.g. Scott and Neumann 2018). By the middle of MIS 4 at ~ 64 ka, we see the earliest multi-stranded evidence for H. sapiens bow hunting in southern Africa. By MIS 3 (starting at ~ 57 ka), when the landscape opened up to hunting large grass-land animals, people seemingly took up spear hunting again as preferred meat-getting strategy (Clark and Plug 2008; Wadley et al. 2008; Parsons and Lombard 2011). This technical flexibility is similar to what Riede (2008) described for bow hunting in northern Europe by ~ 13 ka associated with changing socio-ecology after the Laacher See-eruption.

In the meantime, during MIS 4–3 at Amud Cave in the Levant, Neanderthals were also changing their hunting strategies to adapt to the varying ecology (Hartman et al. 2015; also see Stewart et al. 2019 for broad ecological overview). But instead of pursuing new prey types or inventing new hunting technologies, they opted to change their hunting range for gazelle from high elevations further from the site during the drier earlier phase, to more diverse and low elevations closer to home during the later phase (Hartman et al. 2015). This may also reflect the finding of Stewart et al. (2019), based on genetic studies, that Neanderthal individuals probably had more power-associated alleles per individual on average compared to H. sapiens, and therefore, may have evolved to be more powerful. They link this development with ecologies more suited to encounter or ambush hunting strategies that require powerful modes of locomotion (such as sprinting), as opposed to long-distance endurance running associated with pursuit hunting. The latter has a long evolutionary history on the African grasslands (e.g. Liebenberg 2006; Lieberman et al. 2009) and is often associated with more recent bow hunting.

Collectively, these case studies indicate change in landscape use and physiological/genetic adaptation in Neanderthals to hunt effectively within their ecological settings. This is in contrast with the technological and associated cognitive flexibility in responding to socio-ecological change as reflected in the innovation of, for example, bow hunting by H. sapiens in southern Africa at the same time.

Concluding Discussion

Here, we have explored a four-field co-evolutionary model as tool to improve understanding of human cognitive evolution in the context of feedback loops between biology, technology, society, and ecology. Applying the model to the pre-50 ka MSA/MP time slice enabled us to discuss variation and overlap between populations in terms of their cognition. The result of our endeavour is a comprehensive and varied theoretical account of the development of the MSA/MP human mind. By placing cognition in the centre of our model, we were able to show that brain-selective genetic variants almost certainly contributed to variation in human brain size, shape, and connectivity. Especially, dissimilarities in the precuneus, cerebellum, and thalamus connect to how higher-grade causal reasoning developed differently in H. sapiens and Neanderthals. This includes variation in memory processing, verbal fluency, brain connectivity, and emotional and social processing.

Variations and alternative development between the two groups is also evident in how socio-technical change and persistence developed through time. Our analysis shows that we may predict similar levels of symbolic behaviour and intentional teaching in H. sapiens and Neanderthals, and we speculate that such similarities may have facilitated teaching and learning as well as cultural exchange between the populations when they encountered each other. However, divergences in the range and structuring of social networks would have influenced cognitive and neural adaptation in different ways, linking the unique evolutionary trajectory of the precuneus in H. sapiens to developed socio-technical engagement and grade 7 causal cognition. We also see differences in the innovation, stabilisation, and destabilisation of stone point technologies. Notwithstanding spatio-temporal variations, the Levallois reduction approach illustrates a general long-term, stable technology used by both Neanderthals and H. sapiens. Yet, whilst Neanderthals did not develop this technology into something completely different, stone point technology changed radically with later H. sapiens groups in present-day sub-Saharan Africa.

The external body’s impact on MSA/MP cognitive evolution is not clear-cut. Adaptation to throwing, for example, was probably grounded in the common ancestor of both Neanderthals and H. sapiens, who are also associated with spear hunting and hafting (Lombard and Haidle 2012). In relation to these techno-behaviours and their linked grades of cognition, we do not see any traces of independent development in the two populations. As a result, spear throwing indicates similar behaviour and cognition. Evidence of bow hunting for H. sapiens during the later stages of the MSA/MP, however, shows divergence in terms of biology, technical range, and cognition. This also relates to how techno-behaviours evolved in relation to changes in ecology. Whilst Neanderthals in the present-day Levant adjusted their hunting range still using the same technology for hunting the same prey to adapt to ecological change, H. sapiens invented bow-and-arrow technology, allowing them to vary group size and prey range, and increasing the success rate of individual hunters, thereby dramatically escalating their fitness profile. Consequently, socio-technical invention in combination with cognitive flexibility in response to ecological change made H. sapiens “less restricted in the landscapes which they could inhabit” (Galway-Witham et al. 2019: 8).

Galway-Witham et al. (2019) draw our attention to significant gaps in the understanding of the relationships between different MSA/MP populations, and how variation in biology, socio-technical development and ecological adaptation plays out over time and space. Whereas they conclude that “the behavioural gap has narrowed to a point where there seems to be little difference between the two” (Galway-Witham et al. 2019: 16); our exploration in the context of a four-field co-evolutionary model, provides more nuanced insight into probable variation and similarities between the pre-50 ka Neanderthal and H. sapiens populations. This does not imply that aspects of Neanderthal cognition were inferior to that of H. sapiens, or vice versa. Instead, it shows contextual adaptive variation and that Neanderthal-specific cognitive developments deserve more attention, instead of simply equated to the H. sapiens mind when they display complex cognition. For post-50 ka populations, we argue that what is represented in the Eurasian archaeological record is probably the product of mixed populations (to a larger or lesser extent), instead of belonging solely to either the Neanderthals or to H. sapiens.

Klein (2019) suggests that novel gene constellations post-50 ka gave rise to ‘fully modern’ people and the Later Stone Age in Africa. The fact that the precuneus in its current form is, however, already visible in the palaeoneurological record from ~ 100 ka during the MSA, reveals a different scenario. Also, initial work on the genomes of living Khoe-San and ancient hunter-gatherer populations from southern Africa indicates selection for brain-selective gene regions associated with neuron connectivity at more than 300 ka (Schlebusch et al. 2020). Based on what we presented here as part of our four-field co-evolutionary exploration, we suggest that the current evidence supports a scenario of long-term, incremental physiological, neurological, genetic, and socio-technical developments that gave rise to the sapient mind since the H. sapiens-Neanderthal split several hundred thousand years ago, instead of an abrupt appearance post-50 ka. We were able to highlight only a few aspects within each of the four fields of our model, and much work remains. Our current perspective, however, is that the MP Neanderthals shared some ways of thinking with H. sapiens but that there were also differences in how the two populations understood and interacted with the worlds they lived in before encountering each other.