Language is biocultural behaviour (Darwin 1871; Sapir 1927; White 1940; Deacon 1997; Tomasello 2005; Christiansen et al. 2009; Fitch 2010; Arbib 2018); thus, research into its origins is necessarily an interdisciplinary exercise. Models of language origins typically integrate social, cognitive, anatomical and genetic data as well as broad comparative perspectives drawn from ethology (Tallerman and Gibson 2012). Archaeology provides the critical time depth for model building. Although there is broad agreement that symbols are crucial to language, there is profound disagreement on what constitutes language, when it evolved and on the interpretation of the material evidence (e.g. Noble and Davidson 1996; Deacon 1997; Corballis 2002; Hauser et al. 2002; Everett 2017; Fitch 2017; Boë et al. 2019).

We take a uniformitarian approach which assumes language evolved by natural selection from a primate heritage of vocal and gestural communication. Our theoretical foundation combines Peirce’s semiotics (1977), which distinguishes between index, icon and symbol, with ethnolinguistic data which challenge preconceptions about the inherent grammatical complexity of language (Everett 2005; Jackendoff and Wittenberg 2014).Footnote 1 Both sources enable us to broaden the search for the beginnings of language beyond the current consensus among archaeologists on what constitutes evidence of symbol use (e.g. Klein 2017). Comparative ethnographic and anatomical evidence also shows that language, defined here as communication based on symbols, does not depend on either a broad vocal repertoire or a fully modern vocal tract (Böe et al. 2017; Fitch 2018). We use these data to offer a model for a simple grammatical structure in the earliest language, with recursive grammar a later and non-essential component of language.

Sociological, ethnographic and ethological observations provide evidence of a central role for tools in the construction of society (Pfaffenberger 1992; Latour 1992; Hodder 1994, 2012; Gosden and Marshall 1999; Ingold 2001; Skibo and Schiffer 2008). Contemporary societies have names for tools and conventions for their making, and they carry expressive meaning beyond their utilitarian ends (Arthur 2018). In Peirce’s semiotic scheme, names are symbols, and by implication, the earliest evidence of symbols lies in conventional tool forms and the strategies for making them. Our summary in this paper of Peirce’s scheme has a secondary aim which is to reintroduce the study of signs to evolutionary cognitive archaeology as a complement to current models drawn from cognitive science (Wynn 2017). We do not set out to offer an entirely new theory of the origin of language, but rather a new perspective on the evidence base that supports the thesis that Homo erectus had language.

We begin with a brief review of the philosophical and historical context of the current debate over language origins, highlighting the contrast between punctuated and gradualist models. The hypothesis of a recent and rapid appearance of language, as defined by symbols organised in complex nested grammatical structures (recursion), continues to dominate interpretations of the archaeological record (e.g. Bolhuis et al. 2014; Klein 2008, 2017). This non-Darwinian perspective on language origins is founded on the work of the linguist Chomsky (1955, 1965). Proponents of gradualist hypotheses tend to posit a protolanguage phase which precedes the emergence of recursion-based language (e.g. Donald 1991; Corballis 2002; Bickerton 2014). We highlight previous applications of Peirce’s theory of signs to the issue of language evolution (Deacon 1997, 2010; Cousins 2014; Everett 2017). Our approach differs in accepting symbol use with a simple grammar as sufficient evidence for the existence of language with no need for a protolanguage. A three-part evolutionary typology of grammars lies at the foundation of Everett’s model(2017), in which symbols arose as a distinctive form of communication based on arbitrary conventions of meaning generated in cultural contexts (Everett 2016).

We then outline a theoretical foundation that defines symbols and considers the social contexts of symbol use in relation to technology. First is Peirce’s theory of signs and his concept of a semiotic progression from icon to index to symbol (Peirce 1998; Fisch 1986). Second, we draw generalisations about tools as symbols from observations by sociologists and anthropologists of contemporary and pre-industrial societies. These observations highlight the social construction of the meaning of tools and how decisions about production methods reflect social conventions (e.g. Latour 1992; Killick 2001). This section concludes with an assessment of the non-human primate capacity to generate perceptual and conceptual categories of objects (Grüber et al. 2015) as evidence of a deep evolutionary foundation for constructing symbols. Modern humans are distinctive among animals for using tools as symbols.

We then examine the early archaeological record for evidence of socially constructed conventions (symbols) with a focus on the Acheulean of Africa and Eurasia from about one million years ago onwards when conventional tool forms become a recurrent feature of the archaeological record. The evidence takes the form of regional and chronological changes in approaches to making large bifaces (cleavers, hand-axes), and in the life history of these technologies which demonstrate spatially extended chaîne opératoires including the caching of tools in the landscape (Preysler et al. 2018). Multiple ways of achieving similar ends (equifinality) become evident in core preparation strategies at this time (Sharon 2009; Galloti and Mussi 2017) which we interpret as evidence of culturally governed choices among viable alternatives (Latour 1992; Pfaffenberger 2001). Semantic scaffolds (words or gestures as labels) would have eased the cognitive demands created by some core strategies which involved nested hierarchies of steps in blank production (Herzlinger et al. 2017). Language (speech and gesture based) would also have facilitated the teaching of such complex routine to novices (Morgan et al. 2015; Gärdenfors and Högberg 2017:201). Evidence in the Acheulean for the caching of hand-axes is indicative of extended future planning, and arguably for abstract thought which is the foundation of symbol construction (Gärdenfors 2004). Language without complex grammar was sufficient for the transmission of all these aspects of Acheulean technological behaviours.

Additional support exists for an early emergence of language in the settlement of island Southeast Asia by hominins ~ 800,000 years ago (Bednarik 1997, 2014; van den Bergh et al. 2016; Ingicco et al. 2018). Early sea crossings arguably involved levels of coordinated planning and action that exceed the communicative capacity of gestures alone.

The structure of our argument, building on Peirce, addresses five questions raised by Ingold (1993):337) and others since (Noble and Davidson 1996; Corbey et al. 2016; Tennie et al. 2016; Shea 2017) on the utility of hand-axes as evidence for early language: (1) can the longevity of the hand-axe (and cleaver) as forms be evidence of cultural norms given there is no modern analogue for such persistence; (2) does such persistence necessitate cultural transmission; (3) did the objects conform to a representation in the mind of the maker; (4) do they tell us anything about hominin sociality; and (5) might they have had “communicative or semiotic as well as technical functions?” We return to these questions in the discussion and conclude with the implications of attributing language to Homo erectus and erectus-like species.

From Plato to Chomsky: Epistemologies of Language Origins

A fundamental division characterises current research on how and when language began. The split lies along deep philosophical fault lines that separate Platonists—who believe in universal or innate ideas shared by all humans (Defez 2013)—and the Aristotelian view of language as an inherently cultural phenomenon, learned in social contexts from a young age (Corballis 2002; Tomasello 2005, 2014; Everett 2016, 247ff) and based on neurobiological capacities for acquiring language (see Tallerman and Gibson 2012 for a summary of a debate on language specific vs. generalised biological structures for language learning).

These contrasting positions formed the basis of discussions on the origins of language in the eighteenth and nineteenth centuries. Plato’s perspective of language as an innately human faculty was transformed into a theological position of human exceptionalism explained by the divine origin of reason (Müller 1864). The Société Linguistique de Paris, in 1866, famously decreed that it would no longer discuss the issue at its meetings as it was an insoluble metaphysical problem (Defez 2013). Darwin (1871) took a more broadly comparative approach to the problem of language origins, finding continuity between human and non-human forms of communication. Natural and sexual selection supplanted, in his view, essentialism as mechanisms for understanding how language evolved. Darwin’s gradualist view of language origins follows from his view of evolution as an accumulative process that can produce complexity. New traits emerge from existing traits, and abilities related to human language will be found in other species, and particularly among primates.

Platonism returned in force in the mid-twentieth century with the work of Chomsky (e.g. 1956, 1965, 1959, 1995). In his Transformational-Generative Grammar (or Minimalism), language is a grammatical system above all else. Chomsky’s embrace of Cartesian dualism leads him to reject Darwin’s idea that we might find the precursors of human language in other species (Berwick and Chomsky 2016). Indeed Chomsky and his followers have argued explicitly against Darwinism (e.g. Piatelli-Palmarini 2010), in favour of the position of Alfred Wallace that language could not result from Darwinian evolution. Bickerton (2014) refers to this as “Wallace’s Problem”.Footnote 2

In the late twentieth century, the case for language as product of gradual natural selection was articulated by Pinker and Bloom (1990). More recently, the evolution of language has been framed in the context of more holistic approaches to cultural evolution which recognise the importance of social learning in the acquisition of language (Richerson and Boyd 2005; Tomasello 2005), and in the gradual development of linguistic structures (e.g. Christiansen and Kirby 2003; Steels 2012; Hurford 2004, 2014).

Models of Language Origins and the Interpretation of the Archaeological Record

Given Chomsky’s enormous influence in linguistics and related disciplines, a philosophical divide continues between supporters of a recent punctuated origin of language and those who maintain a gradualist evolutionary position (summarised in Tallerman and Gibson 2012; Haspelmath 2020). The material evidence used by both camps incorporates both the archaeological and fossil record, with inferences drawn about the need for language (symbols) in relation to the hierarchical complexity of a task (Wynn 2002), and from the fossil record in relation to the capacity to produce speech as a component of language (e.g. Lieberman 2007). We start with the essentialist position of Chomsky and illustrate its lasting impact on archaeological theory and method. The gradualist position lacks a figurehead and instead manifests itself in a variety of accretionary hypotheses including our semiotics-based position presented here.

A Punctuated Origin

The most enduring model developed since the 1950s is that of Chomsky, in which human language is distinguished from other forms of communication by the presence of hierarchical recursive grammar generated by a computational system in the brain, independently of cultural context (Chomsky 1965; Chomsky and Scutzenberger 1963; Hauser et al. 2002). Recursion involves embedding sub-phrases into phrases of similar type, and in theory enabling an unlimited range of sentences (and meanings) to be constructed from a limited range of sounds. According to this innatist view, all modern humans are born with this uniquely human faculty for producing language with recursion (universal grammar) which arose suddenly in Homo sapiens from a genetic mutation in the brain sometime between 70,000 and 50,000 years ago (Bolhuis et al. 2014). The most relevant archaeological evidence for language takes the form of proxies for symbol use because “language is interdependent with symbolic thought” (Bolhuis et al. 2014:3). Botha (2010):202) adds the requirement of a bridging theory between claimed evidence for symbol use and fully syntactical language (or recursion). Such a theory should incorporate testable hypotheses, such as those drawn from neuroscience, marshal factual evidence and not be ad hoc. At the core of this approach is a computational model of the mind in which the language mutation represents a marked increase in information processing capacity, independent of cultural context.

The proposition that recursion is the essence of language has never been fully accepted by all linguists (see Tallerman and Gibson 2012), but it entered the mainstream of archaeological interpretation in the 1970s in a regional analysis of the Middle to Upper Palaeolithic transition in southwestern France (Mellars 1973). Stark contrasts were drawn between the two behavioural records produced by two different species, Neanderthals and Homo sapiens respectively. These became the unintended foundation of the concept of a more general “Human Revolution” (Mellars 1989, 2005) in which symbol use and complex (recursive) language marked the emergence of behavioural modernity (Henshilwood and Marean 2003).

The human faculty for producing recursive grammar, or its equivalent “fully syntactical language”, features consistently as the key advantage that Homo sapiens possessed over other hominins, especially in relation to Neanderthals. The Middle to Upper Palaeolithic transition reflects this underlying difference in communicative superiority, with anatomically modern humans able to produce a range of behaviours far beyond the capacity of Neanderthals (Mellars 1973, 1989, 2005). Complex language enabled the development of new kinds of standardised stone tools (blades), organic artefacts, long-distance transport of materials, new subsistence behaviours and objects bearing symbolic value as well as the capacity to innovate quickly. Symbolic value was recognised to reside in abstractions such as cave and portable art as well as personal jewellery and the act of burial with grave goods.

The relatively abrupt transition from the Middle to Upper Palaeolithic marked a symbolic explosion which “must reflect the existence of relatively complicated and highly structured forms of language” associated with H. sapiens (Mellars 1989:359). Similar interpretations were made of this transition in the 1980s and 1990s (Chase and Dibble 1987; Davidson and Noble 1989; Byers 1994) with the more recent addition of demographic superiority as a consequence of the human capacity for innovation founded on fully syntactic language (Mellars and French 2011).

Elements of the “Revolution” have since been found in the African Middle Stone Age (from 300,000 years ago with regionally variable end dates) associated with Homo sapiens, supporting arguments for an earlier development of symbol use in Africa than in Europe (McBrearty and Brooks 2000; Henshilwood and Marean 2003; Barham and Mitchell 2008; Wadley 2015). This evidence has been incorporated into the essentialist paradigm as evidence of the language mutation occurring as early as 70,000 years ago with Homo sapiens in Africa (Bolhuis et al. 2014), or even later once there is consistent rather than episodic evidence of symbolic behaviours in the African record (Klein 2008, 2017; see Fisher 2017 for a critique of the genetic evidence).

The latter interpretation takes an absolutist position that Dawkins, in a blogpost (2011), calls the “tyranny of the discontinuous mind” which is “blind to intermediaries”. Clear discontinuities should exist, in this extreme view, between the modern human capacity for recursion-based language and the more limited linguistic capacities of other hominins (Zilhão 2019). Recent discoveries of evidence for the capacity of Neanderthals to create a range of symbolic objects appear to give this hominin membership in the once exclusive club of symbol makers (e.g. d’Errico and Stringer 2011; Finlayson et al. 2012; Aubert et al. 2014; Villa and Roebroeks 2014; Jaubert et al. 2016; Hoffmann et al. 2018). There have been challenges to the claims of Neanderthal authorship of rock art based on issues of contamination with the dating, and similarly with the early dates attributed to some personal ornaments (White et al. 2019; Pons-Branchu et al. 2020).

An extended assessment of the evidence for Neanderthal symbol use and language concludes that to organise the hunting of large game they had to refer to abstractions of space and time in the planning (i.e. not here, not now). To do so required the capacity to construct “arbitrary Saussurean linguistic signs” Botha (2020):155) which in Peirce’s semiotics (below) would be symbols. He concludes that they lacked the necessary brain structures to produce complex grammar (recursion), but may have had the capacity to string together simple sentences. In the gradualist model developed in this paper, the capacity to create symbols is sufficient for language with no need for complex grammar to communicate complex thought. If we attribute this capacity to Neanderthals, then parsimony points to an earlier origin of language with the common ancestor of H. sapiens and Neanderthals (Deacon and Wurz 2001), now thought to have existed at least 600,000 years ago (Martinon-Torres et al. 2018; Welker et al. 2020), or to convergence through separate, independent evolution. The first position opens the door to the roots of language with Homo erectus or its descendants, and the second suggests the foundations for symbol making were widespread among other hominins, with the possibility that language evolved independently more than once.

A Gradual Evolution of Language

Gradualist models have a long pedigree (Darwin 1871), but placed in the time frame of Chomsky’s influence, a variety of approaches have emerged that vary in emphasis on the biological or cultural factors influencing the origin of language, and in their interpretation of the archaeological record (e.g. Donald 1991; Dunbar 1996; Noble and Davidson 1996; Mithen 1996; Power 2009; Corballis 2002; Bickerton 1990, Bickerton 2014; Coolidge and Wynn 2009; Rossano 2010; Lombard and Gärdenfors 2017). Deacon (1989; 2010), Cousins (2014) and Everett(2017) stand apart from other gradualists in using Peirce’s theory of signs. None is an archaeologist, which is noteworthy given the rarity of engagement with Peirce by Palaeolithic archaeologists (Iliopoulos 2016; Wynn 2017; Ruck and Uomini in press). This reluctance by archaeologists to apply semiotics to the deep past may reflect unfamiliarity with Peirce’s work, or resistance to it because of its association in recent decades with structuralism and the post-structuralist critique of positivist science (Preucel 2006). In this context, the work of evolutionary biologist Deacon (1997) marks a key development in using Peirce’s triad of signs (icon, index and symbols) as a framework for the evolution of human consciousness. He argues that only humans represent or give meaning to experience through arbitrary symbols (language) and that Homo erectus had the capacity to form language-based societies, but lacked the anatomical ability to produce articulate speech, citing Lieberman’s reconstruction of the anatomical constraints of the pre-sapiens larynx (Lieberman 1984). These societies communicated using a mix of limited sounds that carried symbolic meaning coupled with gesture, and over time, a linguistic niche evolved (though cf. Everett 2016, 170ff for a critique of “niche construction theory”). The coevolution of an extended childhood and articulate language followed a Baldwinian trajectory which favours the selection for traits which facilitate social learning (Deacon 2010).

Deacon’s characterisation of the limited capacity for articulate speech with H. erectus plays a critical role in his gradualist model of a developing language niche. That status of the vocal tract as critical to articulate speech production has since been challenged (see Laitman 1984; Boe et al. 2013; de Boer 2017; Fitch 2018; Boë et al. 2019 for syntheses of human and non-human primate evidence and Dediu et al. 2017 for variability of the vocal tract in modern human populations). The fossil evidence now indicates that modern-like speech and auditory capacities had evolved by at least 430,000 years ago in the ancestor of Neanderthals and Denisovans (Martínez et al. 2004, 2008; Gómez-Olivencia et al. 2007; Dediu and Levinson 2013; Steele et al. 2013; Aboitz 2018). The neurological control of breathing to produce articulate speech may have evolved as early as 1.8 Ma with Homo erectus, but was not present in australopithecines (Meyer 2016; Meyer and Haeusler 2015; cf. MacLarnon and Hewitt 2004).

Comparative linguistic data provides additional support for the observation that only a few sounds are needed to produce language (Newbrand 1951; Firchow and Firchow 1969; Everett 1979), and the majority of the world’s languages (60–70%) employ tones to distinguish words (Yip 2002) along with other prosodic features that rely on laryngeal features that do not implicate the vocal apparatus directly (Everett 2012). Homo erectus, and other hominins, could have used tones to supplement a small phonemic inventory to clarify, as all tone languages do, words that might otherwise sound alike.

Cousins (2014):163), a cultural psychologist, uses Peirce’s framework to argue for a “semiotic coevolution” of the capacity for meaning-making with supportive cognitive, social and vocal structures. Agreed meaning is only adaptive in the context of “culturally grounded knowledge about the world – conventions, narrative, beliefs” (Cousins 2014:164). In this model, cultural knowledge emerged from tool-making, starting with the Oldowan, as a physical nexus for cooperation between individuals. Tool-making, language and social learning co-evolved, creating a distinctive cultural niche. As with Deacon, Cousins (2014):164) posits an initial protolanguage based on a few words (symbols) which gradually evolves through Baldwinian selection into more a grammatically complex language.

Everett (2008, 2016, 2017) applies his perspective as an ethnolinguist, with a long experience working among South American hunter-gatherers and horticulturalists, to developing a model of language evolution that draws directly on Peirce’s theory of signs. Underlying Everett’s approach is a three-stage typology of grammatical complexity that recognises the variability observed among contemporary languages, including those lacking recursion, as found in some small-scale societies (Jackendoff 1999; Everett 2005; Gil 2009; Jackendoff and Wittenberg 2014). A meta-analysis of the morphological and syntactical structures of > 2000 languages has shown a significant correlation between group size and language structure (Lupyan and Dale 2010). Speakers of languages in small societies use fewer words, but more inflection to express meaning than speakers of languages in large groups who typically rely on increased word content and grammatical complexity to convey meaning.

In Everett’s typology, the most basic grammar, referred to as G1, has a linear word order (subject-verb-object) that conveys meaning (Fig. 1). G2 languages have hierarchical structures but no recursion (Fig. 2), and G3 languages have recursion (Fig. 3) (Everett 2017: Chapter 9). In this hierarchy of grammars, there is no need for a protolanguage in language evolution; a G1 language is sufficient to convey nuanced, abstract meaning. G1 languages evolved first, with recursion a late and unnecessary expectation for early languages (Karlsson 2009; Everett 2012). G1–G3 coexist today with G1 and G2 languages found in societies without written languages (Everett 2005; Gil 2009).

Fig. 1
figure 1

ac Three diagrams illustrating the linear sentence structures enabled by G1 languages

Fig. 2
figure 2

An example of the hierarchical nesting of sub-phrases in a G2 language

Fig. 3
figure 3

Diagram of the embedded structure of a G3 language with recursion

The empirical differences in these three grammars are illustrated diagrammatically using sentences 1–3, in Fig. 1a–c:

  1. 1

    John came in the room. John sat. John slept.

  2. 2

    John entered the room by the garden. John slept.

  3. 3

    John came in the room, sat, and slept.

The illustrations in Fig. 1a–c conform to a G1 grammar.

In these diagrams, there are no category labels, e.g. “noun” or “verb”, and no phrase labels, such as “verb phrase”. The simplest grammatical structure would be a linear arrangement of words as a proposition/sentence. There are modern languages represented by G1 grammars, for example, Pirahã (see also Futrell et al. 2016; Everett and Gibson 2019) but also Warlpiri, Wargamay, Hixkaryána, Kayardild, Gavião and Amele among others (Pullum 2020).

A G2 grammar would allow the structure in Fig. 2 which shows hierarchical nesting of sub-phrases.

A G3 grammar would allow structures such as that shown in Fig. 3.

Two sentences are contained in or “dominated by” the highest sentence making this a grammar without constraints on recursion.

Everett(2017) uses Peirce’s theory of signs (below) to outline an evolutionary pathway to symbol-based language based on speech and gestures. The archaeological record of Homo erectus provides the material evidence for concluding that this hominin used symbols and at least a G1 level of language to transmit complex cultural knowledge (Everett 2016). We develop that evidence in detail here.

Defining and Recognising Symbols; Peirce’s Semiotics

Between the late 1800s and his death in 1914, Peirce developed one of the most comprehensive philosophical programs since Aristotle. Semiotics, the theory of signs, was Peirce’s focus and touchstone (Peirce 1992, 1998). His symbolic system was the result of neither nature nor nurture, but was constrained by logic (as it in turn constrained logic), a theory opposed to Cartesian dualism, introspection and intuition, all of which Peirce considered deeply unscientific. Perhaps because of the popularity of the simpler, dyadic semiotic system of Saussure (1916 [1983]), those unfamiliar with the triadic Peircean system might be excused for confusing signs and symbols. Whereas Saussure postulated only a dyadic sign-form-meaning composite, Peirce postulates a triadic theory of signs.

Peirce contended that all living systems communicate with their surroundings by responding to visual, acoustic and chemical cues (signs); a founding principle of biosemiotics (Barbieri 2008) and zoosemiotics (see Delahaye 2019 for an overview of these fields). In this framework, signs communicate an object to an interpreter, and the response by the interpreter is called the interpretant (Peirce 1998). Most signs (indexes and icons, below) do not require conventions to understand and respond to the cues, but humans in particular generate meaning from signs based on socially learned conventions (symbols).

The ability to use symbols exists among non-human primates as in the case of the bonobo, Kanzi, who was taught by humans to communicate using visual symbols (Gibson 2002; Savage-Rumbaugh et al. 2004). Vocal symbols also exist among some primates, as in the case of vervet monkeys which learn over time how to respond to the group’s alarm calls linked to specific external threats (Ribiero et al. 2006). Vervet symbol use, however, differs from the human faculty for using symbols to generate a potentially infinite number of new combinations and meanings (Piantadosi and Fedorenko 2017).

Peirce’s theory of signs encompasses a wide empirical range, and we discuss only five key components needed for understanding our claim that H. erectus possessed a symbolic system and language: icon, index, symbol, object and interpretant.

Icons resemble their referents (objects). They are not merely reflections, photos or drawings and can be anything which resembles “in some way”. For example, ground moisture level can be a cue or icon, “telling” an earthworm to surface. When an earthworm “decides” the amount of water that passes its threshold, the amount of water is an icon of maximum tolerable exposure. A human face’s reflection in the water is an icon of the face (and other faces generally). In grammar, examples of iconicity can be seen in the fact that prepositions with more content (“before”, “towards”) tend to be longer than prepositions with less content (“to”, “in”).

Indexes signal a spatial, temporal or other physical relationship with the object. A mouse rustling in grass is an acoustic index-sign to a cat. Humans also use indexes (smells, footprints, sounds) and images, and natural tolerances, such as temperature, taste and texture, but use more complex versions of these signs. Indexes may be pronouns like “here”, “there”, or simply pointing to something where the line from the pointing appendage to the object is an imaginary connection.

A symbol is in general any sign by which the form signals its meaning by a conventional cultural interpretation, linking object, interpretant and the sign. The symbol “dog” means Canis familiaris in English because the culture from which “dog” emerged valued this concept and agreed (by practice) to link the phonetic form, i.e. oral sign, [dɔg] with the object, a specific dog or the class of dogs, via a culturally agreed interpretation.

Indexes and icons in language function only because their forms and relations are conventional, that is they are simultaneously symbolic and indexical, symbols-as-icons and symbols-as-indexes. This multiplicity of meaning also applies to material objects, such as a steel butter knife which operates simultaneously as an icon of the category of knife, an index of the metal, its properties and intended function/spreading movement, and as a symbol of the process of preparing food or the habitual time of use, such as breakfast. These multiple functions coexist in the object, and as habituated users we are unaware of these learned associations and the range of interpretations they represent. Humans and animals overlap in using indexes and icons and needing to interpret them, they differ in that humans use and create symbols habitually, and no known non-human systems require or manifest culturally productive symbols (Hurford 2004; Piantadosi and Fedorenko 2017).Yet no human language lacks symbols (Everett 2016), and we have the socio-cognitive foundations for creating symbols (Callaghan 2020).

Once symbols have arisen through convention (e.g. recognising a tool as more than an icon and an index, but also a symbol of craftsmanship, cultural purpose and personal identity), how does this new set of conventional signs acquire a grammar? Bates and Goodman (1999), Goldberg (2019) and Fedorenko et al. (2012), inter alia, offer a valuable clue. Symbols (what these authors refer to as words and “constructions”) are claimed to be not only logically prior to grammar, as Peirce would claim, but also psychologically foundational for grammar (Bates and Goodman 1999) and neurologically more significant than grammar per se (Fedorenko et al. 2012). The grammar of symbols becomes in this view, the “choice” of how to arrange the symbols of a particular culture (Everett 2012, 2017). This arrangement can be complicated as in many modern languages, but given the variation found in the world’s languages, there is no one model of complexity required for the first languages contra Chomsky (1995). Everett’s G1 is the simplest option for communicating meaning, and logically the earliest in a gradualist model of language evolution.

Chase (1991) considers stone tools as iconic objects created as a result of an understanding of the cause and effect relationship of the properties of stone in relation to the laws of physics. But as Cousins (2014):179) observes, there is nothing inherent in the stone that leads to an awareness of the variables to be managed in order to strike a flake from a core with consistency. The physical properties of the core, the hammer, and the control of the angle and force of blow are not inherent in the materials; they are interpretations made of the materials as part of a process of meaning-making. This is a semiotic perspective which then raises issues of the context of learning—is it shared intentionally through teaching (e.g. Morgan et al. 2015; Lombao et al. 2017) or learned individually by trial and error (Tennie et al. 2016)?

Wynn (1993):402) acknowledges that certain elaborated tools, like hand-axes, can be indexes of the hierarchical process of making the object and come to represent the maker. If the object represents an activity and the maker, and does so through repetition rather than shared intention, then in Wynn’s perspective, the hand-axe is an index. When shared intention is involved, then the object becomes a symbol. The question becomes how do archaeologists, as observers of the objects separated by deep time from the social contexts of makers and users, recognize shared intention in the Palaeolithic record? The question is not new (see Holloway 1969), and we incorporate the two criteria, restated by Davidson (2002):181), of Noble and Davidson (1996) into our analysis: “the manufacture of tools of preconceived form, produced outside the immediate context of use, must entail a representation of intention, something that we may consider indicative of language as communication using symbols”.

The difficulty of distinguishing between icon and symbol in objects which are unfamiliar to us is one reason archaeologists have focused on representational images in cave art as markers of symbol use (e.g. Mellars 1973, 1989, 2005). These images show contemplation and attention to meaning, but in the absence of other contextual data, representational (depictive) art is not symbolic. It is only iconic, but non-representational images, such as the abundant dots and grids in Upper Palaeolithic cave art (Bahn and Vertut 1997), have potential symbolic content given they are arbitrary, repeated forms.

Symbols can originate in many ways, exploiting the different senses, including visually, as with tools, and orally. Orally, symbols arise through sound symbolism, such as onomatopoetic words like “crash”, “bang” and “boom”. We can also see sound symbolism in clusters of sounds in words with similar meanings, such as gleam, glow, glitter and glisten. It can be seen in particular sounds that show intensity, such as tamp vs. tap, stomp vs step. Sound symbolism is common across the world’s languages (Sapir 1915; Urban 1988; Everett 1979). Each sign needs a physical form, and vocal sounds are the best solution to providing form for signs (Everett 2012).

An interpretant is necessary for the arbitrary content of symbols to be meaningful to a viewer or listener. A bridging component, the interpretant, can take the form of other signs and meaningful conventions: “In a world without interpretants a sickle and hammer would only mean a sickle crossed with a hammer. And Leonardo’s Last Supper would only be a very gloomy dinner or a meeting of thirteen unshaven men” (Eco 1976:1467). With material objects, interpretants may become part of the learned cultural knowledge, signalling aspects of the object that the viewer will recognize implicitly as meaningful. This meaning is ephemeral and context specific, as in the case of the butter knife. It is not accessible by a viewer separated in time, space and culturally from this implicit knowledge, but as with icons we can infer that interpretants existed when we find repeated (conventional) artefact forms and selection among a range of strategies for making these objects.

In summary, symbols are both necessary and sufficient conditions for language. Complex recursive grammar is not the point of origin for all human languages (contra Hauser et al. 2002; Berwick and Chomsky 2016), and grammatical structure alone is not sufficient for language; for any human syntax, each node in a syntactic tree must be labelled (e.g. noun phrase, verb phrase; Murphy 2015:715). Labels are symbols in the Peircean sense—conventional, categorising generalisations across different units of linguistic representation.

Tools as Social Conventions and Symbols

To support a claim that tools of the Lower Palaeolithic carried symbolic meaning, this section draws generalisations from sociological, ethnographic and ethological research about tool-making as socially learned, conventionalised knowledge. It starts with contexts of meaning generation and discusses the distinction between utilitarian and symbolic objects as a potential obstacle to a uniformitarian approach. A comparative assessment follows of the social contexts of tool use among non-human primates with a focus on chimpanzees as our closest genetic relatives. Their cognitive capacity to discriminate between kinds of tools is relevant in the evolution of the capacity to create symbols.

Tool use is widespread in the animal kingdom (Lefebvre et al. 2002; Beck 1980; Aunger 2010, Bentley-Condit and Smith 2010; Shumaker et al. 2011), but tool-making as the deliberate modification of an object is relatively rare among animals (Biro et al. 2013). The creation and sharing of tools in the human context differs from that of other animals in that it combines the material with the ideational. Human technologies materialise and sustain worldviews, identities, social relations and life-ways (Guindon 2015:79–80). Perhaps the most unusual aspect of tool use for humans is that tools become symbols, as well as functioning as indexes and icons (Pfaffenberger 2001).

The symbolic aspect of technology is well theorised and empirically supported in sociological studies of technologies in contemporary and historical contexts and in archaeological contexts with diverse and chronologically well-constrained data (e.g. Hodder 1982, 2012; Kopytoff 1986; Pinch and Bijker 1984; Latour 1992; Ingold 1993; Gosden 2005; Wallis 2013). The obvious limitation of this approach for archaeologists working with early to mid-Pleistocene material is that we do not have access to texts or verbal accounts that enrich sociological analyses. Nor do we have the broader range of material culture found in some later Pleistocene contexts with which to distinguish indexes and icons as well as a range of tool-making conventions, and we must contend with a discontinuous and often poorly dated record (Shea 2017). We can, however, draw inferences about the past existence of meaning-making in a semiotic sense from the judicious use of human and non-human analogues, recognising their inherent limitations (e.g. Wobst 1978; McGrew 2010), combined with experimental archaeology with direct application to the archaeological record (Stout et al. 2019). The latter generates observations on the social and cognitive processes involved in interactions with objects (Gärdenfors and Högberg 2017). Research in cognitive archaeology adds to the understanding of tool-making and use as embodied biocultural behaviours integrating perception and action within wider physical and social environments (Leroi-Gourhan 1993; Stout 2002; Stout et al. 2019; Malafouris 2013; Uomini and Meyer 2013; Fairlie and Barham 2016; Overmann and Wynn 2019).

Creating Meaning with Tools: Inferences from Social Constructionism

Social constructionists working cross-culturally among pre-industrial societies, and with an eye to the archaeological record, provide useful generalisations on symbol use applicable to the past. Killick (2004:573-4) outlines three basic differences between pre-industrial and industrial societies in relation to the social transmission of technologies, and the ideational roles of tools and technologies. The learning of technical skills takes place using a combination of language, gesture, imitation and guided intervention or teaching in what Csibra and Gergely (2011) call “natural pedagogy” (e.g. Draper 1976:210, learning leather-work among Ju/’hoansi children, Botswana). Technology shapes the social persona and world view of the individual, as among Nuer pastoralists of the Sudan (Evans-Pritchard (1976:89 [1940]) for whom their limited material culture serves as “chains along which social relationships run, and the simpler is a material culture the more numerous are the relationships expressed through it.” Theories of technology (ontologies) in pre-industrial societies are often linked to social processes and natural phenomena (Stout 2002). Gamo horticultural communities (Ethiopia) are one of the few remaining makers of stone tools, and perceive their tool-stone as a named living and social being with a life history that mirrors that of the tool-maker (Arthur 2018).

Among recent and historical hunter-gatherers, the cultural act of attributing symbolic value to raw materials is widespread (e.g. Gould et al. 1971, Australia; Tayanin and Lindell 2012, Southeast Asia; Brandišauskas 2016, Siberia; Guindon 2015, Canadian subarctic; and papers in Boivin and Owoc 2004 for cultural perceptions of soils and minerals). Objects also carry meaning as arbitrary conventions linking the object to social personas. The sharing of object names with social persona and personal identity is seen with the woman’s kaross among the Ju’/hoansi (chi!kan) which doubles as a colloquial term for “women” (Lee 1979:124); in the names of tools among the Netsilik (Canada) which are selected as personal names for individuals as protection from misfortune (Balikçi 1970:199–200); and among the Piraha (Brasil), the hunting bow (hóií) is used by men only, but the bowstring (hóií hoí) is made by the man’s wife, with the complete bow symbolising their union (Everett 2016). These examples show raw materials and tools operating simultaneously across the semiotic range with their material properties integrated into making and transforming systems of meaning (Wallis 2013:209).

Creating Meaning with Tools

As Pfaffenberger (2001:77–78) observes, tool-related activities are contexts for learning from others, for creating and maintaining relationships, for reinforcing world views; they are not passive settings limited to functional ends. Tools as symbols, icons and indexes bear multiple kinds of meaning and values depending on where they are made, used and seen. From almost the start of their lives, children learn the social value of objects, including tools, from adults who act as “symbol maker” with the child as pointing to things to make intentions clear, using objects in conventional socially agreed ways and talking to the child (Rodríguez and Moro 2008:111; Tomasello 2005; West 2018). The learning process is intimate, interactive, embodied and cumulative starting with perceptual categories moving to higher-level conceptual categories (symbols) (Sloutsky 2010; Trevarthen and Delafield-Butt 2013). The physical relation between infant and parent (intersubjectivity) and the joint attention given to an object are both critical to word (symbol) learning (Studdert-Kennedy and Herbert 2017). The cooperation involved in infant learning has parallels with a novice learning to make tools from an expert with words (speech and gestures) used to convey conceptually opaque actions and their consequences (Csibra and Gergely 2011; Barham 2013; Herzlinger et al. 2017). Simple utterances of just a few words, as in a G1 grammar (“hit there”; “turn it over”), can greatly enhance knowledge transfer (Laland 2017).

The study of social learning among hunter-gatherers provides insight into processes operating in recent small-scale, non-hierarchical societies and offers analogues of relevance here for the deeper evolutionary past (Marlowe 2005). Comparative studies show that at the community level, the transmission of knowledge and know-how is affected by demographic variables including size of age cohorts, rates of interaction between generations and with non-kin (Migliano et al. 2017). For example, among the egalitarian Aka foragers (Central African Republic), most early learning (80%) takes place between parent and child, and this form of vertical transmission promotes stability while allowing for some individual variation (Hewlett and Cavalli-Sforza 1986:932). From middle childhood on into adolescence, more learning takes place from peers and unrelated adults (Hewlett 2016). Cross-cultural data shows that learning to make tools is similar to the pattern seen among the Aka, namely transmission of knowledge from parents and older children to the novice (MacDonald 2007), with increased teaching (by verbal instruction, demonstration, pointing) in early adolescence related to more complex technologies and demanding activities such as big game hunting (Lew-Levy et al. 2017, 2018).

At the population level, quantitative modelling of social learning from an evolutionary perspective, predicts that the intensity of interaction between individuals and groups is more important for the transmission of information than is population size alone (Powell et al. 2009; Grove 2016). As the scale of analysis broadens to include social learning among Acheulean tool-makers, then issues of habitat instability, population isolation and local extinctions add to the list of factors that disrupt cumulative learning (Hopkinson et al. 2013).

Utilitarian or Symbolic?

Archaeologists have long recognised the difficulty of distinguishing style from function and by implication symbolic intent from functional design (Rouse 1960; Sackett 1982, 1986; Dibble 1987; Dibble et al. 2016; Davidson and Noble 1993; McPherron 2000). Standardisation of tool forms may indicate symbolic content, but only if not imposed by functional constraints (Gowlett 1996) or by selective bias imposed by archaeological typologies (Davidson 2002; Shea 2017). More problematical for a semiotic approach is the argument that artefacts can have “a practical function without having any symbolic significance whatever” (Chase and Dibble 1992:48).

From a social constructionist point of view, the distinction between symbol and function is a false dichotomy. The underlying source of this distinction is a dominant ideology in Western industrial society that leads us to expect that all behaviour should be goal-oriented, with a function that is a means to an end (Hodder 1982:164). Utilitarianism permeates our dark matter, (our unconscious, culturally articulated personal knowledge; Everett 2016) and archaeologists tend to be more comfortable equating symbol use with behaviours that do not have immediate functional value, such as ritual (Hawkes 1954). Utility and symbolic value, however, are inseparable from social conventions (Hodder 1982, 2012). A utilitarian purpose is a social construct (Skibo and Schiffer 2008), and “…even the most technical and mundane of acts implicates social aspects of life” (Hodder 1994:385). From the perspective of Peirce’s semiotics, every article produced by a human society has the potential to carry conventional meaning, such as the humble butter knife which carries meaning as an index, icon and symbol depending on the context in which it is seen. The challenge for archaeologists is to generate sufficient contextual information to identify levels of intention that reflect the use of symbols (Davidson 2002).

The extraordinary longevity of Lower Palaeolithic tool technologies poses a potential problem to the constructionist and semiotic perspectives as we have no modern frame of reference for such enduring conventions (Ingold 1993). Hodder (1994):385), however, suggests that the “continuity and stability of form indicates Lower and Middle Palaeolithic handaxes clearly were made using rules” and the rules were social constructs even if they were implicit from social conditioning. As discussed below, there is an enduring set of ergonomic principles embedded in the making of hand-axes and cleavers (Gowlett 2006). They may become implicit through experience or perhaps explicit as categorical concepts with semantic labels (Herzlinger et al. 2017).

Rules apply also to short-term “end-goal” technologies such as scrapers. The life history of scrapers from manufacture to discard reflects social conventions related to function, but also to ontologies of technology (e.g. Arthur 2018). At a practical level, lithic analysts can measure the variables that affect the effectiveness of a tool for a particular task (e.g. morphology, edge angle, use traces), and draw inferences on decisions made during the life history of the object (Preysler et al. 2018). Decision points identified by lithic analysts are etic observations, and though they can be independently verified, they do not reflect the meanings once held by their makers. Those meanings are context specific and lost to us, but the existence of some level of meaning or signification (icon, index or symbol) can be inferred from (1) conventions in tool forms, (2) selection among equally effective tool-making strategies and (3) in the choice to store (cache) tools for future use (below). Symbolic content resides in each of these contexts given they are arbitrary social constructs.

Conventions and Categories Among Non-human Primates

Conventions for tool-use also exist among non-human primates, and most relevant here are longitudinal studies of chimpanzees which form the basis of recognising local socially learned traditions or “cultures” (Whiten 2005). Byrne (2007):582) identifies signals of “culturally guided acquisition” in behaviours that are both intricate in complexity (multiple steps involved) and near uniform in a population. Among chimpanzees, the basic contexts in which tool use takes place include feeding, hygiene maintenance, threat displays, weapon use and amusement (Goodall 1986). The widest range of tool forms is associated with feeding. Local traditions are recognised in central and west Africa including in similar habitats, which minimises the role of adaptation as an explanation for variability (Whiten et al. 1999). Learning of tool use takes place in social contexts by imitation and emulation of others, by individual trial and error (Whiten et al. 2009; van Schaik and Burkart 2011; Sanz and Morgan 2013) and teaching using active intervention and provisioning of tools, typically from mother to offspring (Musgrave et al. 2020). Teaching appears to be more common where the technology is relatively complex with multiple steps in its making (Musgrave et al. 2020), an observation of relevance when considering the complexities of making hand-axes and cleavers (see below).

Chimpanzees and other non-human primates, however, do not meet Davidson’s (2002) criteria for symbol-based tool use. Although there are local traditions, tool forms are made with minimal elaboration when compared with human tools (Goodall 1986), and are task oriented, context specific and intended for immediate use (Gowlett 2015; Wynn and Gowlett 2018:25). Despite these limitations, there is evidence for the capacity to conceptualise objects not just in terms of their physical properties, but also as more general categories such as “tool” and types of tools (Goodall 1986). This level of conceptualisation is involved in human communication when establishing shared meaning for names, nouns and adverbs (Gärdenfors 2003; Medin and Rips 2005). Shared concepts are also essential for reaching understanding about objects or events not in the immediate environment, or of immediate experience. Symbols, whether vocal or visual, externalize these shared understandings. Bonobos and chimpanzees, trained to use symbols under controlled conditions, do use their training to communicate future intention, with one possible observation of symbol use in a natural context (Savage-Rumbaugh et al. 2004; Lyn et al. 2011). Non-human primates in the wild and in captivity can recognize perceptual categories of objects, and may form more abstract conceptual categories (based on kind, such as food, predators) (e.g. Queiroz and Ribeiro 2002; Seyfarth and Cheney 2003; Pedersen 2012; Vonk et al. 2013; Slocombe and Zuberbühler 2005). Chimpanzees, in their natural habitats do seem to recognize the differing properties of objects used as tools and can apply that understanding to other settings (Grüber et al. 2015:7).

As well as socially learned traditions of tool use, chimpanzees (and bonobos) have evolved multimodal forms of communication that integrate gestures, vocalisations and facial signals (Gillespie-Lynch et al. 2014). Gestural traditions of communication appear to be more variable in form than their range of vocalisations (Pollick and de Waal 2007). From the perspective of quantitative linguistics, the structure of chimpanzee gestures follows mathematical laws seen in the transmission of information in human language linked to frequency of word/gesture use (Heesen et al. 2019). The similarities in structure point to commonalities in primate communication that have great evolutionary depth (Boë et al. 2019).

Chimpanzee vocal repertoires are often characterised as context-specific impulsive (emotional) responses with a limited range or intention, but there is increasing evidence of variation in response to social context (Hopkins et al. 2007), to food types (Slocombe and Zuberbühler 2005; Kalan et al. 2015) and awareness of the perspectives of others (intentionality) (Crockford et al. 2017). The learning of new grunts for a particular food (apples) was recorded among chimpanzees transferred to a new zoo where the resident chimpanzee group had a different grunt for the same food (Watson et al. 2015). The incomers gradually learned the existing referential grunt, but only after social bonds were developed between the groups. This is evidence of the capacity for vocalisations linked to objects and learned collectively which lies at the root of symbol generation through constructing words.

Words in Peirce’s semiotics are symbols, and the labelling of objects is so entrenched in our learning of language that we take for granted this facility to categorise and focus attention on a class of objects (Clark 2011). Labels—not syntax—are at the core of language (even for some minimalist linguists, e.g. Murphy 2015), and at some stage in the gradual evolution of language, the transition from visual to verbal labelling took place (Corballis 2002; Gentilucci and Corballis 2006). If categorisation is emergent in non-human primates and ubiquitous among modern humans, then parsimony points to the evolution of symbol use—and language—long before Homo sapiens. Pedersen (2012) concludes, following a study of the ability of captive bonobos to acquire visual and auditory symbols, that language evolved from deep-rooted semantic and conceptual abilities in the last common ancestor of chimpanzees and hominins, some six million years ago, and in recent work, it is argued that the neural, auditory pathway for language evolved at least 25 million years ago among monkeys (Balezeau et al. 2020). The shared inheritance is based on biological and cognitive similarities in how humans and apes experience the world through their bodies and senses (Lakoff and Johnson 1999).

Lower Palaeolithic Tools as Symbols

Stone tool working constitutes the longest record of hominin technology, with the earliest evidence from 3.3 million years ago (Ma) in East Africa, pre-dating the emergence of the genus Homo (Harmand et al. 2015). Preservation biases favour stone over organic materials in the archaeological record with bone and horn core use found in South African cave deposits after 1.8 Ma in association with more than one hominin (Barham and Mitchell 2008). In East Africa, the earliest evidence of bone use comes from Olduvai Gorge between 1.8 and 1.6 Ma, probably associated with Homo erectus, and in the form of bone hammers and a bone hand-axe (Backwell and d’Errico 2005). The earliest evidence of woodworking takes the form of plant residues on 2.0 Ma tools from Kanjera South (Tanzania) (Lemorini et al. 2014), but the oldest probable wooden artefact is substantially later (~ 780 ka) in association with the Acheulean site of Gesher Benot Ya’aqov (Israel) (Belitzky et al. 1991), which also has early evidence for the control of fire (Alperson-Afil et al. 2017).

These non-stone technologies are relevant in the context of language evolution and semiotics because they provide evidence for the extension of the range of cultural choices for tool use to other materials. Our focus, however, is early lithic technology as it is the most widespread evidence base. The evidence includes conventions of tool forms, choice of manufacturing strategy and stages in the life history of a tool that indicate the concept of displacement or detached thought (Hockett 1960). Complementary sources of data drawn from evolutionary cognitive archaeology are incorporated into this section where relevant.

Icons to Symbols in the Archaeological Record

The archaeological record before 1 Ma is reviewed briefly here in setting the context for the evolution of symbol use and language. Using Peirce’s triad of signs, a tentative claim can be made for the early use of icons in the Pliocene which overlaps with the oldest evidence for stone-tool-making. The Oldowan Industry of the Early Pleistocene provides the backdrop of behaviours elaborated later in the Acheulean. These include strategies of raw material selection, learned techniques of core reduction and tool-making. Our focus then diverges with a focus on evidence for regionally variable strategies for biface making after 1 Ma, and another on the growing evidence for sea travel in Southeast Asia. Both behavioural complexes reflect, at a minimum, the use of G1 languages.

The earliest possible evidence of an intentionally interpreted and contemplated icon is associated with Australopithecus africanus at the site of Makapansgat Cave, South Africa. The deposits are dated to between 4.12 and 2.16 million years old (Herries 2003). A red cobble was found in the deposits and was probably brought to the site by an australopithecine rather than by natural processes (Bednarik 1998; Berlant and Wynn 2018). The cobble has erosional marks on both surfaces that resemble a primate face with eyes and mouth (Bednarik 1998). The physical resemblance to a face qualifies this object as an icon in our eyes, and presumably in the eyes of the hominin beholders. Other icons resembling human forms or elements of anatomy occur considerably later, after 800 ka in the North African and Southwest Asian records (Bednarik 1997, 2003; Marshack 1997).

The Makapansgat pebble is roughly coeval with the earliest stone working technology currently known. The site of Lomekwi 3, West Turkana, Kenya (Harmand et al. 2015) preserves evidence of the deliberate detachment of large basalt flakes using a block-on-block technique. Using the reasoning of Chase (1991), these flakes are iconic objects created as a result of an understanding of the cause-and-effect relationship of striking a block of basalt against a stone anvil. In Cousin’s (2014) semiotic coevolution, the process of making these flakes, which involves selecting the raw materials and applying force, is an act of interpretation (of physical properties) to create something new, and to do so more than once. In his Baldwinian model of the coevolution of language and technology, Lomekwi 3 marks an early emergence of a social learning niche among hominins.

For the time being, there is a gap of 700,000 years between the flakes and cores at Lomekwi 3 and the earliest Oldowan at 2.6 Ma (Stout 2011). The early Oldowan arguably marks the beginning of cumulative, learned culture with this contention supported by experimental replication of core reduction strategies that indicate learning by copying (Morgan et al. 2015; Stout et al. 2019). By 2.0 Ma, Oldowan-like assemblages of flakes, cores and a limited range of small retouched tools (scrapers, notches, denticulates) are found in Southwest and Central Asia, India and China (Barsky et al. 2018). Standardised tool forms are rare, but other behaviours relevant to the development of symbol are evident. The site of Kanjera South, Kenya (2.0 Ma) provides the first evidence for the selection and transport of raw materials up to 13 km to a central locality where a range of activities took place including stone tool-making, butchery of small antelopes (possibly hunted), working of wood and processing soft plant matter including underground storage organs (Braun et al. 2009; Ferraro et al. 2013; Lemorini et al. 2014).

The selection and transport of raw materials some distance from the intended place of use have cognitive implications in terms of foresight (planning, long-term memory). It may also indicate a social value (meaning) was placed on these materials. There is evidence from earlier in the Oldowan of the selection of raw materials and the carrying of artefacts across landscapes to favoured localities (Potts 1991; Kroll 1997; Stout et al. 2005). The broader social interpretation of the Kanjera locality is that it was repeatedly used by tool-dependent cooperative groups (Plummer and Bishop 2016). The pragmatics of symbol development and learning involve individuals interacting face to face in contexts associated with tools and their use (Gärdenfors 2004; Tomasello 2005; Rodriguez and Moro 2008). Kanjera South offers an early example of the kind of setting conducive to social learning that predates the evolution of Homo erectus.

The earliest evidence of large retouched tool forms marks the beginning of the Acheulean Technocomplex 1.75 million years ago in Africa, and the subsequent spread of its distinctive tools made on large flakes (> 10 cm) and blocks of stone into Southwest Asia, Europe, South Asia and parts of East Asia (de la Torre 2016; Barsky et al. 2018). The characteristic retouched tool forms include hand-axes, cleavers, picks and knives (Fig. 4a–c). Their making requires additional steps in planning compared with Oldowan cores and flakes, with greater spatial and temporal separation of stages of making and use (Muller et al. 2017). The hand-axe and cleaver are distinguished from Oldowan tools by their large size (> 10 cm), but particularly by their bilateral and plan form symmetry (Roe 1968; Crompton and Gowlett 1993; Shipton et al. 2018). Symmetrical hand-axes occur early in the Acheulean 1.7 Ma marking an elaborated attention to form over function which distinguishes these tools from Oldowan retouched tools (Diez-Martína et al. 2019). This focus on form becomes more widespread from ~ 1.2 Ma with some regional trends towards greater refinement (Shipton et al. 2018), but not in all parts of the Acheulean range (e.g. McNabb and Cole 2015). A broader range of small tools also occurs in the Acheulean some of which appear to be conventional forms such as awls, denticulates and scrapers (Isaac 1997; de la Torre and Mora 2005; Dominguez-Rodrigo et al. 2009), but our interest lies in the large retouched forms and their extended production sequences as evidence of early symbol use.

Fig. 4
figure 4

Late Acheulean large tools: a hand-axe (silcrete), Victoria Falls, Zambia; b cleaver (quartzite), Kalambo Falls, Zambia; c pick (quartzite), Kalambo Falls, Zambia (images copyright Chris Scott)

Homo erectus (sensu lato) is the hominin generally associated with the Acheulean up to 1.0 Ma (Antón et al. 2014), after which other taxa continued the tradition in Africa, Eurasia and South Asia (Moncel and Schreve 2016; Moncel et al. 2018). In Africa, hand-axes and cleavers were made as recently as 212 ka and possibly by Homo sapiens (Benito-Calvo et al. 2014; de la Torre et al. 2014). In Europe, hand-axes appear sporadically in contexts associated with late Middle Pleistocene Neanderthals (de Lumley et al. 2004; Preysler et al. 2018). In north central India, bifaces were still being made as recently as 100 ka (Shipton et al. 2013), and presumably by H. sapiens.

The stability of hand-axes and cleavers as symmetrical tool forms across the long span and wide geographical distribution of the Acheulean has sparked decades of speculation about their social and cognitive implications (see summary in Lycett and Gowlett 2008). At one end of the interpretative spectrum are theories of minimal behavioural intention involved in the making of these tools, and minimal social learning (Tennie et al. 2016). The shapes may have resulted from use as cores, from re-sharpening, from differences in raw materials or from an inherent perceptual bias for symmetry in hominins, or they were under some genetic control (Davidson and Noble 1993; McPherron 2000; White 1998; Hodgson 2015; Corbey et al. 2016). At the other end of the interpretative spectrum are claims for symmetry signalling genetic fitness or trustworthiness of the maker to conspecifics (Kohn and Mithen 1999; Spikins 2012), and more generally as deliberately imposed and socially transmitted forms (Shipton et al. 2018).

Experimental work has demonstrated the difficulty in producing symmetrical forms, and the importance of learned skill in managing the thinness of the tool and the straightness of the edges (Lycett et al. 2016; Shipton and Nielsen 2018). This research undermines the argument that learning to make bifaces is easy and could be independently invented by trial and error during the process of alternate edge flaking (Davidson 2002; Tennie et al. 2016). The argument that hand-axe symmetry reflects increased reduction intensity has been tested quantitatively with flake scar density and symmetry found to be largely independent variables (Shipton et al. 2018). Experimental work has also shown that raw material differences are not a primary limiting factor in hand-axe form (Lycett et al. 2016; García-Medrano et al. 2019; Key 2019). An innate human perceptual bias towards symmetry (Hodgson 2015) has also been challenged through experimental work (Shipton et al. 2018). The suggestion of some genetic control of symmetry is undermined by the temporal and regional variability in the Acheulean (Hosfield et al. 2018), and the absence of hand-axes in regions populated by Homo erectus despite having suitable raw materials (Wynn and Gowlett 2018). Hand-axe dimensions and shape can change with persistent re-sharpening or thinning (McPherron 2000), but intended shape (final form) is evident on bifaces made on flakes with little subsequent shaping (Sharon 2008; Li et al. 2017; Malinsky-Buller 2016; Preysler et al. 2018), and on cobbles (façonnage) indicating knapping to a plan (García-Medrano et al. 2019).

Hand-Axes as Standardised Forms

The debate on the intentionality of biface symmetry has shifted towards a consensus that although there is regional and chronological variability in these forms, the hand-axe and cleaver were socially transmitted, learned constellations of knowledge (Shipton et al. 2018). They meet Davidson’s (2002) criterion of standardisation and are not the products of expediency or figments of archaeological typology (cf. Shea 2017). Within the constellations that separate the hand-axe form (pointed, symmetrical) from cleavers (divergent, symmetrical) are potential interpretants (signs) that linked form with meaning (see “Discussion and Conclusion”, point 5). Of particular relevance is the case made for a set of six “design imperatives” or ergonomics-based variables linked to the use of these objects as hand-held tools (Gowlett 2006) (Fig. 5): (1) a rounded base to fit the hand; (2) extension of the working edge and thinned tip to maintain balance; (3) bifacial trimming to support the working edge; (4) extension of the sides to minimize twisting during use; (5) adjustment of overall thickness to control the weight and (6) a slight adjustment of the symmetry to work with the handedness of the user. This constellation of options provides the tool-maker with scope for variation around a basic size-shape framework, with decisions about the weighting of the variables made during knapping. These geometrical concepts carry meaning that may reduce the cognitive load in what is a demanding hierarchical, multivariate process of construction (Gowlett 2006:218).

Fig. 5
figure 5

Hand-axe and cleaver “design imperatives” (modified and redrawn after J.A.J. Gowlett 2006, Fig. 2, with the author’s permission). The “glob-butt” is the centre of the mass, typically at the butt end; “forward extension” provides leverage and is balanced by the weight of the butt-mass; “support for the working edge” in the extension provides a buttress for working edges in relation to the butt, and this applies to cleavers as well as hand-axes; “lateral extension” offers resistance to twisting during use, especially for long working edges; “thickness adjustment” addresses the need for adjusting the thickness of the mass and controlling edge angle

We cannot know which of the design rules signalled meaning, or if the overall symmetrical shape of the object was a bridging sign. In Peirce’s semiotic framework, a sign can be simultaneously an index, icon and symbol. Hand-axes and cleavers could be indexes of tasks to be performed (e.g. cutting, chopping); icons of one another (they represent a pattern of tool design); and symbols of the cultural values they were designed to support, such as the identity of the maker (Cole 2012), and appropriate contexts of use and discard. In Donald’s (1991) model of a gradual evolution of language, language becomes evident with the development of external forms for storing and transmitting conventional cultural knowledge. Externalised symbols require socially understood routes of access to their meaning which can be communicated through sight, touch, sound, gesture and speech (Donald 1991:131). Hand-axes and cleavers as enduring conventions of tool-making could serve as externalised storage of cultural knowledge, with the specifics of that knowledge inaccessible to the modern viewer, and not needed to interpret these forms as potential symbols.

Choice Among Ways of Making—Equifinality

The social constructionist approach to identifying social conventions seeks evidence of choices made where multiple options exist, each equally effective in satisfying an aim (Killick 2004). In the context of the Acheulean, options exist in the making of hand-axes and cleavers starting with the basic choice of reduction method. The tool can be made on a flake struck from a core (debitage) or by reducing a block or core (façonnage) (Gamble and Marshall 2001). The use of large flakes (> 10 cm) as blanks for these two tool forms appears from the very start of the Acheulean in East Africa (de la Torre and Mora 2005) and occurs widely, after ~ 1 million years ago, in Southwest Asia, India and Iberia (Sharon 2008, 2009, 2010; Shipton 2013; Preysler et al. 2018). Over this broad geographical range, Acheulean tool-makers devised as many as nine different strategies, each with multiple steps, for managing large cores to produce flake blanks (Sharon 2009; Shipton et al. 2013; Akhilesh and Pappu 2015; Li et al. 2017). These methods involve different approaches to handling three-dimensional volumes and working them hierarchically to produce blanks. The methods differ substantially enough that the decision to pursue one option precludes others, and needs to be taken early in the reduction process. There are regional variants as well with the Victoria West technique distinct to South Africa (Li et al. 2017) and the Tabelbala-Tachengit technique and the Kerzaz core method found only in small areas of North Africa (Sharon 2009). These three strategies are technically complex, with the Victoria West method, dated to approximately 1 Ma comparable in complexity of volumetric control to the Levallois technique associated with Middle Palaeolithic/Middle Stone Age technologies after 300 ka (Li et al. 2017).

The variety of strategies for meeting similar functional needs (equifinality) and their regional as well as chronological differences reflect capacities for innovation and social transmission across the Acheulean range (Sharon 2009). The complexity and standardisation of the prepared core approaches, such as Victoria West, have been interpreted as indirect evidence of technical knowledge learned through language (Sharon and Beaumont 2006). Experimental evidence from neuroimaging research supports the coevolution of neural networks that underpin language and tool-making (Uomini and Meyer 2013; Stout et al. 2015 and references within). The teaching of tool-making is hypothesised as the recurring behavioural context which coupled cognitive structures supporting communication and motor systems, leading to the evolution of language (Kolodny and Edelman 2018). We would add that the teaching of tool-making also involves the basic parent-offspring relationship of learning through physical proximity (intersubjectivity) and joint attention on a shared task (Studdert-Kennedy and Terrace 2017). Controlled experiments on learning to make stone tools provide more specific evidence that learning the nested hierarchical processes needed to make a hand-axe, such as alternate bifacial flaking, and edge and platform preparation (involving the non-dominant hand), requires teaching using language (speech and gesture) to minimise errors in transmission between expert and novice (Uomini and Meyer 2013; Putt et al. 2014; Ruck 2014; Morgan et al. 2015; Lombao et al. 2017; Ruck and Uomini in press). Gärdenfors and Högberg (2017):196, table 1) outline a hierarchy of forms of intentional teaching and levels of joint attention and theory of mind between teacher and pupil. They link these levels to increasing difficulty of transmitting an understanding of patterning or concepts to the extent that language is required, as in the case of learning to make an Acheulean hand-axe using soft hammer techniques. The multiplicity of production phases (sub-goals) that need to be completed to move to the next stage of production adds to the levels of knowledge (planning depth) to be transmitted and understood. In the case of bifacially thinned hand-axes, a cause-and-effect understanding of sub-goals associated with bevelling (flaking) and abrading platform edges cannot be understood from copying the actions alone; teaching with language is required (Gärdenfors and Högberg 2017:198–9).Footnote 3 Mahaney (2014) in a detailed study of single expert knapper draws parallels between the complexities of soft hammer thinning of hand-axes with the production of sentences in the English language. The parallels illustrate the skill levels involved and not the kind of language or grammar required to make a hand-axe. A G1 language in our typology lacks recursion in its structure, but places no restriction on the capacity for recursive thought. As Everett (2005, 2012, 2017) and Pullum (2020) have argued, recursive thinking does not require a recursive grammar and there is no evidence for a one-to-one mapping of thought onto language (Everett 2017).

A cognitive analysis of cleaver production provides additional insights on the linkage between planning depth, expertise and the role of language in managing the cognitive demands of this craft (Herzlinger et al. 2017). Cleavers made from large flakes struck from large cores differ from that of hand-axes in not being produced by retouch, but instead by the planned management of the core before the cleaver blank is struck (Sharon 2008). The planning begins with the selection of raw material, and cleavers tend to be made more consistently on coarser-grained rocks than hand-axes. This preference occurs across the geographical and time range of the large flake tradition of blank production and arguably reflects the socially agreed functions of this tool form (Sharon 2008:1332–3). At the 780,000-year-old site of Gesher Benot Ya’aqov (GBY) (Israel), three different core and flake management strategies were used to produce wedge-shaped working edges (Levallois-like, Kombewa, and blank delineation by retouch) (Herzlinger et al. 2017). Each strategy involved a different set of hierarchical steps with sub-goals, with the choice of strategy made early in the chaîne opératoire. A technical and cognitive analysis of the production sequences of GBY cleavers draws on the concept of expert cognition (Wynn et al. 2017). Modern experts in craft tool-making share a set of characteristics that provide a template for considering the level of skilled technical cognition to make cleavers (and hand-axes). Craft knowledge took years to learn, and with mastery of the craft came great accuracy and reliability in production, a capacity for rapid in-depth assessments of problems and making adjustments, and a capacity to focus and retain that focus after an interruption without a loss of intention (Wynn et al. 2017:23). In the context of the GBY cleaver strategies, Herzlinger et al. (2017):11) conclude:

The number of categories may have been fewer than one would find with a modern expert, but categories were definitely present in the minds of the GBY knappers. Further, it would seem likely, though this is impossible to know, that the GBY knappers had declarative/semantic labels for these concepts, either in the form of vocal words or perhaps gestures (we favor the former)”.

This proposed linkage between the complex nested routines of cleaver-making and the use of symbols (words) as scaffolds for managing the sequencing of tasks, complements neuroimaging research on shared networks for tool-making and language (Uomini and Meyer 2013; Meyer et al. 2014; Stout et al. 2015; Putt et al. 2019), and the experimental studies showing the effectiveness of teaching with language in learning complex tool-making routines (Morgan et al. 2015; Lombao et al. 2017).

In summary, the arbitrary (conventional) forms of hand-axes and cleavers are symbols in Peirce’s triad (1998) because they bear no inherent relationship to their functions (Shipton et al. 2018). These forms are social constructs that can serve as icons, indexes and symbols depending on contexts in which they are perceived and the knowledge of the viewer. Attention to form appeared early in the Acheulean and became more common after one million years ago (below) with the development of soft hammer thinning. The complexity of biface production, in particular the process of thinning, exceeds the capacity for a novice to understand cause and effect from observation alone. Teaching with words arguably becomes a necessity to gain technical mastery (Morgan et al. 2015; Gärdenfors and Högberg 2017). Language may have evolved in the context of the needs of teaching increasingly complex coordinated actions. In such contexts, whether tool-making, foraging or hunting, simple sentences would give teachers a low-cost means of transmitting information with greater precision than possible with gestures alone (Laland 2017: 227–8). A G1 language with its linear sequencing of words would fulfil this need.

After One Million Years Ago

The Middle Pleistocene archaeological record between 1 Ma and 300 ka shows increasing behavioural variability across continents, which we argue reflects the impact of symbol-based language on cognitive evolution (encephalisation) and the evolution of an extended childhood as a period of social learning (Antón et al. 2015). Culturally transmitted conventions of tool-making and tool-use change in the Acheulean as seen in the shift in Southwest Asia by 500 ka away from the large flake tradition with its giant cores, use of coarse raw materials, and abundant cleavers towards smaller cores and finer-grained materials for making hand-axes and the discontinuation of cleavers as a tool form (Sharon 2008; Malinsky-Buller 2016). In Western Europe, subtle regional variations emerge in biface conventions among contemporary groups between 500 and 400 ka (White 1998; Ashton 2016; White and Foulds 2018; García-Medrano et al. 2019). In Britain, a distinctive range of hand-axe forms exists with some forms difficult to make and these two features are interpreted as evidence of socially transmitted norms (Shipton and White 2020).

Innovations in knapping methods also emerge after one million years ago in Africa, India, Southwest Asia and Europe including the use of “soft” organic hammers or softer stone hammers to thin hand-axes (Clark 2001; Gallotti et al. 2010; Galloti and Mussi 2017; Shipton 2016, 2018, Malinsky-Buller 2016; Stout et al. 2015). As discussed, soft hammer thinning requires not only an understanding of the properties of the hammer and its use, but also the need for embedded routines linked to edge management and thinning (Mahaney 2014). Teaching with language is argued to be necessary to transmit this conceptually opaque knowledge (Csibra and Gergely 2011; Gärdenfors and Högberg 2017). From a neural perspective, the hierarchical organisation of these additional sub-routines of biface making is linked to cognitive control functions involved in processing linguistic syntax (Stout et al. 2017:586).

This understanding of the properties of other materials combined with increasingly extended production sequences would be the foundation for the invention of hafting later in the Middle Pleistocene with its added complexities of composite hierarchical constructions (Ambrose 2010; Barham 2013). Other innovations in the Acheulean include a new tool form, the “handpoint” in East Africa and Spain (Gowlett 2013; Preysler et al. 2018), the making of blades in East Africa from ~ 550 ka (Johnson and McBrearty 2010) and the use of Levallois prepared cores for making cleaver blanks in the late Acheulean of East Africa (Tyron et al. 2006). The use of ochre also enters the archaeological record in southern Africa between 500 and 400 ka (Watts et al. 2016), adding to the diversity of recurrent, conventionalised behaviours linked to working stone.

The Life History of Bifaces

The final criterion in Davidson’s (2002) framework for recognising the use of symbol-based language is the separation of the making of tools from their use. Preysler et al. (2018) reconstruct the life history of hand-axes and cleavers at Gesher Benot Ya’aqov (Israel) and at later sites in central Spain. Common to both localities is a production sequence starting with the selection of suitable rocks or active quarrying to obtain the raw material with cores shaped at the raw material source then large flakes were struck from the cores and initially shaped by retouch with final shaping usually away from the raw material source. The tools were then transported to places of use, where some were re-sharpened, used and then discarded.

The life history sequence also includes an important option in the context of symbol use which is to store or cache unused tools in anticipation of predicted needs. Caches of raw materials and tools represent future planning (Kuhn 1992), and this behaviour has been observed among individual captive great apes (Osvath 2009; Osvath and Karvonen 2012) and in the wild (e.g. Boesch and Boesch 1984). In the case of collective caching “cooperation about detached goals requires that the inner worlds of the individuals be coordinated. It seems hard to explain how this can be done without evoking symbolic communication” (Gärdenfors 2004:6). There is tentative evidence for caching in the late Acheulean of Spain (Méndez-Quintas 2018:3) and more persuasive evidence at Gesher Benot Ya’aqov (Preysler et al. 2018:131). The latter site also provides evidence of contexts for extended social interaction necessary for transmitting knowledge, including symbols, across generations. The lake shore locality was used over a period of 100,000 years for activities including animal and plant food processing, the working of stone and wood, making fire and caching hand-axes (Goren-Inbar 2011). The caching of these large, unused tools in the landscape indicates provisioning of places rather than provisioning of individuals (Kuhn 1992:192).

Evidence for future planning, and by implication symbol-based language, also occurs early in the Acheulean of East Africa 1.4 Ma at Koobi Fora (Kenya) with the allocation of different areas of a contemporaneous landscape to separate stages in the making and use of hand-axes (Presnyakova et al. 2018). This spatial fragmentation of the life history of hand-axes extends the time depth and evidence base for H. erectus communicating shared abstractions using language. In the context of a gradualist model of language evolution, the roots of symbol use and G1 grammars may lie in shared activities such as the persistent provisioning of raw materials at Kanjera two million years ago which involved planning actions distant in time and space (Hockett 1960; Osvath and Gärdenfors 2005; Plummer and Bishop 2016).

Middle Pleistocene Seafaring

The onset of the Middle Pleistocene, roughly 900,000–780,000 years ago, marks a transition to increasingly variable and harsh climatic conditions (Head and Gibbard 2005). H. erectus is widespread by this time, having settled China and Southeast Asia, including Java. The earliest Acheulean in Java is dated to about one million years old (Simanjuntak et al. 2010). Sea level fluctuations linked to the waxing and waning of glacial stages meant periodic isolation of some island populations. Parts of Indonesia were never linked to the Asian mainland, and the Acheulean did not spread beyond Java. East of Java on the island of Flores, however, there is an archaeological record of stone tool-making from one million years ago, primarily flakes, without hand-axes, cleavers or picks (Brumm et al. 2010).

As argued above, tools are symbols and the hand-axe and cleaver as standardised forms provide indirect evidence of cultural traditions and at least a G1 level of language. The absence or rarity of these tools in the Southeast Asian record poses a challenge in this respect for the early language hypothesis. That challenge is met by considering another aspect of the regional behavioural record that reflects extended future planning based on language. The settlement of Flores and other islands of Wallacea by H. erectus or related taxa is arguably a process that required language to collectively plan and execute the crossing of open bodies of water (Davidson and Noble 1993). Wallacea is a transitional biogeographic zone unique in having islands that were never connected to the mainland of Southeast Asia (Sunda), or to Australia/New Guinea (Sahul) (Kealy et al. 2016). Sea crossings would have been necessary for hominins to settle these islands (Bednarik 1997), and the arrival of Homo sapiens in Australia some 50–60,000 years ago is often cited as a reliable indicator of the necessity of language for planning a sea crossing of 90 km (Davidson and Noble 1992). Building a boat requires the kind of conceptualisation of an arbitrary form intended for an imagined purpose that is only possible by the use of symbols to convey such abstractions. Constructing a boat or raft involves joining multiple parts to function as a whole, a form of extended hafting. Provisioning of water and food and having the capacity to fish would be part of the planning process. By this logic, evidence for the earlier settlement of Wallacea would imply an earlier use of language.

Bednarik (1997, 1998) drew attention to the published archaeological evidence for stone tools on the island of Flores associated with fossil fauna in the Soa Basin, palaeomagnetically dated to ~ 700 ka. The tool-makers were attributed to Homo erectus based on well-known fossil evidence on Java, and Bednarik speculated on the kinds of watercraft needed for travelling between the islands. To reach Flores from Bali involved crossing two islands (Lombok, Sumbawa) and distances of 10 km of open water. Subsequent research in Wallacea has identified submerged islands that at a sea level 45 m lower than today could have been staging posts for a north-south connection between Sulawesi and Sumbawa/Flores, offering additional food resources for dispersing hominins (Kealy et al. 2016). Lower sea levels would have existed during glacial maxima in the Middle Pleistocene, and presumably other islands would have emerged as habitats for coast-adapted communities.

The radiometric dating of the archaeological record on Flores has extended a hominin presence to 1 Ma (Brumm et al. 2010), and there is fossil evidence for a hominin ancestor of Homo floresiensis on the island 800 ka (van den Bergh et al. 2016). The largest island of Wallacea—Sulawesi—is now known to have been occupied by hominins at least 200 ka (van den Bergh et al. 2016), and there is evidence for hominins in the Philippines, north of Wallacea, ~ 700 ka in the form of stone tools among the remains of a butchered rhinoceros (Ingicco et al. 2018).

Despite the uncertainty about which hominins settled these islands (Cooper and Stringer 2013), the evidence is accumulating for multiple sea crossings in the early Middle Pleistocene. The short crossings between the islands of Wallacea, though less demanding than the long crossing to Australia with no landmass apparent, also required shared awareness of a future goal, not unlike the caching of hand-axes. Language would be necessary in this context for constructing watercraft and storing provisions (food and water), and a G1 language would be sufficient to convey the information required to navigate between visible islands (Gil 2009). Ongoing experimental building and testing of rafts using local knowledge of plant resources (e.g. bamboo poles, vine bindings and rope making) has demonstrated the feasibility of crossing distances of 20 to 50 km by H. erectus using rafts with paddles (Bednarik 2014). The intentional settlement of these islands by genetically viable populations is a more parsimonious explanation than the accidental seeding of hominins on islands by tsunamis or other random natural processes (e.g. Ruxton and Wilkinson 2012).

Discussion and Conclusion

“Finally, there is the fact that many quite reasonable hypotheses in the historical behavioral sciences cannot, as a practical matter, be refuted absolutely. It is possible to choose among alternative hypotheses in terms of their relative probability…”

(Chase and Dibble 1992:50).

Throughout this paper, we have drawn evidence from a range of sources in support of the contentious claim that language evolved earlier in hominin evolution than is normally accepted (Belfer-Cohen and Goren-Inbar 1994; Sharon 2009; Goren-Inbar 2011). Homo erectus rather than Homo sapiens was the first ancestor to generate symbols, and symbols are the essential component of language, not syntax (Hurford 2004; Piantadosi and Fedorenko 2017; Studdert-Kennedy and Terrace 2017). Our conclusion derives from our reading of Peirce’s semiotic progression and its application to the archaeological record against criteria set by Noble and Davidson (1996) for the recognition of language in tools. As the work by Steels (2005) suggests, even all the later additions to the basic symbolic system and grammar of language are the filling-in of the semiotics of language (see also Everett 2017, 197ff for a discussion of how language complexity can develop over time, from a simple G1 grammar).

We outlined at the outset five questions posed by Ingold (1993):337) for those who would interpret hand-axes as evidence for early language. We respond as follows:

  1. (1)

    There cannot be a modern analogue for the longevity of the Acheulean given the present is short. The longevity of the hand-axe (and cleaver) as recurrent forms is evidence of cultural norms (Hodder 1994) that reflect stabilised solutions to particular needs (Pinch and Bijker 1984; Deacon 1997) that were transmitted over generations in small-scale societies by natural pedagogy including teaching using language (Csibra and Gergely 2011; Lew-Levy et al. 2017). Small population sizes and limited rates of interaction inhibited rapid innovation (Hopkinson et al. 2013).

  2. (2)

    The persistence of these forms necessitated cultural transmission given the complex hierarchical processes of manufacture (Morgan et al. 2015; Gärdenfors and Högberg 2017; Herzlinger et al. 2017), and the range (temporal and geographical) of available alternative strategies to achieve similar ends (Sharon 2009)—these are cultural choices (e.g. Killick 2004; Byrne 2007).

  3. (3)

    Representational models of tool-making are being challenged (Fairlie and Barham 2016; Overmann and Wynn 2019) in recognition that the process is embodied and reflexive, with knappers responding to changing affordances rather than imposing invariant forms (Malafouris 2013), but the production of hand-axes—and especially cleavers—unfolds from decisions made early in the reduction process linked to raw material properties and to an intended end-form (Gowlett 2006; Herzlinger et al. 2017).

  4. (4)

    The extended life histories of large Acheulean tools are the product of cooperative societies in which technology is entangled with daily lives as conduits and creators of meaning (Pfaffenberger 2001; Goren-Inbar 2011; Hodder 2012). The evidence for caching of hand-axes (Preysler et al. 2018) indicates the shared abstraction of future use (Hockett 1960; Gärdenfors 2004).

  5. (5)

    The standardised forms and cultural selection of production processes are recurrent conventional constructs indicative of symbol-based language (Holloway 1969; Peirce 1998). The forms may have held semiotic value to those who made, used and viewed them, but we cannot know the culturally specific meanings of the signs, including interpretants, generated by these objects. The identification of recurrent ergonomic design features in hand-axes and cleavers (Gowlett 2006), however, provides a way of disentangling Peirce’s triad as applied to these forms. For objects, his theory of signs specifies a logical-causal relation between material form and the signalling of meaning as indexes (proximity, causation) and icons (resemblance), whereas symbols are conventional constructions more dependent on cultural knowledge to interpret (Wallis 2013:210). The process of making a hand-axe involves responding to raw material constraints (e.g. internal flaws) and changing opportunities (e.g. edge angles) during the production process (Mahaney 2014; Shipton 2018). Adjustments are made in response to these indexes in relation to an implicit awareness of the design imperatives (Wynn and Gowlett 2018). The form of the tool signals immediate or future actions and as such is an icon, and this association can extend to components used in the knapping process, such as hammers and cores. An element of cultural knowledge exists in indexes and icons, but symbols are essentially arbitrary constructs of meaning though ultimately linked to the material object.

The superstructure of our argument, building on Peirce, is uniformitarian in design and content. Cross-cultural observations drawn from pre-industrial societies demonstrate the centrality of tools as media for generating and transmitting meaning and value. Tools have expressive symbolic value beyond fulfilling particular functions, and in the case of hand-axes and cleavers, they may have had multiple uses (McCall 2016: Chapter 3). The ability to agree value is distinctly cultural, and we make the wider point that symbols do not have to be reserved for ritual or other rarefied activities. Peirce makes no assumptions about the association of symbols with specific behaviours, and nor do we. Objects made to arbitrary repeated forms, such as a butter knife, are the products of symbolic thought. We assume that this was also the case in the past with hand-axes and cleavers. We also argue that the development of labels (words as symbols) for the repeated forms of the hand-axe, cleaver and perhaps the pick was the most efficient way of referring to these objects where proximity was not possible (pointing as an index), and gestural images (icons) were too ambiguous to convey intention clearly (Donald 1991). Clarity of intention is also relevant in making the case for the efficacy of words in teaching to make complex tools (Morgan et al. 2015; Gärdenfors and Högberg 2017; Herzlinger et al. 2017; Laland 2017; Lew-Levy et al. 2017).

Our typology of grammars contributes to the growing gradualist approach to language evolution by highlighting the capacity of simple word order to convey meaning without the need for complex grammar (Hurford 2004; Piantadosi and Fedorenko 2017). Cross-cultural evidence for the correlation of group size with grammatical complexity (Lupyan and Dale 2010; Dale and Lupyan 2012) adds support to the contention that that Homo erectus, with a language based on words as symbols with minimal grammar (a G1 language) could have created complex tools, including boats, and planned for the future by provisioning landscapes and reaching distant islands in Southeast Asia. We are not the first to attribute the capacity for symbols and language to H. erectus (e.g. Deacon 1997; Tobias 2005; Gowlett 2009), but our claim is based on a semiotic framework linked explicitly to technology and a distinct typology of syntax (G1-G3 grammars) as sufficient to underwrite language.

Human tool-making is an order of complexity greater than that of any other animal, and that is in part because language has integrated technology into all aspects of our social lives (Arthur 2009). Learned traditions of tool use and making exist in non-human primates, often focused on immediate needs with minimal attention to the form of tools (Goodall 1986), but chimpanzees show a nascent capacity to categorise tool function (Grüber et al. 2015) which suggests that the ability to partition causality existed in our last common ancestor. There are hints too of vocalisations that are referential and learned, which if supported by observations in the wild would add to the behavioural flexibility of that common ancestor, and to case for a gradual and early evolution of language.

The archaeological record suggests an early awareness of icons based on intentional use of resemblance, and by two million years ago, hominins had developed a reliance on technology and a range of cooperative behaviours that exceeded those seen in other primates today (Plummer and Bishop 2016). With the emergence of the Acheulean tradition 1.7 million years ago, the first evidence exists of attention given to the visual form of artefacts, in this case a large symmetrical hand-axe from Olduvai Gorge that prefigures the standardisation of the hand-axe form later in the Acheulean after 1.2 million years ago (Diez-Martína et al. 2019). The establishment of conventions of hand-axe and cleaver forms, and multiple ways of making these tools (Sharon 2009), marks the development of symbols and language.

The capacity to share abstract concepts using language was a key transition in the evolution of communication and in hominin evolution. By extending that capacity to H. erectus, we are not denying the achievements of Homo sapiens; we are simply placing them in a broader evolutionary time frame which accords with current evidence.