Part I: Culture and the development of human cognition

Introduction

The idea that culture can change cognition has been a focus of renewed interest in recent cognitive science research. While many agree that human cognition is partly learned, cultural learning views take several forms. According to Tomasello’s Cultural Intelligence Hypothesis (Herrmann et al., 2007), humans are “adapted for culture” (Tomasello 1999b; see also Tomasello 2020). We possess a suite of biological adaptations for social cognition, including imitation, a high-fidelity copying mechanism in which agents carefully reproduce the actions they see others perform (Boesch and Tomasello 1998; Fridland and Moore 2015); and adaptations for the skills and motivations for ‘cooperative communication’ (Herrmann et al. 2007; Tomasello 2020). Cooperative communication requires some uniquely human forms of theory of mind (Tomasello 2008)—if not the ability to pass explicit false belief tasks (O’Madagain and Tomasello 2021). Tomasello has argued that these early-developing aspects of social cognition are elements of our “unique skills and motivations [for] shared intentionality” (Tomasello 2020, p. 4), which are “almost certainly adaptations for life in a cultural group” (ibid., p. 3). These adaptations differentiate our minds from those of other species, and enable the acquisition of language and culture. Language and culture in turn extend the cognition of which we are capable, but only because we possess adaptations that make their acquisition possible.

While all agree that human cognition is a product of both biological adaptation and cultural forces, Heyes doubts that human minds are adapted for social life to the degree that Tomasello proposes. She argues that the socio-cognitive differences between human and non-human great apes are more a product of cultural forces. Rather than possessing adaptations for ‘shared intentionality’, Heyes argues that humans fostered cultural behaviours that supported the development of our social learning skills. In turn these facilitated the development and acquisition of ‘cognitive gadgets’, which extended the cognition of which we are capable. While Heyes does not deny that humans may be adapted for social learning, on her account these adaptations consist primarily of socially beneficial attentional biases (Heyes 2018).

Tomasello and Heyes both focus on what makes human minds unique, posit a significant role for social learning in the evolution of human cognition, reject Chomskyan nativism about syntax, and emphasise that human cognition is partly a product of cultural evolution. Their accounts thus have much in common, and stand against thinkers in the Fodorian and Chomskian traditions, who take human cognition to be largely innate.Footnote 1 Nonetheless, their explanations of cognitive development are substantially different.

In this paper, we consider the sorts of evidence that might shed light on the disagreements between Heyes and Tomasello. After developing the contrasts between their views in more detail ("Heyes and Tomasello: developing the contrast" section), we give a brief survey of the heuristics that are currently used to adjudice whether traits are likely to have cultural origins ("Heuristics for determining trait origins" section). We find that while existing heuristics constrain interpretations, they can be consistent with both cultural and biological explanations. In "The immersion problem" section we consider a further obstacle to the task of determining human trait origins, the ‘immersion problem’. Given that all human populations are raised with culture, we cannot practically or ethically study how human minds would develop in the absence of culture. In "Learning (from) apes: the effects of culture on cognition" section we propose a partial solution to the immersion problem: renewed attention to studies of enculturated great apes. We argue that by re-examining studies of great apes raised with and without the influence of human culture, we can better understand both the ways in which human culture changes cognition, and the limits of great ape cognition. While these studies do not settle debates about cognitive origins they can help us to overcome the limitations of studies of human cognitive development, and provide a foundation for potentially valuable new research.Footnote 2

Heyes and Tomasello: developing the contrast

As both Heyes and Tomasello recognise, all human cognition is the product of both biology and culture. This is partly because cognition is a phenotypic trait, rather than a genotypic one; and because the human environment is inherently cultural (Keller 2010). Moreover, the deliberate manner in which humans have constructed our environment (or ‘niche’—Day, Laland, Odling-Smee 2003; Jablonka 2011) has led to an unprecedented degree of gene-culture co-evolution in our species (Jablonka 2011). Human cultural behaviour has generated biological selection pressure for a range of traits. For example, cattle farming in Northern Europe led to biological selection for genes promoting lactase persistence in that region around 6.5kya (e.g. Henrich 2020). Neither Tomasello nor Heyes would dispute this. Nor would they dispute that human cognition could be a product of the Baldwin Effect (Baldwin 1896).

Heyes and Tomasello also agree that human cognition is to some extent a product of cultural evolution. Cultural evolution is a selection process through which cultural practices mimic Darwinian selection. Following advantageous refinements to existing cultural practices (e.g. concerning the manufacture of tools), older variants of those practices may fall into disuse. Over time, functionally superior cultural practices emerge at the expense of the inferior variants, analogous to processes of genetic selection. Heyes (2018) has argued at length that cultural evolution can generate not only new material technologies, but also new cognitive technologies—in her words, ‘cognitive gadgets’. This idea is also present in Tomasello’s work—for example, in his claim that certain types of reasoning are made possible only by culturally evolved forms of natural language (O’Madagain and Tomasello 2021; see also Moore 2021). Nonetheless, Tomasello and Heyes disagree about which human cognitive traits are products of cultural evolution, and which are part of our biologically inherited ‘start-up kit’ (Heyes 2018)—i.e. the set of unlearned cognitive abilities that infants can recruit for learning, and which made the cultural evolution of new cognitive gadgets possible. While both agree that adult forms of uniquely human ‘Theory of Mind’ are the product of cultural evolution (and the emergence of particular forms of natural language) (Heyes and Frith 2014; O’Madagain and Tomasello 2021), Tomasello has argued that many other uniquely human cognitive traits—including simpler forms of mindreading (Tomasello 2006, 2008)—are indicative of adaptations in the hominin lineage, whereas Heyes argues that the same traits are products of cultural learning.

A further disagreement concerns the origins of imitation, a high fidelity form of social learning. Tomasello, but not Heyes, thinks imitation is an adaptation in the hominin lineage. His argument is grounded in studies of social learning in chimpanzees, and the finding that while both chimpanzees and young children can learn by emulation, only children imitate (Tennie et al. 2009). This enables children to copy behaviours that cannot easily be learned by emulation, like the arbitrary words of natural languages (Moore 2013a; Acerbi and Tennie 2016). Tomasello argues that since the hominin lineage split from our last common ancestor (LCA) with chimpanzees and bonobos, human ancestors acquired a biological adaptation for imitation (e.g. Henrich 2015; Tomasello 2020; though see "The enculturation of social attention" section). In contrast, Heyes has argued that the Mirror Neurons System (MNS), hypothesised to support imitation (Rizzolatti and Sinigaglia 2008), is learned in ontogeny via general purpose associative learning mechanisms that generate associations between observed and performed actions. She argues that humans have become uniquely skilled imitators not because of any inherited MNS but because we have adopted cultural practices that increase our opportunities to experience the co-observation of our own and others’ actions (Heyes 2018).

In the following sections we consider whether there are any general methodological principles that help to determine when cultural explanations of trait origins are appropriate. We will argue that the principles we consider do not settle disagreements between Heyes and Tomasello. Nonetheless we argue that currently under-utilised studies of enculturated great apes could help to resolve origins debates. While enculturation studies are sometimes thought unscientific and treated with suspicion (Lyn 2017), when interpreted cautiously they can tell us much about whether historical developments in hominin cognition are likely to be a product of biological, cultural, or ecological changes. Moreover, they can help us to specify the nature of the mechanisms that support uniquely human features of cognition, and the ways in which these mechanisms might have been trained through enculturation.

Heuristics for determining trait origins

The cognitive development literature already contains a number of criteria used for determining when biological and cultural explanations of cognitive and behavioural differences are appropriate. However, as we will show in the following sections, these criteria do not settle all origins debates.

Learning and learnability arguments

Poverty of Stimulus (PoS) arguments set out to show that knowledge that agents must acquire by a certain stage of development could not be learned in the available time; and consequently that this knowledge must be unlearned (and so by implication genetically encoded). One influential PoS argument is Chomsky’s argument for the origins of syntax (Chomsky 1986). Chomsky argues that since children don’t make certain kinds of syntax mistake while learning to use language, and given their limited exposure to incorrect models of syntax, knowledge of syntax must be unlearned. The application of PoS arguments also includes cases where learning might in principle be possible, but where the unreliability of learning makes it risky. For example, responses to alarm calls have been hypothesised to be evolutionarily preserved on the grounds that naive youngsters might not get a second chance to learn a call indicating a nearby predator (Tomasello 2008).

Successful PoS arguments show that some bodies of knowledge must be unlearned. Nonetheless, there is disagreement about which bodies of knowledge must be explained in this way. For example, Chomsky’s claims about the innateness of syntax are disputed (Cowie 1997; Tomasello 2003; Heyes 2018). PoS heuristics are also fallible. If a trait is present in neonates and changes little in ontogeny, that may be evidence it is substantially biologically inherited; and traits that emerge slowly and improve with training may be better explained as products of learning. However, there are exceptions.

Meltzoff and Moore’s (1977) finding that neonates stick out their tongues in response to observing adult tongue protrusions has long been interpreted as evidence of an innate mechanism for imitation in humans, on the basis of its early development. With the discovery of mirror neurons (Gallese et al. 1996), the mechanism enabling neonate imitation was thought to have been discovered. The Mirror Neuron System (MNS) was in turn hypothesised to be an adaptation for matching self and other behaviours; nature’s answer to the problem of other minds (Rizzolatti and Sinigaglia 2008). However, recent findings undermine the existence of neonate imitation. Oostenbroek and colleagues found that neonates were just as likely to produce non-matching behaviours in response to demonstrations as matching ones, suggesting that infants do not copy others’ behaviours shortly after birth (Oostenbroek et al. 2016), and that humans do not possess an innate adaptation for imitation. Here, then, an early-emerging behaviour turned out not to be good evidence of adaptation.

Furthermore, while a trait’s developing late in ontogeny and only after training may be evidence against its being an unlearned adaptation (as in the case of reading, for example), adapted traits can be both slow-developing and trained. Bi-pedal walking, for example, is uncontroversially an adaptation in the hominin lineage, but children learn to walk, do so only around twelve months, and benefit from training. So again these criteria for identifying biologically encoded traits are imperfect; a heuristic rather than a rule.

Impairment

Another way of identifying our biologically inherited start-up kit is through the study of cognitive impairments. Some genetic defects may impair subsequent learning; and while not all genetic defects are biologically inherited, they are often so. For example, members of the KE family have been identified as possessing a defective FOXP2 gene, which causes a combination of issues including orofacial dyspraxia, and problems with both verbal articulation and sequence learning that leave “virtually every aspect of language and of grammar affected” (Fisher et al. 1998 p.168). This gene has been identified as playing a key role in language development, and perhaps especially syntax development.

While genetics can help to identify adaptations, this process may not be straightforward—not least because the science of genetics is still subject to revision. We know relatively little about how genes are expressed at the phenotypic level—such that even where genes have been identified as being correlated with particular cognitive traits, the underlying causal pathways may be poorly understood. For example, the precise function of FOXP2 remains a matter of debate. Berwick and Chomsky (2016) argue that its central function lies in the phono-articulatory system, independent of syntax, while Christiansen and Chater (2016) think its primary involvement is likely to have been in general purpose sequence learning, subsequently exapted for use in syntax. Since the existing FOXP2 data can be multiply interpreted, they do not resolve questions about adaptive functions. Moreover, since even culturally learned traits have essential genetic underpinnings that can be impaired (Heyes and Frith 2014), genetic evidence is currently unlikely to confirm whether traits are learned.

Cultural variation

Cultural variation also bears on questions about trait origins, because universality may indicate genetic origins, whereas variation can indicate cultural origins.Footnote 3

Since people from any geographical region can excel when raised in the cultural practices of any other group, cognitive differences between human populations are almost certainly products of culture—and perhaps environmental (e.g. nutritional) differences. Nonetheless, there are genetic differences between geographically distinct populations—for example, with respect to lactase persistence, and disease resistance—and cognitive differences could in principle arise in genetically isolated populations. The field of cultural genomics seeks to determine whether there may be genetic underpinnings of culture specific cognitive traits (e.g. Chen and Moyzis 2018). However, the prospect of progress in this field is hampered by the issues considered in "Impairment" section. For all that we can identify correlations between genes and culturally variant cognitive traits, the causal pathways that support these correlations remain poorly understood. This makes it hard to determine whether culturally variant cognitive traits are supported by genetic variation, even where genetic correlations can be identified.

Potential genetic differences could be discerned through studies of how cognition changes following exposure to other groups. Genetic and cultural traits change on different timescales. While it takes many generations for even advantageous biological adaptations to seed within a population, cultural differences can be overcome within a generation. As a result, if—due to new forms of cultural exposure—genetically isolated populations can acquire new cognitive traits within a generation, the previous absence of a trait within that community is unlikely to be because they were genetically ill-equipped to acquire it. In contrast, a long-lasting inability to acquire that new trait may point to some genetic component that makes learning the relevant trait more difficult.

To illustrate with an example, it has been hypothesised (1) that the (until recently geographically isolated) Pirahã people have no number terms in their language, and that correspondingly they are poor at performing calculations involving anything but the smallest integers (Frank et al. 2008), and (2) that the Pirahã language lacks recursion (Everett 2009). While these traits could have genetic underpinnings, the Pirahã people’s increasing exposure to other communities suggests that this is not the case. One half-Pirahã individual is already fluent in Portuguese (although less fluent in Pirahã) (Caleb Everett, personal communication), which uncontroversially involves recursion. If the Pirahã can learn recursive languages like Portuguese, then the alleged absence of recursion in the culturally evolved Pirahã language is not evidence that the Pirahã people lack an innate cognitive architecture for recursion (contra Everett 2009). As the Pirahã become more integrated into Brazilian society, numeracy and fluency in Portuguese will likely follow—putting an end to the idea that they are incapable of learning otherwise common human cognitive traits.

The considerations above show that even fairly reliable heuristics for evaluating evidence about the origins of cognitive traits are not always easy to interpret. As our discussion showed, these heuristics are imperfect, and so claims about whether traits are the result of biological adaptations (or exaptations) must be handled with care.

The immersion problem

The foregoing considerations show that disagreements like those between Heyes and Tomasello may not easily be solved. What we call the ‘immersion problem’ constitutes an additional obstacle to determining trait origins, because it shows the limitations of appealing to cultural variation as a method for identifying cultural features of cognition. The problem of humankind’s immersion in culture is that some human cultural traits spread so easily that they acquire features more commonly associated with biological traits—namely, their seeming universality and early development. While many cultural behaviours were once shared only within small groups, leading to variation between culturally isolated groups, increasing globalisation may have resulted in the near universalisation of cultural traits like imitation. Such cultural traits may be so easily invented and acquired, and so useful, that they are adopted by all who encounter them. They may also have been independently invented many times in human history. Thus, over time, cultural variation between human populations has diminished—giving the potentially misleading impression that traits that are cultural in origin are the product of biological adaptations in the hominin lineage.

Not only are human cultural practices pervasive, they are hard to escape. Newborns are immersed in culture from birth, and no society is culture-free. This means there may be no clear pre- and post-cultural stages in human development. The best approximation comes from neonates, who have limited exposure to culture. However, testing and interpreting the cognition of neonates is practically difficult, given the limited range of behavioural responses available to them. Moreover, as cultural immersion begins early and persists throughout ontogeny, a clear entry point may still be elusive. Cultural forces may also be invisible to us, precisely because they are universal. In this respect, children may receive cultural input (e.g., in the form of emotion regulation, imitation training, or positive reinforcement) that goes unnoticed because it is so unremarkable (see Heyes (2016) for possible examples).

An additional feature of the pervasiveness of culture makes its influence on human development yet more difficult to study. Given that no human group is culture-free, control groups of populations raised without culture are elusive. Possible candidates include so-called children who have grown up deprived of ‘normal’ human interaction, either because they were orphaned and raised by animals (the ‘wild’ boy of Aveyron (Lane, 1976)), disabled in ways that limited their exposure to culture (Ildefonso (Schaller 1995)), or abused (Genie (Curtiss 1977)). In such cases it can be difficult to draw conclusions about the origins of cognitive impairments because combinations of abuse, neglect, and deprivation make it hard to disentangle the cognitive consequences of children’s (lack of) enculturation from the consequences of abuse.Footnote 4 Since the most extreme cases of wild children are rare and reports of them potentially exaggerated, reported traits form an unsatisfactory basis for scientific conclusions. Since these children’s developmental environments cannot be ethically replicated, we may never know exactly how to explain the causes of their developmental differences.

The aforementioned issues point to a central, if implicit, bone of contention in disagreements between Tomasello and Heyes. Tomasello’s work on pre-verbal children, and the scientific value of comparing the cognitive abilities of younger children and non-human great apes, has always been motivated by the concern that the effects of human culture on the cognition of older children (perhaps from the age of 3–4 years) are too pervasive to make comparisons informative (second author, in conversation). The rationale has been that, if we want to understand the true differences between humans and great apes, we need to study young children, before the effects of culture (and especially language) on cognition become so pervasive as to make underlying differences uninterpretable. In light of this rationale, most of Tomasello’s work in developmental psychology has focussed on children between the ages of 12-months and 4-years. This is practical, since such children are relatively available and easy to test in paradigms using behavioural measures. However cultural immersion need not begin only around 12-months. Heyes (2016, 2018), among others, has challenged this assumption by drawing attention to the ways in which social learning and cultural influences shape the behaviour of even younger infants. Cultural training starts early. For example, western caregivers imitate their infants frequently during the first year of life (Pawlby 1977).

The pervasiveness of human culture makes it hard to identify clear-cut before and after culture stages of development, prevents us from recognising some cultural inputs as such, and prevents us from relying on comparing ordinarily developing children with control groups. These issues add to the shortcomings described in "Heuristics for determining trait origins" section. Not only are heuristics insufficient to settle disagreements about trait origins; new studies of young children may also leave these disagreements unresolved.

Learning (from) apes: the effects of culture on cognition

Some of these limitations can be overcome by looking at great ape enculturation and training studies. Enculturation studies look at the abilities acquired by non-human great apes when raised in human-like environments. Enculturated apes can include both highly enculturated, when they are raised only by humans, and semi-enculturated, when they experience extensive human interaction and training, but still live with conspecifics in controlled environments (Henrich and Tennie 2017). Such apes are contrasted with those raised in zoos, who are ordinarily raised by their own mothers (albeit in proximity to humans), and who do not experience the same quality and frequency of interaction with humans (Call and Tomasello 1996). In what follows we use the term ‘enculturated apes’ to refer to both highly and semi-enculturated individuals, and specify the kind of apes involved in the case studies we present.

Enculturated apes may receive deliberate, task-specific training in human-like abilities—but they do not always.Footnote 5 Nonetheless, their cognitive development can still benefit from being raised among humans (Leavens et al. 2019). As the case of the enculturated bonobo Kanzi shows (see "Language and communication" section), untrained subjects may develop in unexpected ways when left to their own devices in human environments. Both studies of trained zoo ape and enculturated ape behaviours are potentially worthwhile, because they tell us what these species can achieve with different cultural inputs. By comparing wild great apes with those raised in human environments, we obtain a control group for studying the effects of human culture on cognitive development, and thereby the before- and after-culture comparisons that are not possible when studying humans. Since the development of enculturated apes and human infants can also be compared, enculturation studies also help to identify key biological differences between species. If enculturated apes fail to acquire traits that are acquired by human infants in comparable environments, this may point to underlying biological differences that inhibit their capacity to learn.

The fact that great apes can acquire certain human abilities through learning does not tell us that the same abilities are not biologically adapted in humans—e.g. via the Baldwin effect (Baldwin 1896). Thus, it may be that there are cognitive traits that can be acquired by great apes raised in human environments, which are nonetheless more easily acquired by humans in the same environment, because the latter but not the former have undergone biological adaptation for more easily acquiring that trait. Nonetheless, findings from enculturated apes can help us to determine when biological changes are needed to explain cognitive differences between species. If great apes acquire human-like cognitive traits when raised in human environments, this is evidence that we need not appeal to biological changes to explain the appearance of that trait in hominin history. Instead we might appeal to changing developmental environments. These developmental environments need not themselves be a product of genetic changes in our hominin ancestors. Rather, they might have arisen as a result of changing ecologies, which either led to the (e.g. epigenetic) expression of, or induced our ancestors to develop and rely on, abilities that previously were latent (Moore 2017b); or because of the development of new cultural practices (for example, in styles of parenting) that changed the cultural inputs to which developing infants were exposed.

In what follows we illustrate the impact of culture on great ape cognition by reference to three traits: (i) number cognition, (ii) language, and (iii) social attention—domains that we consider promising leads for future research. While the first case we present is not strictly one of enculturation but training, it exemplifies how exposure to cultural tools not native to a species’s ecological niche can have far-reaching cognitive effects.

Before elaborating these enculturation studies, we start by acknowledging that enculturation studies get a bad rap—often with good reason. The documentary Nim presented Terrace’s efforts to raise the chimpanzee Nim in a human environment as both callous and unscientific. A more recent scandal involving the removal of Savage-Rumbaugh from the (now renamed) Iowa Primate Learning Sanctuary (IPLS) over ape and staff welfare considerations further undermined confidence in enculturation projects (Hu 2018). Criticism of the scientific value of enculturation projects has also grown. As noted by Lyn (2017), in early communication studies some of this revolved around the idea that enculturated animals might be mimicking their human caregivers without understanding them. While this led to more carefully controlled follow-ups, suspicion remained. Concerns about a lack of experimental rigour and exaggeration in the documentation of enculturation research (Rivas 2005; Pepperberg 2017) have been exacerbated by the eccentric behaviour of high-profile enculturation researchers, and their willingness to make strong claims on the basis of anecdotes that cannot be independently verified or evaluated. For example, Koko, the famous (and now deceased) enculturated gorilla, appeared in a 2011 YouTube video in which she was shown choosing between hand-written descriptions of different options for expanding her family.Footnote 6 This video appeared to suggest that Koko could read—not to mention understand a series of complex counterfactual propositions. These are claims that no credible scientific evidence supports. Such behaviour fuels scepticism that those engaged in enculturation projects are not producing scientifically credible research. Perhaps as a consequence of this scepticism, in 2017 there seemed to be no scientifically funded great ape enculturation projects anywhere in the world (Lyn 2017).

While the foregoing reflections may seem gossipy, we think it’s important to acknowledge them because of the ways in which such episodes have created a research environment in which it is easy for researchers to overlook or dismiss the significance of findings from enculturation research. Such episodes are often discussed by researchers in private but not in print. Nonetheless, these discussions can engender sceptical attitudes that lead to the quiet erasure of enculturation data from published research. In order to reassess this erasure, we think it important to acknowledge the causes of this (often reasonable) scepticism. We agree with critics on the need for caution when interpreting the abilities of enculturated apes, and are not advocating for the uncritical adoption of enculturation research. Researchers making use of these data should be attentive to the possibility that studies are poorly controlled, findings exaggerated, and anecdotal reports uncorroborated. Nonetheless, as Lyn (2017), Lloyd (2004), and Leavens et al. (2019) have argued, enculturation studies provide a unique opportunity to investigate the potential abilities of other species, where these might need training or a specific learning environment to emerge. We consider the following cases in this spirit. Although the following sections do not constitute a complete survey of the effects of enculturation of great ape cognition, they identify three broad domains of cognition—symbolic cognition, communication, and social attention—for which enculturation seems to engender significant cognitive change. The studies we now report were all carefully controlled, and can help to settle the outstanding disagreements between Tomasello and Heyes.

Sheba and number cognition

Learning numerals enables new cognitive abilities in humans (Everett 2017). Studies of Sheba the laboratory-reared chimpanzee by Boysen and colleagues suggest that this is true of chimpanzees too. Sheba was trained to associate placards with numbers of magnets (from 1 to 3) with varying quantities of candies (Boysen 1989). Initially she was presented with a single candy and rewarded when she matched it with a placard with one magnet. After reaching stable performance she was trained with arrays of two and three magnets. Arabic numerals were then systematically substituted in ordinal sequence to the magnet arrays, and the numbers from 4 to 8 were introduced too (Boysen and Berntson 1989; Boysen 1993). Subsequently a comprehension task was introduced in which Sheba was trained to match numbers on a screen with the equivalent placard (Boysen 1993).

After Sheba had been familiarised with these tasks for 2.5 years, experimenters hid oranges in three different locations and presented Sheba with a platform containing numerals. Without any task specific training, Sheba used the numerals to indicate the quantity of oranges in each location. Boysen and colleagues (2000) interpreted this success as evidence that Sheba’s training enabled her development of a sense of number. Like children, Sheba also spontaneously used her fingers to count before providing her answers (Boysen and Hallberg 2000). Thus learning to associate numerals with quantities enabled Sheba to master numbers and to manipulate them in novel ways.

The effects of number training were also evident in other tasks. In one task testing ordinality, Sheba and other two chimpanzees, Darrell and Kermit, were presented in training with pairs of numerals (e.g. 1–2, 2–3, etc.) and had to select the larger number. In novel tests, non-adjacent numbers (2–4) were presented. When chimpanzees had to choose the higher number, Sheba succeeded from the outset, while Darrell and Kermit performed significantly worse. Sheba could perform the task in reverse as well, systematically choosing the lower number if requested. The experimenters attributed Sheba’s performance to her training. After Darrell had received additional training on numbers, and mastered numbers from 0 to 6 and two fractions (1/2 and 1/4), his performance matched Sheba’s.

In further studies, Boysen and colleagues (Boysen and Berntson 1995; Boysen et al. 1996) tested chimpanzees in a reverse reward contingency task, where they were rewarded with the food quantity they did not choose. To perform optimally the chimpanzees had to choose the lower quantity of food. While visible food made the task difficult for chimpanzees, they succeeded once numerals were used instead—seemingly because substituting symbols helped improve the apes’ inhibitory control (Boysen and Berntson 1995; Boysen 2006). These results support the idea that using symbols helps chimpanzees to master tasks otherwise unavailable to them. Training number cognition also had unintended downstream effects. Sheba performed sums without previous training, and performed better in the ordinality task than untrained peers. Since Darrell achieved proficiency after training, Sheba’s performance was not a fluke. Training alone explains her success.

Since wild chimpanzees do not engage in symbolic counting, the behavioural changes in Sheba and Darrell provide an unprecedented perspective on what learning numerals can change. Since they were taught numbers but did not experience the range of counting activities familiar to western children, their emergent abilities seem to be due to training in the use of symbols. Darrell matched Sheba’s ability in the ordinality task only after he learned to manipulate numbers in a novel way—a clear effect of his training. While Sheba’s performance in the summing task would be unthinkable without training, her spontaneous summing suggests that learning numbers provided Sheba with a new cognitive gadget, the effects of which exceeded her training.

While the mechanisms at play in Sheba’s number development need to be explored, both their onset and limitations suggest some possibilities. One is that Sheba’s performance is attributable to an acquired ability to represent quantities in a more abstract way (i.e. through numerals). Acquiring a new representational format enabled Sheba and Darrell to relate quantities to each other in a novel way. This is consistent with other proposals that acquired representational formats imbue subjects with new cognitive abilities (e.g. Clark (2006); Everett (2017) on the acquisition of numeracy; and O’Madagain and Tomasello (2021), Moore (2021), and Berio (2021a, 2021b) on Theory of Mind development). In his interpretation of Boysen’s finding in the reverse reward task, Clark (2005, 2006) characterises Sheba as a case where symbolic representations change the computational burden, making difficult tasks computationally tractable.

Language and communication

Enculturation also extends great apes’ communicative abilities. While the reports of language-trained individuals can be exaggerated (Rivas 2005), they show clear differences in the communicative repertoires of enculturated and unenculturated great apes even when cautiously interpreted.

In captivity and in the wild, chimpanzees and bonobos use a relatively small number of calls and gestures, but with little evidence of syntactic structure. Chimpanzees have a repertoire of ≈50 gestures, used with a ≈20 different communicative functions (Hobaiter and Byrne 2014), and with five categories of distinct calls (Crockford 2019). Gestures are produced intentionally and with identifiable goals. While this was once doubted of vocalisations (e.g. Tomasello 2008), newer evidence suggests that the same is also true of at least some calls (Crockford et al. 2012). The situation in bonobos is comparable, although their gestural repertoire is slightly smaller than chimpanzees’ (≈30 gestures used with ≈15 functions). Like chimpanzees, they also have five acoustically distinct call types (Keenan et al. 2020), although the functions of these calls are less well studied. While calls and gestures are used across a variety of contexts, changing their communicative function, there is little evidence that wild apes combine them to produce more complex semantic effects. There is more to learn about the communicative repertoire of wild Panin—particularly relating to the use and interpretation of signs across contexts (Moore 2014), the question of whether and how signs and facial expressions are combined (Slocombe, Waller, and Liebal 2011), and the functions of graded vocalisations (Crockford 2019). Nonetheless, we know enough to attempt a comparison of the communicative repertoires of wild and enculturated individuals.

Studies of the communicative abilities of enculturated great apes have produced seemingly inconsistent results. Some have reported wild successes (Fouts and Mills 1997). Others present more sober findings (Terrace et al. 1979; Rivas 2005). The most successful project involves the bonobos Kanzi and Panbanisha. While Kanzi was raised by his own mother, Panbanisha was reared by humans and trained to use elements of language. Kanzi was not trained in this way but grew up observing his mother's interactions with humans, which included Lexigram training. (A Lexigram is a board consisting of a series of symbols that are associated with words of English, creating a visual keyboard that non-verbal individuals can point to as a means of uttering words.) As a result of this unusual infancy, Kanzi was reported to have developed the ability to understand spoken English around the level of a child of around 2.5 years and to communicate relatively fluently using a Lexigram consisting of around 450 signs, with perhaps 30–40 signs used on a daily basis (Savage-Rumbaugh 1986).Footnote 7

Kanzi produces a range of utterances consisting of simple combinations of verbs and nouns indicated using either pairs of Lexigram symbols, or a Lexigram symbol and a pointing gesture—for example, ‘eat’, plus a gesture to some food. He also crafts short multi-sign strings—e.g., by using his Lexigram to produce the utterance ‘Kanzi, ball’ as a way of asking for a ball. This is comparable to the utterances produced by sign-language trained chimpanzees (Rivas 2005) and similar in content and form to the utterances Kanzi produces when combining Lexigram signs and gestures, like points. While seemingly more complex than anything seen in wild apes, these utterances contain no evidence of syntactic complexity.

Kanzi’s comprehension of English exceeds his production abilities. In a 2010 interview,Footnote 8 Savage-Rumbaugh described him as capable of understanding perhaps a couple of thousand words of English (although different counting methods report lower numbers (Call 2011)). Furthermore, although Kanzi does not produce grammatically structured utterances, he can track relatively subtle grammatical differences. For example, in testing he responded differentially to:

  1. a.

    Put the tomato in the oil.

  2. b.

    Put some oil in the tomato.

In case (a) Kanzi put the tomato in the oil, whereas in (b) he picked up the oil and poured it in a bowl with the tomato (Truswell 2017). This suggests that Kanzi tracks not only the semantic properties of utterances, but perhaps also syntax-relevant properties like linear order (ibid.). Nonetheless, Kanzi struggles with some grammatical forms that are difficult for but ultimately mastered by young children. For example, in sentences of the form ‘Fetch the tomato and the oil’ he typically brings only one or the other object, suggesting an inability to bind two nouns to the same verb (ibid.).

There are further differences between wild and enculturated chimpanzees and bonobos. While Lexigram-trained individuals use them to refer to absent entities (Call 2011), this ability is seen infrequently in unenculturated apes—although it has been reported in captive chimpanzees (Bohn et al. 2015). Additionally, enculturated bonobos point with seemingly pro-social motives (Lyn et al. 2010; see also Leavens and Bard 2011). While some claim that captive apes do not produce or understand points with pro-social motives (Herrmann and Tomasello 2006; Tomasello 2006; Call 2011; although see Moore 2013b), such gestures have sporadically been reported in both wild chimpanzees and bonobos (Veà and Sabater-Pi 1998; Hobaiter et al. 2014). Thus, while both informative pointing and distal reference are seemingly more prevalent in enculturated than wild apes, it is unclear whether these abilities are really new.

Studies of Kanzi’s production and comprehension of English are also suggestive. His relatively fine-grained discrimination between grammatically sophisticated sentences of English suggest that whatever abilities support the comprehension of natural language grammars in English, they have some analogue in great apes. Since great apes do not seem to produce or understand syntactically complex utterances in the wild, they could not have undergone natural selection for syntax-relevant abilities in a communicative context (Lloyd 2004). More likely, these abilities were selected for non-communicative functions and recruited for use in communication only when enculturated apes found themselves in a language rich environment.

The fact that three-year-old children’s grammatical competence soars just as Kanzi’s stopped developing also tells us that there might be a limit to what even enculturated bonobos are capable of learning. This suggests a biological difference between children and non-human great apes. However, we do not currently know whether this difference is language specific or a result of domain general cognitive differences (for example, in working memory).

The enculturation of social attention

Humans and great apes also differ in their social learning abilities. Whereas human children are proficient imitators, zoo-reared great apes are only emulators (Tennie et al. 2009).Footnote 9 Imitation is a high-fidelity copying mechanism, in which an agent attends to both the particular action performed by another, and the goal in pursuit of which the action is being performed, and endeavours to reproduce both (Boesch and Tomasello 1998; Fridland and Moore 2015). In contrast, emulation is when “an individual observes and learns some dynamic affordances of the inanimate world as a result of the behaviour of other animals and then uses what it has learned to devise its own behavioural strategies” (Boesch and Tomasello 1998, p. 598). Imitation differs from emulation in the fidelity with which an agent attempts to copy the behaviours they observe. This difference matters because some behaviours are useful only when copied precisely. For example, imperfectly copied knots may also be useless (Tennie et al. 2009), and the imperfectly copied words of a language incomprehensible to others (Moore 2013a). The emergence of imitation in phylogeny is therefore hypothesised to be key to human development (Tomasello 2008; Henrich 2015).

Tramacere and Moore (2020) have hypothesised that imitation could have emerged in hominin history following natural selection for both fine-grained motor control in the mouth and limbs and better social attention, rather than as an adaptation in its own right. Their idea is that whereas chimpanzees look to affordances of both objects and their environment to figure out how best to use them (emulation), our ancestors underwent selection to look to one another for potential solutions to the problems that they encountered (imitation). Among other things, this might have caused them to attend more closely to others’ manual and vocal behaviours, facilitating accurate copying. Heyes (2018) agrees that natural selection for social attention and motivation likely played a role in the evolution of the human ‘starter kit’, but has argued that such behaviours were additionally fostered by social practices which reward individuals for more precise copying (Heyes 2018). These claims are consistent: imitation could be the result of adaptations for both increased motor control and social attention and cultural practices that encourage attention to self and others. These hypotheses could be tested in studies of enculturated great apes. If enculturated great apes attend to the world in more human-like ways—and so look more to one another in problem-solving tasks, rather than to the environment—then they should perform better in imitation tasks than zoo apes. Eye-tracking studies should also reveal differences in their gaze behaviour. Confirming earlier reports of imitation in enculturated chimpanzees (e.g., Hayes and Hayes 1952; Tomasello et al. 2005), Pope and colleagues (Pope et al. 2017) trained four captive chimpanzees to reproduce demonstrated actions. Following training they also found a range of changes to areas associated with the MNS in humans—suggesting that white matter connectivity changes in response to behavioural training. In a separate study, Pope and colleagues (Pope, Russell and Hopkins 2015), also found that imitation recognition in chimpanzees is correlated with socio-communicative competence—consistent with their having common underlying cognitive abilities.

Another way of exploring the hypothesis that social attention training makes great ape cognition more human-like comes in the domain of pointing and communication studies. A longstanding but simplistic claim has been that human children understand pointing, whereas great apes do not; and that this contributes to an explanation of why children alone acquire language (Tomasello 2008). Infants certainly excel at pointing comprehension (Behne et al. 2012), whereas zoo apes often perform poorly (Tomasello, Call and Gluckman 1997; Herrmann and Tomasello 2006). Tomasello and colleagues have argued, on this basis, that great apes do not understand the Gricean communicative intentions required for informative pointing comprehension, both because this requires grasping “the embedding of intention within another” (Tomasello 2006), and because they “do not understand communicative acts with either a helping or a sharing motive” (Herrmann and Tomasello 2006, p.527). Since children do understand such intentions, comprehension of which is needed for language development (Tomasello 2008; Moore 2018), only children go on to master language. The same conclusions also suggest that there ought not to be great apes who can understand pointing—at least not if doing so requires using the same metarepresentational abilities (‘embedded intentions’) implicated in human communication.

This conclusion is problematic for a number of reasons. First, enculturated chimpanzees and bonobos perform well on pointing comprehension tasks (e.g., Lyn et al. 2010). Indeed, in most groups of apes, some individuals also understand informative points statistically above chance (e.g. Moore et al. 2015). Even unenculturated chimpanzees perform above chance on tasks designed to give them more time to think before making their decisions (Mulcahy and Call 2009). This makes it likely that when apes perform poorly in pointing comprehension studies, it’s because they aren’t paying attention. This conclusion is consistent with its being the case that great apes do understand communicative intentions (Moore 2016, 2017a), and that comprehension of such intentions does not require any particularly demanding metarepresentations (ibid.).

A recent finding supports this conclusion. In the first eye tracking study of great apes’ comprehension of human communication, Kano et al. (2018) looked at the effect of human communicative signals on the ways in which chimpanzees attended to identical objects. In the test condition, apes watched a video of an experimenter looking at them, calling their name, and then looking towards one of two identical objects placed in the bottom corners of the screen. The study sought to determine whether ape observers would follow the experimenter’s gaze to the target object more when he first engaged them with ostensive gaze than in a series of control conditions where ostensive gaze was replaced by a salient, non-communicative behaviour (the experimenter looking down while eating an apple). While the chimpanzees did not spend longer looking at the object cued ostensively, they spent longer examining both objects in the test condition than in the controls. The authors interpret this finding as showing that great apes did recognise ostensive gaze as communicative, but that they did not subsequently use the experimenter’s gaze cues to identify the object of his attention. Rather, they scanned the whole environment for evidence about what the experimenter might be telling them. They scanned less in the control condition because they were not looking for evidence to disambiguate the experimenter’s communicative behaviour. This finding suggests that a single explanation may account for both great apes’ imitation behaviour and their failure to use information to interpret others’ points. In both task types, captive apes look to the environment to learn about it. In contrast, humans look to other humans and use their intentional behaviour to learn about the world. If this is right, great ape enculturation may incorporate a learning process through which apes come to acquire information using more human-like search strategies. This is consistent with an earlier hypothesis, according to which “in growing up with humans … apes become both more competent and more motivated to pay attention to the things that humans do” (Tomasello et al. 2005, p.113).

The training of social attention might also have implications for a range of other cognitive phenomena. For example, similar enculturation processes may explain the development of joint attention in human history (Tomasello 1999a, b)—although there is currently little evidence of joint attention even in enculturated chimpanzees (Tomasello,Carpenter, and Hobson 2005). It has also been hypothesised that captive chimpanzees’ performance in Stag Hunt tasks would improve significantly with greater experience of the ways in which attention to one’s partner can facilitate coordination (Moore 2017b). While the development of social, rather than individual information gathering strategies may point to an adaptation in the hominin lineage, future studies should investigate the extent to which social attention can be altered with processes of training and/or enculturation. Explanations that propose that enculturation changes attention are appealing because they specify a mechanism that could, in principle, be trained—namely, the types of environmental stimuli to which to attend to learn possible solutions to encountered challenges.

Taking stock

Having explored three ways in which enculturation studies can inform our understanding of cognitive development, we now finish by clarifying how these studies can help to resolve disagreements about the origins of specific traits.

When it comes to language development and syntax, both Tomasello and Heyes deny that knowledge of syntax need be genetically encoded. However, Kanzi’s communicative abilities have implications for their shared view that neither has acknowledged. The fact that Kanzi’s linguistic development stopped relatively early in his ontogeny may suggest an important biological component of the foundations of syntax that is overlooked in pure social learning views. While we do not yet know what this component is, the salient differences between Kanzi’s performance and that of children provides an important platform for future research, and indicates that both Tomasello and Heyes may need to adjust their views of language development.

With respect to communication, we do not need to posit large scale biological changes in phylogeny to explain chimpanzees’ and bonobos’ (lack of) understanding of human communication (and pointing in particular). Rather, enculturation studies point to a significant role in the changing of social attention. That enculturation changes great ape social attention does not mean that these abilities have not undergone natural selection in humans, but it suggests that it may be premature to assume that differences are wholly attributable to processes of natural selection in the hominin lineage. Cultural selection may also play a role. The same applies to imitation. Considering attention training as a key component for socio-cognitive development has the advantage of providing us with a unified explanation of the phylogenetic development of both pointing and imitation. If this is right, then Heyes’s appeals to attentional biases may be better placed to explain the emergence of imitation and pointing than Tomasello’s (2020) appeals to larger-scale adaptations.

Finally, while neither Heyes nor Tomasello talk about the origin of number cognition, Sheba’s case can give us a more general insight in the origins of symbolic cognition and the ways in which it extends human cognition. Sheba and Darrell give us good reasons to think learning numbers provided them with a new cognitive gadget. This is consistent with the predictions of both Tomasello and Heyes, since both defend the view that symbols provide us with new forms of representations that expand our cognitive repertoire (Tomasello 2014; Heyes 2018; O’Madagain and Tomasello 2021).

As a result, the case studies above present a mixed picture. On the development of pointing comprehension and imitation, Heyes’s more austere view seems to have the upper hand. Nonetheless, enculturation studies also offer support for both Tomasello’s and Heyes’s views on the role of symbols in cognition, and challenge their shared view on the absence of any need to posit biological changes to explain natural language syntax. All of these studies reaffirm the value of carefully controlled enculturation research.

Conclusions

This paper set out to consider whether there are features of traits that are either cultural or biological in origin, to which we might appeal to settle a series of local disagreements between Heyes and Tomasello (among others) about the origins of particular human cognitive traits. While the heuristics that we considered in the first parts of the paper can help to determine trait origins, they underdetermine answers to origin questions. The difficulty of identifying traits with a cultural origin is exacerbated by what we call the immersion problem—the fact that we cannot easily study the development of human cognitive traits in the absence of human culture. This problem arises because of infants’ early immersion in their cultural environment, and because studies of humans raised outside ordinary learning environments are difficult to interpret.

As a contribution to resolving outstanding questions about trait origins, we have proposed that researchers should pay more attention to studies of enculturation in great apes. This project could include both renewing attention to existing studies and reports of enculturated great apes, and new controlled studies of already enculturated great apes. When interpreted with appropriate caution, the former may turn out to be a rich and undervalued source of knowledge (although we would not expect them to answer all of the questions we might have).

We recognise that changing attitudes to the ethics of enculturation studies make future enculturation projects improbable. Therefore we do not propose the deliberate enculturation of any new great apes. Nonetheless, many previously enculturated great apes remain alive and in captivity. These individuals could also be studied systematically, so that we might better understand the ways in which their cognition differs from unenculturated peers. The rise of African sanctuaries may also provide opportunities to study great apes who have been exposed to human culture. These sanctuaries house infants who were orphaned by deforestation and the bushmeat trade, and who are consequently hand-reared by humans. Previous studies suggest that hand-reared apes would show enculturation effects in their social attention behaviour, and we foresee no particular ethical issues with testing these individuals before they are returned to the wild.

Existing and future studies present us with an opportunity to observe the effects of (aspects of) human enculturation by enabling comparisons of non-enculturated (i.e. wild and captive apes) and enculturated (human-reared) subjects, and thereby facilitating observations of the impact of enculturation on cognition. By looking at the limitations of great ape cognition after enculturation, we can additionally identify the limits of ape abilities after controlling for the effects of culture. While we do not claim that enculturation studies alone could settle the questions discussed in this paper, and recognise that some existing enculturation projects may contribute little of value, nonetheless enculturation studies are a potentially valuable source of evidence for understanding the origins of the human mind. They are therefore worth renewed critical scrutiny.

We conclude by noting a final motivation for this paper, which we have not previously discussed. This unstated goal is to build resistance to the assumption that, by default, cognitive differences between human and non-human ape species point to biological adaptations in the recent hominin lineage. This assumption seems to be implicit in the developmental literature (e.g. Tomasello 2008; Henrich 2015) but—particularly since hypothesised adaptations are often described only loosely—we think it holds back research in cognitive development. A key challenge for research on cognitive development lies in the proper characterisation of the mechanisms that support human cognition, and of the ways in which these mechanisms develop in ontogeny and phylogeny. Where adaptations are posited to explain cognitive differences between species, and the details of these adaptations spelled out only by reference to loosely constrained just-so stories, the appearance of progress is superficial. Rather than an account of the ways in which cognitive mechanisms develop, we are left only with a redescription of the claim that cognitive differences between species exist, and a tacit reassertion of the misguided idea that cultural evolution plays a limited role in cognitive development. When appropriately theorised, renewed interest in enculturation studies could help the field to shrug off this lazy adaptationism, and to explore more systematically the potentially important insights that we have sketched here. A more detailed account of the cognitive mechanisms that support great ape cognition, and of the ways in which these mechanisms are changed by processes of enculturation, would be a valuable project for future research.