The gestural communication of great apes has been subject to intensive study in recent years, spurred by the exciting discovery that apes gesture intentionally, something that appears to be lacking in most animal communication (Tomasello, George, Kruger, Farrar, & Evans, 1985; but see Crockford, Wittig, Mundry, & Zuberbuehler, 2012, and Schel, Townsend, Machanda, & Zuberbuehler, 2013, for evidence that chimpanzee alarm-calling also meets the accepted criteria for intentionality). Apes have extensive repertoires of gestures, and they use them to target specific individuals, whom they continue to monitor for the effect of their gesturing (chimpanzee: Hobaiter & Byrne, 2011a; bonobo: Pika, Liebal, & Tomasello, 2003; gorilla: Genty, Breuer, Hobaiter, & Byrne, 2009; orangutan: Cartmill & Byrne, 2010; Liebal, Pika, & Tomasello, 2006). If the desired result is not attained, they repeat or elaborate their gestures until it is (Leavens, Russell, & Hopkins, 2005). Gestures vary in modality, and apes choose gestures whose modality is appropriate to their audience’s state of attention (Hobaiter & Byrne, 2011a; Liebal, Pika, Call, & Tomasello, 2004). If the target audience is already looking at the signaler, a silent visual gesture is likely to be employed; if not, using a contact gesture is more likely; gestures that convey the message with a distinctive sound as well as a visual appearance are used whether the target is attending or not. There is even some evidence that apes take the knowledge of their target audience into account (Cartmill & Byrne, 2007). Using the fact that zoo-housed orangutans use gesture to request treats from their keepers, the “understanding” of a keeper was experimentally varied, from complete incomprehension to partial grasp of the ape’s intent. In the latter case, orangutans kept at it, giving the same gestures at an increased rate; but if the human seemed totally clueless, they switched to using different gestures of equivalent meaning. The obvious parallel is with how we change our mimes in a charades game according to whether or not our team is getting the idea.

These very human-like characteristics invite comparison with language and have tended to bolster the long-discussed idea of a gestural origin to language (Corballis, 2010; Hewes, 1973). But great ape gestures are not learned socially, like the words of a language (Call & Tomasello, 2007), nor are most gestures constructed idiosyncratically in dyadic interactions (Genty et al., 2009; Hobaiter & Byrne, 2011a). Ape gestures are species-typical signals that owe more to biology than learning, in both their form and meaning. This is not to say that apes cannot learn gestures socially, or by dyadic construction (e.g., Halina, Rossano, & Tomasello, 2013); nor is it to suggest that the innate potential to develop a gestural repertoire makes apes immune to learning from experience. But development is a matter of exploring a latent repertoire conferred by biology (Hobaiter & Byrne, 2011b), rather than acquiring gestures from scratch. The biological repertoire consists of specific gestures for specific meanings, and all apes have the innate potential to develop large repertoires. Indeed, the gesture repertoire is extensively shared between species. Over 30 gestures are the same in the chimpanzee, the gorilla, and the orangutan (Hobaiter & Byrne, 2011a); 69 are the same across all African ape species, whose total repertoires vary between 73 and 77. This repertoire overlap implies an origin in common descent; thus, gorillas share most of their repertoire with chimpanzees because they are descended from a recent shared ancestor. We can confidently predict that any great ape, more closely related to the two Pan species than to the gorilla, will include in its gestural repertoire over 60 gestures whose form and function we already know from the study of the Pan species, because of its shared biological inheritance. And one ape species fits that bill: Homo sapiens.

So where are all those predicted gestures? Human gesture is known to take several forms, and we will examine each in turn, to see whether they reflect an origin in shared phylogeny with nonhuman apes. Most familiar of all are the so-called co-speech gestures that we make all the time as adults, although we are generally not very aware of either producing or perceiving them. In the course of development, children gradually come to produce complex forms of co-speech gesticulation, although they first mainly use representational gestures (in 6- to 10-years-olds: Coletta, Pellenq, & Guidetti, 2010). Co-speech gesture plays crucial roles in face-to-face communication for both speaker and listener (e.g., Kendon, 2004; McNeill, 1992, 2005). Such gestures have no direct linguistic function and no meaning on their own, but they lend rhythm, emphasize speech, and sometimes serve an iconic function, which can help illustrate and disambiguate the speaker’s message (e.g., describing the size or movement of objects with the hands). Unsurprisingly, since they are intricately linked to speech, these gestures have no similarity with those made by apes.

Before the age of speech, gesture is the sole or main means of intentional communication for infants and toddlers, and these prespeech gestures persist alongside the production of the first words (Acredolo & Goodwin, 1988). Evidence from behavioral studies in human infants has highlighted the role of communicative gestures in language development (e.g., Pizzuto & Capobianco, 2005; Rowe & Goldin-Meadow, 2009). In congenitally deaf children, prespeech gestures can develop into a language-like structured communication system, called home-sign (Goldin-Meadow & Feldman, 1977). The analysis of prespeech gesture typically involves the distinction between deictic gestures (i.e., pointing toward an object or event of reference) and motor-iconic gestures, used to represent the characteristics of objects, functions, or states (e.g., putting flapping hands at the shoulder to depict a bird). This general ability for symbolic reference, present from infancy (e.g., Namy & Waxman, 1998), has been shown to allow in communities of deaf children the development of sign language, characterized by a syntactic structure (Goldin-Meadow & Morford, 1985; Goldin-Meadow & Mylander, 1998), through the emergence of spontaneous conventions among home-sign users (e.g., in Nicaragua: Sengas, Kita, & Ozyurek, 2004). Although no trace of syntactic structure has yet been detected in the natural gesturing of great apes, claims of iconicity and deictic reference have been made. Certainly, captive apes readily learn to point, and use pointing in interaction with humans. In the wild, pointing has been described, but it appears to be very rare; Hobaiter and colleagues speculated that the lack of physical barriers to a young ape in the wild means that there is seldom a need to recruit aid by triadic interaction with the mother, and they presented evidence of pointing and joint attention by a young chimpanzee toward its mother in circumstances of a social barrier to a desired food (Hobaiter, Leavens & Byrne, 2014). Iconic reference has been claimed in both gorillas and bonobos (Genty & Zuberbuehler, 2015; Tanner & Byrne, 1996); in both cases, it remains possible that the “iconic” nature of the gesture was visible only to the researcher, and resulted from a coincidental similarity of form. In the main, the natural gestures of great apes are neither deictic nor referential, whereas human prespeech does not appear to include the gestures shared among our nearest ape relatives.

Finally, all human groups develop culture-specific gestures, also referred to as emblems or symbolic gestures, which convey meanings based on conventionality or habit (e.g., nodding the head, waving goodbye) and may be given during speech or on their own. Comparisons across multiple cultural groups have shown differences in the forms of emblems for the same verbal meaning (e.g., different gestures can be related to the verbal message “come”) and, conversely, differences in meaning for the same form (e.g., the ring, in which a circle is made with the thumb and the index finger, means “A-OK” in Britain, but can be offensive in other countries). Very few culturally similar emblems have been described so far, for relatively basic meanings tied to universal physical forms related to the referents of the message (Matsumoto & Hwang, 2013). A few natural gestures of great apes have been argued to be cultural, on the basis of intercommunity differences in their presence or their usual function (e.g., among chimpanzees: Whiten et al., 1999; among orangutans: van Schaik et al., 2003). Whether or not this ascription is validated by further research, it certainly does not apply to the great majority of gestures, whose similarity across widely separated populations, and even different species, is more remarkable (Genty et al., 2009; Hobaiter & Byrne, 2011a).

Our brief overview of human gestures suggests that children gradually learn to convey intended meanings through gesture as a result of experience, either through social learning as a cultural product (especially for emblems) or through reinforcement learning from actions that initially do not involve communicative intention. The latter process has been defined as ontogenetic ritualization (Tomasello & Call, 1997) and may be illustrated with the origin of imperative pointing: Children first attempt to reach for an object, and then come to understand the communicative function of their movements thanks to the reaction of their parents, who interpret the children’s goals and give them the desired object (Cochet, Jover, Oger, & Vauclair, 2014). The gestures we referred to as preverbal gestures in the previous section thus may all develop from motor acts, turning first into basic, mimic-like gestural actions that over time may incorporate more complex forms—that is, forms that are less directly related to the initial action. These learning capacities, which are observed early in child development, may rely on skills that have been argued to be inborn and specific to the human species: the ability to engage in triadic relationships of deixis, and the ability to imitate and integrate others’ or one’s own actions in the service of communication (e.g., Tomasello, Carpenter, Call, Behne, & Moll, 2005).

What is missing from the literature on human gesture is any indication that humans take advantage of their biological inheritance, the gestures that can be expected to be part of human nature from shared ancestry with apes: It seems that 60–70 gestures are missing, somewhere! We predict that—despite appearances—humans have the potential to make and understand all of the gestures that are shared between chimpanzees, bonobos, and gorillas (of course, those species may have developed species-specific gestures in addition, since the time of their evolutionary separation from the human line). It may be that, because language is so much more powerful for conveying intentional meaning, humans normally have little need to employ gestures, and so may well never explore the potential repertoire of innate gestures that they retain from their shared ape ancestry. However, we predict that deliberate experimentation would reveal a latent ability to recognize and interpret “ape gestures” in naïve humans; given the subtlety of many of the gestures expressed by great apes, it might be necessary for humans to mimic the gesture form in somewhat exaggerated fashion. Suitable tests might vary from participant observation (e.g., miming ape gestures to a naïve subject and asking them what was meant), through cognitive experimentation (e.g., in a forced choice judgment of correctness, showing video in which a gesture is made to a target who reacts, or does not react, in accordance with the meaning known from ape studies), to neuropsychology (e.g., comparing the brain areas activated by watching mimes of ape and human gestures with the same or different natural meanings). To our knowledge, none of these tests has been attempted, but readers can make informal investigation themselves: Try out the chimpanzee gesture Hand Fling, in which the fingers are flicked toward the target with the arm extended and wrist uppermost. To a chimpanzee, this is a firm request to back off; from our observations, most humans seem to understand it the same way, and we suggest this is because it is part of a universal gestural potential derived from shared ancestry with the chimpanzee.

If we are right, what are the implications for the longstanding controversy (see Fitch, 2010, ch. 13) over the hypothesis of a gestural origin for language? First of all, the fact that all living nonhuman great apes have an extensive, and extensively shared, repertoire of gestures that they use intentionally on a minute-by-minute basis, whereas intentional vocal communication in apes on current evidence is restricted to the single situation of alarm in the chimpanzee, is strong evidence that the last common ancestor of Homo and Pan was also largely a gestural communicator. There is little evidence for pantomime in nonhuman apes (Russon & Andrews, 2010), and their natural repertoires of gestures are restricted in content. Nevertheless, the step to a more open system of gestural communication would be a small one (e.g., Arbib, 2005), in contrast to the challenge presented by vocal learning. The big question is how a flexible system of gestural communication became a language: by abandonment, when the acquisition of vocal learning ability in the human lineage allowed speech, or by incorporation, via an intermediate stage of gestural protolanguage? We suggest that the issue of whether our “ape legacy” of gestures proves to be alive and well in daily use, and has merely been overlooked so far, or is deeply buried in our minds and only interrogated via indirect experiment, is relevant to this question. At present, the latter possibility seems more likely, favoring a vocal origin of language, but the study of human gesturing—both preverbal and co-speech—in the light of ape gestures may yet reveal incorporation, which would be more consistent with a gestural origin of language.