Breakthroughs in science often come from the use of simple systems to examine more complex, hard-to-study processes. Laughter is an ideal simple system for the investigation of vocal evolution; contrast the austere simplicity of “ha-ha” to the thousands of grammatically complex human languages and their countless dialects. For a scientist struggling to understand the nitty-gritty of neurobehavioral evolution, the majesty and complexity of speech and language is more a cause of concern than celebration. Laughter provides an attractive alternative to the use of animal models. With laughter, we can study one of our species’ own animal calls, using readily available participants who can self-report and follow directions.

Laughter is a socially potent vocalization that has been a topic of philosophical speculation since Plato and Aristotle (Morreall, 1983), but scientific analyses of what laughter is, when we do it, what it means, and how it evolved are mostly the products of recent decades (Provine, 2000). The relative youth of laughter science means that the frontiers are near and fundamental questions remain unanswered. Laughter insights are often accessible to the simple observation of ongoing behavior, an application of human ethology termed sidewalk neuroscience (Provine, 2000, 2016).

Before I engage evolutionary issues, the facts (and misperceptions) about laughter should be considered. Although we have lifelong experience as producers and consumers of laughter, we are hardly expert; empirical analyses provide many surprises. The research program described here follows the path of discovery wherever it leads. What began as a straightforward analysis of the structure, production, and control of laughter took unanticipated turns, leading to contrasts between laughing and talking, as well as to investigations of conscious control, gender issues, social context, and utility, and ultimately to the evolution of laughter and the discovery of why humans can speak and other apes cannot, the focus of this review.

Laughter structure

Acoustic analyses of human laughter collected in informal settings have identified the vocal quantum (“ha,” etc.) at the root of laughter. Laughter is composed of a series of short, voiced and unvoiced notes, each about 1/15 s long, that are repeated at regular intervals of about 1/5 s (giving it the typical form ha-ha; Provine & Yong, 1991). (A “voiced” note has a tonal quality; unvoiced notes are noisy blasts.) The deep structure of laughter is based on this 5-Hz vocal cadence. Each voiced laugh note has strong harmonic structure (each harmonic being a multiple of the lowest, fundamental frequency). Female laughter has a fundamental frequency that is about twice as high as that of a male (502 Hz vs. 276 Hz), a pattern consistent with sex differences in speech. The laugh notes carry the information necessary to identify a vocalization as laughter; if the notes are edited out of a recording of a bout of laughter and the gaps are closed, all that remains is a long, breathy sigh. Altering the duration of the internote intervals influences whether laughs are perceived as spontaneous, voluntary, human, or nonhuman (Bryant & Aktipis, 2014). Laughter has a stereotyped (but not fixed) acoustic and rhythmic structure, the central tendency around which variations occur in different social and emotional contexts. Such deep structure must exist or we could not reliably identify a vocalization as laughter—and laughter is certainly recognizable, as the only positive vocal emotional expression identified across culturally and geographically distinct groups (Sauter, Eisner, Ekman, & Scott, 2010). Other research has confirmed laughter’s harmonic and temporal structure and context specificity (Bachorowski & Owren, 2001; Bachorowski, Smoski, & Owren, 2001; Scott, 2013; Smoski & Bachorowski, 2003; Vettin & Todt, 2004).

The stereotypy of laughter is, in part, the result of neuromuscular constraints of our vocal apparatus; it is difficult to laugh in other than the normal way (e.g., haaa-haaa, ha—ha), and the resulting vocalizations sound abnormal (Provine, 2000, pp. 57–62). Using electromyography, Luschei, Ramig, Finnegan, Baker, and Smith (2006) described the pattern of laryngeal muscle activity and respiration during laughter; laryngeal adductors (e.g., thyroarytenoid, lateral cricoarytenoid) show bursts of activity at a rate of about 5 Hz, which corresponds to the rhythm of ha-ha. In the brain, the periaqueductal gray region coordinates the production of such emotional vocalizations in humans and other animals (Esposito, Demeurisse, Alberti, & Fabbro, 1999; Jürgens, 2009; Wild, Rodden, Grodd, & Ruch, 2003; Zhang, Davis, Bandler, & Carrive, 1994), in conjunction with its projection to the nucleus retroambiguous (Holstege & Subramanian, 2016).

Laughter is not speech

Laughing is not the act of speaking “ha ha,” a proposition that can be tested by a simple experiment. When asked to laugh, about half of people say that they cannot do it, or gamely provide a poor imitation. Give it a try. In a reaction-time study (Provine, 2012, pp. 217–220), when subjects were asked to laugh, the response latency (2.1 s) was over twice as long as that for saying “ha ha” (0.9 s), evidence that laughter is not speech and that different mechanisms of production are involved. Clinical case studies provide further evidence: Laughter and other nonverbal emotional vocalizations are preserved in patients with bilateral damage to speech motor areas of the brain, who are otherwise unable to speak or vocalize voluntarily (Simonyan & Horowitz, 2011). Gervais and Wilson (2005) distinguished between spontaneous, emotion-driven (Duchenne) laughter and voluntarily controlled (non-Duchenne) laughter in their review and synthesis of laughter and humor evolution. Human conversation includes both types of laughter (Ruch & Ekman, 2001), but nonhuman primates may be restricted to the involuntary type. Although some individuals can produce convincing voluntary laughter (Bryant & Aktipis, 2014), laughing and speaking involve different neurological mechanisms (McGettigan et al., 2015), with laughter being under weaker voluntary control than speaking. This fact is critical for understanding laughter in daily life, where people commonly provide and act upon fictive post-hoc rationalizations about why they laugh, presuming that it is a choice, perhaps a response to “nervousness” or something “funny” (Provine, 2000). We humans are reluctant to acknowledge that we are sometimes unthinking herd animals, driven by unconscious instincts, acting out our species’ ancient biological script (Provine, 2012). During conversation, most laughter is produced spontaneously (not intentionally spoken) and is often a contagious response to hearing other people laugh (Provine, 1992). Whether in a small conversational dyad or large theatre audience, laughter stimulates laughter.

The low level of voluntary control does not preclude high levels of lawfulness and predictability of when and how we laugh.

Laughter punctuates speech

What determines where you laugh in the speech stream? Although conversation is filled with laughter, the laughs do not occur randomly. The placement of laughter in speech is akin to punctuating written text and is termed the punctuation effect (Provine, 1993). A speaker’s laughter usually occurs before and after complete statements and questions, and seldom interrupts phrase structure. For example, a speaker may announce “I’ve got to go now ha-ha,” but rarely “I’ve got to ha-ha go now.” When laughter does interrupt phrases, it usually involves “laugh-speak,” a hybrid form of laughing speech that is often an attempt to temper embarrassing circumstances and is probably under more voluntary control than most laughter. The study of punctuation does not consider laugh-speak, but focuses only on classic, discrete ha-ha-type laughs. Most of us have experienced uncontrollable fits of laughter that make speech impossible and are hard to stop, but such episodes are rare.

Punctuation is a significant neurolinguistic phenomenon that demonstrates the dominance of speech over laughter and the priority access of speech to the shared vocal apparatus. The vocal laughter of the deaf punctuates American Sign Language, a form of manual linguistic expression that does not compete with laughter for the vocal tract (Provine & Emmorey, 2006). Neither do emoticons disrupt phrases in online text messages, a nonvocal linguistic medium (Provine, Spencer, & Mandell, 2007). The presence of punctuation in these very different linguistic contexts indicates that the punctuation of speech by laughter involves a higher-level neurolinguistic mechanism, not simply a lower-level gate-keeping process that regulates access to the shared organ of vocalization. The details of this mechanism are unclear, but a general principle emerges: The recently evolved vocal act of speech overrides automatic, more phylogenetically primitive utterances. There may be top-down inhibition of midbrain and brainstem vocal centers during the act of speech.

If you are skeptical of laughter’s punctuation effect, look for it in your own conversations, and consider a wider range of punctuation phenomena. When, for example, do you breathe during conversation? Is the placement of a breath determined automatically, or via conscious choice? Consistent with the finding for laughter, breaths reliably punctuate the phrase structure of speech (McFarland, 2001), again demonstrating linguistic dominance over a more primal act.

Social and gender contexts of laughter

The essential stimulus for laughter is not a joke or other formal attempt at humor, but another person. The sociality of laughter is remarkable. Laughter diaries of college students have revealed that laughter is 30 times more frequent in social than in solitary situations (Provine & Fischer, 1989). (The media of television, radio, text, or simply responding to imaginary people in your head are sources of vicarious social stimulation.) When one is completely alone—no media, no people or thoughts about people—laughter almost totally disappears. Likewise, chimpanzees laugh more often during social than during solitary play (Davila-Ross, Allcock, Thomas, & Bard, 2011; Matsusaka, 2004). In humans, laughter promotes bonding when we “laugh with” friends, but “laughing at” outliers and adversaries (e.g., jeering, ridicule) can be aversive and an attempt to shape their behavior (Provine, 2000). Dezecache and Dunbar (2012) suggested that laughter permits “grooming at a distance,” increasing the number of “groomed” individuals and opportunities for bonding from one to three or more.

The social context of laughter was explored further in a second study that involved surreptitiously observing 1,200 instances of spontaneous laughter of human dyads in public places (Provine, 1993). For each episode of laughter, the following data were recorded: the gender of the speaker (the person speaking immediately before laughter occurred) and of the audience (the person listening to the speaker), whether the speaker and/or the audience laughed, and what the speaker had said immediately before the laughter. These simple methods provided surprising results—speakers laughed an average of 46 % more than their audience, and only 10 %–15 % of prelaugh comments were judged as being remotely humorous. Banal comments like “I’ve got to go now” or “There’s Andre” were typical. Such vapid prelaugh fare is not simply an artifact of “having to be there” to grasp its true nature. Humor, itself a fascinating but rather different topic (Hurley, Dennett, & Adams, 2011; Martin, 2007), is a recently evolved cognitive and linguistic stimulus that is not responsible for most laughter (Provine, 1993). On several levels, standup comedy, in which a nonlaughing comedian tells jokes to a laughing audience, fails as a model for everyday, conversational laughter.

There are strong gender differences in laughter patterns. In the above analysis of 1,200 instances of laughter (Provine, 1993), both sexes laughed a lot, but in cross-gender conversations, females laughed 126 % more than their male counterparts, meaning that women were the leading laughers, whereas males were the leading laugh getters. The greater success of males at stimulating laughter is not a matter of males consciously choosing not to laugh at females, because both males and females laugh more at male than at female speakers. Likewise, females do not choose to laugh more to male than to female speakers. Consistent with a common theme of laughter science, both males and females are acting out a primal script that is under minimal voluntary control.

Gender differences in laugh patterns suggest that laughter may be a factor in courtship. Among young German adults, Grammer and Eibl-Eibesfeldt (1990) observed that the more a woman laughed during an encounter, the greater was her self-reported interest in her male conversational partner. Conversely, the men were most interested in women who laughed in their presence. The laughter of the female, not the male, is most predictive of a relationship. Given that laughter, like vocal crying, is under weak voluntary control (Provine, 2012), it is an honest signal that reveals our true feelings about our conversational partners. But not all laughter is equally effective. Voiced laughs make a more positive impression than unvoiced pants or grunts (Bachorowski & Owren, 2001), and variability in pitch and rhythm is associated with positive ratings (Kipper & Todt, 2001, 2003). Although laughs are relatively simple in structure, the laughs of individuals and groups provide a lot of information. For example, short audio clips of joint laughter (colaughter) provided sufficient information for members of 24 diverse societies and language groups to distinguish whether the pairs of laughing people were friends or strangers (Bryant et al., 2016).

Gender differences in laughter and humor extend to the psychosexual marketplace of the print medium. An analysis of 3,745 personal ads (“in search of”) from eight major US newspapers showed that women were more likely than men to seek partners with a “good sense of humor” or its equivalent, whereas men were likely to offer it (Provine, 2000, pp. 32–35). Why are humorous men attractive? Their witty banter is a likely indicator of fitness that predicts intelligence and mating success (Greengross & Miller, 2011).

Primate laughter, bipedalism, and speech evolution

Laughter is not unique to humans (Provine, 1996, 2000, 2004). Acoustically different but functionally similar play vocalizations are produced by other mammals, including rats (Panksepp, 2007; Panksepp & Burgdorf, 2003). Since Darwin (1872/1965), it has been known that chimpanzees and other great apes produce a laugh-like sound when tickled or during play (Davila-Ross, Owren, & Zimmermann, 2009; Matsusaka, 2004; Provine, 2000; Vettin & Todt, 2005). However, with only auditory cues, the sounds of chimpanzee (Pan troglodytes) and human laughter are so different that naïve human observers are unable to identify the chimpanzee vocalization as laughter (Provine, 2000, pp. 77–79), most guessing that they were listening to panting, often of a dog, or other biological or nonbiological (e.g., sawing, sanding) sources. In contrast, everyone recognized human laughter. It is easy to understand primatologists’ references to these non-human-like ape vocalizations as laughs and chuckles, given the context of their occurrence (e.g., tickling) and their being accompanied by a cheerful-looking facial expression, variously termed the “play face,” “open-mouth display,” or “laugh face.” Van Hoof (1972) pioneered the study of chimpanzee laughter, especially regarding the shared ancestry of human laughing and smiling, but he did not attend to the essential vocal and respiratory processes considered here or the details of facial anatomy and movement (Davila-Ross, Jesus, Osborne, & Bard, 2015).

Laughter is primate onomatopoeia and may be the clearest example of the phylogeny of a human vocalization, with frolicking chimpanzees providing the essential clue. Ancestral ape laughter mimics (ritualizes) the labored breathing of vigorous play, but signals playful intent, not physical exertion. Thus, laughter is the sound of play, whether in the form of the primal chimpanzee “pant-pant” or the derivative human “ha-ha” (Provine, 1996, 2000), and its message is “This is play, I’m not attacking you.” If humans laughed like chimpanzees, this relationship would have been discovered long ago.

The comparative analysis of chimpanzee and human laughter has also provided an unanticipated discovery: why we can talk and other apes cannot. As noted above, laughing humans parse an outward breath into a series of short (1/15 s), voiced and unvoiced blasts (“ha-ha,” etc.) that repeat about every 1/5 s (Provine & Yong, 1991). In contrast, laughing chimpanzees do not, and probably cannot, parse their exhalations into neatly defined notes, producing only one, unvoiced laugh sound per outward and inward breath (Provine, 1996), as in panting. In subsequent work, Davila-Ross, Owren, and Zimmermann (2009) confirmed these general properties of human and chimpanzee laugh production, adding details about orangutans, gorillas, and bonobos. The later great apes have a bit more vocal flexibility than the chimpanzee, sometimes producing both alternating outward–inward and outward laughter, but their laughter is of the noisy, unvoiced chimpanzee type. Bonobo laughter sometimes shows a hint of voicing.

Remarkably, nonhuman primates can learn to recognize hundreds of words (Savage-Rumbaugh et al., 1993) but can hardly learn to produce any (Hayes, 1951). Laughter analyses have revealed the reason for this discrepancy between vocal perception and production in nonhuman primates (Provine, 1996, 2000): The vocal facility of chimpanzees and other great apes is limited to only one or a few syllables per breath, because it is captive to an inflexible respiratory system that is closely tied to locomotion. Although chimpanzees can walk short distances on their hind legs, they share the vocal constraints typical of other four-legged animals, whose breathing and running are closely synchronized (one stride per breath) to brace the thorax for forelimb impacts (Bramble & Currier, 1983). The principle is the same as with breath-holding when you are lifting a heavy weight. Without breath-holding, the thorax is a weak, air-filled bag. The evolution of bipedalism freed the thorax of its support function, incidentally conferring flexibility in the coordination of breathing, running, and vocalizing, which were events necessary for the emergence of human speech and laughter. This is the basis of the bipedal theory of speech evolution (Provine, 2000).

Further evidence for the bipedal theory has come from studies of respiration during running. A bipedal human runner, for example, may employ a variety of strides per breath: the ratio can be 4:1, 3:1, 5:2, 2:1, 3:2, or 1:1, with 2:1 being the most common (Bramble & Currier, 1983). Freedom from the rigid 1:1 neurologically based link between stride and breath that is characteristic of quadrupeds provided a critical preadaptation; it enabled our ancestors to evolve a vocal system in which utterances are no longer tied to single breaths, permitting the subsequent selection for speech and our species’ characteristic ha-ha laugh. The neurological transformation necessary for human-type laughter and speech occurred after the evolution of bipedality. Simply standing a quadruped upright does not confer vocal competence, as is shown by chimpanzees’ inability to speak.

Routes to vocal facility in mammals are not exclusively associated with bipedality (Fitch, 2005). Cetaceans (dolphins and whales) and seals, descendants of terrestrial quadrupeds, have impressive vocal competence (Janik & Slater, 1997), having escaped the constraints of forelimb support of the thorax during locomotion via their buoyancy in an aquatic medium (Provine, 2000). Birds, especially songbirds and parrots, are bipeds notable for their vocal virtuosity, but contrasting them with mammals is complicated by their very different evolutionary trajectory, locomotor apparatus, and vocal mechanism (Fitch, 2005).

Traditionally, vocal evolution research is a neck-up affair that has focused on laryngeal and brain mechanisms and neglected the neck-down processes essential for respiratory control and sound production. Although much attention has been given to the special properties of speech and language, speech is ultimately a movement that produces a sound, and is generated by mechanisms similar to those involved in such less exalted acts as running, flying, and breathing. Research on laughter evolution has redressed this imbalance by targeting sound-producing movements and the mechanisms that control them.

Evolutionary tactics: the critical-change approach

The laughter research described here is part of a broader program that investigates neurobehavioral evolution via simple, instinctive acts, including yawning, sneezing, coughing, itching/scratching, tickling, nausea/vomiting, hiccupping, vocal crying, emotional tearing, reddening of the eyes, belching, farting, and the prenatal processes responsible for the genesis of all behavior (Provine, 2012). The program previously involved the development and evolution of flight and flightlessness in wild and domestic birds (Provine, 1984). The presumption is that, as with laughter, simple behaviors provide an opportunity to move beyond speculation about adaptive processes and discover the actual mechanisms of behavioral change.

A promising research tactic is to target recently evolved, uniquely human behaviors and compare them with their primate antecedents, seeking the neurological, glandular, or muscular source of the transformation—what I term the critical-change approach. In the case of laughter, the critical transformation involved breath control that permitted the change from panting (chimpanzee-like) laughter to the parsed exhalation of the human ha-ha, and ultimately the more virtuosic exhalation of human speech (Provine, 1996, 2000). Although the details are still being worked out, other critical changes may include changes to our spinal cord structure, given that the main differences between the spinal cords of humans and other primates is that humans have relatively more spinal cord mass in the thoracic segments, the region most involved in respiratory control and the maintenance of upright posture (MacLarnon, 1995; MacLarnon & Hewitt, 1999). Also, the greater human cord mass exclusively involves gray matter (not white), the tissue composed of cell bodies, pattern-generating circuits, and the motor neurons that send axons to the muscles. Changes may also involve the previously considered brain mechanisms for vocalization.

Emotional tearing and scleral whiteness provide informative applications of the critical-change approach in a nonvocal domain. Only humans have the neurological connection between brain and gland that permits tears of sadness, a breakthrough in emotional communication (Provine, Krosnowski, & Brocato, 2009). The apparently crude tuning of the tearing mechanism, such that tears are produced during laughing, coughing, sneezing, vomiting, and yawning, as well as to express sadness, may reflect an evolutionary process in midflight, before the details are sorted out. The uniquely human white sclera provides the ground necessary to perceive the redness caused by increased blood flow in the overlying conjunctiva, a source of information about emotion, disease, health, and age (Provine, Cabrera, & Nave-Blodgett, 2013; Provine, Nave-Blodgett, & Cabrera, 2013). (Nonhuman primates have dark sclera.) White sclera also enhance the social cue of directional gaze (Tomasello, Hare, Lehmann, & Call, 2007). Incidentally, the visual cues of both emotional tearing and scleral whiteness are enhanced by the increased face-to-face contact provided by bipedal posture.

The consequences of bipedality are broad, profound, and still being discovered. Many benefits of bipedality have been proposed (efficient locomotion, predator detection, genital display, cooling, and freeing the hands to carry infants, throw, manipulate objects, make and use tools, gesture, and forage), and now its role in facilitating speech evolution, the central theme of this review. It is fitting that bipedality, a definitive trait of hominins, is also a necessary condition for the selection of speech, a characteristic trait of Homo sapiens.