It seems a truth—while perhaps not universally acknowledged, at least widely shared—that men are funnier than women (see, e.g., Lewis, 2000). Such a view has been expressed by men and women, and often in conjunction with firm assertions that men’s humor advantage, if such it be, is not part of any general intellectual superiority (Greer, 2009; Hitchens, 2007). Presuming a reliable gender difference in humor production, various theories have been offered, including suggestions that humor, like the head butting of elk, is done to impress potential mates (Bressler, Martin, & Balshine, 2006). Consistent with such a notion, females indicate a preference for mates who makes them laugh, whereas males prefer a mate who laughs at their humor (Li, Griskevicius, Durante, Jonason, Pasisz & Aumer, 2009). There is also evidence that both genders comply, with women laughing more, and men making people laugh more (Provine, 2000, p. 27; but see Kothoff, 2006). However, this evidence does not require that men actually be more capable of being funny, but could be due to some combination of emotional responsivity, differential effort, and pity. There are no direct tests of assertions about gender differences in the ability to be funny.

In this article, we explore explanations for the impression that men are funnier than women. It could be that the stereotype exists because it is true, and people have correctly observed the world. The impression could also exist without the stereotype’s being true, if people’s view of the world is systematically biased. In two studies, we explore these two possibilities. In the first, male and female participants wrote, or at least tried to write, funny captions to accompany cartoon images, and raters, also male and female, evaluated their success. The second study examined whether, in the context of a memory experiment, people would be more likely to recall funny things as having been produced by men. Previous work has shown that stereotypes, about occupations for example, can influence memory (e.g., Marsh, Cook, & Hicks, 2006; Mather, Johnson, & De Leonardis, 1999). If people give credit for funniness to men because they expect men to be funny, although it could not explain the origin of the stereotype, it could explain the stereotype’s perpetuation.

Study 1

Phase 1: Humor production


A total of 32 undergraduates (16 male, 16 female) from the University of California, San Diego (UCSD), participated for course credit.


Twenty cartoon from The New Yorker’s caption contest were compiled, together with a questionnaire that asked about gender as well as other demographic questions.


After viewing two sample The New Yorker cartoons with captions, participants wrote a caption for each of the 20 cartoons in their packet. The participants were encouraged to be as funny as possible and were told that others would be rating their captions. They worked alone in a quiet room and were given 45 min to produce the 20 captions. Afterward, they completed a questionnaire in which they were asked how funny they thought that others would find their captions, and also whether they thought men or women were funnier.

Phase 2: Humor rating


A group of 81 UCSD undergraduates (34 male, 47 female) received class credit for participating.


The 20 cartoon images from The New Yorker with new captions written by the 32 participants in Phase 1 were used. The stimuli were presented on a monitor, and participants responded with keyboard presses. These participants also indicated at the end whether they thought that men or women were funnier.


Participants rated the funniness of the captions using a five-round knockout tournament-style system. In a given round of a tournament, one cartoon image was displayed with 2 corresponding captions, as shown in Fig. 1. At their own pace, raters chose the funnier of the 2 captions with a keypress. Next, 2 new captions appeared. This process was repeated for all 32 captions corresponding to each cartoon. The 16 captions chosen as winners in the first round were randomly pitted against each other in the next round. By the end, participants had made 620 choices. How many rounds a writer’s caption survived before being knocked out determined the humorousness score that the writer earned during that tournament from that rater, with 1 point for each round their caption won; thus, a writer could earn from 0 to 5 points per tournament. This approach may be able to provide a more sensitive measure of relative funniness than would Likert-type ratings, especially since many captions were crowded at the not-funny-at-all end of the continuum.

Fig. 1
figure 1

An example of our tournament-style rating system, in which two captions were pitted against one another and participants chose the funnier of the two. (In this case, the top caption was the overall winner, and the bottom one was most often eliminated in the first round.)

For each tournament, a preference score was computed by subtracting the average points the rater allocated to female writers from the average points allocated to male writers. An average over the 20 tournaments of these preference scores was calculated for each rater, providing a measure of the degree to which that rater preferred captions produced by male writers across the entire experiment. A score of 0 for any particular rater thus indicates an absence of average preference for either male- or female-produced humor, and any positive deviation from 0 indicates a relative preference for humor produced by males.



The participants were in broad agreement with the stereotype that men are funnier, with (aggregating across writers and raters) 89% of the woman so indicating, and 94% of the men (1 female and 2 males did not respond to the question). This rate of endorsing the stereotype is not significantly different by gender, χ 2(1, N = 110) = 0.58, p > .45. The caption writers also indicated that this would apply to them, with the male writers predicting an average funniness for their own creations of 2.3 (on a 1–5 scale), while the female writers predicted a significantly more modest 1.5, t(30) = 3.50, p = .001.

Humor ratings

To assess the degree to which raters exhibited a preference for captions written by males or females, a one-way analysis of variance was computed on the preference scores calculated for each rater, with rater gender as a between-subjects factor. Because an average preference score of 0 indicated no average rater preference for male or female writers, the relevant analysis was the test of the model intercept. Raters on average did show a preference for male writers, t(79) = 7.85, p < .001, allocating 0.11 more points to them (SD = 0.46, d = 0.24). Both female and male raters showed this preference for male writers, although the preference was significantly greater among male raters, t(79) = 3.63, p < .001. Female raters allocated male writers on average 0.06 more points [t(46) = 3.22, p < .01, d = 0.13]; male raters allocated them on average 0.16 more points [t(33) = 7.66, p < .001, d = 0.35].

Content analysis

Given the differences, albeit slight, in the success of male and female writers, and the differences, again slight, in their success with male and female raters, we performed a rough investigation of the content of the cartoon captions. Two research assistants independently flagged each of the 640 captions for the presence of each of 25 different categories of content, drawn from theories of humor (e.g., puns or self-deprecation, Long & Graesser, 1988; benign violations, McGraw & Warren, 2010). With the data from these two coders, aggregate measures of category usage were calculated for male and female authors. For only two categories did gender differences appear: Male authors used more sexual humor and profanity. The base usage of profanity and sexual humor was low, so these categories were combined to calculate for each of the 32 caption authors a percentage of their 20 captions that used either. Male authors used these categories of humor in more of their captions [t(30) = 2.17, p = .038; males = 4.30%, females = 1.95%]. This modest difference in usage, however, does not appear to explain the humor advantage for males, and in no follow-up analysis was there any humor advantage for such captions, nor were they preferred by male raters.


These results are consistent with the widespread belief that men are funnier than women: Both male and female raters judged captions written by men to be funnier. Males demonstrated an even stronger preference for captions written by males, indicating a uniquely strong appreciation of male humor by male raters. While males did, in a way that might surprise few, produce more profanity and sexual content, this was not the basis of their slightly greater humor success, nor of their slightly greater appeal to men.

Study 2

This experiment was designed to test the idea that memory is affected by the belief that males are funnier than females. Stereotypes can impact source attributions (e.g., Bayen, Nakamura, Dupuis, & Yang, 2000; Hicks & Cockman, 2003; Marsh et al., 2006), and here we explored whether people would tend to recognize funny things as having been produced by men rather than women. Such errors could enable people to maintain the view that men are funnier, even in a world where they actually are not.



A group of 72 UCSD undergraduates (36 male, 36 female) participated for class credit.


The study items comprised the 20 cartoons from the first study, coupled with 100 captions selected from those generated by the Study 1 participants. These were the 50 captions that the raters had rated as most humorous (25 written by females, 25 by males) and the 50 that had been rated as least humorous (again 25 written by females, 25 by males). The cartoons were presented on a monitor. Each image was presented multiple times, but captions were only presented once. Below the image and caption appeared the caption writer’s gender. In total, there were 80 targets (shown during presentation and test) and 20 lures (shown only during test) that were balanced for humorousness and the writers’ gender.


Participants were told that the same cartoon images would appear more than once and with different captions. Participants were instructed to remember the captions and the writers’ gender for a memory test. The 80 targets appeared individually for 12 s apiece, in random order. Following a 2-min distractor task comprising simple math problems, participants took the 100-item memory (80 targets and 20 lures) test.

In the memory test, participants first made an old/new decision, indicating whether they believed that a given caption had been displayed before. Participants pressed the “o” key to indicate “old” (the correct response for targets) or the “n” key to indicate “new” (the correct response for lures). This allowed for an examination of whether funny captions were remembered better. Next, regardless of the old/new response, participants made a source (writer’s gender) decision, by pressing “f” for female or “m” for male. This allowed for an examination of whether funny captions were remembered as having been produced by men.


Old/new and source accuracy

Because the female and male participants did not differ significantly in their accuracy scores, the data were combined in order to analyze overall old/new and source accuracy. The proportion correct (88%) on the old/new decision was significantly above chance, t(71) = 38.04, p < .001. The average d' (a measure of sensitivity, which is zero at chance; Macmillan & Creelman, 2005) = 2.98, SD = 0.71. The proportion correct (65%) on the source decision was also significantly above chance, t(71) = 13.42, p < .001. The average d' for source decisions was 0.45, SD = 0.38.

Humor effect

For both old/new judgments and source memory, there was a significant effect of humor (as measured by d'), with the funny captions (M = 2.95, SD = 0.73) remembered better than the nonfunny captions (M = 2.63, SD = 0.65), t(71) = 4.63, p < .001, and the sources of the funny captions (M = .57, SD = .54) remembered better than those of the nonfunny captions (M = .34, SD = .51), t(71) = 2.85, p = .006].

Effect of humor on source memory

For both funny and nonfunny target captions, the proportions of items for which the correct gender source was identified were calculated. This provides conditional source identification measure (CSIM) scores and is the standard way to assess response bias (Murnane & Bayen, 1996). For example, for the funny captions, the number correctly identified as male or female was divided by the actual number of funny captions that had been written by males or females. To illustrate further, say that 1 participant correctly identified 18 of the 20 funny caption targets that were written by females. This participant attributed 12 of those 18 to a female writer. Thus, this subject’s CSIM score for funny targets written by females would be .67. This was repeated for each participant for each humor type and gender of the caption writer. A three-way ANOVA with participants as random effects was computed using these CSIM scores, with writer gender, humor type, and participant gender as factors. As is shown in Fig. 2, participants preferentially attributed the funny captions, but not the unfunny ones, to males. The interaction was significant, F(1, 210) = 9.29, p = .003. Follow-up t tests indicated that the nonfunny captions were more often attributed to females (M = .68, SD = .14) than to males (M = .63, SD = .17), t(210) = 2.44, p = .016, and that there was a trend in attributing the funny captions to males (M = .70, SD = .14) as compared to females (M = .66, SD = .14), t(210) = 1.87, p = .063. The degrees of memory distortion were not significantly different for the male and females raters, F(1, 210) = 2.36, p = .126.

Fig. 2
figure 2

Conditional source identification measure (CSIM) scores by humor type (with standard errors). Higher CSIM scores reflect greater attributions of authorship

The same source attribution analysis was conducted on the 20 new items. For both funny and nonfunny lure captions, the proportions of items for which each gender source was identified were calculated. The interaction of humor type and writer gender was significant, F(1, 210) = 4.10, p = .044. Follow-up t tests indicated no difference in the attribution of nonfunny captions to females (M = .49, SD = .28) as compared to males (M = .50, SD = .26), t(210) = 0.13, p = .899, but did show that funny captions were more often attributed to males (M = .61, SD = .26) than to females (M = .47, SD = .26), t(210) = 2.99, p = .003.


Humorous items often afford a memorial benefit. Our findings replicate the humor effect (Kaplan & Pascoe, 1977; Schmidt, 1994, 2002; Schmidt & Williams, 2001; Takahashi & Inoue, 2009), with the funny captions not only remembered better, but also with the gender of their authors remembered better. The analyses also provide evidence for a humor-based retrieval bias; individuals of both genders tend to misattribute humorous captions to male writers. This was true both for misremembering captions whose author gender the participants had seen and for attributions of new captions whose author gender the participants were only guessing. This finding is consistent with previous research on the stereotypes that influence source memory decisions (e.g., Hicks & Cockman, 2003).


We explored two possible components underlying the stereotype that males are funnier, and both received some empirical support. Men, at least under the conditions and constraints of the present experimental situation, were funnier. In addition, funny captions were preferentially attributed to male authors, a bias that was present in both men and women.

The data are not entirely consistent with a view of male humor being favored evolutionarily as impressing women (Miller, 2000), because male humor (perhaps like the male sports car) appeals most especially to other men. Nor are the data consistent with humor being gender specific, because we found that, while men prefer male humor, women (albeit very slightly) also prefer it. Our data did reveal slight differences in the content of the humor produced by men and women, with men inclined slightly more to sex and profanity, but the factors that made their captions slightly more successful, and especially so to other men, were elusive.

The data do suggest that men’s view that men are funnier could be a result of their actually finding the humor they produce funnier, as well as of their biased recall of funny things as having sprung from men’s minds. Women, who were perhaps slightly less strong in their conviction that men are funnier, also showed less of an effect of actually finding them funnier, though women did show equally biased recall. Women’s laughing more at men (Provine, 2000), when the gender is known, may be largely due not to superior humor, but to more subtle social influences, which are known to impact laughter (e.g., social status; Coser, 1960; Robinson & Smith-Lovin, 2001).

Other factors will likely contribute to the impression of male funniness. In our first study, everyone was asked to be funny, and it could be that men would spontaneously regard more occasions as appropriate for humor—because they feel it is expected of them, in order to impress women, or because they are less cautious about hurt feelings. It could also be that our caption contest required a style of humor at which women are relatively adept, and that other domains would produce different results. As an argument against these points, however, we did find a dramatic difference between the male and female caption authors in their predictions of success. Male confidence, in this domain at least, does seem to outstrip male competence.