Introduction

The way in which humor is produced and understood reflects the cognitive changes from childhood to adulthood and late adulthood (Martin 2007). Throughout the lifespan, humor plays an important role in social communication, as in developing relations and shortening social distance (Graham et al. 1992) and in the enjoyment and confidence within social interactions (Nezlek and Derks 2001). Despite evidence of the lifelong importance and changes in humor, very little is known about humor understanding during the old age (Greengross 2013) and on the relationship between humor and the cognitive skills supporting social communication. The present study aims at paving the way towards a lifespan consideration of humor as part of the social-communicative competence. Specifically, we investigated the ability to comprehend humor in older adults and the relationships of humor understanding with pragmatics and theory of mind (ToM), as two of the core cognitive skills underpinning social communication (Hyter 2017).

In order to test humor comprehension, researchers have used visual and verbal tasks. A common paradigm requires participants to complete a given story (either visual or verbal) with the punchline, i.e., the ending that elicits the humorous interpretation, by selecting one of a number of alternatives. The alternatives generally include, together with the funny (correct) ending, a straightforward and an unrelated (often illogical) ending (Brownell et al. 1983; Shammi and Stuss 1999, 2003). Studies often measure also humor appreciation, i.e., the emotional responses to humor (Ruch and Hehl 1998), by means of funniness ratings on a Likert-scale (Shammi and Stuss 2003; Uekermann et al. 2006).

When applied in cross-sectional studies comparing extreme age-groups, i.e., younger vs. older adults, these tasks evidenced a general effect of age on humor comprehension abilities. Older adults tend to make more errors than younger adults in selecting jokes’ funny endings, both in visual and in verbal tasks (Mak and Carpenter 2007; Shammi and Stuss 2003; Uekermann et al. 2006). With respect to humor appreciation, results showed that older adults perceive humorous cartoons (Barrick et al. 1990; Shammi and Stuss 2003) and verbal jokes (Uekermann et al. 2006) as funny as younger adults do. A limited number of studies have investigated humor comprehension and appreciation using continuous groups. For instance, Schaier and Cicirelli (1976) asked three groups of seniors (age 50–59, 60–69, and 70–79) to rate the funniness of a set of jokes and to explain what was funny about each joke, considering explanation as a proxy of comprehension. Results showed that humor comprehension significantly declined with age and humor appreciation positively correlated with age. A more recent study compared two groups of elderly people, young-old (age 60–70) and old-old (age 71–90) (Daniluk and Borkowska 2017). Old-old individuals had more difficulties than young-old individuals in selecting the correct ending of humorous stories. In addition, when humor comprehension scores were regressed onto demographics and cognitive functioning scores (obtained through the Mini-Mental State Examination) results showed that the overall cognitive level and education significantly predicted humor. The role of cognitive factors in humor comprehension was explored also in other studies, disclosing relationships with working memory and verbal abstract ability (Shammi and Stuss 2003), inhibition and set shifting (Uekermann et al. 2006), cognitive flexibility, abstract reasoning, and vocabulary (Mak and Carpenter 2007). In light of these complex relationships, Mak and Carpenter (2007) highlighted that “[…] a more sophisticated model is needed to clarify the role of cognition in humor comprehension”. Given the important role of humor in social communication, we argue that such model should consider also the role of the other cognitive skills supporting social communication, such as pragmatics and ToM (Hyter 2017). However, no studies till now have investigated the relationships of humor with communicative and mentalizing skills in aging. These relationships are at the core of the present investigation.

Insights into the social and communicative aspects of humor can be found in theoretical accounts that combine psychological and linguistic elements. The most widely renowned psychological account of humor – which provides the basis for further models – is the Incongruity-Resolution theory. This account describes the mechanisms of humor comprehension as a two-step process, involving the detection of an incongruity and its resolution with a cognitive rule, as in problem-solving tasks (Suls 1972). Building on the Incongruity-Resolution account, theoretical proposals developed within the linguistic-pragmatic framework have emphasized the communicative role of humor, accounting for key aspects of humorous exchanges, such as the non-literal use of language, the role of speaker’s intentions, and context-based inferences (Hoicka 2014). At the core of the Semantic Script Theory of Humor (Raskin 1984) and the subsequent General Theory of Verbal Humor (Attardo and Raskin 1991), there is the idea that, in order to process a joke, interlocutors should abandon the fundamental assumption that speakers are cooperative in communication (Cooperative Principle in Grice 1989) and establish a new situation in which the mode of communication follows the rules of non-bona-fide cooperation (Attardo 1993). Within this communicative mode, the hearer acknowledges that the speaker’s intention is not to convey truthful information, but rather to cause a cognitive (and emotional) response (Attardo 2008; Raskin and Attardo 1994). The role of speaker’s intentions in humor comprehension has been investigated also within the Relevance Theory framework (Sperber and Wilson 1995). In this account, humorous utterances require the identification of speaker’s intentions to generate positive cognitive effects in the perceiver: the highest possible reward in humor comprehension is reached by balancing the mental effort paid in processing the joke with the outcome of the joke’s processing (e.g., amusement) (Yus 2016). To process a joke, the hearer has to compute a series of inferences necessary to turn the joke into a fully contextualized interpretation, via disambiguation of single tokens in the discourse, concept adjustment or the derivation of implicatures from the joke’s scenario (Yus 2017).

Other accounts focus mainly on social cognition, claiming that humor depends primarily on ToM abilities (Jung 2003). Specifically, the understanding of someone else’s thoughts is considered to be necessary for humor comprehension (Howe 2002). In line with this view, studies on the lifespan highlighted the importance of social cognition in the development of humor (Hoicka and Gattis 2008; Reddy 2001) as well as in the decline of humor in aging (Uekermann et al. 2006). Recent neurophysiological evidence on young adults indicates that higher social skills are beneficial in the earliest phases of humor processing (Canal et al. 2019). Moreover, there is evidence of difficulties in humor comprehension in populations with ToM impairment, such as people with Autism Spectrum Disorder (Emerich et al. 2003; Samson and Hegenloh 2010) and schizophrenia (Bozikas et al. 2007; Corcoran et al. 1997).

Accounting for the interplay between ToM abilities and humor comprehension, previous studies investigated brain areas activated in response to different types of humorous cartoons or jokes (Chan and Lavallee 2015; Feng et al. 2014; Marjoram et al. 2006; Samson et al. 2008). Interestingly, the involvement of ToM regions (such as the medial prefrontal cortex and the temporo-parietal junction) was not general, but depended on the type of humorous stimulus, being stronger in cartoons where the viewer had to ascribe mental states or intentions to the portrayed characters than in other humorous vignettes that did not especially engage mentalizing abilities to be solved (such as for instance those based on physical similarity). Consistently, in a behavioral study participants gave more mentalistic explanations for ToM-based cartoons compared to other types of cartoons, indicating that humor types differently load on ToM skills (Samson 2012).

Altogether these theoretical accounts and experimental evidence bring us to derive a general hypothesis, namely that both pragmatics and ToM are fundamental in humor comprehension, yet to a different extent. On the one hand, pragmatic operations such as understanding the speaker’s communicative intention and the context-based inferences required to fill the gap between the literal meaning and the humorous meaning seem to be pivotal in every type of jokes. On the other hand, the involvement of ToM is likely to depend on the specific type of jokes, being higher in jokes that require to reason about the mental states of the characters in the joke’s story. Investigating the relationship between humor comprehension and these skills might represent an important advancement to understand the general cognitive architecture of humor. Furthermore, the role of pragmatic and ToM skills in humor understanding might be of special interest in the aging population, since both pragmatics and ToM are known to undergo a decline in the old age (Cavallini et al. 2013; Henry et al. 2013; Messer 2015; Parola and Bosco 2018).

To tackle these issues, the approach adopted in the present study was based on the following key elements: a) creating a new humor task which includes jokes with different ToM loads; b) assessing older adults’ humor comprehension, as well as pragmatic and ToM abilities, with specific tests; and c) using a modeling technique to examine the theoretical distinction between different types of jokes and to estimate accurately the relationships between humor, pragmatics, and ToM. To meet a), we created the Phonological and Mental Jokes (PMJ) task, comprising jokes where the incongruity arises either at the sound level or in the character’s mental states. To meet b), a sample of 147 older adults was assessed with the PMJ task, with the comprehension section of a validated test assessing pragmatic skills (Assessment of Pragmatic Abilities and Cognitive Substrates, APACS, Arcara and Bambini 2016), and with an advanced ToM task, i.e., the Strange Stories (White et al. 2009). To meet c), we used Structural Equation Models (SEMs). The use of SEMs brings several benefits to the present study. First, it is a hypothesis-driven approach and allows researchers to test specific models. Second, a number of fit indices can be used to evaluate a specified model. Third, by providing error estimates, it enables researchers to obtain estimates closer to the ‘true’ score on the latent variables, that is, the hypothetical constructs measured through the task’s items. Fourth, it allows to test complex models that include a number of variables at the same time, showing unique relationships between them, i.e., after accounting for the effect of the other variables in the model (Kelloway 2017; Kline 2011).

In adopting SEM modeling, our analysis had two main aims. Our first aim was to investigate whether Phonological jokes (PJ) and Mental jokes (MJ) can be considered as different constructs in older individuals. On the basis of previous studies using different types of humorous stimuli (Chan and Lavallee 2015; Samson et al. 2008), we hypothesized that a model with two latent factors corresponding to PJ and MJ would show adequate fit to the data.

Our second aim was to investigate the role of pragmatic and ToM skills (focus variables) in relation to the two types of jokes, controlling for a number of background variables. Based on the theoretical accounts describing the pragmatic nature of humor (Attardo 2008; Yus 2016), we hypothesized that pragmatics is a significant predictor of both PJ and MJ latent factors. Furthermore, on the basis of the role of ToM and its modulation in different types of jokes (Chan and Lavallee 2015; Samson et al. 2008), we hypothesized that there is a significant relationship between ToM and MJ latent factors, but not between ToM and PJ latent factors. As background variables, we considered age and education because of their well-known effect on both pragmatics and ToM, as well as on jokes. Furthermore, given that both the PMJ task and the Strange Stories were based on the comprehension of verbal stories, we also controlled for individual differences in vocabulary.

Method

Participants

The sample was composed of 147 older adults (age: M = 69.21, SD = 5.76; range 60–85; Female: 71.4%). Information about educational attainment were collected with a short demographic questionnaire; educational level was categorized on a five-point scale (1 = elementary school; 2 = middle school; 3 = high school; 4 = bachelor’s and master’s degree; 5 = PhD and other). Participants varied in the educational degree attained: 5% elementary school, 17% middle school, 56% high school, 18% university degree, and 4% higher degree. All participants were independently living, with no history of neurological or psychiatric illness, and all of them scored above the 24-point cut-off of the MMSE, a widely used screening test for cognitive impairment (Folstein et al. 1975). All participants were Italian native speakers. Participants were unpaid volunteers recruited through socio-cultural aggregation centers, personal contacts, word of mouth, and local advertisements.

The study was approved by the Ethics Committee of the Department of Brain and Behavioral Sciences of the University of Pavia. Informed consent was obtained from all participants.

Assessment

Tests were administered in a quiet room at recreational centers or subjects’ own home, under the supervision of a research assistant who was instructed to explain the tasks. Test were administered either in a paper and pencil format or orally (see below for the details for each test). Participants took part in two testing sessions of about 1-h length each one. The order of tests was randomized across subjects.

Theory of mind

We used the Strange Stories task (Happé et al. 1998; White et al. 2009), an advanced ToM test commonly used in the aging research. We selected six mental state stories depicting three different social scenarios - double bluff, persuasion, and misunderstanding. Stories were presented in a booklet in which participants wrote their answers. According to the scoring procedure, participants’ answers were scored as a 2 for a full and explicit answer, 1 for a partially correct answer and 0 for an incorrect answer. Total scores ranged from 0 to 12.

Pragmatics

We administered the comprehension section of the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS) test (Arcara and Bambini 2016), a validated tool to evaluate pragmatic skills (details on reliability and validity are described in Arcara and Bambini 2016). The APACS comprehension section includes four tasks (Narratives, Figurative Language 1, Humor, Figurative Language 2), globally assessing the ability to understand implicit and explicit aspects of a narrative text and to derive non-literal meanings. A composite score is derived from the scores of the four tasks, i.e., APACS comprehension score. All the four tasks were administered orally by a research assistant, and participants’ answers were recorded and scored later. Given that one of the four tasks used to calculate the APACS comprehension score focuses on humor and that this might lead to spurious correlations with the PMJ task, we decided to restrict the calculation of the APACS comprehension score to three out of the four tasks, namely Narratives, Figurative Language 1, and Figurative Language 2, thus excluding the Humor task. Scores from each of the three tasks were transformed into proportions and averaged, so that the total APACS comprehension scores ranged from 0 to 1.

Vocabulary

This was assessed using the verbal meaning subscale from the Primary Mental Abilities battery (Thurstone and Thurstone 1963). This test requests participants to identify the correct synonym of 50 target words by choosing from four given alternatives within 8 min. Following the distinction – common in the literature on language learning – between size or breadth of vocabulary knowledge (i.e., how many words are known) and depth or quality of vocabulary knowledge (i.e., how well those words are known; see Schmitt 2014), this test can be considered as an assessment of vocabulary depth in a receptive format, as it measures how well an item is integrated and linked with other items in the mental lexicon. The test was administered in a paper-and-pencil format. Total score ranged from 0 to 50.

Phonological and Mental Jokes (PMJ) task

Construction of the Jokes

Jokes were selected and adapted from Italian magazines, books and web repositories. Structure-wise, all jokes were based on incongruity-resolution. Content-wise, we avoided sexual, black, aggressive, and scatological humor (Ruch and Platt 2012) and selected jokes with ‘no specific content’ (Carretero-Dios et al. 2009). In addition, we paid attention to select jokes that belonged to two differ types with respect to the source of the incongruity. One type can be defined as ‘phonological jokes’.Footnote 1 In this type the incongruity originates from a sound similarity between the punning word and the expected non-humorous word. The other type can be defined as ‘mental jokes’. In this type there is an incongruity between the belief/thought attributed to one character and that character’s next utterance: the latter disconfirms the initial (attributed) belief/thought, which turns out to be false. Mental jokes are thus based on ToM processes such as the attribution of false belief and mental states in general (see Table 1 for examples). Jokes were initially selected and adapted to meet the criteria expressed above by the first co-author. Two other researchers acted as independent judges to check the conformity of these jokes to the study requirements. Overall the PMJ task included 12 jokes, six phonological and six mental. This numerosity is in line with existing tasks assessing humor comprehension (e.g., seven items in the Humor task in the APACS test, and 10 items in the Batteria per il Linguaggio dell’Emisfero Destro (BLED) test, Rinaldi et al. 2004).

Table 1 Examples of items in the Phonological and Mental Jokes task

Furthermore, each  joke was adapted in order to match a fixed linguistic structure consisting of three lines of text (two context-lines followed by the ending) with a dialogue between two characters. The first context-line introduced a situational context, whereas the second context-line started a dialogue, which was completed by the humorous ending. Following existing joke comprehension tasks, such as the Humor task in APACS, for each joke we created two other endings (one straightforward and one unrelated) to be used in a multiple choice task. The three endings differed only for the target word, leaving the linguistic context constant between conditions.

Control Measures

As a control on the materials, a number of measures were taken to ensure that jokes did not vary for main psycholinguistic variables across types (i.e., phonological and mental) and conditions (i.e., humorous, straightforward, and unrelated). The length of endings was balanced across the two types of joke. Mean frequency of target words in straightforward endings and in unrelated endings showed no significant difference between phonological and mental jokes. For the humorous condition, the difference was significant, given that phonological jokes comprised phonemic or syllabic changes that in some cases created non-real words (therefore having a null frequency, i.e., sambatical). See Table 2 for values and descriptive statistics.

Table 2 Descriptive statistics for the Phonological and Mental Jokes task

To check the humorousness of the (correctly completed) jokes in the PMJ task, we used funniness as a proxy and we tested whether texts with humorous endings were judged as being funnier than texts with straightforward endings in the young population.Footnote 2 To do so, we run a rating study on 40 young adults (age: M = 24.48, SD = 3.27, range 19–34, Female: 75.47%), who were asked to complete an online questionnaire assessing the level of funniness of the 12 texts used in the PMJ task by means of a 7-points Likert scale (1 = not funny at all; 7 = extremely funny). Two lists were created, counterbalancing the type of ending of each text across lists, so that each list contained 6 texts with humorous and 6 texts with straightforward endings. Overall, texts with humorous endings were significantly funnier than texts with straightforward endings, and this was true also considering mental jokes only and phonological jokes only. Interestingly, humorous endings in mental jokes were considered funnier than those in phonological jokes. See Table 2 for values and statistics.

Procedure

The 12 jokes were administered in a paper and pencil format in three different randomized lists. For each joke, humorous, straightforward, and unrelated endings were randomized across lists. No cue in the written material indicated differences across types of ending or the target words. Subjects were assigned to a list and were asked to choose which ending worked best as a punchline for the joke, where punchline was explained as “the ending that makes you laugh the most”. Responses were scored for accuracy (0 for wrong straightforward and unrelated choices and 1 for correct humorous choices). Total score ranged from 0 to 6 for each subset (i.e., phonological and mental). Then, participants were asked to rate the funniness of the joke (as they completed it) on a scale ranging from 1 to 10, with an emoticon depicting an unamused face upon the lowest value and a crying laughing emoticon upon the highest value of the scale. Following Uekermann et al. (2006), funniness scores were calculated only on correctly completed jokes.

Statistical Analyses

Before testing our hypotheses, since the PMJ is a novel task, we investigated its psychometric properties. In order to do that, we examined internal consistency, item analysis, and concurrent validity of the PMJ accuracy scores. Concurrent validity was measured against a standardized test of humor comprehension, i.e., the Humor task included in the APACS test (Arcara and Bambini 2016).

Then, in order to test whether phonological and mental jokes were separate constructs (aim 1), we focused on PMJ accuracy scores and we tested a two latent factors measurement model. Then, we examined whether the two subsets (phonological vs. mental) significantly differed from one another in terms of accuracy and funniness scores through two repeated-measure ANOVAs. Funniness scores were further analyzed through correlational analyses. We did not use SEM modeling to analyze funniness scores because of the high number of missing values. Missing values were due to the fact that funniness was computed on correctly completed jokes in the multiple choice task and therefore the number of single values contributing to form the funniness score was different for each item, depending on how many subjects completed it correctly.

Before addressing aim 2 (i.e., associations between jokes, pragmatics, and ToM), we preliminarily examined the Strange Stories task to ensure that the items reflect the same underlying ToM ability. Notwithstanding its common use in the aging literature (Henry et al. 2013) to our knowledge no study has examined the latent structure of the Strange Stories task in older adults (but see Devine and Hughes 2016 for a study on children). Therefore, we decided to assess the fit of a single latent factor measurement model.

Then, we specified a model in which we regressed the PMJ latent factors onto pragmatics, and entered covariation paths between the PMJ and ToM latent factors. Also, ToM latent factor and pragmatics were specified as covariates. The model also included three background variables, namely vocabulary, age, and education. Vocabulary was entered as predictor of both the PMJ and ToM latent factors, while it was considered as covariate for pragmatics. Finally, age and education were entered as predictors of all the other variables in the model.

In all the measurement models, the indicator with the greatest variance was selected as marker variable and fixed to 1, in order to scale the latent variables. All other factor loadings between indicators and latent variables were freely estimated. Because both PMJ and Strange Stories had categorical items, we used the mean- and variance-adjusted weighted least squares (WLSMV) estimator. Model fit was assessed using five primary criteria: the chi-square goodness of fit statistic, the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA), and the weighted root mean square residual (WRMR). We used the following criteria to assess acceptability of our models: a non-significant chi-square test of model fit; for adequate model fit, CFI ≥ .90, TLI ≥ .90, RMSEA ≤.08; for good model fit, CFI ≥ .95, TLI ≥ .95, RMSEA ≤.05; WRMR <1.00 (Brown 2015; DiStefano et al. 2018). A 90% confidence interval of the RMSEA whose lower limit is below .05 and the upper limit is below .10 further supports the fit of the solution (Kline 2011). Beyond model chi-square and approximate fit indices, we also inspected individual parameter estimates. The significance of the individual parameters (path and factor loading estimates) was evaluated for its meaningfulness to the model. Covariance coverage values, that is, the proportion of available data, were high across all tested models (96% on average), indicating not substantial missing values.

We used SPSS 19 (IBM Corp. Released 2010) for descriptive analyses, correlation analyses, and ANOVAs. For measurement model and structural equation model analyses, we used Mplus Version 7 (Muthén & Muthén 1998–2012).

Data Availability

The dataset analyzed during the current study, together with the syntax codes used in MPlus for the analysis, is available in the Open Science Framework repository, https://osf.io/7kj8h/?view_only=0b6bf3762d014df0a126c5d2b4e753f9.

Psychometric Properties of the Phonological and Mental Jokes task

First, we found that internal consistency of the whole PMJ task was acceptable, Cronbach’s α = .65. Then we performed the item analysis by: a) computing corrected item-total correlation (CITC); b) calculating the difficulty level (i.e., number of correct responses*100/total responses) for each item; and c) computing item discrimination index for each item, comparing accuracy between participants obtaining high (upper 27%) and low (lower 27%) scores in the PMJ task. See Table 3 for details. Inspecting corrected item-total correlation, we noticed that all items showed CITCs greater than .20, except for item 12, which showed a very low coefficient, r = .08. All the twelve items presented good difficulty level (according to Quaigrain and Arhin 2017, values between 20 and 90% are considered acceptable). However, item 12 had a lower value compared to the other items (Item12 = 41.79 vs. Mitem = 80.19), suggesting that this joke was somewhat different from the others. Item discrimination index (DI) were on average good, MDI = 0.49 (DI ≥ 0.40 indicates that the item is functioning satisfactorily; see Quaigrain and Arhin 2017). Hence, all the PMJ items can effectively discriminate between people with higher scores and those with lower scores.

Table 3 Item characteristics of the Phonological and Mental Jokes task

Overall, the results presented above suggest that there might be good reason to remove item 12 from the PMJ task. Internal consistency of the PMJ task without this item increases and becomes satisfactory, α = .68. Also, removing any other items does not further improve Cronbach’s α.

Concurrent validity was examined by computing Pearson’s correlation between the accuracy scores in the PMJ task and accuracy scores in the Humor task of the APACS test. Results indicated that the two tasks were moderately correlated, r = .42, p < .001, thus supporting the concurrent validity of the PMJ task. The correlation between the 11-item PMJ task and the APACS Humor task was similar, r = .43, p < .001.

Latent Structure of the Phonological and Mental Jokes task

First, we tested a two latent factors measurement model for the 12-item PMJ task, with six items loading onto a Phonological Jokes latent factor (PJ), and six onto a Mental Jokes latent factor (MJ). Results indicated an acceptable fit to the data, χ2(53) = 61.14, p = .207, CFI = .95, TLI = .93, RMSEA = .03, 90% CI [.00, .06], WRMR = .78. When inspecting parameter estimates, all items showed significant standardized factor loadings, all ps < .001, except for one item from the mental subset (item 12), λ = .18, p = .175. Note that this item was the one identified in the item analysis as poorly correlating with the whole task and with a lower difficulty level compared to the other item. Therefore, we removed this item from the model,Footnote 3 retaining 5 indicators for the Mental latent variable and 6 indicators for the Phonological latent variable (Table 4). The final model (Fig. 1) showed good fit to the data, χ2(43) = 47.65, p = .289, CFI = .97, TLI = .96, RMSEA = .03, 90% CI [.00, .06], WRMR = .74. The two latent factors showed significant variance: PJ, unstandardized estimate = .26, p = .038; MJ, unstandardized estimate = .33, p = .025; PJ and MJ were strongly intercorrelated, r = .87 p < .001.

Table 4 Parameter estimates for the measurement model for the Phonological and Mental Jokes task
Fig. 1
figure 1

Standardized estimates for the final measurement model for the Phonological and Mental Jokes task. All parameter estimates were p < .001. MJ, Mental Jokes; PJ, Phonological Jokes

We then examined whether the phonological and mental subsets differ in terms of accuracy and funniness scores (see Table 5 for average scores).

Table 5 Descriptive statistics for humor, theory of mind, pragmatics and vocabulary

A repeated measure ANOVA with subset (phonological vs. mental) as within subject variable and accuracy as dependent variable showed that the two subsets were not statistically different, F(1, 134) = 1.94 p = .166, ηp2 = .01. A similar analysis on the funniness scores (on accurately completed jokes) revealed that mental jokes were perceived as funnier than phonological jokes, F(1, 131) = 35.40, p < .001, ηp2 = .21. Furthermore, we ran correlation analyses on funniness scores and found that they were not related to any of the investigated variables, all rs ≤ .12, all ps ≥ .161, neither for phonological nor for mental jokes. Funniness in phonological and mental jokes were strongly correlated, r(137) = .83, p < .001.

Associations between Variables

We then moved to address our second aim. We preliminarily tested a measurement model in which the six items of the Strange Stories loaded onto a single latent factor, i.e., ToM. This model provided adequate fit to the data, χ2(9) = 10.84, p = .287, CFI = .96, TLI = .94, RMSEA = .04, 90% CI [.00, .10], WRMR = .55. However, one item showed a not-significant standardized factor loading, λ = .16, p = .174. Therefore, we eliminated this item and examined the new model with five indicators, which showed excellent fit, χ2(5) = 2.42, p = .789, CFI = 1.00, TLI = 1.11, RMSEA = .00, 90% CI [.00, .08], WRMR = .29. The latent variable showed significant variance, unstandardized estimate = .26, p = .039 (see Table 6).

Table 6 Parameter estimates for the measurement model for the Strange Stories task

Next, we examined the associations between variables (see Table 5 for descriptive statistics). To address aim 2 (relationships between humor understanding, pragmatics, and ToM), we specified the SEM model shown in Fig. 2 (and see Table 7 for unstandardized estimates). Model fit indices showed good fit, χ2(153) = 172.75, p = .143, CFI = .96, TLI = .95, RMSEA = .03, 90% CI [.00, .05], WRMR = .77. Starting from the focus variables, standardized coefficients estimates indicated that pragmatics predicted both PJ and MJ latent factors, both ps = .002. Independently of these effects, we found a significant relationship between ToM and MJ, p = .036, but not between ToM and PJ, p = .266. PJ and MJ were strongly associated, p < .001. All the associations described above were independent of variation in age, education, and vocabulary.

Fig. 2
figure 2

SEM model depicting relationships between humor understanding, theory of mind, pragmatics, vocabulary, education, and age. Humor understanding was assessed with the Phonological and Mental Jokes task. Theory of mind was assessed with the Strange Stories. Pragmatics was assessed with the comprehension section of the APACS test. Vocabulary was assessed with the verbal meaning subtest of the Primary Mental Ability (PMA). Standardized coefficients are reported. SS_1 to _6, items of the Strange Stories task; J_1 to _11, items of the Phonological and Mental Jokes task. PJ, Phonological Jokes latent factor; MJ, Mental Jokes latent factor; TOM, theory of mind latent factor; VOC, vocabulary; PRAG, pragmatics; EDU, educational attainment. Dashed grey line represents paths with p > .10, solid black line represents paths with p < .05. * p < .05, **p < .01, *** p < .001

Table 7 Unstandardized parameter estimates for the SEM model examining relationships between humor understanding, theory of mind, pragmatics, vocabulary, education, and age

Concerning these background variables, we found that vocabulary predicted both ToM and MJ latent factors, p < .001 and p = .041 respectively, but not PJ, p = .115, and was significantly related with pragmatics, p < .001. Also, age exerted a significant negative effect on vocabulary, p = .017, pragmatics, p < .001, and MJ, p = .041. On a similar note, education showed a positive effect on both vocabulary and pragmatics, ps < .001. Age and education were negatively related, p < .001.

Discussion

Jokes understanding is an important part of people’s social life, especially in aging (Damianakis and Marziali 2011). Older adults use humor as a coping strategy to deal with life challenges and humor helps them to maintain well-being and mental health (Ganz and Jacobs 2014; Mak and Carpenter 2007; Ruch and McGhee 2014). Yet humor comes with comprehension costs, and experimental studies have shown that physiological aging negatively affects the ability to understand jokes (Greengross 2013). Despite some research on the role of executive functions, very little has been done to examine how the cognitive skills underpinning social communication are involved in the age-related decline in humor understanding. The study herein aimed at addressing this issue by examining associations between humor understanding, pragmatics, and ToM in older adults. We hypothesized that pragmatic mechanisms would be broadly involved in humor understanding, because arguably all jokes require to shift to a special communication mode (where cooperation principles are violated, Attardo 1993, or flouted, Dynel 2008) and to inferentially fill the gap between the literal meaning and the intended humorous meaning (Yus 2016). On the contrary, we expected the involvement of ToM to be specific and to vary depending on whether the joke requires (or not) to reason about the mental states of the joke’s characters. In order to test these hypotheses, we started by creating a task based on the classic selection of the funny ending in a multiple choice fashion, but employing novel and carefully crafted materials divided in phonological and mental jokes (Phonological and Mental Jokes task). The former are based on incongruity at the sound level and require no reasoning about the mental states of the joke’s characters; the latter are based on the attribution of false belief/thoughts to one character. Innovatively with respect to the previous literature, we applied a structural equation modeling (SEM) approach to examine the relationships between individual differences in ToM, pragmatics, and humor understanding in normal aging.

Our first aim was to test the possibility of distinguishing phonological and mental jokes as two different constructs. As expected, results on accuracy supported a model with two latent factors corresponding to the two different types of jokes. In details, the phonological (PJ) and mental jokes (MJ) emerged as two separate, although related, types of jokes. These findings fit with the experimental literature that distinguished between mentalistic vs. other types of humorous stimuli (Chan and Lavallee 2015; Feng et al. 2014; Samson 2012; Samson et al. 2008). More generally, results are line with the theoretical and linguistic literature claiming that jokes are not a unitary phenomenon but actually differ on several levels (Attardo and Raskin 1991; Dynel 2012; Yus 2017). Although the joke’s type distinction employed here was designed starting from the idea of modulating the amount of ToM reasoning, it actually finds support in the taxonomies proposed for the logical structure of jokes, i.e., the logical mechanism behind their resolution: phonological jokes might be based on the logical mechanism of “cratylism” (i.e., the idea that if two words sound the same or similar they must therefore have the same or similar meanings), while mental jokes seem more in line with reasoning from false premises (Attardo et al. 2002). Our distinction between phonological and mental jokes is also compatible with the difference between discourse-centered jokes (based on conflicts in the verbal content of the joke) and frame-centered jokes (based on conflicts in the construction of the mental situation of the joke), which in turn capitalize on different pragmatic-inferential operations (Yus 2017). Furthermore, reliability and item analysis indicated that the PMJ task is a psychometrically sound measure of humor understanding in the aging population, with satisfactory properties. Based on these considerations, we believe that the PMJ task is important for at least two reasons. First, it provides researchers with a task that can be applied to study humor in adulthood and aging. Second, it supports the view of humor as a complex phenomenon that should be investigated using a fine-grained analysis, paying attention to the different logical mechanisms and inference types, as well as to the skills involved depending on the type of joke.

Notably, the two types of jokes assessed by the PMJ task were not significantly different in accuracy scores. That is, phonological and mental jokes were similar in terms of difficulty in the multiple-choice task. Conversely, we found a significant difference in the appreciation of mental vs. phonological jokes by elderly people: the mental jokes were rated as being funnier than the phonological ones. This is consistent with our preliminary rating study and with previous studies on young adults showing that higher complexity in mentalizing increases appreciation (Dunbar et al. 2016; Samson 2012). We speculate that this difference in jokes’ appreciation is, at least in part, due to the involvement of greater cognitive resources that make mental jokes more engaging than phonological jokes. For instance, as evidenced by the analysis related to the second aim of the study, understanding MJs, but not PJs, was associated with ToM. Although more research is clearly needed to confirm our speculation, the interpretation that we offered is in line with the literature showing that jokes sufficiently hard – but not too hard – are rated as the funniest, even in children (Greengross 2013; McGhee 1976). Interestingly, funniness ratings were not related to pragmatics, age, education, or vocabulary, opening future avenues to investigate which factors may account for humor appreciation.

Our second aim was to examine the association between the skills supporting social communication and humor understanding, namely ToM and pragmatics. Concerning ToM, we expected to find a relationship between ToM and MJ, but not PJ. Results confirmed our expectations, evidencing a differential involvement of ToM in verbal humor understanding. Specifically, individual differences in ToM score, as assessed through the Strange Stories, were related to individual differences in MJ, but not in PJ. Elaborating on this finding, we can conclude that jokes where characters’ thoughts and emotions are essential to detect and solve the incongruity rely on mentalizing skills more heavily than jokes based on sound patterns. This finding echoes previous works that reported greater involvement of ToM mechanisms and associated brain regions in response to humor cartoons and jokes based on mental reasoning compared with other types of humor based for instance on visual or sound resemblance (Chan and Lavallee 2015; Corcoran et al. 1997; Marjoram et al. 2006; Samson et al. 2008). Moreover, our finding extends previous literature in at least two important ways. First, previous evidence on the differential involvement of ToM in humor understanding was obtained mainly through tasks employing humorous cartoons. Here we showed that it is possible to observe differential involvement of ToM skills also in another modality, i.e., verbal humor. Second, we evaluated participants’ ToM skills by using a task that was distinct from the task used to assess humor understanding. Previous works investigated ToM involvement in humor by asking participants to answer mentalistic questions on the jokes’ stories (Uekermann et al. 2006) or by rating the mentalistic content in cartoons’ explanation (Samson 2012): in other words, the measures of jokes and ToM understanding were extracted by the very same stimuli, possibly inflating correlations between constructs. Here, the use of independent and specific tests for humor comprehension and for ToM offers stronger evidence of the association between the two abilities.

Coming to the second focus variable, namely pragmatics, our results showed that pragmatic comprehension skills, as assessed in the APACS comprehension test, predicted humor understanding independently from the type of jokes. That is, as hypothesized, the understanding of both mental and phonological jokes engages pragmatic skills in older adults. This result is in line with the theoretical literature considering humor comprehension as part of the more general pragmatic competence (Attardo 2008; Yus 2016). Although our study cannot (and was not designed to) offer direct support to the specific predictions provided in the General Theory of Verbal Humor (Attardo and Raskin 1991) or in the Relevance Theoretic Account of humor (Yus 2016), here we do provide empirical evidence in favor of the general pragmatic approach adopted by these models, pointing to the fact that in aging humor comprehension – irrespective of the different types considered here – is clearly linked to general pragmatic skills, capitalizing on the recognition of the speaker’s intention and on inferencing. Arguably, both phonological and mental jokes rely on the same pragmatic skills assessed in the APACS test, namely inferencing from narrative texts and figurative language. In this respect, it is useful to consider that breakdown in humor is reported as one aspect of the larger pragmatic language disorder observed in several adult clinical populations (Cummings 2014), as in persons with right hemisphere brain damage (Cheang and Pell 2006) and schizophrenia (Bambini et al. 2016), and is included, together with difficulties in idioms and metaphors, in the diagnostic criteria for the Social (Pragmatic) Communication Disorder according to the DSM 5 (American Psychiatric Association 2013).

To summarize, the present findings offer the first picture of how the skills of social communication are involved in jokes understanding, with a focus on elderly people. Thanks to the use of a technique such as structural equation modeling, we could describe the role of pragmatics and ToM in jokes, while controlling for the effects of individual differences in vocabulary, educational attainment, and age. This means that the observed associations between jokes, pragmatics, and ToM are unique and ‘genuine’, i.e., controlled for the effect of vocabulary, educational, and age. While pragmatics was broadly involved in humor comprehension, ToM showed a specific link with mental, but not phonological jokes. This finding – as well as the approach we adopted – carries important implications. First, it promotes a socio-communicative perspective on the study of humor, which has been neglected till now. Humor is a fundamental aspect of communication and interaction, and its study might disclose evidence of great importance for fields such as experimental pragmatics, for instance on the affiliative effects of language in conversation (Morisseau et al. 2017). Second, our study brought the attention to a population so far mostly neglected, namely healthy older adults. Indeed, existing studies have mainly been conducted either on young adults or on clinical populations, with the consequence that we still know very little on the associations between humor and social communication in normal aging. Here we show a complex interplay which deserves further exploration in the psychology of aging.

On a more general note, our model also showed that pragmatics and ToM were not significantly related one to each other. This result – coupled with the findings of a pragmatic involvement and a joke-specific ToM involvement – is interesting and fits with existing findings at the crossroad between developmental psychology and developmental pragmatics showing that ToM is engaged differently depending on the specific pragmatic phenomenon, for instance being stronger for metaphors with mental rather than physical contents (Lecce et al. 2019). This is not to say that there is not a relationship between ToM and pragmatics, but rather that ToM and pragmatics are two independent cognitive domains and their relationship becomes stronger in specific communicative situations. The debate on the relationship between pragmatics and ToM is wide and interdisciplinary, embracing theoretical as well as clinical findings. While some authors claim that pragmatics is a sub-component of ToM (Sperber and Wilson 2002), more recent studies argue for a distinction (Bosco et al. 2018; Muller et al. 2010). Our findings support the latter view, offering novel evidence on healthy older adults and on a phenomenon never explored in this perspective, namely humor.

Finally, our analysis yielded interesting results with respect to the background variables, and especially vocabulary. Vocabulary was entered in the model as predictor for mental and phonological jokes and for ToM, and as covariate for pragmatics. Findings indicate that vocabulary – and specifically vocabulary depth, i.e., how well the words are known (Schmitt 2014) – predicted mental jokes (but not phonological jokes). Moreover, vocabulary predicted ToM and it was associated with pragmatics. The role of vocabulary in jokes was already reported in the past (Mak and Carpenter 2007): its higher involvement in mental jokes might be explained by the fact that mentalistic reasoning per se is associated with vocabulary skills in adulthood (Valle et al. 2015) and in aging (Charlton et al. 2009; Lecce et al. 2017). This explanation also fits with the finding that vocabulary predicts ToM. It is indeed possible that a deeper knowledge of the meanings of words allowed for a better understanding of those jokes where mental states were verbally conveyed. The relationship between vocabulary and pragmatics is not surprising, and it matches with previous evidence reporting a link between vocabulary and pragmatic skills in typical and atypical conditions (Cappelli et al. 2018; Matthews et al. 2018). We also found that vocabulary was predicted both by age (negatively) and by education (positively), in line with previous studies (Verhaeghen 2003).

Another interesting finding concerning the background variable is that age predicted mental jokes, but not phonological jokes. The greater cognitive demands of mental jokes – i.e., the fact that mental jokes recruit additional cognitive resources such as for instance mentalizing ones – might be the reason of their higher difficulty for older adults, in line with previous studies exploring the age effect in elderly people (Daniluk and Borkowska 2017; Schaier and Cicirelli 1976). The general conclusion about the findings concerning the background variables is that mental jokes were at the center of a larger set of relationships compared with phonological jokes. Not only did mental jokes engage ToM resources, but they were also predicted by vocabulary knowledge and by age.

Limitations and Future Directions

The first important caveat is that, in focusing on the skills supporting social communication, our model did not consider other cognitive functions, especially executive functions (EFs). Past studies indicated that EFs play an important role in older adults’ humor understanding (Mak and Carpenter 2007; Shammi and Stuss 2003; Uekermann et al. 2006), as well as in pragmatics (Grindrod and Raizen 2018) and ToM (Charlton et al. 2009; Johansson Nolaker et al. 2018). Therefore, future studies including EFs would be helpful in order to better elucidate the role that general cognitive functioning plays in humor understanding.

The second limitation concerns the generalizability of the current results to the larger population and to other age-groups. Here we focused on a sample of older people. On the one hand, this choice allowed us to investigate a period of life which is still overlooked, but, on the other hand, limited our understanding of the age-related differences in the interplay between humor understanding and social-communicative skills. In order to gain a complete lifespan perspective on humor as a part of social communication, future studies on adults and, perhaps, children, are clearly needed. We mention children because development is another domain where the role of pragmatics and ToM in humor is of special interest, due to the different developmental trajectories of these skills. Interestingly, a recent account describe humor as a form of communication that begins developing very early in life, before the acquisition of ToM skills, with the latter exploited later on in development for more complex forms of humor (Airenti 2016). Also, one cannot exclude that age-related differences combine with gender-related differences. Here we could not investigate gender effects because the sample was grossly unbalanced. However, in a previous study employing a task similar to the one used here, the authors reported gender differences in the young seniors population (64–70 years), with women performing better than men in joke comprehension, but not in the older age-group (Daniluk and Borkowska 2017). Interestingly, the neuroimaging literature evidenced gender differences in the neural response to humor processing and possibly in the recruitment of cognitive systems (Azim et al. 2005; Chang et al. 2018; Kohn et al. 2011). For instance, women seem to show a greater recruitment of brain regions involved in language and executive processing (Azim et al. 2005). In this perspective, future research on joke comprehension might test whether there are differences in the interplay of pragmatics and ToM across genders and age-groups.

Furthermore, in the present study we used a static, off-line humor task (i.e., written jokes). Although this is a more ecological choice compared to the use cartoons, canned jokes are still far from reflecting the daily-life demands of humor comprehension. Indeed, in real life people are often engaged in the understanding of conversational jokes, and of the mental states behind them, not limited to the mental states of the jokes’ characters but extending to the mental states of the people producing the jokes, in dynamic social situations. Thus, future studies on the socio-cognitive underpinnings of humor should try to move forward. A potentially fruitful way to do so would be to use dynamic video tasks (see for instance Hofmann and Ruch 2017), which approximate real-life demands, or to capitalize on researches of humor in interaction (Norrick 2010).

Finally, this study concentrated on humor comprehension, with only marginal observations on humor appreciation. Appreciation is a subject’s matter (Dynel 2009) and – intriguingly – it seems to increase with age for certain types of humor (Ruch et al. 1990; Schaier and Cicirelli 1976; Shammi and Stuss 2003), with positive implications on life satisfaction (Ruch et al. 2010). With the PMJ task, we observed that funniness was higher for mental than phonological jokes, possibly because the former are more engaging, but funniness was not related to any of the communicative and socio-cognitive variables considered here. Did funniness rather depend on personality traits, for instance conservatism (Ruch et al. 1990), or temperament (Ruch and Hofmann 2012), or comic/humor style (Heintz and Ruch 2019; Ruch et al. 2018)? Another aspect of appreciation that we did not consider here is aversiveness, i.e., the degree to which a joke is considered inappropriate, annoying, offensive (Carretero-Dios et al. 2009). Aversiveness especially affects jokes with certain types of content (e.g., sexual; see Ruch 1992) that were avoided in this study; for jokes with no specific content such as those employed here no major variation in aversiveness is expected (Carretero-Dios et al. 2009). Yet future studies might combine the phonological vs. mental distinction employed here with other content types (e.g., sexual, black, etc.), and explore possible variations of aversiveness triggered by phonological vs. mental incongruities. We believe that a full-fledged account of humor as part of social communication skills should somehow combine the more cognitive side with the appreciation and personality side, since the latter is highly relevant for the pragmatic effects – more specifically for the so-called perlocutionary effects – that are linked to the choice of using humor in communication, such as teasing, pleasing, offending, persuading, relieving, etc.

Conclusion

It is well known that elderly people might have difficulties in understanding jokes, due to cognitive decline. In this study we focused on the cognitive demands that jokes pose to elderly people on the side of social communication skills, namely pragmatics and ToM, which also decline in aging. When presented with a joke, people need to somehow revise the rules of cooperative communication (by abandoning, Attardo 1993, or by exploiting them, Dynel 2008) and to inferentially derive a content which is not informative but funny. This capacity, which is part of our ability to use language in context (aka pragmatics), seems a general feature of jokes understanding. By contrast, only certain types of jokes, such as those when there is a misattribution of mental states between the joke’s characters, seem to engage ToM skills. Humor understanding – and arguably also difficulties in humor understanding – is thus deeply grounded in older adults’ skills of social communication.