Introduction

Throughout much of the USA, public displays of cursing constitute a crime punishable by fines or incarceration (Hudson, 2017). No other subset of English has the power to compel a parent to symbolically wash their child’s mouth out with soap or to ameliorate pain when dropping a pan on your foot. Many of us learned early in life to regard cursing as a forbidden or punishable behavior. Nevertheless, taboo words represent a frequent, expressive, and utilitarian element of our daily lives. Much remains to be learned about the structure and representation of taboo words. Toward this end, we examined prediction of (1) lexical tabooness for single words and (2) quality of combinations of novel compound taboo words (e.g., shitrocket).

The relationship between words and their referents is generally considered arbitrarily symbolic, and this property of natural language has significant implications for tabooness. Use of the word excrement, for example, is both descriptive and permissible in many contexts, whereas shit is not. This shit/excrement example marks a distinction between taboo-word usage and taboo actions. Often, but not always, this relationship between the signifier (word) and the signified (act) is highly correlated. Here we make specific reference to taboo word use.

Semantic features of English taboo words

Researchers have hypothesized various semantic classifications for American English taboo words. Jay (1992) proposed weakness of body or spirit, social deviations, animal names, ethnic slurs, body parts, and body processes and products as overarching categories of taboo words. Four alternative categories suggested by Bergen (2016) include praying, fornicating, excreting, and slurring. Furthermore, Fromkin et al. (2011) note the disproportionate amount of sexual and abusive words related to the female body relative to the male body. These particular classifications motivated our current design to include an aggregated set of semantic predictors of tabooness that included gender (male and female), body parts and products, body acts and processes, religious and spiritual terms, disease, and mental and somatic state terms. Additionally, we included socioeconomic status and monetary terms as exploratory measures.

Structural features of English taboo words

In addition to semantic distinctiveness, taboo words are marked by constraints on phonological form and morphosyntactic use. Taboo English words often flexibly assume different grammatical classes, metaphorical usage, infixation, compounding, and lexical hybridization (Bergen, 2016; Jay, 2009). Many believe that taboo words are also marked by word length (i.e., “four-letter words”). This word-length effect could derive from the prediction of Zipf’s Law that highly frequent words spontaneously shorten over time (e.g., automobile→car). Confirmation of this phenomenon is, however, challenging, given that most word-frequency corpora based upon news and subtitles underestimate the prevalence of taboo words (Brysbaert & New, 2009; Janschewitz, 2008; Jay, 1980) and that highly taboo words represent only a small subset of the English lexicon.

Bergen (2016) noted a propensity among taboo words to manifest sound-symbolic features such as closed-syllable structures (e.g., consonant-vowel-consonant) and the presence of more stop consonants (e.g., cock, shit, dick) than would be predicted by chance within a random sample of words. Specifically, phonaesthesia is a potential driver of taboo word structure. Many taboo terms (e.g., cunt, shit, twat, fuck) are composed of sequences of hard consonants nested within short, closed syllable structures, conferring a subjectively unpleasant sound structure that colors its referent as analogously foul.

What imbues taboo words with tabooness? A hypothesis

Janschewitz (2008) advanced the hypothesis that tabooness is an emergent property of negative valence and high physiological arousal. Here we offer a nuanced perspective on Janschewitz (2008), arguing for moderating roles of phonology and semantic category. Figure 1 illustrates the relationship between arousal, valence, and tabooness for a corpus of 1,195 English words. Highly taboo words tend to cluster at the higher end of arousal and negative valence. There are, however, exceptions to this trend, including words that denote socially stigmatized concepts (e.g., cancer), as well as taboo words that are not particularly negatively valenced or highly arousing (e.g., damn).

Fig. 1
figure 1

Valence by arousal scatterplot. Note: Each data point reflects one of 1,206 English words plotted a two-dimensional plane bounded by arousal (x-axis) and valence (y-axis). Red asterisks reflect words rated z>1.96 on tabooness (see Method)

It is challenging to confirm causal claims regarding the direction of effect for valence/arousal and tabooness. Words could assume negative valence and high arousal because their use is forbidden. Alternatively, negative valence and high arousal could lead to words becoming forbiddden. One way to isolate this temporal relation would be through an analysis of etymology and historical word usage. Consider, for example, the arc of a word such as abortion. A transition from socially acceptable to taboo over time along with a punctuated increase in the frequency of usage would support the Janschewitz (2008) hypothesis. Although such a historical linguistic analysis is beyond the scope of the current investigation, we revisit this hypothesis as a future direction in the General Discussion.

Compounding as a source of new taboo words

Compounding represents a novel source of tabooness in English. A recent example of this phenomenon occurred in February 2017 during heated political discourse where Pennsylvania State Senator, Daylin Leach, challenged US President Donald Trump by tweeting, “Why don't you try to destroy my career you fascist, loofa-faced, shit-gibbon!” This distinctive insult garnered the interest of both the popular press and language researchers (Tessier & Becker, 2018). In a follow-up article in Slate, Zimmer (2017) subsequently traced the etymology of shitgibbon to writer David Quantic’s critiques of British pop music in the 1980s. Zimmer’s article specifically highlighted the unanswered question of why certain compounds such as shitgibbon are so effective. We hypothesize that word form and meaning interact in taboo words. In two experiments to follow, we examined factors that predict tabooness for single words and the quality of novel taboo compounds.

Experiment 1: Prediction of tabooness for single words

Method

Participants

We enlisted 190 participants from Mechanical Turk (Amazon Inc.) to provide subjective ratings of tabooness and concreteness for subsets of the corpus (i.e., each participant rated approximately 500 words), including only experienced raters with Master-Level status. Sex distribution was roughly comparable (93F/97M), with a mean age of 38.56 years (SD = 10.05, range = 22–66). We excluded participants (n = 11) who failed to complete > 70% of the survey, or who completed the survey more than 2 standard deviations faster or slower than the mean duration. Participants provided electronic informed consent and were forewarned that they would encounter offensive terms during the course of the experiment.

Stimuli and corpus development

We first compiled a base corpus of high-frequency English words by querying the SUBTLEX word frequency database (Brysbaert & New, 2009), applying a minimium frequency threshold of > 5 per million words. We then cross-referenced the list with concreteness ratings from Brysbaert et al. (2013), eliminating all entries without published concreteness values. SUBTLEX contains relatively few taboo entries. We supplemented this corpus with stimuli drawn from the Janschewitz (2008) corpus and an additional set of socially stigmatized, high-arousal terms (e.g., welfare) generated by laboratory members. After concatenating the three corpora, we eliminated repeated items and obscure entries (e.g., British English slang). These trimming procedures yielded 1,194 total words that we subsequently coded on numerous psycholinguistic variables.

Corpus coding and data analysis

The dependent variable in the multiple regression was tabooness as rated by Mechanical Turk participants on a 1–9 Likert scale.Footnote 1 We predicted tabooness from a linear combination of 23 variables. Table 1 outlines each of the predictors, which varied on scale (continuous vs. categorical), category (semantic vs. phonological), and subjectivity (objective letter counts vs. subjective affective norms).

Table 1 Predictors of tabooness

Phonological measures were derived using a Python script that queried syllabification from Merriam Webster’s online dictionary and converted each word to Klattese, a machine-readable version of the International Phonetic Alphabet. Errors were manually transcribed.

We obtained valence and arousal ratings from the Warriner, Kuperman, and Brysbaert (2013) database, assigning missing values (N = 47) by adapting Warriner et al.’s rating scale instructions and administering to MTurk raters (N = 50). We manually derived an interaction term by multiplying the valence and arousal ratings for each word. Concreteness ratings were obtained from Brysbaert et al. (2014) with missing values imputed using MTurk rater responses using adapted scale instructions (see Online Supplemental Material). We manually coded total number of morphemes per word. We obtained word frequency values and dominant part of speech using SUBTLEX. Polysemy values were then obtained from WordNet using dominant part of speech. Eight items were missing from SUBTLEX and WordNet (Miller, 1995). We determined part of speech and senses for these items (e.g., spaz, bro, jism) by cross-referencing Wiktionary.org.

We marked dichotomous semantic distinctions (e.g., female vs. other) using two rounds of coding. In the first round, the authors convened and nominally coded each word’s semantic category by consensus.Footnote 2 A second round of confirmatory coding was completed by blinded raters (N = 5). Rater agreement was 84.74%. We eliminated ten words with limited rater agreement between coding rounds.

Statistical design and data analysis

We first checked parametric linear regression assumptions within the original set of predictors using the “car” package of R (Fox et al., 2018). All assumptions were violated. Consequently, we completed factor reduction using a two-part procedure. We first conducted a Principal Components Analysis (PCA) to determine the optimal number of latent factors. Using an eigenvector threshold of > 1, we specified six orthogonal latent factors. Within these factors, we employed a threshold of ± .30 as the minimum correlation for group membership. We extracted the factor loadings as new variables and entered all factors and lone variables as orthogonalized predictors in a standard parametric least-squares multiple-regression predicting tabooness. Table 2 reflects the varimax-rotated factor matrix.

Table 2 Tabooness linear regression results

Factor one (“word length”) reflects a linear combination of number of phonemes, and number of syllables. Factor two (“emotion*arousal”) represents valence and the arousal-by-valence interaction term. Factor three (“concreteness”) reflects perceptual salience, which tends to be higher for nouns than verbs. Factor four (“arousal”) represents arousal. Factor five (“syllable structure”) represents a combination of consonant clusters, presence of codas, ratio of stop consonants per syllable, and total number of syllables per word. Factor six (“obstruence”) represents the phonetic distinction of cessation of airflow created by the articulation of stops and fricatives.

Results

Table 3 summarizes the linear model predicting tabooness from a combination of 14 decorrelated factors. Figure 2 displays a correlation plot reflecting all bivariate relations among predictors. The overall model was statistically significant, accounting for 43% of the variance of tabooness (r2 = .43, adjusted r2 = .43, p < .001). Individual predictors ordered by their weighted contribution to the model include: Emotion*Arousal (factor 2), Male, Arousal (factor 4), Female, Body Parts, Concreteness (factor 3), Disease, Body Acts, and Obstruence (factor 6). Included in Table 3 is the relative weight of each predictor calculated using the “lmg” method as implemented within the R package “relaimpo” (Groemping & Matthias, 2018).

Table 3 Factor loadings based on Principal Components Analysis
Fig. 2
figure 2

Correlation plots. Note: Non-significant bivariate correlations in panel A are presented as blank cells. In panel B, the Pearson correlation approximates point biserial correlation and was used to represent the bivariate associations

Interim discussion

Tabooness was moderately predicted (r2 = .43) from a linear algorithm of nine variables. Arousal and valence contributed the greatest weight to the model, although a range of additional predictors were also statistically significant. Taboo words are slightly more abstract than concrete and more often connote body parts, bodily acts, gender, and/or disease. Obstruence was the only statistically significant word form/phonological predictor of tabooness, accounting for minimal variance (lmg < 0.01). There was no evidence for contributions of syllable structure or word length to support intuitions of phonaesthesia or the “four-letter-word” designation. One possible reason for the lack of an observed word length effect is the inclusion of compound words (e.g., cocksucker). We examine this specific subset of the taboo lexicon further below.

Experiment 2: Interactivity between form and meaning in taboo compounding

We examined a potential source of emergent tabooness when combining extant taboo words (e.g., shit) with common nouns (e.g., gibbon) to form novel compounds (e.g., shitgibbon). Participants evaluated the subjective quality of novel taboo and common noun combinations via Likert-scale ratings (e.g., ass-rocket [plausible] vs. ass-arm [implausible]). We examined the quality of novel taboo compound words when participants made unconstrained judgments (i.e., rate the extent to which this word combines with any curse word to form a new curse word), and an exploratory measure of combinations to specified anchors (i.e., fuck, shit). We then conducted a multiple regression to examine prediction of the quality of taboo-word compounding.

Method

Participants

Participants included a combination of neurotypical young adults (n = 25) from Temple University who completed the study in the laboratory supplemented with MTurk raters (n = 115). For the final analyses, we excluded 17 participants who showed minimal variability in their responses (see rationale to follow). Participants completed ratings for the quality of common nouns combined with any possible taboo word (i.e., unconstrained). Footnote 3 The final sample included 87 adults with a mean age of 35.6 (SD = 8.5, range = 18–46); sex distribution was 45F/42M. Participants were nominally financially compensated and provided written informed consent. Participants were additionally forewarned that they would be making judgments of taboo words.

Stimuli

We initially searched the MRC Psycholinguistic Database (Coltheart, 1981) for common nouns, limiting output by part-of-speech (i.e., noun only) with a minimum concreteness threshold of > 560 on a 100–700 scale. From this initial list (N = 1,026) we eliminated homophones, polysemes, and low-frequency nouns using a frequency threshold of 5 per million via the SUBTLEXUS database (Brysbaert & New, 2009). Finally, we eliminated common nouns that form existing taboo compounds (e.g., hat, hole, head). These procedures netted a corpus of frequent and concrete English nouns (N = 487). Table 4 reflects psycholinguistic attributes of the stimulus set. This corpus is freely available for use via the Open Science Framework (https://osf.io/uc8k4).

Table 4 Psycholinguistic attributes of combinatorial noun corpus (N = 480)

Experimental procedures

Participants evaluated plausibility of each common noun (e.g., door) as part of a taboo compound (e.g., assdoor). We obtained these ratings via Qualtrics software (Qualtrics Inc., Provo UT) using a Likert-scale format. In the primary analysis, participants rated the quality of novel taboo words with no specified comparison anchor. That is, participants were free to choose any taboo word they felt paired well with a given common noun in either initial (e.g., assdoor) or final position (e.g., doorass). Likert scales ranged from 1 (very poor) through 4 (neutral) to 7 (outstanding).Footnote 4 Stimuli were fully randomized. The software automatically prompted participants to complete all choices. Participants were given unlimited time to complete the survey, with most completing within 20 min.

Corpus coding and data analysis

Phonological predictors included length in letters, length in syllables, and orthographic neighborhood density (Marian, Bartolotti, Chabal, & Shook, 2012). Two independent raters coded the following phonological variables: syllable complexity (ratio of syllables with consonant clusters to total number of syllables), number of closed syllables (similarly normalized for length), and consonant obstruence (ratio of the total number of stop and affricate consonants to length in syllables). Rater agreement was 96.8% for the phonological predictors. Two additional independent raters coded membership in the following semantic categories: animate, manufactured artifact, receptacle, body part, vehicle, human dwelling, profession. Initial rater agreement was 95.5%. Raters then reconvened and resolved item-level disagreements.

Stimuli with ratings for valence, dominance, and physiological arousal drawn from Warriner, Kuperman, and Brysbaert (2013) were included as semantic predictors.

We conducted an item-level multiple regression with each word as an independent observation. The dependent variable was quality of emergent profanity as gleaned from the average rating for each item across participants.

Results

We eliminated data from participants who completed the survey either with restricted variability (e.g., all 1’s or 4’s) and/or or too rapidly (n = 17). Table 5 summarizes results of the regression.

Table 5 Multiple regressions predicting quality of combinatorial taboo words

The model was statistically significant [F(20,460) = 8.59, p < .001], accounting for 24.03% of the variance in the tabooness judgments (r2 = .27, adjusted R2 = .24). Statistically significant phonological predictors included syllables-per-word (B = -.22, p < .05) and consonant obstruence (B = .18, p < .01), confirming that participants judged shorter words with more stop consonants as better candidates for novel taboo terms. In addition to word form, participants were sensitive to semantic variables including emotional valence (B = -.13, p < .01), physiological arousal (B = .12, p < .01), body part (B = .28, p < .05), receptacle (B = .23, p < .05), animacy (B = .66, p < .01), and profession (B = -.79, p < .01).

The five strongest candidates for taboo compounding per rated quality included: sack, trash, pig, rod, and mouth. The five least acceptable candidates were fireplace, restaurant, tennis, newspaper, and physician.

Interim discussion

English is rife with taboo terms formed through combinatorial processes with religious terms (e.g., goddamn) and other extant taboo words ( Hughes, 1998; Mohr, 2013). In this experiment, we investigated this idiosyncratic propensity for common noun compounding. There are many such examples in common usage today (e.g., shithead, asshat, clusterfuck), and compounding appears to be a legitimate source of new words. We explored why some common nouns form effective new curse words (e.g., shithead), whereas others (e.g., shitarm) do not. It has been suggested that taboo words tend to denote negative concepts while simultaneously having phonological structures that mark sound-symbolic patterns of aggressive and/or unpleasant sounds (Bergen, 2016). We hypothesized that both of these factors (form and meaning) interact to predict the quality of emergent taboo speech, and this was indeed the case.

The data suggest that taboo compounding is a non-random process and that the quality of novel taboo compounding is to an extent predictable by a simple linear model. This compounding process did, however, differ in several important respects relative to the single-word regression data in Experiment 1. First, participants endorsed shorter words, words with many similar sounding neighbors, and words with higher levels of obstruance (e.g., abrupt stoppage of air during articulation) as superior candidates for taboo compounding. Second, prediction was optimized by a linear combination of these formal factors with semantic variables such as whether a word denoted a profession, dwelling, or receptacle.

It is unclear how the linguistic rules governing taboo word compounding (e.g., catdick) diverge from non-taboo compounding (e.g., catfish). We know of no previous neuropsychological reports of excessive cursing characterized by either the production or the spontaneous generation of noun compounds. Morphological decomposition of non-taboo compound words (cat + fish = catfish) is a well-studied phenomenon in language disorders such as aphasia (Rastle & Davis, 2008). However, the extent to which the constituent morphemes of taboo compound words are similarly dissociable remains an open question.

General discussion

We conducted two experiments examining whether tabooness can be algorithmically predicted from the form and meaning of a particular word. The data suggest that American English follows a recipe for tabooness both for single words and to a lesser extent for compound words. Several factors are strongly associated with tabooness. These include physiological arousal and negative emotional valence, as well as semantic factors such as gender, body relations, and disease. There was less evidence for an effect of word length among single words, possibly because of the inclusion of a diverse range of compound words (e.g., cocksucker, motherfucker), all of which counter viability of the “four-letter-word” phenomenon.

Following Janschewitz (2008), we focused on an interaction between high arousal and negative emotional valence. These factors do appear to play a deterministic role in predicting tabooness, but there exists a range of additional moderating variables. Figure 1 illustrates overlap between taboo words and non-taboo words that share space within the negative-valence and high-arousal quadrant. Words such as welfare, abortion, and sodomy have all the necessary ingredients for tabooness, and indeed appear as some of the more taboo terms in our distribution. Yet, unlike words that are universally regarded as taboo, this particular class of descriptive terms is acceptable within certain public settings (e.g., scientific and/or instructional discourse). By tracing etymology and usage statistics over time, it may be possible to observe an arc as negatively valent and highly arousing words shift from descriptive to taboo.

Concluding remarks and future directions

Our findings suggest several promising future directions for the psychology and neurology of taboo-word processing. One application involves populations who experience excessive and/or uncontrolled taboo-word usage as a result of neurological etiologies, including severe expressive aphasia, Tourette Syndrome, traumatic brain injury (TBI), and the behavioral variant of frontotemporal degeneration (bvFTD). There are some commonalities but also many differences in the respective neuropathologies that underlie coprolalia and the excessive use of profanity in these populations. Hemispheric differences in valence, arousal, propositional/non-propositional language representation, theory of mind, and general cognitive control are all possible etiologies of excessive taboo-word usage. These variables likely interact with premorbid individual differences in both receptive and expressive use of taboo words. For people who experience debilitating social consequences of uncontrolled taboo-word usage, algorithmic prediction of tabooness may hold promise for tailoring intervention. Rather than punish or prohibit the output, a focus on precipitating factors (e.g., modulating arousal, sensitivity to listener attitudes) may improve communicative outcomes.

Open Science Statement

Stimuli, computer scripts, and scale wording are freely available for download and use at https://osf.io/uc8k4