Skip to main content

An inclusive, real-world investigation of persuasion in language and verbal behavior

Abstract

Linguistic features of a message necessarily shape its persuasive appeal. However, studies have largely examined the effect of linguistic features on persuasion in isolation and do not incorporate properties of language that are often involved in real-world persuasion. As such, little is known about the key verbal dimensions of persuasion or the relative impact of linguistic features on a message’s persuasive appeal in real-world social interactions. We collected large-scale data of online social interactions from a social media website in which users engage in debates in an attempt to change each other’s views on any topic. Messages that successfully changed a user’s views are explicitly marked by the user themselves. We simultaneously examined linguistic features that have been previously linked with message persuasiveness between persuasive and non-persuasive messages. Linguistic features that drive persuasion fell along three central dimensions: structural complexity, negative emotionality, and positive emotionality. Word count, lexical diversity, reading difficulty, analytical language, and self-references emerged as most essential to a message’s persuasive appeal: messages that were longer, more analytic, less anecdotal, more difficult to read, and less lexically varied had significantly greater odds of being persuasive. These results provide a more parsimonious understanding of the social psychological pathways to persuasion as it operates in the real world through verbal behavior. Our results inform theories that address the role of language in persuasion, and provide insight into effective persuasion in digital environments.

Introduction

Understanding persuasion—how people can fundamentally alter the thoughts, feelings, and behaviors of others—is a cornerstone of social psychology. Historically, social influence has been outstandingly difficult to study in the real-world, requiring researchers to piece together society-level puzzles either in the abstract [1] or through carefully-crafted field studies [2]. In recent years, technology has driven interest in studying social influence as digital traces make it possible to study how the behaviors of one individual or group cascade to change others’ behaviors [3, 4]. Nevertheless, most social processes are complex, to the point where they are very difficult to study as they operate outside of the lab. However, the availability of digital data and computational techniques provide a ripe opportunity to begin understanding the precise mechanisms by which people influence the thoughts and feelings of others.

Today, persuasion is often transacted—partially or wholly—through verbal interactions that take place on the internet [5]: a message is transmitted from one person to another through the use of language, altering the recipient’s attitude. As such, researchers have sought to identify linguistic featuresFootnote 1 that are linked to a message’s persuasive appeal. A relatively sizable number of linguistic features that are important in message persuasiveness have emerged from this body of research and include features that indicate what a message conveys as well as how it was conveyed (Table 1). Models of persuasion, such as the Elaboration Likelihood Model (ELM) [6], have been used to identify these linguistic features and explain how they affect message persuasiveness.

Table 1 Summary of linguistic features and predictions

Despite the impressive corpus of studies to date, the existing literature has several limitations. Studies have largely examined the effect of linguistic features on persuasion in isolation by only focusing on a small number of linguistic features (i.e., one or two) at a time. While this body of literature has collectively identified a relatively sizable number of linguistic features that are linked to message persuasiveness, it remains unclear how these links, taken together, inform the social aspects of verbal behavior in persuasion. In other words, what do the linguistic features connected with message persuasiveness reveal about the key verbal behaviors involved in persuasion? As language provides “a rich stream of ongoing social processes” [7], synthesizing these findings can provide a more complete understanding of the social psychological pathways to persuasion.

In the same vein, real-world messages are constructed using a varied combination of linguistic features to transmit complex thoughts, emotions, and information to others. Nevertheless, studies tend to examine how a single linguistic feature (or a small set of features) correlate with persuasion without taking into account other potentially important linguistic features within a given message [8, 9]. The meaning of a given word or feature in any text is dependent on the context by which it was used which can be inferred by the words and features that surround it [10, 11]. As such, the effect of any particular linguistic feature on message persuasiveness can be attenuated by the presence of other features in the message. As they are typically studied in isolation, little is known about the relative impact of linguistic features on a message’s persuasive appeal.

Furthermore, studies that examine the effect of linguistic features on persuasion tend to focus on persuasion in terms of engaging in specific behaviors [3, 12,13,14] rather than changing attitudes in general. Persuading people to engage in a specific behavior is conceptually distinct from changing people’s attitude on a topic. Although changes in behavior can facilitate changes in attitude, changes in behavior can also be dependent on attitude change (e.g., an individual may not engage in behavior change unless they believe that the behavior will result in a desirable outcome). Although changes in behavior can facilitate changes in attitude, changes in behavior does not always indicate that attitude change has occurred (e.g., an individual may decide to ultimately receive the COVID-19 vaccine because their employer requires it and not because their views regarding vaccines have changed) [15].

Finally, many studies that investigate the effect of linguistic features on persuasion are conducted in controlled lab settings [16, 17] due to the sheer difficulty of studying persuasion as it unfolds in the real-world. Given that persuasion often takes place through online social interactions [5], there is a need to study persuasion in this setting. Doing so also enables researchers to better understand how digital environments influence the process of persuasion, especially as digital environments are now progressively constructed to persuade the attitudes and behaviors of users [18] and there is “little consensus on how to persuade effectively within the digital realm” [19].

We sought to address these limitations in the current study. Specifically, we collected large-scale data from r/ChangeMyView, an online public forum on the social media website Reddit where users engage in debates in an attempt to change each other’s views on any topic. Most importantly, messages that successfully changed a user’s views are explicitly marked by the user themselves. That is, individuals are exposed to several messages and explicitly identified the message(s) that actually changed their views. We simultaneously examined linguistic features that have been previously linked with message persuasiveness (Table 1) between persuasive and non-persuasive messages to test the following research questions:

  1. 1.

    What are the key linguistic dimensions of persuasion? Given that a relatively sizable number of linguistic features have been linked with persuasion, we first sought to determine whether these features could be meaningfully reduced to a smaller number of dimensions representing the key verbal processes of persuasion. We then assessed whether these dimensions were uniquely predictive of persuasion when controlling for the effects of the remaining dimensions.

  2. 2.

    Which individual linguistic features, when assessed simultaneously, are the most essential and relevant to a message’s persuasive appeal? We then simultaneously assessed all linguistic features that have been linked with message persuasiveness in a single model to examine the relative impact of the features on a message’s persuasive appeal to identify features that were most crucial to message persuasiveness.

While theory-driven predictions can be made regarding how each linguistic feature relates to persuasion, there has been a considerable amount of variability across studies in terms of which features positively or negatively relate to persuasion, as well as studies that show mixed or inconclusive results pertaining to the effect of a given linguistic feature on persuasion (see Table 1). Given that our primary goal was to obtain a more unified understanding of the social psychological pathways to persuasion via language, the current study is guided by a jointly data-driven and exploratory approach, with results informing our understanding of the directional relationship between the linguistic features and message persuasiveness. Overall, assessing the interplay between important linguistic features on persuasion using large-scale, real-world data help inform theories, such as ELM, that address how linguistic features influence persuasion to provide a parsimonious and ecologically-valid understanding of the social psychological processes that shape persuasion.

Although some previous studies have used r/ChangeMyView data to investigate the effect of linguistic features on persuasion, they differ from the current investigation in important ways. The types and combinations of linguistic features that have been examined vary across studies and typically feature a mix of linguistic features that have and have not been linked to persuasion. For example, Tan et al. [21] examined how some persuasion-linked linguistic features (including arousal, valence, reading difficulty, and hedges), some non-persuasion-linked features (e.g., formatting features such as use of italics and boldface), and interaction dynamics (e.g., the time a replier enters a debate) were associated with successful persuasion. Wei et al. [22] investigated how surface text features (e.g., reply length, punctuation), social interaction features (e.g., the number of replies stemming from a root comment), and argumentation-related features (e.g., argument relevance and originality) related to persuasion. Musi et al. [23] assessed the distribution of argumentative concessions in persuasive versus non-persuasive comments, and Priniski and Horne [24] examined persuasion through the presentation of evidence only in sociomoral topics. Moreover, studies tend to have greater emphasis on model building to accurately detect persuasive content online rather than interpretability and a more unified understanding of the social psychological pathways to persuasion via language. For instance, Khazaei et al. [20] assessed how all LIWC-based features varied across persuasive and non-persuasive replies and used this information to train a machine learning model to identify persuasive responses.

Method

Data collection

We used data from the Reddit sub-community (i.e., “subreddit”) r/ChangeMyView, a forum in which users post their own views (referred to as “original posters”, or “OPs”) on any topic and invite others to debate them. Those who debate the OP (referred to as “repliers”) reply to the OP’s post in an attempt to change the OP’s view. The OP will award a delta (∆) to particular replies that changed their original views.

Using data from r/ChangeMyView presents several advantages. All replies in r/ChangeMyView are written with the purpose of persuasion. The replies that successfully change an OP’s view are explicitly marked by the OP themselves, allowing for a sample of persuasive and non-persuasive replies. All OPs and repliers must adhere to the official policiesFootnote 2 of r/ChangeMyView. For instance, OPs are required to explain at a reasonable length (using 500 characters or more) why they hold their views and to interact with repliers within a reasonable time frame. Replies must be substantial, adequate, and on-topic. Because these policies are enforced by moderators, the resulting interactions are high in quality [21] and are conducted under similar conditions with similar expectations. OPs can also post their view on any topic, allowing for an examination of persuasion across a wide variety of topics.

All top-level replies (direct replies to the OP’s original statement of views) posted between January 2013 and October 2018 were initially collected from the Pushshift database [25]. We focused only on the top-level replies and omitted any additional replies that were in response to a direct reply (i.e., a direct reply’s “children”). This ensured that replies that were deemed persuasive were due to its contents and not due to any resulting “back-and-forth” interactions given that deltas can also be awarded to downstream replies. We also omitted any top-level replies that were made by a post’s OP and any replies that received a delta in which the delta was not awarded by the OP. Because the data contained a substantially greater number of non-persuasive replies (99.39%) than persuasive ones, analyses were conducted on a balanced subsample that included all top-level replies that were awarded a delta and a random subsample of top-level replies that were not awarded a delta that came from the original posts in which at least one delta was awarded. This allowed us to compare the persuasive and non-persuasive replies from the same original post while bypassing issues associated with class imbalances [26].

As an example, consider a parent post that garnered two top-level replies that were awarded a delta, and three top-level replies that were not awarded a delta. In this case, the two top-level replies that were awarded a delta were included in the subsample and two out of the three top-level replies that were not awarded a delta would be randomly selected for inclusion in the subsample. Using the random number generator in Microsoft Excel, the 3 top-level replies that were not awarded a delta were assigned a random number between 1 and 100. Replies with the lowest two values were then selected for inclusion in the subsample. Parent posts almost always contained a greater number of top-level replies that were not awarded a delta than top-level replies that were awarded a delta. However, for the very few instances in which a parent post contained a greater number of top-level replies that were awarded a delta than top-level replies that were not awarded a delta, we included all top-level replies in the subsample (N = 9020 top-level replies; n = 4515 top-level replies that were awarded a delta; n = 4505 top-level replies that were not awarded a delta). Example persuasive and non-persuasive replies can be found in Table 2.

Table 2 Example replies

To gain an initial understanding of the types of topics that were raised for debate in the subreddit, we randomly selected 100 replies from the final dataset and manually coded their content. Six overarching topics emerged: legal and politics; race, culture, and gender; business and work; science and technology; behavior, attitudes, and relationships; and recreation. More information regarding debated topics can be found in the supplementary materials.Footnote 3.

Linguistic features

Prior to extracting linguistic features from our data, we conducted a cursory search of the psychological literature to identify prominent linguistic features reported to have a significant relationship with message persuasiveness in at least one published study. These linguistic features are listed in Table 1. Each reply in the r/ChangeMyView dataset was analyzed separately using Language Inquiry and Word Count (LIWC) [27] which calculates the percentage-use of words belonging to psychologically or linguistically meaningful categories. We used LIWC to quantify word count, analytic thinking (analytical thinking formula = articles + prepositions—personal pronouns—impersonal pronouns—auxiliary verbs—conjunctions—adverbs—negations; relative frequencies are normalized within LIWC2015 to a 0-to-100 scale, with higher scores reflecting more analytical language and lower scores reflecting more informal and narrative-like language), the percentage-use of self-references (i.e., first-person singular pronouns, or “i-words”), and the percentage-use of certainty terms in each reply within our corpus. Dictionaries of terms that have been rated on emotionalityFootnote 4 (i.e., valence, arousal, and dominance) from [28] were imported into LIWC to measure the percentage-use of language that scored high and low on valence, arousal, and dominance. A dictionary of hedges from [29] was also imported into LIWC to measure the percentage-use of hedges. Following [21], the use of examples was measured by occurrences of “for example”, “for instance”, and “e.g.”. Language abstraction/concreteness was measured using the linguistic category model, with higher scores indicating higher levels of language abstraction and lower scores indicating lower levels of language abstraction (i.e., greater language concreteness; formula for calculation = [(Descriptive Action Verbs × 1) + (Interpretative Action Verb × 2) + (State Verb × 3) + (Adjectives × 4)]/(Descriptive Action Verbs + Interpretative Action Verbs + State Verbs + Adjectives)) [30]. Type-token ratio, the ratio between the number of unique words in a message and the total number of words in the given message [31], was used to measure lexical diversity with higher scores indicating greater lexical diversity (type-token ratio formula = number of unique lexical terms/total number of words). Last, reading difficulty was measured via the SMOG Index which estimates the years of education the average person needs to completely comprehend a piece of text (SMOG Index formula = 1.0430 [√number of polysyllables × (30/number of sentences)] + 3.1291). Because a higher SMOG score indicates that higher education is needed to comprehend a piece of text, higher reading difficulty scores represent text that is more difficult to read and lower scores represent text that is easier to read [32]. More information about these linguistic features and example replies that scored high and low on each linguistic feature are reported in the supplementary.

Results

Given that a relatively sizable number of linguistic features have been linked with persuasion, we first determined whether these features could be meaningfully reduced to a smaller number of dimensions representing the key verbal processes of persuasion. Second, we determined whether these dimensions were each uniquely predictive of persuasion when controlling for the effects of the remaining dimensions. Third, we simultaneously assessed all linguistic features that have been linked with message persuasiveness in a single model to understand how linguistic features interact with one another to influence a message’s persuasive appeal and identify features most crucial to message persuasiveness. All data and analytic code can be found in the supplementary. Descriptive statistics, zero-order correlations between all variables, and complete analytic outputs for all analyses are presented in the supplementary.

To identify the key linguistic dimensions of persuasion (RQ 1), we submitted all linguistic features into a principal components analysis (PCA) with a varimax rotation. Bartlett’s Sphericity Test (p < 0.001) and the Kaiser–Meyer–Olkin metric (KMO = 0.55) suggested that our data were suitable for analysis. Features with factor loadings greater than the absolute value of 0.50 were retained and used to quantify principal components. Three principal components were extracted that collectively accounted for 36.28% of the total variance: structural complexity, negative emotionality, and positive emotionality (see Table 3). Structural complexity had high loadings in the direction of lower lexical diversity, higher word count, and greater reading difficulty. Negative emotionality had high loadings in the direction of greater percentage-use of terms that scored low on valence and low on dominance. Positive emotionality had high loadings in the direction of greater percentage-use of terms that scored high on dominance, high on valence, and hedges.

Table 3 Results of PCA with Varimax Rotation

To assess if all three dimensions were uniquely important to message persuasiveness, we entered each component into a multilevel logistic regression analysis using lme4 [33]. This procedure corrects for non-independence of replies (i.e., replies to the same parent post) on the dependent variable: persuasion (delta awarded = 1, no delta awarded = 0). We include random intercepts for replies nested within parent posts and replies nested within repliers (i.e., some repliers provided replies to multiple original posts). All three components emerged as significant predictors of persuasion. For a one-unit increase in structural complexity, the odds of receiving a delta increase by a factor of 2.25, 95% CI [2.11, 2.39]. For a one-unit increase in negative emotionality, the odds of receiving a delta decrease by a factor of 0.89, 95% CI [0.85, 0.94]. For a one-unit increase in positive emotionality, the odds of receiving a delta also decrease by a factor of 0.92, 95% CI [0.88, 0.97]. Post-hoc power analyses conducted using the simr package in R (Version 1.0.5) [34] revealed that we had at least 96% power to detect a small effect (i.e., 0.15) for each of these factors on persuasion.

Next, the individual linguistic features were assessed simultaneously to identify those that were the most essential and relevant to a message’s persuasive appeal (RQ 2). A logistic least absolute shrinkage and selection operator (LASSO) regression was performed using glmmLasso [35]. A LASSO regression is a penalized regression analysis that performs variable selection to prevent overfitting by adding a penalty (λ) to the cost function (i.e., the sum of squared errors) equal to the sum of the absolute value of the coefficients. This penalty results in sparse models with few coefficients. In other words, this method selects a parsimonious set of variables that best predict the outcome variable and has many advantages over other feature selection methods [36]. All linguistic features were entered into the LASSO regression model. A grid search was performed to identify the most optimal shrinkage parameter based on BIC. Five features emerged with nonzero coefficients: word count, lexical diversity, reading difficulty, analytical thinking, and self-references (Table 4).

Table 4 Results of LASSO regression

These variables were subsequently entered into a multilevel logistic regression. Again, persuasion was entered as the dependent variable and we included random intercepts for replies nested within parent posts and replies nested within repliers. All five predictors emerged as significant predictors of persuasion. Specifically, for a one-unit increase in word count, the odds of receiving a delta increase by a factor of 1.23, 95% CI [1.13, 1.35]. For a one-unit increase in reading difficulty scores (i.e., greater difficulty in reading comprehension), the odds of receiving a delta increase by a factor of 1.10, 95% CI [1.04, 1.16]. For a one-unit increase in analytical thinking, the odds of receiving a delta increase by a factor of 1.10, 95% CI [1.05, 1.17]. For a one-unit increase in self-references, the odds of receiving a delta decrease by a factor of 0.92, 95% CI [0.87, 0.98]. Last, for a one-unit increase in lexical diversity, the odds of receiving a delta decrease by a factor of 0.54, 95% CI [0.50, 0.59]. Post-hoc power analyses conducted using the simr [34] revealed that we had at least 96% power to detect a small effect (i.e., 0.15) for each of these predictors on persuasion.

Discussion

Previous studies have largely examined the effect of linguistic features on persuasion in isolation and do not incorporate properties of language that are often involved in real-world persuasion. As such, little is known about the key verbal dimensions of persuasion or the relative impact of linguistic features on a message’s persuasive appeal in real-world social interactions. To address these limitations, we collected large-scale data of online social interactions from a public forum in which users engage in debates in an attempt to change each other’s views on any topic. Messages that successfully changed a user’s views are explicitly marked by the user themselves. We simultaneously examined linguistic features that have been previously linked with message persuasiveness between persuasive and non-persuasive messages. Our findings provide a parsimonious and ecologically-valid understanding of the social psychological pathways to persuasion as it operates in the real world through verbal behavior.

Three linguistic dimensions appeared to underlie the tested features: structural complexity, negative emotionality, and positive emotionality. Each dimension uniquely predicted persuasion when the effects of the remaining dimensions were statistically controlled, with greater structural complexity exhibiting the highest odds of persuasion. Interestingly, messages marked with less emotionality had higher odds of persuasion than messages marked with more emotionality, regardless of whether it was positive or negative. Emotionality can help persuasion in specific contexts [37, 38], but emotional appeals can also backfire when audiences prefer cognitive appeals [39]. Given that OPs were publicly inviting others to debate them, it is plausible that they preferred cognitively-appealing responses—ones that include an abundance of clear and valid reasons to support an argument—rather than emotionally-appealing responses.

The linguistic features that made a message longer, more analytic, less anecdotal, more difficult to read, and less lexically diverse were most essential to a message’s persuasive appeal and uniquely predictive of persuasion. Longer messages provide more context and likely contain more arguments than shorter messages. Presenting more arguments can be more persuasive even if the arguments themselves are not compelling [40]. Longer messages likely provided more opportunities for the OP to engage with material that could potentially change their mind, thus increasing the likelihood of persuasion.

Although more readable content is easier to understand and less aversive than less readable content [41], greater reading difficulty and comprehension can engender more interest, attention, and engagement [42, 43]. It can also facilitate deeper cognitive processing that leads to greater learning and long-term retention [44, 45]. This is especially true for individuals intrinsically motivated or capable of engaging in complex and novel tasks [46]. OPs were likely capable of and intrinsically motivated to engage in content that challenged their beliefs considering they were inviting others to debate them. The interpretation of users being intrinsically motivated to challenge their beliefs is also in line with the link that emerged between greater usage of analytical language and persuasion. Similarly, messages that focused less on one’s own personal experiences may have provided more objective evidence to support a particular argument, facilitating persuasion.

Last, while greater lexical repetitions may be perceived as less interesting [31, 47], it facilitated persuasion in this context. Lexical repetitions provide effective ways for speakers to communicate complex topics as it keeps “lexical strings relatively simple, while complex lexical relations are constructed around them” [48]. Lexical repetitions are advantageous for navigating through the order and logic of an argument, providing “textual markers” that help readers connect important aspects of an argument together [49]. Lower lexical diversity, then, appeared to be beneficial for building arguments that are more cohesive, more coherent, and thus, more persuasive.

Altogether, our findings reveal that the linguistic features linked to persuasion fall along three dimensions pertaining to structural complexity, negative emotionality, and positive emotionality. Our findings also highlight the importance of linguistic features related to a message’s structural complexity, particularly the verbal behaviors that provide a greater amount of factual evidence in a way that enables readers to connect important aspects of the information in an appropriately stimulating manner. Although the other linguistic features that were examined in this study may contribute to message persuasiveness to some degree, our results indicate that they are relatively less important after word count, lexical diversity, reading difficulty, analytical thinking, and self-references are taken into account. These findings also seem to reflect r/ChangeMyView’s digital environment. A central feature of r/ChangeMyView is ensuring that all posts and replies meaningfully contribute to the conversations. As such, OPs and repliers must adhere to all moderator-enforced policies of interaction. In addition, users who post on r/ChangeMyView are likely individuals who are open to attitude change given that they are publicly inviting others to debate them on a topic they already have an opinion on. This suggests that, in digital environments that underscore meaningful contributions to conversations, the ability to convey more objective information while fostering engagement and a holistic understanding of an argument are most vital to the alteration of established attitudes among open-minded individuals.

Our findings also have implications for the process by which persuasion research via language is conducted. Assessing the relative importance of a linguistic feature on message persuasiveness allowed us to understand its interconnections with other linguistic features and its link to persuasion, yielding a more comprehensive and well-rounded understanding of the feature’s role in message persuasiveness. Consider word count, for example: without assessing word count’s relative importance on message persuasiveness in the current study, we would not have been able to ascertain its link to message persuasiveness via a message’s structural complexity and the importance of providing more content in a way that enables readers to connect important aspects of the information in an appropriately stimulating manner. Because the meaning of a word or linguistic feature in any text is dependent on the context by which it is used, understanding the social psychological pathways to persuasion via language requires researchers to account for the presence of multiple linguistic features within a given message when assessing a linguistic feature’s link to message persuasiveness. This holistic approach may also help reconcile conflicting results from previous research on language and persuasion.

Our findings also inform theories, such as ELM, that address how linguistic features influence persuasion and provide a more precise understanding of the social psychological pathways to persuasion. For example, ELM states that here are two main routes to persuasion: the central route, which focuses on the message quality on persuasion, and the peripheral route, which uses heuristics and peripheral cues to help influence individual decisions regarding a topic [6]. Individuals are more likely persuaded via the central route if they have the ability and motivation to process the information. On the other hand, individuals are more likely persuaded via the peripheral route if involvement is low and information processing capability is diminished. OPs likely have the ability and motivation to process arguments from repliers and are thus likely persuaded via the central route given that they are publicly inviting others to debate them. Supplying more information to support a conclusion may be more likely to persuade via the central route, but this information also needs to be organized in a way that helps readers connect important aspects of the information together. A wealth of information that is structured in an incoherent manner would undoubtedly hinder comprehension, and thus, persuasion.

Strengths and limitations

Our dataset contained a large sample of replies that spanned a wide variety of topics, and provided high ecological validity given that it captured the process of persuasion as it occurred naturally without elicitation. The enforcement of rules on r/ChangeMyView yielded interactions that were conducted under similar conditions and expectations. This helped to minimize interaction variance without interfering with the naturalistic nature of the data. However, OPs can award deltas to responses within subtrees (the “children” of direct replies) typically as the result of “back-and-forth” interactions with repliers. These were not included in the current study as we only examined top-level responses. Our results could also differ by topic, recency of the post, and post length, and it is possible non-linguistic features such as the popularity of a post, the number of “upvotes” (i.e., the number of instances other users have registered agreement with a particular post or reply) a reply receives, and the number of deltas a replier has ever received may also impact message persuasiveness. Future studies should determine if these variables moderate the findings, and doing so would also address the relative importance of linguistic versus non-linguistic features on message persuasiveness.

Although it is a policy on r/ChangeMyView that OPs must post a non-neutral opinion (i.e., their post must take a non-neutral stance on a topic), and posts that violate this rule are removed by moderators, it is possible that an OP’s post did not accurately reflect their true attitude or attitude strength. Given the nature of the data, this study cannot address whether the resulting attitude changes were long-lasting, nor if the OP’s attitude strength moderated their attitude change. Longitudinal studies can assess these points. Because there were substantially more non-persuasive replies (99.39%) than persuasive ones, we constructed a balanced subsample and conducted our analyses on this balanced subsample. While this strategy limited biased outcomes stemming from a large class imbalance, it also limits the generalizability of results to posts in which no persuasion occurred. Further examinations of the class imbalance are needed to address this issue. For example, it is possible that posts in which no persuasion occurred are systematically different from posts in which persuasion occurred. Or, perhaps the class imbalance simply reflects the rigid nature of attitudes. In addition, our results may only reflect a particular population given that Reddit users tend to skew younger and male [50]. Since we did not have access to subjects’ demographic information, we cannot assert the representativeness of our sample. Future research should investigate persuasion that takes place on other debate-style forums and websites to incorporate more diverse subjects, interaction modes, and digital environments.

Notes

  1. We define linguistic feature as a characteristic used to classify a word or corpus of text based on their linguistic properties. Examples include reading difficulty, words denoting high or low emotionality, hedges, etc.

  2. For all of r/ChangeMyView’s policies, visit https://www.reddit.com/r/changemyview/wiki/rules#wiki_rule_a.

  3. Supplementary materials can be found here: https://osf.io/4rj26/?view_only=5556b511084b4e75bc14808e47d15dce

  4. We adopted the Valence-Arousal-Dominance circumplex model of emotion (Bradley & Lang, 1994; Russell, 1980) and the PAD emotion state model (Mehrabian, 1980; Bales, 2001) and conceptualize valence, arousal, and dominance as the dimensions of emotion. All three dimensions have been linked to message persuasiveness (see Table 1).

References

  1. Lasswell, H. D. (1938). Propaganda technique in the world war. In P. Smith (Ed).

  2. Milgram, S., & Shotland, R. L. (1973). Television and antisocial behavior: Field experiments. Academic Press.

    Google Scholar 

  3. Althoff, T., Danescu-Niculescu-Mizil, C., & Jurafsky, D. (2014, May). How to ask for a favor: A case study on the success of altruistic requests. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 8, No. 1).

  4. Pentland, A. (2014). Social physics: How good ideas spread—the lessons from a new science. Penguin Press.

    Google Scholar 

  5. Fogg, B. J. (2008). Mass interpersonal persuasion: An early view of a new phenomenon. International conference on persuasive technology (pp. 23–34). Berlin: Springer.

    Chapter  Google Scholar 

  6. Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood model of persuasion. Communication and persuasion (pp. 1–24). New York: Springer.

    Chapter  Google Scholar 

  7. Boyd, R. L., & Schwartz, H. A. (2021). Natural language analysis and the psychology of verbal behavior: The past, present, and future states of the field. Journal of Language and Social Psychology, 40(1), 21–41. https://doi.org/10.1177/0261927X20967028

    Article  Google Scholar 

  8. Averbeck, J. M., & Miller, C. (2014). Expanding language expectancy theory: The suasory effects of lexical complexity and syntactic complexity on effective message design. Communication Studies, 65(1), 72–95.

    Article  Google Scholar 

  9. Clementson, D. E., Pascual-Ferrá, P., & Beatty, M. J. (2016). When does a presidential candidate seem presidential and trustworthy? Campaign messages through the lens of language expectancy theory. Presidential Studies Quarterly, 46(3), 592–617.

    Article  Google Scholar 

  10. Evans, V. (2009). How words mean: Lexical concepts, cognitive models, and meaning construction. Oxford University Press.

    Book  Google Scholar 

  11. Asher, N. (2011). Lexical meaning in context: A web of words. Cambridge University Press.

    Book  Google Scholar 

  12. Mitra, T., & Gilbert, E. (2014). The Language that Gets People to Give: Phrases that Predict Success on Kickstarter. In Proc. CSCW’14.

  13. Larrimore, L., Jiang, L., Larrimore, J., Markowitz, D., & Gorski, S. (2011). Peer to peer lending: The relationship between language features, trustworthiness, and persuasion success. Journal of Applied Communication Research, 39(1), 19–37.

    Article  Google Scholar 

  14. Markowitz, D. M. (2020). Putting your best pet forward: Language patterns of persuasion in online pet advertisements. Journal of Applied Social Psychology, 50(3), 160–173.

    Article  Google Scholar 

  15. Olson, J. M., & Stone, J. (2005). The influence of behavior on attitudes. In D. Albarracín, B. T. Johnson, & M. P. Zanna (Eds.), The handbook of attitudes (pp. 223–271). Lawrence Erlbaum.

    Google Scholar 

  16. Levitt, S. D., & List, J. A. (2007). What do laboratory experiments measuring social preferences reveal about the real world? Journal of Economic Perspectives, 21(2), 153–174.

    Article  Google Scholar 

  17. Matz, S. C., Kosinski, M., Nave, G., & Stillwell, D. J. (2017). Psychological targeting as an effective approach to digital mass persuasion. Proceedings of the National Academy of Sciences of the United States of America, 114(48), 12714–12719.

    Article  Google Scholar 

  18. Cyr, D., Head, M., Lim, E., & Stibe, A. (2018). Using the elaboration likelihood model to examine online persuasion through website design. Information & Management, 55(7), 807–821.

    Article  Google Scholar 

  19. Slattery, P., Simpson, J., & Utesheva, A. (2013). Online persuasion as psychological transition, and the multifaced agents of persuasion: A personal construct theory perspective. In ACIS 2013: Information Systems: Transforming the future: Proceedings of the 24th Australasian conference on information systems, pp. 1–11.

  20. Khazaei, T., Lu, X., & Mercer, R. (2017). Writing to persuade: Analysis and detection of persuasive discourse. In iConference 2017 Proceedings.

  21. Tan, C., Niculae, V., Danescu-Niculescu-Mizil, C., & Lee, L. (2016). Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions. In Proceedings of the 25th International Conference on World Wide Web - WWW ’16.

  22. Wei, Z., Liu, Y., & Li, Y. (2016, August). Is this post persuasive? Ranking argumentative comments in online forum. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 195–200).

  23. Musi, E., Ghosh, D., & Muresan, S. (2018). ChangeMyView through concessions: Do concessions increase persuasion? http://arxiv.org/abs/1806.03223

  24. Priniski, J., & Horne, Z. (2018). Attitude change on Reddit’s change my view. In Proceedings of the 40th Annual Meeting of the Cognitive Science Society.

  25. Baumgartner, J., Zannettou, S., Keegan, B., Squire, M., & Blackburn, J. (2020). The Pushshift Reddit Dataset. http://arxiv.org/abs/2001.08435

  26. Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5), 429–449. https://doi.org/10.3233/IDA-2002-6504

    Article  Google Scholar 

  27. Pennebaker, J. W., Boyd, R. L., Jordan, K., & Blackburn, K. (2015). The development and psychometric properties of LIWC2015. The University of Texas at Austin.

    Google Scholar 

  28. Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior Research Methods, 45(4), 1191–1207.

    Article  Google Scholar 

  29. Hanauer, D.A., Liu, Y., Mei, Q., Manion, F.J., Balis, U.J., & Zheng, K. (2012). Hedging their mets: the use of uncertainty terms in clinical documents and its potential implications when sharing the documents with patients. In: AMIA Annual Symposium Proceedings, Vol. 2012, p. 321. American Medical Informatics Association.

  30. Seih, Y. T., Beier, S., & Pennebaker, J. W. (2017). Development and examination of the linguistic category model in a computerized text analysis method. Journal of Language and Social Psychology, 36(3), 343–355.

    Article  Google Scholar 

  31. Bradac, J. J., Konsky, C. W., & Davies, R. A. (1976). Two studies of the effects of linguistic diversity upon judgments of communicator attributes and message effectiveness. Communication Monographs, 43(1), 70–79.

    Article  Google Scholar 

  32. DuBay, W. H. (2007). Smart language: Readers, readability, and the grading of text.

  33. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01

    Article  Google Scholar 

  34. Green, P., & MacLeod, C. J. (2016). simr: An R package for power analysis of generalised linear mixed models by simulation. Methods in Ecology and Evolution, 7(4), 493–498. https://doi.org/10.1111/2041-210X.12504

    Article  Google Scholar 

  35. Groll, A. (2017). glmmLasso: Variable selection for generalized linear mixed models by L1-penalized estimation. https://CRAN.R-project.org/package=glmmLasso

  36. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.

    Google Scholar 

  37. Andrade, E. B., & Ho, T. H. (2009). Gaming emotions in social interactions. Journal of Consumer Research, 36(4), 539–552.

    Article  Google Scholar 

  38. East, R., Hammond, K., & Wright, M. (2007). The relative incidence of positive and negative word of mouth: A multi-category study. International Journal of Research in Marketing, 24(2), 175–184.

    Article  Google Scholar 

  39. Haddock, G., Maio, G. R., Arnold, K., & Huskinson, T. (2008). Should persuasion be affective or cognitive? The moderating effects of need for affect and need for cognition. Personality and Social Psychology Bulletin, 34(6), 769–778.

    Article  Google Scholar 

  40. Petty, R. E., & Cacioppo, J. T. (1984). The effects of involvement on responses to argument quantity and quality: Central and peripheral routes to persuasion. Journal of Personality and Social Psychology, 46(1), 69–81.

    Article  Google Scholar 

  41. Fang, B., Ye, Q., Kucukusta, D., & Law, R. (2016). Analysis of the perceived value of online tourism reviews: Influence of readability and reviewer characteristics. Tourism Management, 52, 498–506.

    Article  Google Scholar 

  42. Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Plenum Press.

    Book  Google Scholar 

  43. Clifford, M. (1990). Students need challenge, not easy success. Educational Leadership, 48, 22–26.

    Google Scholar 

  44. Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. In M. A. Gernsbacher, R. W. Pew, L. M. Hough, & J. R. Pomerantz (Eds.), Psychology and the real world: Essays illustrating fundamental contributions to society (pp. 56–64). Worth Publishers.

    Google Scholar 

  45. Linn, M. C., Chang, H., Chiu, J., Zhang, Z., & McElhaney, K. (2011). Can desirable difficulties overcome deceptive clarity in scientific visualizations? In A. Benjamin (Ed.), Successful remembering and successful forgetting: a Festschrift in honor of Robert A. Bjork (pp. 235–258). Psychology Press.

    Google Scholar 

  46. McNamara, D. S., & Kintsch, W. (1996). Learning from texts: Effects of prior knowledge and text coherence. Discourse Processes, 22, 247–288.

    Article  Google Scholar 

  47. Crossley, S. A., Salsbury, T., & Mcnamara, D. S. (2015). Assessing lexical proficiency using analytic ratings: A case for collocation accuracy. Applied Linguistics, 36(5), 570–590.

    Google Scholar 

  48. Martin, J. R., & Rose, D. (2003). Working with discourse: Meaning beyond the clause. Bloomsbury Publishing.

    Google Scholar 

  49. Schulze, J. (2011). Writing to Persuade: A Systemic Functional View. Gist Education and Learning Research Journal, 5, 127–157.

    Google Scholar 

  50. Barthel, M., Stocking, G., Holcomb, J., & Mitchell, A. (2016). Seven-in-ten Reddit users get news on the site. Berlin: Pew Research Center.

    Google Scholar 

  51. O’Keefe, D. J. (1997). Standpoint explicitness and persuasive effect: A meta-analytic review of the effects of varying conclusion articulation in persuasive messages. Argumentation and Advocacy, 34(1), 1–12.

    Article  Google Scholar 

  52. O’Keefe, D. J. (1998). Justification explicitness and persuasive effect: A meta-analytic review of the effects of varying support articulation in persuasive messages. Argumentation and Advocacy, 35(2), 61–75.

    Article  Google Scholar 

  53. Calder, B. J., Insko, C. A., & Yandell, B. (1974). The relation of cognitive and memorial processes to persuasion in a simulated jury trial. Journal of Applied Social Psychology, 4(1), 62–93.

    Article  Google Scholar 

  54. Hamilton, M. A. (1998). Message variables the mediate and moderate the effect of equivocal language on source credibility. Journal of Language and Social Psychology, 17, 109–143.

    Article  Google Scholar 

  55. Wood, W., Kallgren, C. A., & Preisler, R. M. (1985). Access to attitude-relevant information in memory as a determinant of persuasion: The role of message attributes. Journal of Experimental Social Psychology, 21(1), 73–85.

    Article  Google Scholar 

  56. Toma, C. L., & D’Angelo, J. D. (2014). Tell-tale words. Journal of Language and Social Psychology, 34(1), 25–45.

    Article  Google Scholar 

  57. Slater, M. D., & Rouner, D. (1996). How message evaluation and source attributes may influence credibility assessment and belief change. Journalism & Mass Communication Quarterly, 73(4), 974–991.

    Article  Google Scholar 

  58. Ahmad, S. N., & Laroche, M. (2015). How do expressed emotions affect the helpfulness of a product review? Evidence from reviews using latent semantic analysis. International Journal of Electronic Commerce, 20(1), 76–111.

    Article  Google Scholar 

  59. Karmarkar, U. R., & Tormala, Z. L. (2010). Believe me, I have no idea what I’m talking about: The effects of source certainty on consumer involvement and persuasion. Journal of Consumer Research, 36(6), 1033–1049.

    Article  Google Scholar 

  60. Xiao, L. (2018). A message’s persuasive features in Wikipedia’s article for deletion discussions. In Proceedings of the 9th International Conference on Social Media and Society (pp. 345–349).

  61. Kaufman, D. Q., Stasson, M. F., & Hart, J. W. (1999). Are the tabloids always wrong or is that just what we think? Need for cognition and perceptions of articles in print media. Journal of Applied Social Psychology, 29(9), 1984–2000.

    Article  Google Scholar 

  62. Allport, G. W., & Postman, L. (1947). The psychology of rumor. Rinehart & Winston.

    Google Scholar 

  63. Hazleton, V., Cupach, W. R., & Liska, J. (1986). Message style: An investigation of the perceived characteristics of persuasive messages. Journal of Social Behavior and Personality, 1(4), 565.

    Google Scholar 

  64. Wegener, D. T., Petty, R. E., & Klein, D. J. (1994). Effects of mood on high elaboration attitude change: The mediating role of likelihood judgments. European Journal of Social Psychology, 24(1), 25–43.

    Article  Google Scholar 

  65. Hosman, L. A., & Siltanen, S. A. (2006). Powerful and powerless language forms: Their consequences for impression formation, attributions of control of self and control of others, cognitive responses, and message memory. Journal of Language and Social Psychology, 25(1), 33–46.

    Article  Google Scholar 

  66. Gibbons, P., Busch, J., & Bradac, J. J. (1991). Powerful versus powerless language: Consequences for persuasion, impression formation, and cognitive response. Journal of Language and Social Psychology, 10(2), 115–133.

    Article  Google Scholar 

  67. Hosman, L. A., Huebner, T. M., & Siltanen, S. A. (2002). The impact of power-of-speech style, argument strength, and need for cognition on impression formation, cognitive responses, and persuasion. Journal of Language and Social Psychology, 21(4), 361–379.

    Article  Google Scholar 

  68. Holtgraves, T., & Lasky, B. (1999). Linguistic power and persuasion. Journal of Language and Social Psychology, 18(2), 196–205.

    Article  Google Scholar 

  69. Blankenship, K. L., & Holtgraves, T. (2005). The role of different markers of linguistic powerlessness in persuasion. Journal of Language and Social Psychology, 24(1), 3–24.

    Article  Google Scholar 

  70. Toulmin, S. E. (2003). The uses of argument. Cambridge University Press.

    Book  Google Scholar 

  71. Baesler, E. J., & Burgoon, J. K. (1994). The temporal effects of story and statistical evidence on belief change. Communication Research, 21(5), 582–602.

    Article  Google Scholar 

  72. Doest, L., Semin, G. R., & Sherman, S. J. (2002). Linguistic context and social perception: Does stimulus abstraction moderate processing style? Journal of Language and Social Psychology, 21(3), 195–229.

    Article  Google Scholar 

  73. Schwanenflugel, P. J., & Stowe, R. W. (1989). Context availability and the processing of abstract and concrete words in sentences. Reading Research Quarterly, 24, 114–126.

    Article  Google Scholar 

  74. Seifert, L. S. (1997). Activating representations in permanent memory: Different benefits for pictures and words. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(5), 1106.

    Google Scholar 

  75. Douglas, K. M., & Sutton, R. M. (2003). Effects of communication goals and expectancies on language abstraction. Journal of Personality and Social Psychology, 84(4), 682–696.

    Article  Google Scholar 

  76. Hansen, J., & Wänke, M. (2010). Truth from language and truth from fit: The impact of linguistic concreteness and level of construal on subjective truth. Personality and Social Psychology Bulletin, 36(11), 1576–1588.

    Article  Google Scholar 

  77. Pan, L., McNamara, G. M., Lee, J., Haleblian, J. M., & Devers, C. E. (2017). Give it to us straight: Language concreteness and its effects on investors’ reactions. Academy of Management Proceedings, 2017(1), 12140.

    Article  Google Scholar 

  78. Goering, E., Connor, U. M., Nagelhout, E., & Steinberg, R. (2011). Persuasion in fundraising letters: An interdisciplinary study. Nonprofit and Voluntary Sector Quarterly, 40(2), 228–246.

    Article  Google Scholar 

  79. Xu, Z., Ellis, L., & Umphrey, L. R. (2019). The easier the better? Comparing the readability and engagement of online pro-and anti-vaccination articles. Health Education & Behavior, 46(5), 790–797.

    Article  Google Scholar 

  80. Bradac, J. J., Bowers, J. W., & Courtright, J. A. (1979). Three language variables in communication research: Intensity, immediacy, and diversity. Human Communication Research, 5(3), 257–269.

    Article  Google Scholar 

  81. Bradac, J. J., Desmond, R. J., & Murdock, J. I. (1977). Diversity and density: Lexically determined evaluative and informational consequences of linguistic complexity. Communications Monographs, 44(4), 273–283.

    Article  Google Scholar 

  82. Daller, H., Van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of bilinguals. Applied Linguistics, 24(2), 197–222.

    Article  Google Scholar 

Download references

Acknowledgements

We thank Haley Bader, Carolynn Boatfield, Maria Civitello, Katie Kauth, and Xinyu Wang for their assistance in data cleaning, Arthur Bousquet and Leonardo Carrico for their assistance in data analysis, and David Johnson for his helpful feedback on earlier drafts of this paper. Preparation of this manuscript was funded, in part, by grants from the Swiss National Science Foundation (#196255) and the Federal Bureau of Investigation (15F06718R0006603). The views, opinions, and findings contained in this document are those of the authors and should not be construed as position, policy, or decision of the aforementioned agencies, unless so designated by other documents.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

VT developed the concept of the study, conducted data analysis, and wrote the manuscript. RL Boyd collected the data, assisted with study development, natural language and statistical analyses and provided critical revisions. SS assisted with data preparation and analyses and provided critical revisions. AK, CG, AL, and LM assisted with data cleaning and literature review.

Corresponding author

Correspondence to Vivian P. Ta.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Availability of data and material

https://osf.io/4rj26/?view_only=5556b511084b4e75bc14808e47d15dce.

Code availability

https://osf.io/4rj26/?view_only=5556b511084b4e75bc14808e47d15dce.

Ethics approval

Approval granted by Lancaster University’s ethics committee (Reference #FST19067).

Consent to participate

Not applicable; data was non-identifiable and publicly available.

Consent for publication

Not applicable; data was non-identifiable and publicly available.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ta, V.P., Boyd, R.L., Seraj, S. et al. An inclusive, real-world investigation of persuasion in language and verbal behavior. J Comput Soc Sc 5, 883–903 (2022). https://doi.org/10.1007/s42001-021-00153-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-021-00153-5

Keywords

  • Persuasion
  • Language
  • Attitude change
  • Online interactions