Introduction

Understanding persuasion—how people can fundamentally alter the thoughts, feelings, and behaviors of others—is a cornerstone of social psychology. Historically, social influence has been outstandingly difficult to study in the real-world, requiring researchers to piece together society-level puzzles either in the abstract [1] or through carefully-crafted field studies [2]. In recent years, technology has driven interest in studying social influence as digital traces make it possible to study how the behaviors of one individual or group cascade to change others’ behaviors [3, 4]. Nevertheless, most social processes are complex, to the point where they are very difficult to study as they operate outside of the lab. However, the availability of digital data and computational techniques provide a ripe opportunity to begin understanding the precise mechanisms by which people influence the thoughts and feelings of others.

Today, persuasion is often transacted—partially or wholly—through verbal interactions that take place on the internet [5]: a message is transmitted from one person to another through the use of language, altering the recipient’s attitude. As such, researchers have sought to identify linguistic featuresFootnote 1 that are linked to a message’s persuasive appeal. A relatively sizable number of linguistic features that are important in message persuasiveness have emerged from this body of research and include features that indicate what a message conveys as well as how it was conveyed (Table 1). Models of persuasion, such as the Elaboration Likelihood Model (ELM) [6], have been used to identify these linguistic features and explain how they affect message persuasiveness.

Table 1 Summary of linguistic features and predictions

Despite the impressive corpus of studies to date, the existing literature has several limitations. Studies have largely examined the effect of linguistic features on persuasion in isolation by only focusing on a small number of linguistic features (i.e., one or two) at a time. While this body of literature has collectively identified a relatively sizable number of linguistic features that are linked to message persuasiveness, it remains unclear how these links, taken together, inform the social aspects of verbal behavior in persuasion. In other words, what do the linguistic features connected with message persuasiveness reveal about the key verbal behaviors involved in persuasion? As language provides “a rich stream of ongoing social processes” [7], synthesizing these findings can provide a more complete understanding of the social psychological pathways to persuasion.

In the same vein, real-world messages are constructed using a varied combination of linguistic features to transmit complex thoughts, emotions, and information to others. Nevertheless, studies tend to examine how a single linguistic feature (or a small set of features) correlate with persuasion without taking into account other potentially important linguistic features within a given message [8, 9]. The meaning of a given word or feature in any text is dependent on the context by which it was used which can be inferred by the words and features that surround it [10, 11]. As such, the effect of any particular linguistic feature on message persuasiveness can be attenuated by the presence of other features in the message. As they are typically studied in isolation, little is known about the relative impact of linguistic features on a message’s persuasive appeal.

Furthermore, studies that examine the effect of linguistic features on persuasion tend to focus on persuasion in terms of engaging in specific behaviors [3, 12,13,14] rather than changing attitudes in general. Persuading people to engage in a specific behavior is conceptually distinct from changing people’s attitude on a topic. Although changes in behavior can facilitate changes in attitude, changes in behavior can also be dependent on attitude change (e.g., an individual may not engage in behavior change unless they believe that the behavior will result in a desirable outcome). Although changes in behavior can facilitate changes in attitude, changes in behavior does not always indicate that attitude change has occurred (e.g., an individual may decide to ultimately receive the COVID-19 vaccine because their employer requires it and not because their views regarding vaccines have changed) [15].

Finally, many studies that investigate the effect of linguistic features on persuasion are conducted in controlled lab settings [16, 17] due to the sheer difficulty of studying persuasion as it unfolds in the real-world. Given that persuasion often takes place through online social interactions [5], there is a need to study persuasion in this setting. Doing so also enables researchers to better understand how digital environments influence the process of persuasion, especially as digital environments are now progressively constructed to persuade the attitudes and behaviors of users [18] and there is “little consensus on how to persuade effectively within the digital realm” [19].

We sought to address these limitations in the current study. Specifically, we collected large-scale data from r/ChangeMyView, an online public forum on the social media website Reddit where users engage in debates in an attempt to change each other’s views on any topic. Most importantly, messages that successfully changed a user’s views are explicitly marked by the user themselves. That is, individuals are exposed to several messages and explicitly identified the message(s) that actually changed their views. We simultaneously examined linguistic features that have been previously linked with message persuasiveness (Table 1) between persuasive and non-persuasive messages to test the following research questions:

  1. 1.

    What are the key linguistic dimensions of persuasion? Given that a relatively sizable number of linguistic features have been linked with persuasion, we first sought to determine whether these features could be meaningfully reduced to a smaller number of dimensions representing the key verbal processes of persuasion. We then assessed whether these dimensions were uniquely predictive of persuasion when controlling for the effects of the remaining dimensions.

  2. 2.

    Which individual linguistic features, when assessed simultaneously, are the most essential and relevant to a message’s persuasive appeal? We then simultaneously assessed all linguistic features that have been linked with message persuasiveness in a single model to examine the relative impact of the features on a message’s persuasive appeal to identify features that were most crucial to message persuasiveness.

While theory-driven predictions can be made regarding how each linguistic feature relates to persuasion, there has been a considerable amount of variability across studies in terms of which features positively or negatively relate to persuasion, as well as studies that show mixed or inconclusive results pertaining to the effect of a given linguistic feature on persuasion (see Table 1). Given that our primary goal was to obtain a more unified understanding of the social psychological pathways to persuasion via language, the current study is guided by a jointly data-driven and exploratory approach, with results informing our understanding of the directional relationship between the linguistic features and message persuasiveness. Overall, assessing the interplay between important linguistic features on persuasion using large-scale, real-world data help inform theories, such as ELM, that address how linguistic features influence persuasion to provide a parsimonious and ecologically-valid understanding of the social psychological processes that shape persuasion.

Although some previous studies have used r/ChangeMyView data to investigate the effect of linguistic features on persuasion, they differ from the current investigation in important ways. The types and combinations of linguistic features that have been examined vary across studies and typically feature a mix of linguistic features that have and have not been linked to persuasion. For example, Tan et al. [21] examined how some persuasion-linked linguistic features (including arousal, valence, reading difficulty, and hedges), some non-persuasion-linked features (e.g., formatting features such as use of italics and boldface), and interaction dynamics (e.g., the time a replier enters a debate) were associated with successful persuasion. Wei et al. [22] investigated how surface text features (e.g., reply length, punctuation), social interaction features (e.g., the number of replies stemming from a root comment), and argumentation-related features (e.g., argument relevance and originality) related to persuasion. Musi et al. [23] assessed the distribution of argumentative concessions in persuasive versus non-persuasive comments, and Priniski and Horne [24] examined persuasion through the presentation of evidence only in sociomoral topics. Moreover, studies tend to have greater emphasis on model building to accurately detect persuasive content online rather than interpretability and a more unified understanding of the social psychological pathways to persuasion via language. For instance, Khazaei et al. [20] assessed how all LIWC-based features varied across persuasive and non-persuasive replies and used this information to train a machine learning model to identify persuasive responses.

Method

Data collection

We used data from the Reddit sub-community (i.e., “subreddit”) r/ChangeMyView, a forum in which users post their own views (referred to as “original posters”, or “OPs”) on any topic and invite others to debate them. Those who debate the OP (referred to as “repliers”) reply to the OP’s post in an attempt to change the OP’s view. The OP will award a delta (∆) to particular replies that changed their original views.

Using data from r/ChangeMyView presents several advantages. All replies in r/ChangeMyView are written with the purpose of persuasion. The replies that successfully change an OP’s view are explicitly marked by the OP themselves, allowing for a sample of persuasive and non-persuasive replies. All OPs and repliers must adhere to the official policiesFootnote 2 of r/ChangeMyView. For instance, OPs are required to explain at a reasonable length (using 500 characters or more) why they hold their views and to interact with repliers within a reasonable time frame. Replies must be substantial, adequate, and on-topic. Because these policies are enforced by moderators, the resulting interactions are high in quality [21] and are conducted under similar conditions with similar expectations. OPs can also post their view on any topic, allowing for an examination of persuasion across a wide variety of topics.

All top-level replies (direct replies to the OP’s original statement of views) posted between January 2013 and October 2018 were initially collected from the Pushshift database [25]. We focused only on the top-level replies and omitted any additional replies that were in response to a direct reply (i.e., a direct reply’s “children”). This ensured that replies that were deemed persuasive were due to its contents and not due to any resulting “back-and-forth” interactions given that deltas can also be awarded to downstream replies. We also omitted any top-level replies that were made by a post’s OP and any replies that received a delta in which the delta was not awarded by the OP. Because the data contained a substantially greater number of non-persuasive replies (99.39%) than persuasive ones, analyses were conducted on a balanced subsample that included all top-level replies that were awarded a delta and a random subsample of top-level replies that were not awarded a delta that came from the original posts in which at least one delta was awarded. This allowed us to compare the persuasive and non-persuasive replies from the same original post while bypassing issues associated with class imbalances [26].

As an example, consider a parent post that garnered two top-level replies that were awarded a delta, and three top-level replies that were not awarded a delta. In this case, the two top-level replies that were awarded a delta were included in the subsample and two out of the three top-level replies that were not awarded a delta would be randomly selected for inclusion in the subsample. Using the random number generator in Microsoft Excel, the 3 top-level replies that were not awarded a delta were assigned a random number between 1 and 100. Replies with the lowest two values were then selected for inclusion in the subsample. Parent posts almost always contained a greater number of top-level replies that were not awarded a delta than top-level replies that were awarded a delta. However, for the very few instances in which a parent post contained a greater number of top-level replies that were awarded a delta than top-level replies that were not awarded a delta, we included all top-level replies in the subsample (N = 9020 top-level replies; n = 4515 top-level replies that were awarded a delta; n = 4505 top-level replies that were not awarded a delta). Example persuasive and non-persuasive replies can be found in Table 2.

Table 2 Example replies

To gain an initial understanding of the types of topics that were raised for debate in the subreddit, we randomly selected 100 replies from the final dataset and manually coded their content. Six overarching topics emerged: legal and politics; race, culture, and gender; business and work; science and technology; behavior, attitudes, and relationships; and recreation. More information regarding debated topics can be found in the supplementary materials.Footnote 3.

Linguistic features

Prior to extracting linguistic features from our data, we conducted a cursory search of the psychological literature to identify prominent linguistic features reported to have a significant relationship with message persuasiveness in at least one published study. These linguistic features are listed in Table 1. Each reply in the r/ChangeMyView dataset was analyzed separately using Language Inquiry and Word Count (LIWC) [27] which calculates the percentage-use of words belonging to psychologically or linguistically meaningful categories. We used LIWC to quantify word count, analytic thinking (analytical thinking formula = articles + prepositions—personal pronouns—impersonal pronouns—auxiliary verbs—conjunctions—adverbs—negations; relative frequencies are normalized within LIWC2015 to a 0-to-100 scale, with higher scores reflecting more analytical language and lower scores reflecting more informal and narrative-like language), the percentage-use of self-references (i.e., first-person singular pronouns, or “i-words”), and the percentage-use of certainty terms in each reply within our corpus. Dictionaries of terms that have been rated on emotionalityFootnote 4 (i.e., valence, arousal, and dominance) from [28] were imported into LIWC to measure the percentage-use of language that scored high and low on valence, arousal, and dominance. A dictionary of hedges from [29] was also imported into LIWC to measure the percentage-use of hedges. Following [21], the use of examples was measured by occurrences of “for example”, “for instance”, and “e.g.”. Language abstraction/concreteness was measured using the linguistic category model, with higher scores indicating higher levels of language abstraction and lower scores indicating lower levels of language abstraction (i.e., greater language concreteness; formula for calculation = [(Descriptive Action Verbs × 1) + (Interpretative Action Verb × 2) + (State Verb × 3) + (Adjectives × 4)]/(Descriptive Action Verbs + Interpretative Action Verbs + State Verbs + Adjectives)) [30]. Type-token ratio, the ratio between the number of unique words in a message and the total number of words in the given message [31], was used to measure lexical diversity with higher scores indicating greater lexical diversity (type-token ratio formula = number of unique lexical terms/total number of words). Last, reading difficulty was measured via the SMOG Index which estimates the years of education the average person needs to completely comprehend a piece of text (SMOG Index formula = 1.0430 [√number of polysyllables × (30/number of sentences)] + 3.1291). Because a higher SMOG score indicates that higher education is needed to comprehend a piece of text, higher reading difficulty scores represent text that is more difficult to read and lower scores represent text that is easier to read [32]. More information about these linguistic features and example replies that scored high and low on each linguistic feature are reported in the supplementary.

Results

Given that a relatively sizable number of linguistic features have been linked with persuasion, we first determined whether these features could be meaningfully reduced to a smaller number of dimensions representing the key verbal processes of persuasion. Second, we determined whether these dimensions were each uniquely predictive of persuasion when controlling for the effects of the remaining dimensions. Third, we simultaneously assessed all linguistic features that have been linked with message persuasiveness in a single model to understand how linguistic features interact with one another to influence a message’s persuasive appeal and identify features most crucial to message persuasiveness. All data and analytic code can be found in the supplementary. Descriptive statistics, zero-order correlations between all variables, and complete analytic outputs for all analyses are presented in the supplementary.

To identify the key linguistic dimensions of persuasion (RQ 1), we submitted all linguistic features into a principal components analysis (PCA) with a varimax rotation. Bartlett’s Sphericity Test (p < 0.001) and the Kaiser–Meyer–Olkin metric (KMO = 0.55) suggested that our data were suitable for analysis. Features with factor loadings greater than the absolute value of 0.50 were retained and used to quantify principal components. Three principal components were extracted that collectively accounted for 36.28% of the total variance: structural complexity, negative emotionality, and positive emotionality (see Table 3). Structural complexity had high loadings in the direction of lower lexical diversity, higher word count, and greater reading difficulty. Negative emotionality had high loadings in the direction of greater percentage-use of terms that scored low on valence and low on dominance. Positive emotionality had high loadings in the direction of greater percentage-use of terms that scored high on dominance, high on valence, and hedges.

Table 3 Results of PCA with Varimax Rotation

To assess if all three dimensions were uniquely important to message persuasiveness, we entered each component into a multilevel logistic regression analysis using lme4 [33]. This procedure corrects for non-independence of replies (i.e., replies to the same parent post) on the dependent variable: persuasion (delta awarded = 1, no delta awarded = 0). We include random intercepts for replies nested within parent posts and replies nested within repliers (i.e., some repliers provided replies to multiple original posts). All three components emerged as significant predictors of persuasion. For a one-unit increase in structural complexity, the odds of receiving a delta increase by a factor of 2.25, 95% CI [2.11, 2.39]. For a one-unit increase in negative emotionality, the odds of receiving a delta decrease by a factor of 0.89, 95% CI [0.85, 0.94]. For a one-unit increase in positive emotionality, the odds of receiving a delta also decrease by a factor of 0.92, 95% CI [0.88, 0.97]. Post-hoc power analyses conducted using the simr package in R (Version 1.0.5) [34] revealed that we had at least 96% power to detect a small effect (i.e., 0.15) for each of these factors on persuasion.

Next, the individual linguistic features were assessed simultaneously to identify those that were the most essential and relevant to a message’s persuasive appeal (RQ 2). A logistic least absolute shrinkage and selection operator (LASSO) regression was performed using glmmLasso [35]. A LASSO regression is a penalized regression analysis that performs variable selection to prevent overfitting by adding a penalty (λ) to the cost function (i.e., the sum of squared errors) equal to the sum of the absolute value of the coefficients. This penalty results in sparse models with few coefficients. In other words, this method selects a parsimonious set of variables that best predict the outcome variable and has many advantages over other feature selection methods [36]. All linguistic features were entered into the LASSO regression model. A grid search was performed to identify the most optimal shrinkage parameter based on BIC. Five features emerged with nonzero coefficients: word count, lexical diversity, reading difficulty, analytical thinking, and self-references (Table 4).

Table 4 Results of LASSO regression

These variables were subsequently entered into a multilevel logistic regression. Again, persuasion was entered as the dependent variable and we included random intercepts for replies nested within parent posts and replies nested within repliers. All five predictors emerged as significant predictors of persuasion. Specifically, for a one-unit increase in word count, the odds of receiving a delta increase by a factor of 1.23, 95% CI [1.13, 1.35]. For a one-unit increase in reading difficulty scores (i.e., greater difficulty in reading comprehension), the odds of receiving a delta increase by a factor of 1.10, 95% CI [1.04, 1.16]. For a one-unit increase in analytical thinking, the odds of receiving a delta increase by a factor of 1.10, 95% CI [1.05, 1.17]. For a one-unit increase in self-references, the odds of receiving a delta decrease by a factor of 0.92, 95% CI [0.87, 0.98]. Last, for a one-unit increase in lexical diversity, the odds of receiving a delta decrease by a factor of 0.54, 95% CI [0.50, 0.59]. Post-hoc power analyses conducted using the simr [34] revealed that we had at least 96% power to detect a small effect (i.e., 0.15) for each of these predictors on persuasion.

Discussion

Previous studies have largely examined the effect of linguistic features on persuasion in isolation and do not incorporate properties of language that are often involved in real-world persuasion. As such, little is known about the key verbal dimensions of persuasion or the relative impact of linguistic features on a message’s persuasive appeal in real-world social interactions. To address these limitations, we collected large-scale data of online social interactions from a public forum in which users engage in debates in an attempt to change each other’s views on any topic. Messages that successfully changed a user’s views are explicitly marked by the user themselves. We simultaneously examined linguistic features that have been previously linked with message persuasiveness between persuasive and non-persuasive messages. Our findings provide a parsimonious and ecologically-valid understanding of the social psychological pathways to persuasion as it operates in the real world through verbal behavior.

Three linguistic dimensions appeared to underlie the tested features: structural complexity, negative emotionality, and positive emotionality. Each dimension uniquely predicted persuasion when the effects of the remaining dimensions were statistically controlled, with greater structural complexity exhibiting the highest odds of persuasion. Interestingly, messages marked with less emotionality had higher odds of persuasion than messages marked with more emotionality, regardless of whether it was positive or negative. Emotionality can help persuasion in specific contexts [37, 38], but emotional appeals can also backfire when audiences prefer cognitive appeals [39]. Given that OPs were publicly inviting others to debate them, it is plausible that they preferred cognitively-appealing responses—ones that include an abundance of clear and valid reasons to support an argument—rather than emotionally-appealing responses.

The linguistic features that made a message longer, more analytic, less anecdotal, more difficult to read, and less lexically diverse were most essential to a message’s persuasive appeal and uniquely predictive of persuasion. Longer messages provide more context and likely contain more arguments than shorter messages. Presenting more arguments can be more persuasive even if the arguments themselves are not compelling [40]. Longer messages likely provided more opportunities for the OP to engage with material that could potentially change their mind, thus increasing the likelihood of persuasion.

Although more readable content is easier to understand and less aversive than less readable content [41], greater reading difficulty and comprehension can engender more interest, attention, and engagement [42, 43]. It can also facilitate deeper cognitive processing that leads to greater learning and long-term retention [44, 45]. This is especially true for individuals intrinsically motivated or capable of engaging in complex and novel tasks [46]. OPs were likely capable of and intrinsically motivated to engage in content that challenged their beliefs considering they were inviting others to debate them. The interpretation of users being intrinsically motivated to challenge their beliefs is also in line with the link that emerged between greater usage of analytical language and persuasion. Similarly, messages that focused less on one’s own personal experiences may have provided more objective evidence to support a particular argument, facilitating persuasion.

Last, while greater lexical repetitions may be perceived as less interesting [31, 47], it facilitated persuasion in this context. Lexical repetitions provide effective ways for speakers to communicate complex topics as it keeps “lexical strings relatively simple, while complex lexical relations are constructed around them” [48]. Lexical repetitions are advantageous for navigating through the order and logic of an argument, providing “textual markers” that help readers connect important aspects of an argument together [49]. Lower lexical diversity, then, appeared to be beneficial for building arguments that are more cohesive, more coherent, and thus, more persuasive.

Altogether, our findings reveal that the linguistic features linked to persuasion fall along three dimensions pertaining to structural complexity, negative emotionality, and positive emotionality. Our findings also highlight the importance of linguistic features related to a message’s structural complexity, particularly the verbal behaviors that provide a greater amount of factual evidence in a way that enables readers to connect important aspects of the information in an appropriately stimulating manner. Although the other linguistic features that were examined in this study may contribute to message persuasiveness to some degree, our results indicate that they are relatively less important after word count, lexical diversity, reading difficulty, analytical thinking, and self-references are taken into account. These findings also seem to reflect r/ChangeMyView’s digital environment. A central feature of r/ChangeMyView is ensuring that all posts and replies meaningfully contribute to the conversations. As such, OPs and repliers must adhere to all moderator-enforced policies of interaction. In addition, users who post on r/ChangeMyView are likely individuals who are open to attitude change given that they are publicly inviting others to debate them on a topic they already have an opinion on. This suggests that, in digital environments that underscore meaningful contributions to conversations, the ability to convey more objective information while fostering engagement and a holistic understanding of an argument are most vital to the alteration of established attitudes among open-minded individuals.

Our findings also have implications for the process by which persuasion research via language is conducted. Assessing the relative importance of a linguistic feature on message persuasiveness allowed us to understand its interconnections with other linguistic features and its link to persuasion, yielding a more comprehensive and well-rounded understanding of the feature’s role in message persuasiveness. Consider word count, for example: without assessing word count’s relative importance on message persuasiveness in the current study, we would not have been able to ascertain its link to message persuasiveness via a message’s structural complexity and the importance of providing more content in a way that enables readers to connect important aspects of the information in an appropriately stimulating manner. Because the meaning of a word or linguistic feature in any text is dependent on the context by which it is used, understanding the social psychological pathways to persuasion via language requires researchers to account for the presence of multiple linguistic features within a given message when assessing a linguistic feature’s link to message persuasiveness. This holistic approach may also help reconcile conflicting results from previous research on language and persuasion.

Our findings also inform theories, such as ELM, that address how linguistic features influence persuasion and provide a more precise understanding of the social psychological pathways to persuasion. For example, ELM states that here are two main routes to persuasion: the central route, which focuses on the message quality on persuasion, and the peripheral route, which uses heuristics and peripheral cues to help influence individual decisions regarding a topic [6]. Individuals are more likely persuaded via the central route if they have the ability and motivation to process the information. On the other hand, individuals are more likely persuaded via the peripheral route if involvement is low and information processing capability is diminished. OPs likely have the ability and motivation to process arguments from repliers and are thus likely persuaded via the central route given that they are publicly inviting others to debate them. Supplying more information to support a conclusion may be more likely to persuade via the central route, but this information also needs to be organized in a way that helps readers connect important aspects of the information together. A wealth of information that is structured in an incoherent manner would undoubtedly hinder comprehension, and thus, persuasion.

Strengths and limitations

Our dataset contained a large sample of replies that spanned a wide variety of topics, and provided high ecological validity given that it captured the process of persuasion as it occurred naturally without elicitation. The enforcement of rules on r/ChangeMyView yielded interactions that were conducted under similar conditions and expectations. This helped to minimize interaction variance without interfering with the naturalistic nature of the data. However, OPs can award deltas to responses within subtrees (the “children” of direct replies) typically as the result of “back-and-forth” interactions with repliers. These were not included in the current study as we only examined top-level responses. Our results could also differ by topic, recency of the post, and post length, and it is possible non-linguistic features such as the popularity of a post, the number of “upvotes” (i.e., the number of instances other users have registered agreement with a particular post or reply) a reply receives, and the number of deltas a replier has ever received may also impact message persuasiveness. Future studies should determine if these variables moderate the findings, and doing so would also address the relative importance of linguistic versus non-linguistic features on message persuasiveness.

Although it is a policy on r/ChangeMyView that OPs must post a non-neutral opinion (i.e., their post must take a non-neutral stance on a topic), and posts that violate this rule are removed by moderators, it is possible that an OP’s post did not accurately reflect their true attitude or attitude strength. Given the nature of the data, this study cannot address whether the resulting attitude changes were long-lasting, nor if the OP’s attitude strength moderated their attitude change. Longitudinal studies can assess these points. Because there were substantially more non-persuasive replies (99.39%) than persuasive ones, we constructed a balanced subsample and conducted our analyses on this balanced subsample. While this strategy limited biased outcomes stemming from a large class imbalance, it also limits the generalizability of results to posts in which no persuasion occurred. Further examinations of the class imbalance are needed to address this issue. For example, it is possible that posts in which no persuasion occurred are systematically different from posts in which persuasion occurred. Or, perhaps the class imbalance simply reflects the rigid nature of attitudes. In addition, our results may only reflect a particular population given that Reddit users tend to skew younger and male [50]. Since we did not have access to subjects’ demographic information, we cannot assert the representativeness of our sample. Future research should investigate persuasion that takes place on other debate-style forums and websites to incorporate more diverse subjects, interaction modes, and digital environments.