1 Introduction

In this paper, we present, and test, a novel mathematical formulation of how information spreads online. Our model is based on fuzzy trace theory (FTT)—a leading account of decision under risk—which emphasizes the combined roles of mental representation, message content, social values, and individual differences. FTT posits that individuals encode multiple representations of a stimulus, such as online information, in parallel. These representations are referred to as gist—the essential meaning of the information—and verbatim—a detailed symbolic representation of the stimulus. Thus, our model incorporates factors capturing the extent to which the individual “gets the gist” (i.e., is able to extract meaningful information) and is therefore illuminated by the content of the message. We also incorporate a factor capturing motivation to share, such as may be triggered by material containing especially compelling media (e.g., vivid photos or surprising content). Our formulation is novel because it combines these cognitive and motivational considerations into a common computational framework.

1.1 Rationale for our approach

Our approach is motivated by the widespread effects of online misinformation and disinformation across multiple contexts. Although we focus on misinformation about vaccines in this article, online misinformation is now widely considered to be a threat in multiple domains (Grinberg et al. 2019), underlying shifts in electoral politics (Swire et al. 2017), epidemic outbreaks (Chou et al. 2018), and several other areas pertinent to national security. Indeed, social media have an especially wide reach, with Twitter as one of the most popular platforms. A recent Pew Center study (Perrin 2015) indicates that more people get their news from social media than from any other source. As of March 1, 2018, 24% of all online adults, and 45% of adults aged 18–24, are on Twitter (Smith and Anderson 2018). Thus, social media, and especially Twitter, enable the rapid increase in the speed and scope of dissemination of narratives that may affect decisions and other behaviors, including decisions to share information online.

Public health professionals face challenges from this new communications environment (Chou et al. 2018). For example, the World Health Organization has recently declared vaccine hesitancy to be one of the world’s top 10 public health threatsFootnote 1—in large part, driven by anti-vaccine sentiment on social media (Brewer et al. 2017). Indeed, the journal Vaccine devoted an entire special issue to the role social media plays in vaccination decisions (Betsch et al. 2012). Importantly, the consensus article in this special issue emphasizes the role of psychological factors, such as how online narratives are processed, in the spread of online information. Specifically, they state that “Narratives have inherent advantages over other communication formats...[and] include all of the key elements of memorable messages: They are easy to understand, concrete, credible...and highly emotional. These qualities make this type of information compelling...” (p. 3730). Furthermore, within the specific domain of vaccine refusal, recent studies have documented the role of both domestic and state-sponsored foreign actors using misinformative online messages about public health topics to market products and to promote political discord (Broniatowski et al. 2018; Jamison et al. 2019; Subrahmanian et al. 2016).

Although our primary focus in this paper is misinformation about vaccines, recent political developments have highlighted the popularity of “fake news” which, although factually inaccurate, may have been shared more widely online than vetted media sources (Silverman 2016; and see also Dredze et al. 2017). Findings suggest that unverified information with highly surprising, or emotionally arousing and therefore motivational, content may travel faster and farther than information containing verbatim facts (Vosoughi et al. 2018; Berger and Milkman 2012). We propose that the influence of news, and its concomitant recognition as fake or genuine, can be studied as a scientific problem (see also Pennycook et al. 2018). We therefore focus on analogues in the literature (i.e., vaccination) that provide theoretical and empirical insight into the process of influence through social media.

In this paper, we aim to model how these psychological factors drive online sharing. Studies in psycholinguistics have identified a narrative’s “coherence” as a key factor driving a story’s comprehensibility and long-term retention (Trabasso et al. 1982; Van den Broek 2010; Pennington and Hastie 1991). Although several dimensions of narrative coherence have been proposed (Reese et al. 2011; Gernsbacher et al. 1996), there is a consensus in the literature that coherent narratives often provide a causal structure for the events described (Mandler 1983; Trabasso and Sperry 1985; Gernsbacher et al. 1990; Diehl et al. 2006; Van den Broek 2010), therefore conveying the meaning, or gist of the story. In contrast, incoherent stories contain a relatively weak causal structure. According to this reasoning, therefore, online information facilitating causal coherence produces more coherent and meaningful gists, and will therefore be more influential. In contrast, official communications tend to focus on literal verbatim facts without emphasizing the causal relations among those facts in a manner that communicates a coherent gist. For example, government sites tend to focus on “how” vaccines work, whereas anti-vaccination narratives focus on providing a causal (though not necessarily accurate) explanation for “why” vaccines are harmful and are consequently more comprehensible, influential, and memorable (Trope and Liberman 2010; Fukukura et al. 2013).

The outline of this paper is as follows: In Sect. 2, we provide an overview of literature motivating the use of FTT to model the spread of online information. In Sect. 3, we provide a description of our modeling approach. Section 4 tests this model on an existing dataset of tweets about vaccines. Finally, Sect. 5, discusses the implications of these findings for future work, and concludes.

2 Literature review

2.1 Fuzzy-trace theory

According to FTT, effective messages help readers retain the meaning of the message in memory (because gist endures) and, hence, facilitate availability of the knowledge at the time of behavior. FTT can be used to explain the popularity of online messages because of the search for meaning and the tendency to interpret events even when knowledge is inadequate. FTT’s approach to online communication builds on the core concepts of gist and verbatim mental representations, modified and adapted from the psycholinguistic literature (Kintsch 1974) but modified in the light of more recent findings (see Reyna 2012). According to FTT, meaningful stimuli such as social media messages (e.g., those that communicate narratives) are encoded into memory in two forms: a verbatim representation (the objective stimulus or a decontextualized representation of what actually happened) and a gist representation, the subjective or meaningful interpretation of what happened (Reyna et al. 2016). Verbatim representations encode details, such as exact numbers. For example, an anti-vaccine message discussing the results of an isolated scientific study (Cowling et al. 2012) out of the context of the broader literature (Sundaram et al. 2013) may state that “Flu Shot Induces 4.4-fold increase in non-flu acute respiratory infections.” In contrast, a gist representation encodes the essential meaning of the sentence. Furthermore, there may be multiple gist representations. An uninformed gist that supports avoiding the vaccine might be held by a non-expert as follows: “Say no to the Flu Shot !! It’s ineffective and dangerous ...” In contrast, a gist held by an expert might emphasize “many problems w/ reporting bias & confounding” indicating that the findings of this specific study should not be considered definitive. Gist representations depend on culture, knowledge, beliefs, and other life experiences (Reyna and Adam 2003). However, in practice, coherent gist representations have been communicated to diverse audiences. Importantly, gist interpretations, rather than verbatim facts, tend to guide decisions and behavior.

When making sense of text, gist representations reflect coherent, causal stories (Reyna 2012; Reyna et al. 2016; Pennington and Hastie 1991). These narratives “connect the dots,” to offer a coherent account, and are more likely to be accepted. More coherent stories such as those connecting adverse health outcomes (e.g., autism) to certain behaviors (e.g., vaccination), are more likely to be accepted because they “make sense”—i.e., they provide an explanation for otherwise mysterious adverse events. Online messages are predicted to increase in popularity when similar messages from one’s friends or other trusted sources, make certain ideas plausible (e.g., that the government would intentionally infect people), especially coupled with an increased prevalence of poorly understood outcomes. Thus, a story describing how children developed symptoms of autism after having gotten vaccinated might allow one to erroneously conclude that vaccines cause autism. (In fact, the symptoms of autism tend to occur around the same time as the US Centers for Disease Control and Prevention recommend that children receive vaccines.) Similar spurious correlations underlie the false claims that exposure to the larvicide pyriproxifen (Vazquez 2016) or receipt of the DTaP vaccine by pregnant mothers, rather than the Zika virus, causes birth defects (Dredze et al. 2016a).

2.1.1 Individual differences

Prior work on FTT has shown that an individual’s reliance on gist vs. verbatim representations is associated with individual differences in metacognitive monitoring and editing initial reactions to information (Broniatowski and Reyna 2018). In the domain of risky decision problems that involve numerical information, studies have found that more numerate individuals (i.e., those possessing greater mathematical ability) are less prone to framing biases (Liberali et al. 2012; Peters et al. 2006; Peters and Levin 2008; Schley and Peters 2014), suggesting an increased ability to directly compare decision options that have the same verbatim expected value (Broniatowski and Reyna 2018). (Framing biases are effects of phrasing the same outcomes differently, such as phrasing choice options in terms of saving 200 people or as 400 people dying when 600 people are expected to die if nothing is done). Similarly, subjects exhibiting high Need for Cognition (NFC) (Cacioppo et al. 1984; Cacioppo et al. 1996) tend to be more consistent across multiple exposures to framing problems, presumably because they are able to identify the common structure of these problems (Broniatowski and Reyna 2018; LeBoeuf and Shafir 2003; Simon et al. 2004; Curseu 2006). However, the effects of numeracy and NFC do not explain where framing biases come from to begin with—namely, from gist representations of meaning in context—but they do explain the tendency to inhibit gist, especially in within-subjects designs featuring different frames for the same information (LeBoeuf and Shafir 2003; Broniatowski and Reyna 2018).

Individual differences have been found in the domain of narrative comprehension (Rapp et al. 2007). For example, Linderholm et al. (2000) and Van den Broek (2010) found that more skilled readers, and those with more relevant background knowledge, were better able to extract the gist from narratives with poorly-defined causal structures. In addition, LaTour et al. (2014) observed that subjects higher in NFC were better able to identify and reject narratives whose gists were inconsistent [see also Pennycook and Rand (2018) who found that subjects scoring higher on the cognitive reflection test (Frederick 2005) were better able to distinguish between true and misinformative headlines—cognitive reflection is known to be correlated with both numeracy (Liberali et al. 2012; Cokely and Kelley 2009) and need for cognition (Pennycook et al. 2016)]. More recently, van den Broek and Helder (2017) describes evidence for a model of narrative comprehension in which multiple levels of mental representation are encoded. Specifically, the authors differentiate between readers who prefer to use coherence-building strategies relying on effortful “close-to-the-text” reading (perhaps analogous to those exhibiting high NFC) and those who utilize a more interpretive strategy that is “farther” from the text. Importantly, such interpretive processes are associated with domain expertise (Goldman et al. 2015)—a hallmark of gist processing (Reyna and Lloyd 2006). Thus, there is reason to believe that individual differences associated with systematic variation in susceptibility to framing biases may also be associated with differences in one’s ability to extract a meaningful gist from online narrative text. Furthermore, a subject’s ability to extract this meaningful gist is a function both of the subject’s characteristics and the narrative’s content—more difficult texts are likely to appeal only to those subjects possessing the willingness and ability to expend the effort to comprehend them.

2.1.2 Motivational factors

Beyond the effects of metacognitive monitoring and editing, there is evidence indicating the role of motivational factors in risky decisions. For example, reward sensitivity has been associated with risk-taking across a wide range of problem types (e.g., Reyna et al. 2011; Broniatowski and Reyna 2018; Galván 2017). Berger and Milkman (2012) examined the psychological drivers of online information diffusion with implications for the motivational factors posited by our model. Specifically, the authors examined the determinants of what makes specific news articles more likely to be shared by email. They found that “virality” can essentially be described by two classes of factors: (1) motivational factors, which they described as “how surprising, interesting, or practically useful content is (all of which are positively linked to virality)...”; and (2) emotional valence/arousal factors. Regarding the latter, positively valenced items were more likely to be shared than negatively valenced items; however, arousal plays an important role as well. Specifically, high-arousal emotions, such as awe, anger, or anxiety were more likely to be viral whereas low-arousal emotions, such as sadness lead to less virality. In the domain of online narrative, such factors may also include flashy media, surprising or otherwise emotionally-arousing content (Vosoughi et al. 2018; Berger and Milkman 2012) and other motivational “clickbait” designed to temporarily grab the user’s attention. Additionally, motivational factors include trust in specific sources including the government, celebrities, and other opinion leaders (e.g., Quinn et al. 2013; Swire et al. 2017), and prior associations that trigger strong impulsive reactions (e.g., appeals to emotion). Although these factors typically engender virality, their effects tend to quickly diminish (Swire et al. 2017).

2.2 Evidence for FTT’s predictions online

2.2.1 Explicit tests of FTT online

Prior work (Broniatowski et al. 2016) has examined FTT’s predictions in the context of the Disneyland Measles Outbreak which began in December 2014 at Disneyland in California and led to 111 confirmed cases of measles in seven states (as well as in Canada and Mexico). Although measles was widely considered eliminated in the United States, reduced vaccination rates in some communities, due to concerns about vaccine toxicity, ultimately called attention to the issue of herd immunity—how slight reductions in vaccination rates can lead to epidemics.

This study was conducted in the context of an ongoing debate: Does including an anecdotal narrative lead to more effective communication compared to presenting “just the facts” (Buttenheim and Asch 2016) (i.e., statistical data)? In addition to the perceived effectiveness of narratives noted above, public health officials have been hesitant to include stories in their communications due to concerns of appearing biased or paternalistic. In contrast, FTT predicts that the verbatim details of a message are incorporated separately from, but in parallel to, the gist of the message. According to FTT, narratives are effective to the extent that they communicate a gist representation of information that then better cues motivationally relevant moral and social principles.

Broniatowski et al. (2016) crowdsourced the coding of 4581 out of a collection of 39,351 outbreak-related articles published from November 2014 to March 2015, asking coders to indicate whether each article expressed statistics (a verbatim representation), a story, and/or a “bottom line meaning” (i.e., a gist). Finally, they measured how frequently these articles were shared on Facebook. Results were consistent with expectations based on FTT, enumerated below:

  1. 1.

    FTT predicts that gist and verbatim representations are encoded in parallel. The authors found that both gist and verbatim types of information were associated with an article’s likelihood of being shared at least once, constituting distinct sources of variance.

  2. 2.

    The effects of gist were larger than the effects of verbatim, consistent with FTT’s “fuzzy-processing preference.”

  3. 3.

    Stories did not have a significant impact on an article’s likelihood of being shared after controlling for gist and verbatim, indicating that stories are only effective to the extent that they communicate a gist.

  4. 4.

    Among those articles that were shared at least once, only the expression of a gist was significantly associated with an increased number of Facebook shares (articles with gists were shared 2.4 times more often, on average, than articles without gists).

  5. 5.

    Articles expressing a gist that also expressed positive opinions about both pro- and anti-vaccine advocates were shared 57.8 times more often than other articles, suggesting that facts can indeed be effectively shared if concerns of those on the “opposing” side are acknowledged (while emphasizing the bottom-line meaning of the data in its cultural context).

  6. 6.

    Motivational factors—e.g., presence of vivid media—were associated with an article’s likelihood of being shared at least once, but not with more than one share.

These results provide evidence supporting FTT’s expectations for online information sharing and suggest that content features should be predictive of the spread of online misinformation. Furthermore, there is evidence supporting the combined roles of meaning-making (gist) and motivation on the sharing of online information. As will be discussed below, our model incorporates both types of factors.

2.3 Content features have not traditionally incorporated gist

Although online misinformation and disinformation are relatively new problems, significant work has been performed examining the spread of ideas, such as rumors, through social networks (e.g., Rogers 2010). Most of this prior work has focused on complex contagion (Centola 2010; Mnøsted et al. 2017) and homophily (Centola 2011; Bakshy et al. 2015; Grinberg et al. 2019)—both mediated by social network structure—as antecedents of information sharing. In contrast, comparatively little analysis of the psychological content of this information has been performed.

Romero et al. (2011) conducted an observational study that was explicitly designed to examine the role of content while controlling for social network factors. Specifically, they defined two content-based measures of a twitter hashtag’s spread: (1) “stickiness”, probability of sharing based on at least one exposure to a hashtag and (2) “persistence”—whether sharing will continue to occur after multiple exposures. Using this approach, the authors found evidence in support of variation in complex contagion by topic. Although this analysis primarily focused on hashtags rather than on true semantic content, the authors did provide some evidence suggesting that more meaningful hashtags, indexing topics such as politics, may be more persistent over time when compared to less meaningful “idiomatic” hashtags that are more motivational in nature.

In general, several studies have focused on verbatim-level features, rather than the gist-based semantic content that FTT predicts would be compelling. These studies have concluded that verbatim content features are not predictive when compared to structural features. For example, Petrovic et al. (2011) used a machine-learning approach to examine the relative predictive power of “social features” (features of the tweet’s author) compared to “tweet features” (text and statistical verbatim features of the tweet) in predicting retweets. The authors found that social features, rather than verbatim content features, were most predictive of retweets. Similarly, Cheng et al. (2014) found that the size of an information cascade could be more easily predicted based on temporal (i.e., how quickly an item was shared after having been initially posted) and structural, rather than verbatim content-based, features as the cascade grew. Tsur and Rappoport (2012) concluded that structural features of twitter hashtags captured more variance than did verbatim content features. However, unlike prior studies, Tsur and Rappoport (2012) examined the context of tweets, finding that an interaction of content and contextual features were indeed predictive of a hashtag’s spread, adding significant predictive value above the contribution of structural features. The authors acknowledge that the cognitive/psychological attributes of their tweets were not well characterized, potentially explaining why they were unable to capture more variance with these factors. There is therefore a need to explicitly examine the role of gist factors in the spread of information online.

3 A model of information sharing online

We aim to explicitly test FTT’s core constructs using social media data. Our approach builds on a recent mathematical formalization of FTT (Broniatowski and Reyna 2018).

3.1 Parameter specification

The structure of the model is as follows: FTT posits a hierarchy of gist that is, in the domain of numbers, analogous to scales of measurement (Reyna and Brainerd 2008; Stevens et al. 1946). Broniatowski and Reyna (2018) illustrates this hierarchy with the following example:

...consider the following choice between:

  1. 1.

    Winning $180 for sure; versus

  2. 2.

    0.90 chance of winning $250 and 0.10 chance of no money.

[At the simplest level of gist], people represent this decision as a categorical choice between the following two options:

  1. 1.

    Some chance of winning some money

  2. 2.

    Some chance of winning some money

Given this representation, most decision makers would favor option 1 because it promises some money without the chance of no money. However, more precise, yet still qualitative, representations are also generated simultaneously, such as ordinal representations (e.g., small vs. large amount of money):

  1. 1.

    More chance of winning less money

  2. 2.

    Less chance of winning more money and some chance of winning no money.

This representation does not allow for a clear decision to be made because most people would prefer winning more money to winning less money, but they would also prefer more chance of winning to less chance of winning. Finally, one may choose a precise interval representation of the problem whereby one calculates the expected value of each option by multiplying its respective outcomes by their probabilities, as follows:

1.:

Expected value of $180 (i.e., $180 × 1)

2.:

Expected value of $225 (i.e., $250 × 0.90 + $0 × 0.10)

Given this representation, most decision makers would favor option 2.

According to FTT, in the domain of numbers, risky decisions are encoded at the categorical, ordinal, and interval levels simultaneously. Broniatowski and Reyna (2018) models the probability, P, that a subject will choose a given decision option in a risky choice gamble by the logistic function,

$$\begin{aligned} P({\mathbf {x}}) = \frac{1}{1+e^{-({\mathbf {a}} \cdot {\mathbf {x}}+b)}} \end{aligned}$$
(1)

where \({\mathbf {x}}\) is a vector containing an entry for each level of mental representation (e.g., gist and verbatim), and \({\mathbf {a}}\) is a vector containing an entry corresponding to decision weights that are associated with individual differences in a subject’s ability to inhibit cognitive biases e.g., due to a subject’s numeracy (e.g., Liberali et al. 2012; Peters et al. 2006; Peters and Levin 2008; Schley and Peters 2014) and NFC (Cacioppo et al. 1984; Cacioppo et al. 1996). In addition, b, captures a subject’s overall motivation. We account for conflict between representations by adding weighted votes from each representation. As in the example above, preferences depend on the application of social values (e.g., winning money is good) to representations of options. If this gist representation prefers the certain option (− 1), the ordinal representation is indifferent, and the expected value representation of the problem prefers the risky option (+ 1), and then x = [− 1 0, + 1]. Furthermore, research suggests that neurotypical adults weigh the simplest gist representation most heavily; for example, if the representational weights a = [2,1,1] for each of the levels of mental representation posited by our model, then

$$\begin{aligned} {\mathbf {a}} \cdot {\mathbf {x}} = -2 + 0 + 1 = -1 \end{aligned}$$
(2)

Finally, suppose we estimate b = 0.5, indicating a preference for the more rewarding, though riskier, option (\(b>0\) indicates overall risk-seeking behavior, whereas \(b<0\) indicates risk averse behavior). Under these assumptions, the probability that a randomly chosen subject from our sample will choose the risky gamble option is

$$\begin{aligned} P({\mathbf {x}}) = \frac{1}{(1+e^{-(-1+0.5)})} = 38\% \end{aligned}$$
(3)

Inputs to this model are the three parameters outlined above (\({\mathbf {x}}, {\mathbf {a}}\), and b) and the model outputs a prediction regarding a decision probability (summarized in Table 1).

Table 1 Summary of model parameters

3.1.1 Mental representation of prior knowledge

Many online audiences lack extensive prior knowledge about controversial topics. Under these circumstances, causal narratives that provide explanations for otherwise mysterious adverse events are easy to comprehend and therefore compelling. Under such circumstances, individuals also rely on their social contacts for signals of the trustworthiness of online information. For example, Granovetter and Soong (1983) described the decision to adopt a behavior, such as spreading a rumor, as a function of the number of friends who had done the same. Specifically, they framed this decision as a risky binary choice: sharing when few people have done so is risky, yet doing so when many others have done so is safer. Thus, we posit that the perception of whether sharing a controversial article is perceived as risky is affected by social influence as determined by a threshold. When the number of exposures does not exceed the threshold, sharing the article is perceived as “risky”: here, the decision-maker faces the following binary choice analogous to a framing problem (Tversky and Kahneman 1981) (see also Broniatowski et al. 2015; Klein et al. 2017):

  1. A.

    Don’t share the online information and lose no social capital

  2. B.

    Share the article and maybe lose no social capital, but maybe lose some social capital (such as when one is criticized by friends)

In contrast, when the number of exposures exceeds the threshold (meaning that the information is now socially validated), the decision-maker faces the following choice:

  1. C.

    Don’t share the article and gain no social capital for sure

  2. D.

    Share the article and maybe gain some social capital as reflected by likes, reshares, etc., but maybe gain no social capital

The factor of representations is captured by the \({\mathbf {x}}\) vector in our model.

3.1.2 Values associated with gist principles

To decide between the options outlined above (A vs. B or C vs. D), subjects must apply social values that they endorse, called “gist principles” in FTT because, like information, they are mentally represented in simple gist forms. For example, a subject who is seeking social approval and who perceives the possibility of positive attention from sharing will be more likely to choose to share the information. They apply the gist principle—that “positive attention is better than no attention”—to their representations of options. Similarly, one who feels that he or she is at risk of social opprobrium but perceives nil risks from not sharing would not share the information since not getting criticized is preferred to getting criticized. This factor is captured by the signs of the elements in the x vector. Naturally, the subject’s assessments of how their friends might react are central to their judgments, with culture, worldview, and social identity all informing the gist of what information will be well received when shared.

3.1.3 Weights assigned to each mental representation

Subjects differ in the degree to which they rely on categorical gist (and other gist representations) versus literal verbatim information. For example, those who are more numerate (in the sense of rote computation) can rely more on precise numerical and literal details, giving less relative weight to categorical gist, all else equal (Reyna and Brainerd 2008). Similarly, when cued with obvious equivalencies, such as when framing is manipulated within-subjects, those with higher Need for Cognition may compare between frames, giving more relative weight to verbatim tradeoffs. Individual differences in reliance on these representations are captured by the \({\mathbf {a}}\) vector. Conversely, social media content that is easier to comprehend, because it is less detailed, is likely to be more widely shared.

3.1.4 Motivational factors

Motivation and strong emotion can bias decisions. For example, articles may contain “clickbait” or other factors that are designed to trigger impulsive sharing behavior. This factor is captured by the b parameter.

Granovetter and Soong (1983, p. 167) presage these factors in their threshold model of collective action. Specifically, they associate risky decisions with motivational personality factors (“Some individuals are more daring than others”), factors associated with social values in cultural context (“some are more committed to radical causes...”), and factors associated with verbatim cost-benefit calculations (“rational economic motives”).

3.2 Dataset

In order to test our model, we must measure its key constructs on social media. The analysis that follows is based on a set of 10,000 tweets about vaccines collected between November, 2014 and September, 2017, tagged as relevant to vaccines using the classifier described in Dredze et al. (2016b), and containing at least one word starting with “vax” or “vacc”.Footnote 2 This procedure yielded a dataset that was largely relevant to the online discourse about vaccine safety, although with some outliers (such as tweets pertaining to vaccinating pets and messages from fans of a band called “The Vaccines”). We chose not to remove these tweets since they were segmented by the topic model analysis (described below). Each tweet was hand-annotated by three raters as pro-vaccine, anti-vaccine, or neutral. Annotators had moderate agreement (Fleiss’ \(\kappa\) = 0.49) in the first round of annotation, and annotation rounds were conducted until raters reached consensus (typically 2–3 rounds; see Broniatowski et al. (2018) for full dataset details, annotation instructions, and procedure).

3.3 Operationalizing model parameters

The factors identified above map to the key elements of the model of decision under risk upon which we build (Broniatowski and Reyna 2018). We assume that the probability that a given individual will share an item of information can be described using the logistic function described in Eq. (1) where \({\mathbf {x}}\) is a vector capturing mental representations, \({\mathbf {a}}\) is a vector capturing weights placed on each such representation, and b is a scalar capturing motivational factors.

3.3.1 P(\({\mathbf {x}}\))—the probability that a given message is shared

Our model may be rewritten as

$$\begin{aligned} logit[P({\mathbf {x}})]= {\mathbf {a}} \cdot {\mathbf {x}}+b \end{aligned}$$
(4)

where

$$\begin{aligned} logit[P({\mathbf {x}})] = log\left( \frac{P({\mathbf {x}})}{1-P({\mathbf {x}})}\right) \end{aligned}$$
(5)

We operationalize P(\({\mathbf {x}}\)) by measuring the total number of retweets per follower for each message in our dataset. We use a logistic transform so that we may test our predictions using linear regression models.

3.3.2 \({\mathbf {x}}\)—mental representation

Although we cannot directly measure the mental representations of every social media user in our sample, we may examine proxies for gist. Specifically, Griffiths et al. (2007) posited that Latent Dirichlet Allocation (LDA) (Blei et al. 2003) may be used as a measure of the gist associated with a given document. Although we agree that probabilistic topics, such as those generated by LDA, may be associated with gist, there is work demonstrating that LDA does not always yield topics that are comprehensible by humans (Chang et al. 2009)—some topics are expected to be more coherent than others. As indicated above, we expect that topics that “connect the dots”—i.e., expressing causal coherence—are more likely to capture a compelling gist.

In order to capture a proxy for gist, we fit a 50-topic LDA model to our dataset using unigram and bigram features using the scikit-Learn (Pedregosa et al. 2011) and lda (Riddell 2014) python packages. This allows us to determine the probability that any given tweet is about a given topic. These probabilities were converted into logarthmic units using a logistic transform to control for floor and ceiling effects. The top five most frequent terms associated with each topic are shown in Table 2.

Table 2 Topics extracted from dataset using LDA

Topic 2, in particular, captures the gist that vaccines cause autism. Since this topic explicitly captures a causal gist, we expect that it will be associated with a higher number of retweets per follower.

3.3.3 a—representational weights

Although our data do not allow us to directly measure the weights that each sharer places on different mental representation, we are able to determine the comprehensibility of each tweet using standard metrics. We expect that tweets that are more difficult to comprehend will be shared less frequently because some individuals, perhaps those with lower literacy, will not have the ability to derive meaningful information from them, whereas those with higher Need for Cognition may understand them but may prefer to share something more compelling.

We assess the comprehensibility of a tweet using several standard measures contained in the textstat python package (Bansal 2018). Since several of these measures are correlated, we conducted a principal component analysis (PCA) to extract orthogonal factors associated with text comprehensibility, corresponding to readability, verbatim features, and number of sentences (see Table 3). These principal components were used as predictors.

Table 3 Results of PCA applied to measures of text comprehensibility

3.3.4 b—motivational factors

Many online messages contain compelling multimedia presentations, such as vivid images, movies, or sounds that are expected to be motivational. For example, previous work indicates that the presence of images on social media increases the likelihood that the message will be shared at least once (Broniatowski et al. 2016; Chen and Dredze 2018), presumably because it is more noticeable and often more emotionally arousing. Thus, we record whether a given tweet contains any such media as a proxy for its motivational power.

Additionally, we assess the emotional content of a tweet using two separate measures: (1) weighted emotion scores associated with Plutchik’s eight basic emotions (joy, trust, fear, surprise, sadness, anticipation, anger, and disgust) (Plutchik 2001)Footnote 3, and (2) mean valence, arousal, and dominance scores associated with the Affective Norms for English Words (ANEW) dictionary (Bradley and Lang 1999). We again conducted a PCA to extract orthogonal factors associated with emotion, yielding three dimensions corresponding to ANEW scores (averaged across both positive and negative valences), Plutchik’s negative emotions only, and Plutchik’s positive emotions only (see Table 4). These principal components were used as predictors.

We tentatively associate these emotional measures with the b parameter because Vosoughi et al. (2018) and Berger and Milkman (2012) speculated that such emotions were the driving force behind virality (however, see Rivers et al. 2008 for a more extensive discussion of the relationship between different definitions of emotion and decision-making). Finally, we included a dummy variable indexing whether a tweet was generated by a “verified user”—defined by twitter as accounts “of public interest”Footnote 4—and therefore a proxy for celebrity.

Table 4 Results of PCA applied to measures of emotion

3.4 Regression analyses

The aim of our analysis is twofold. On one hand, fuzzy-trace theory provides an explanation for which tweets about vaccines are shared online. We therefore seek to determine which of the theoretically-motivated factors, identified above, are significantly associated with online sharing. On the other hand, we seek a model that may be used to predict which of these tweets are more likely to be shared on new data without overfitting (for more about the distinction between prediction and explanation, see Shmueli 2010). Our model selection process is informed by these two parallel, yet complementary, goals.

3.4.1 Data segmentation

Consistent with our prior work (Broniatowski et al. 2016), we separately analyzed the factors driving virality (i.e., the number of retweets per follower) from those associated with the likelihood that a given tweet was retweeted at least once. Consequently, after removing 2254 tweets that were generated by accounts with 0 followers (meaning that we could not calculate the number of retweets per follower), we separated our sample into two segments—those that had been retweeted at least once (n = 1388), and those that had not (n = 6358).

3.4.2 Linear multiple regression

Our goal was to select the best-fitting linear regression model to predict retweets per follower among those tweets that had been retweeted at least once. Consistent with our dual aims of prediction and explanation, we carried three rounds of model fitting, each of which used two separate model selection procedures.

  1. 1.

    Explanation We performed model fitting by bidirectional stepwise elimination, using the Akaike Information Criterion (AIC; Akaike 1976), which is mathematically equivalent to \(L_0\)-norm regularization) as the minimization criterion. The starting state for this stepwise procedure was a model including all main effects, but no interactions. Terms were removed or added one at a time if doing so reduced AIC.

  2. 2.

    Prediction We segmented our data into thirds, holding one segment out for measuring predictive accuracy, with the remaining tweets used for training and test. We used Least Absolute Shrinkage and Selection Operator (LASSO) regression—a technique based on \(L_1\)-norm regularization—with threefold cross validation to determine the factors underlying the most predictive model.

In both cases, predictors included items associated with theoretically-motivated factors:

  1. 1.

    \({\mathbf {x}}\): logit-transformed proportions of all 50 topics

  2. 2.

    \({\mathbf {a}}\): the three PCA dimensions of text comprehensibility

  3. 3.

    Tweet polarity: pro-vaccine, anti-vaccine, or neutral

  4. 4.

    b: the three PCA dimensions of emotion, dummy variables indicating the presence or absence of vivid media, and whether or not a tweet was generated by a verified user (an explicit measure of source credibility)

In addition, first- and second-order interaction terms between topics, polarity, and comprehensibility were included to account for their multiplicative effects in our model.

In the second round of model-fitting, we constructed new ordinary least squares (OLS) regression models containing only the factors that replicated across both the bidirectional elimination and LASSO model selection methodologies. Finally, in the third round, we removed all factors that were not significant at the \(p\,<\,0.05\) level after controlling for multiple comparisons using the Holm-Bonferroni procedure.

3.4.3 Logistic regression

Following Chen and Dredze (2018) and Broniatowski et al. (2016), we also conducted an analysis designed to test our model’s predictions for whether a tweet was likely to be shared at least once treating this as a binary classification task. We once again conducted three rounds of model fitting, with each round containing two model fits.

  1. 1.

    Explanation a standard logistic regression model fit to all of the data using bidirectional stepwise elimination with AIC as the minimization criterion.

  2. 2.

    Prediction a logistic regression model with \(L_1\)-norm regularization fit to two-thirds of the data using threefold cross-validation, and evaluated against the remaining third. Here, we randomly undersampled tweets with no retweets to control for class imbalance, again comparing our model’s results to “null” and “saturated” variants.

In each case, we used the same set of covariates as in the linear regression analyses, where the target variable was whether or not a given tweet had at least one retweet. Our second and third round of model fitting followed the same procedure used for the linear regression models, only substituting logistic regression for OLS regression.

4 Results

4.1 Linear regression analysis

Table 5 shows the explanatory linear regression model resulting from our selection procedure (intermediate models are presented in the Appendix).

Table 5 Best-fitting linear regression model predicting number of retweets per follower for tweets with at least one retweet

Among the variables posited by our model, Topic 2 has the largest positive coefficient, indicating that messages containing the gist that vaccines cause autism are more likely to be shared. Topic 17, corresponding to vaccinations for pets, was also more likely to be shared. A negative interaction term between Topic 2 and verbatim features indicates that sharing for this topic decreases for longer tweets with more words. Finally, tweets from non-verified accounts were shared significantly more than tweets from verified accounts.

Beyond this explanatory analysis, we compared our model’s performance to a Null model (only a constant predictor), a model containing only one feature (user verification), and a saturated model (containing all features and interactions) (Busemeyer and Wang 2000). Table 6 shows that the predictive power of the four factors shown in 5 improves upon simpler models, and equals or exceeds more complex models on holdout data.

Table 6 Linear regression model performance compared to null and saturated models

4.2 Logistic regression analysis

Table 7 shows the explanatory logistic regression model resulting from our selection procedure (intermediate models are presented in the Appendix). Among the variables posited by our model, and consistent with prior work (Chen and Dredze 2018; Broniatowski et al. 2016), user verification and the presence of media are both significantly associated with more sharing. Additionally, topic 28 — corresponding to a “link” betweeen vaccines and autism—and topic 39—corresponding to a statistical description of vaccine adverse event rates—were both associated with more sharing. Finally, topic 12—associated with “big pharma” conspiracy theories—led to more sharing when the associated sentiment was neutral.

Table 7 Best-fitting logistic regression model predicting whether a given tweet will be shared at least once

Table 8 shows that the predictive power of the factors shown in Table 7 improves upon simpler models, and exceeds more complex models on holdout data.

Table 8 Logistic regression model performance compared to null and saturated models

5 Discussion

Results of our analysis support FTT’s implications. Topic 2, expressing a causal gist, is the strongest predictor of retweets per follower, replicating across multiple methodologies. Notably, this effect was attenuated when messages in this topic contained more difficult verbatim features, providing some evidence in favor of the role of multiple mental representations and the hierarchy of gist. Furthermore, consistent with the weaker role of verbatim representations, Topic 39, expressing verbatim statistics about vaccine-related adverse events, predicted only the likelihood of a single retweet.

We found support for several of our model’s other parameters: the role of motivational factors on the first retweet is illustrated by the significant positive effects of user verification and vivid media on whether a message is retweeted at least once. Notably, user verification has a positive effect on the likelihood that a tweet is retweeted at least once, but a negative effect on the total number of retweets per follower, indicating that, without meaningful content, tweets from verified users are even less likely to go viral than tweets from unverified users. This may be because these accounts simply tend to generate more content and have more followers. Finally, Topic 28, which perhaps expresses similar thematic content as Topic 2 without expressing a gist—i.e., it mentions a “link’ between vaccines and autism, but not a causal connection—only increased the likelihood of the first share.

Our results extend our recent findings on Facebook data (Broniatowski et al. 2016) where we showed that only gist was associated with increasing numbers of Facebook shares of vaccine-related news articles, whereas gist, verbatim statistics, and vivid media all predicted at least one share. Thus, this study extends our results from the most popular social media platform (as of 2018, 68% of the US adult population is on Facebook (Smith and Anderson 2018) to multiple social media.

5.1 Limitations and directions for future work

Our findings are limited by difficulties operationalizing the core constructs of FTT. Although LDA topics may be associated with gist in many cases, they do not in general capture the construct of meaning in context, which depends both on the prior knowledge of the observing subject and the stimulus. Future work should therefore focus on methods to extract gists given a candidate set of messages in the context of the knowledge likely to be widely held within a given online community. Similarly, representational weights are expected to vary with individual social media user accounts; therefore, proxy attributes of tweet content, such as emotion word dictionaries, readability, or verbatim features, will likely be noisy. Indeed, the results in Table 4 show that nominally distinct constructs such as valence, arousal, dominance were conflated. Similarly, discrete emotion states group into dimensions primarily reflecting overall sentiment. Future work may profitably focus on deriving relevant psychometric features given a sufficiently large set of tweets that might be used to characterize stable personality traits and other individual differences (e.g., Quercia et al. 2011; Golbeck et al. 2011). Importantly, vaccine sentiment evaluated on individual tweets may be a poor proxy for values stored in long-term memory. Indeed, those who support and oppose vaccination may agree on several relevant values, such as “saving lives is good” or “avoid harm”, while disagreeing on the specific factors that might save lives or avoid harm. Specifically, those who oppose vaccination may contend that vaccines cause harm whereas those who support vaccination may be more concerned about harms caused by viral illnesses. It is therefore not surprising that the sentiment of vaccine messages was less of a predictive feature in our model.

Our results also highlight the complex role that emotion may play in both gist and motivation. Although the effects of emotional keywords did not replicate across multiple methods, Topic 17, corresponding to vaccinations for pets, significantly increased retweets per follower and likely has both motivational and gist components. Content referring to pets and other animals [e.g., cat videos and pandas (Hsee and Rottenstreich 2004; Myrick 2015)] tend to trigger emotional responses that can both increase arousal, a motivational factor, while also facilitating the retrieval of gist principles that influence decisions (Rivers et al. 2008), such as those to share online. Thus, future work should better characterize the role of strong emotion on both meaningful and motivational factors associated with online sharing.

6 Conclusions

In this paper, we propose a formal model of information sharing online based on FTT. Our model incorporates elements into its formulation that capture motivational factors as well as factors associated with the extraction of meaning from the article’s content. These factors are predictive of online sharing, allowing us to make novel predictions.

Overall, our model provides strong support for the roles of multiple mental representations, but especially causal—i.e., meaningful—gist, combined with motivational factors. It appears that motivation aids a given tweet to be retweeted at least once; however, once retweeted, gist may be the engine underlying its virality.