1 Introduction

The COVID-19 crisis, which led to much of social life migrating online, has contributed to an infodemic, where information of varying quality quickly spreads in social media networks around the world. While ideally high-quality health information would be shared from credible sources and saturate social networks, misinformation about COVID-19 poses a significant public health risk during a global pandemic (O’Connor and Murphy 2020; Barau et al. 2020). As World Health Organization Director General Tedros Adhanom Ghebreyesus remarked, “We're not just fighting a pandemic; we're fighting an infodemic” (The Lancet Infectious Diseases 2020). Twitter provides a significant source of COVID-19 misinformation (Yang et al. 2020). In one analysis, almost 25% of COVID-19-related tweets contained some misinformation (Kouzy et al. 2020). Much of this misinformation spreads through bots, automated accounts that often share false or conspiracy-based information in order to amplify a political message. According to one analysis of COVID-19 information on Twitter, bot accounts share a high volume of tweets linking to low-credibility sources (Yang et al. 2020). Analysis has also revealed that “high bot score accounts are used to promote political conspiracies and divisive hashtags alongside with COVID-19 content” (Ferrara 2020 p. 17), while accounts likely run by humans focus more on health and public welfare.

The spread of misinformation on Twitter has been noted with alarm by scholars even before COVID-19 (Zubiaga and Ji 2014; Waszak et al. 2018; Sommariva et al. 2018). When reviewing the state of medical information on social media, Wang et al. (2019) conclude that “misinformation is abundant on the internet and is often more popular than accurate information” (p. 7), while Chen and colleagues (2018) found that medical misinformation spread more broadly on Twitter than accurate information. For instance, misleading information about Zika on Twitter was more popular than accurate posts (Sharma et al. 2017). With the rise in AI, bots spread much of this disinformation, often contributing significantly to the spread of low-credibility content (Shao et al. 2018). Twitter bots played “a disproportionate role in spreading and repeating misinformation” about the U.S. presidential election 2016 (Shao et al. 2017, p. 1), hold a “small but strategic role in Venezuelan political conversations” (Forelle et al. 2015 p. 1), and retweet anti-vaccination information (Broniatowski et al. 2018), especially to receptive users (Yuan et al. 2019).

Correcting Twitter misinformation remains a huge public health challenge, both in and beyond the COVID-19 crisis. This challenge exists because people are notoriously difficult to persuade when they hold false or conspiratorial beliefs (Gruzd and Mai 2020; Rice 2020) and because some analyses suggests that on Twitter, “COVID-19 misinformed communities are denser, and more organized than informed communities, with a possibility of a high volume of the misinformation being part of disinformation campaigns” (Memon and Carley 2020 p. 1). Scholars have increased calls for research into combating misinformation online. Chou et al. (2018) called for research that develops and tests interventions in response to online misinformation. According to Pagato et al. (2019), research must address the following questions: “How does health (mis)information spread, how does it shape attitudes, beliefs and behavior, and what policies or public health strategies are effective in disseminating legitimate health information while curbing the spread of health misinformation?” (2019, p. 1). Wei et al. (2016) describe the challenges that “undesirable users” create for using Twitter as a medium for understanding the “cultural landscape” and helping the response to important events and crises (p. 51).

Misinformation is often defined in a way that allows for its automatic detection. Dhar et al. (2016) describe misinformation as a rumor; pushing that definition further, Tsugawa and Ohsaki (2017) identify misinformation with the concept of “flaming” where falsehoods become viral when expressed in negative terms; by using a sentiment analysis, Tsugawa and Ohsaki then identified possible misinformation. Dewan and Kumaraguru (2017), on the other hand, focused on the motives of those who shared the misinformation, describing it as a tool of cybercriminals perpetuating a scam or hoax. Another approach to identifying misinformation uses automated fact-checking, which focuses on a direct comparison of the message to a known, credible outside source. Thorne and Vlachos (2018), using this definition, look at the state of natural language processing and journalistic sources to see where there are gaps in the automated fact-checking process. Each definition provides benefits for the automated identification and tracking of misinformation to monitor the health of social networks. While the work of social network analysis and mining scholars is of great importance for addressing the COVID-19 infodemic, the second step in addressing misinformation in social networks is what is done once a message (tweet, FB post, etc.) is identified as a problem.

Many social network platforms like Facebook, particularly those located in societies which emphasize the importance of freedom of expression, may feel uncomfortable outright banning or censoring posts (Kang and Isaac 2019). Instead, flagging posts from a questionable source or flagging information that is known to miss what credible sources are saying is a common approach. Yet, does flagging misinformation or a questionable source sway social media users if they already believe the information being flagged?

In response to these calls and the special theme of this issue, which asks for strategies to mitigate and fact check COVID-19 misinformation, this article reports on a novel, branching survey experiment (N = 299) that tested how participants responded to tweets featuring conspiracy theories about the official count of COVID-19 deaths in the United States. Participants first viewed a tweet that aligned with their existing opinion about the COVID-19 death tallies and then saw the same tweet with a flag indicating that the tweet was generated by a bot and then saw a flag warning that the tweet contained false information. The results suggest that both flags significantly decrease participants’ willingness to engage with tweets and may change some participants’ minds about COVID-19 misinformation. Social media platforms can use this information in their approaches to help combat a COVID-19 infodemic. This finding is an important contribution to social network analysis and mining so that the warnings from automated detection techniques can be crafted into persuasive messages that will motivate users to be cautious during the COVID-19 infodemic.

2 Literature review

2.1 Human perception of messages shared by bots

People tend to trust content attributed to AI authors less than they trust content attributed to humans (Waddell 2018). This makes sense, as users often rely on the authority of a Twitter account to separate reliable and unreliable information (Zubiaga and Ji 2014). However, studies tend to find that people only mistrust AI-generated content under certain conditions. Readers did not assign higher credibility scores to human-written vs. bot-written news articles when they did not know who wrote the story, but they considered stories labeled as written by humans more credible and readable (Graefe and Bohlken 2020). Adding low-confidence indicators to AI-generated content decreases participant trust, but high-confidence indicators do not increase trust (Bruzzese et al. 2020). Research of participants who viewed tweets labeled as coming from either a CDC Twitterbot or a human working at the CDC found that “a Twitterbot is perceived as a credible source of information” (Edwards et al. 2014, p. 374). Participants gave similar credibility scores for a set of 10 Airbnb profiles regardless of whether they thought they were human or computer generated; however, when participants engaged with a set of 10 profiles and received information that some of the profiles were human generated and some were AI generated, they gave lower trustworthiness scores to profiles they assumed were AI generated (Jakesch et al. 2019).

2.2 Correcting misinformation on social media

Many studies find that interventions to correct misinformation on Twitter work to reduce misperceptions. Giving people accuracy nudges before they consider sharing COVID-19-related information significantly improves their truth discernment, suggesting that “nudging people to think about accuracy is a simple way to improve choices about what to share on social media” (Pennycook et al. 2020). Labeling information as rumor caused participants to consider it less important than information labeled as news (Oh and Lee 2019). Correcting misinformation about the Zika virus on Twitter by providing a source lowered misperceptions in participants (Vraga and Bode 2017a, b), as did correcting conspiracy theories about Zika (Lyons et al. 2019). Corrections can be effective coming from either algorithms or other platform users and can even affect individuals with high levels of conspiracy beliefs (Bode and Vraga 2017). WhatsApp messages from civil society organizations in Zimbabwe can correct COVID-19 misperceptions and affect positive changes in social distancing behavior (Bowles et al. 2020). Corrections from government agencies were more effective than corrections from other users (Vraga and Bode 2017a, b; van der Meer and Jin 2020), though other research has found that comments about Twitter content being fake news were more effective coming from other users than as a disclaimer from a social media platform (Colliander 2019). In an experimental situation where participants saw a fake news story on Facebook about a nonprofit organization along with a refutation from the nonprofit, denial created higher credibility for the nonprofit than comments attacking the source of the fake news (Vafeiadis et al. 2019).

However, attempts to correct misinformation can sometimes work against their intended effect (Lewandosky et al. 2012), especially in individuals who accept conspiracy theories (Miller et al. 2016). In two experiments designed to combat Zika and yellow fever misinformation in Brazil, Carey et al. (2020) found partial success for interventions to correct health myths, but also concluded that “current approaches to combating misinformation and conspiracy theories about disease epidemics and outbreaks may be ineffective or even counterproductive” (p. 9). A meta-analysis of attempts to correct misinformation online (Wang et al. 2019) finds that “although interventions to correct misperceptions are proven effective at times, efforts to retract misinformation need to be carried out with caution in order to prevent backfiring” (p. 7).

2.3 Research gap

The experiment reported here contributes to this ongoing investigation of methods for best countering and correcting the spread of misinformation on social media. Specifically, we make two unique contributions to this effort. First, while most research randomly assigns participants into experimental groups, this study assigned participants to conditions based on their previous beliefs about COVID-19 misinformation. Participants who believed COVID-19 death tallies were over- or undercounted saw tweets confirming their beliefs. (Those with no opinion or who felt the counts were accurate were randomly sorted into the over- or undercounted groups.) This approach allowed us to test the impact of flags on audiences sympathetic to the misinformation in the tweets and also allowed us to more directly test for backfire effects sometimes associated with message correction. This methodology specifically responds to calls by Lewandosky et al. (2012) to test whether retractions “fail to reduce reliance on misinformation specifically among people for whom the retraction violates personal belief” (p. 118) and Wang (2019) to understand “the role of belief systems on the intention to spread misinformation” (p. 1). Second, the experiment employs a sequence of two flags, telling users that the tweet is a suspected bot and then informing users that the tweet contains misinformation. This approach allows us to test the influence of these flags individually and together and represents a more sustained fact-checking approach.

2.4 Research questions

This study answers the following research questions:

  • RQ1 Does a flag that the tweet is written by a bot change participants’ engagement with and attitudes about the tweet?

  • RQ2 Do flags that the tweet is both written by a bot and contains misinformation change participant’s engagement with and attitudes about the tweet?

  • RQ3 Are participants capable of changing their opinion after viewing flags that a tweet was shared by a bot and contained misinformation?

  • RQ4 What personal experiences and attitudes are associated with the respondent’s willingness to change their opinion about coronavirus numbers after viewing the flagged tweets?

3 Method

3.1 Research context

Data were collected over a three-day period from September 8, 2020, through September 10, 2020. On September 10, 2020, the USA had a total of 6,366,986 cases of COVID-19, including 183,950 reported deaths from the virus (US Historical Data). At the time data were collected for this study, there was no scientific evidence indicating these numbers were incorrect. However, posts sharing false and misleading information, including suggestions that official COVID-19 numbers from the CDC were being either over- or underreported, were abundant on social media platforms such as Facebook and Twitter during this time period (Ebrahimji 2020; Kouzy et al. 2020). In the month prior to the current study, the then current President of the United States shared a tweet that was removed by Twitter for reporting false coronavirus statistics (Quinn 2020). The removed tweet claimed the CDC had quietly updated coronavirus numbers and suggested prior COVID-19 deaths were being overreported.

3.2 Participants

We collected a total of 332 initial responses from participants using Amazon’s Mechanical Turk (MTurk) between September 8, 2020 and September 10, 2020. We choose MTurk for recruitment because its participants have been found representative of the general US population (Levay et al. 2016; McCredie and Morey 2019; Redmiles et al. 2019), especially the general Internet-using population (Keith et al. 2017). Further, MTurk participants tend to accept and take seriously experimental conditions at roughly the same rate as lab experiment participants (Thomas and Clifford 2017). According to Ford (2017), one major problem with MTurk results comes from “speeders,” participants who rush through answers to get paid as quickly as possible. To help combat this issue, our survey included three attention check questions embedded throughout the survey where participants were asked to select a specific response option. We dropped 33 subjects from our final dataset for failing to correctly respond to all three attention check questions, resulting in a final sample of 299 individuals.

The final sample was on average 35 years old (M = 35.49, SD = 10.03), primarily male (59.9% male, 39.5% female, 0.6% other or prefer not to say), and White (White = 75.95%, Asian = 10.0%, Black/African-American = 7.0%, Hispanic/Latinx = 2.3%, Native American = 1.7%, biracial = 2.1%, and other = 0.7%). Our sample is a good reflection of Twitter’s users: 30.9% are between the age of 25 and 34, and the majority of Twitter’s users are male (Clement 2020a, 2020b).

3.3 Survey instrument

Participants were first presented with four statements and asked to select the one that best described their view of coronavirus case reporting data from the U.S. federal government. The four statements were: (1) there is underreporting—actual numbers are higher than reported numbers; (2) there is overreporting—actual numbers are lower than reported numbers; (3) there is accurate reporting—actual numbers are consistent with reported numbers; and (4) I do not have an opinion regarding coronavirus numbers. Participant responses to this question were used to assign participants to either the overreporting tweet condition or the underreporting tweet condition. One novel aspect of this study is that we presented respondents with the tweets that were aligned with respondents’ current beliefs. For respondents who did not have an opinion on reporting or believed the numbers were accurate, they were randomly assigned to either the overreporting or underreporting tweet condition. Participants were then asked questions to assess their attitudes and behaviors before being presented with one of two fabricated tweets claiming coronavirus deaths are being misreported, either overreported (See Fig. 1) or underreported (See Fig. 2). After viewing the tweet, respondents were asked to respond to items to assess their attitudes regarding the tweet’s credibility. After the initial tweet was presented to the respondent, they were then shown the tweet again, this time flagged with a statement which read “Caution: Suspected Bot Account. Learn More.” Respondents were then asked to assess the flagged tweet’s credibility. Respondents were ultimately shown the tweet a third time with a second flag added which read “Caution: Tweet contains misinformation about the novel coronavirus. Learn more.” After being presented with the tweet with two flags, respondents were asked to assess the tweet’s credibility. Tweets were created using the Tweetgen.com service (beta-0.3.2 2020) and used identical share and like numbers for all versions. The numbers were chosen to neither appear very low nor very high to keep the participants focused on the content flags. Our methodology follows other recent studies (Borah and Xiao 2018; Wasike 2017; Lim and Lee-Won 2017; Oeldorf-Hirsch et al. 2020; Scott et al. 2020; Solnick et al. 2020) that test the effects of static representations of Twitter and Facebook posts.

Fig. 1
figure 1

Tweets for overcounted coronavirus numbers condition

Fig. 2
figure 2

Tweets for undercounted coronavirus numbers condition

3.4 Measures

Participant preventative behaviors were assessed by asking respondents how frequently (never, sometimes, often, or always) they engaged in seventeen different behaviors designed to reduce the risk of catching the coronavirus. Behaviors included avoiding nonessential shopping, frequently washing hands for 20 seconds, cleaning regularly touched surfaces with disinfectant, and limiting gatherings to fewer than 10 people.

To assess respondents’ fears related to the coronavirus, we asked respondents to indicate their level of agreement using a 5-point Likert-type scale for three statements designed to capture their coronavirus health-related concerns. These statements included: “I am scared that I might contract coronavirus,” “I am scared that someone in my family will contract coronavirus,” and “I fear that if I or someone in my family gets coronavirus, we will face complications that require hospitalization.”

Respondents were asked to report the number of hours they spent on social and news media both before the coronavirus and in the 30 days prior to completing the survey. To determine time spent on social media, respondents were asked to report the number of hours spent per day on social media, while time spent on news media was collected by asking respondents to report the number of hours they spent watching/reading news for each time period.

Respondents were presented with 22 different news sources (including a write-in “other” option) and asked to indicate where they received their news. Respondents were allowed to select multiple options from the list. Major television and social media sites were listed separately (e.g., CNN, Fox News, Facebook, Twitter), while less frequently consumed sources or regional media were presented using categories such as “Local Television News” or “Liberal News Websites (Mother Jones, the Nation).” Respondents were also given the option of selecting news information directly from President Trump through either “President Trump Tweets” or “President Trump White House briefings.”

Anomie, or the breakdown in belief in social bonds, was assessed using the nine item GSS Anomie Scale (Smith et al. 2019). Sample items include “Most people don’t care what happens to others” and “A person must live pretty much for today.” Respondents responded using a 5-point Likert-type scale. The Cronbach’s alpha of this scale was 0.845.

To assess the extent to which respondents generally believe and engage in conspiratorial thinking we used the Conspiracy Mentality Questionnaire (CMQ; Bruder et al. 2013). The scale consisted of five items (e.g., there are secret organizations that greatly influence political decisions) and used a 5-point Likert-type scale with strongly agree and strongly disagree anchors. Cronbach’s alpha for the CMQ was 0.832.

Government trust was measured using the Citizen Trust in Government Organizations’ scale (Grimmelikhuijsen and Knies 2017). Respondents were presented with nine statements and asked the extent to which they agree or disagree with each statement using a 5-point Likert-type scale. Sample items include “the federal government is capable” and “the federal government is honest.” The overall Cronbach’s alpha for this scale was 0.959.

To assess respondent attitudes regarding tweet credibility, we adapted items used to evaluate Twitter posts first used by Vraga and Bode (2017a, b). After viewing each tweet, respondents were asked to evaluate the tweet as being useful, interesting, trustworthy, credible, biased, accurate, or relevant using a 5-point Likert-type scale. Additionally, respondents were asked to indicate how they would interact with the tweet by responding to four questions to gauge likely behaviors in regard to the tweet. These behaviors included following the Twitter account, retweeting the tweet, liking the tweet, and searching for additional information related to the tweet.

After reading the tweets, respondents were presented with a cognitive dissonance measure developed by Metzger et al. (2020) to determine the impact of viewing attitude-challenging information, such as flagging a tweet that shares respondents’ beliefs, on feelings of dissonance. This was measured using a nine-item 5-point Likert-type scale. Cronbach's alpha for the scale was 0.638.

Respondent religiosity was captured using a 3-item measure developed by Barnett et al. (1996). Respondents were presented with three statements (e.g., "my religion is very important to me") and asked to respond to each statement using a 5-point Likert-type scale. The Cronbach’s alpha for this scale was 0.930.

Given the nature of COVID-19, basic respondent health information related to the virus was also collected for this study. Respondents were asked to report whether they, someone in their household, a family member, close friend or acquaintance, or coworker had been diagnosed with COVID-19. Respondents were also asked to indicate whether they suffered from any of the preexisting conditions that increased the risk of severe illness from COVID-19 (Centers for Disease Control and Prevention 2020).

Demographic data including age, gender, ethnicity, and highest degree completed were collected from all respondents.

3.5 Analysis

Data were analyzed using Kruskal–Wallis, ANOVA, Chi-squared test for independence, independent t-tests, and Pearson correlations using IBM SPSS 26 (2020). Graphs were created in Microsoft Excel (2016).

4 Results

4.1 Association between belief in COVID-19 numbers and preventative behaviors

To identify whether there is an association between the participants’ belief about the accuracy of COVID-19 mortality figures and preventive behaviors, we performed a Kruskal–Wallis test and found that there was a statistically significant difference in hand washing (H3 = 15.653, p = 0.001), avoiding touching the face (H3 = 15.407, p = 0.002), avoiding using cash when making purchases (H3 = 13.725, p = 0.003), limiting gatherings to fewer than 10 people (H3 = 33.311, p < 0.001), working from home (H3 = 16.313, p = 0.001), avoiding nonessential shopping (H3 = 22.595, p < 0.001), monitoring news about coronavirus (H3 = 11.326, p = 0.01, practicing social distancing (H3 = 24.511, p < 0.001), using electronic communication to avoid meeting with people in person (H3 = 27.78, p < 0.001), wearing a mask (H3 = 43.923, p < 0.001), staying at home unless shopping for core needs (H3 = 16.195, p = 0.001), and quarantining from others if symptoms appear (H3 = 18.879, p < 0.001). The average preventative behavior score for each coronavirus case count group is shown in Fig. 3.

Fig. 3
figure 3

Differences in preventative behaviors based on belief in COVID-19 Count

For those who have personally experienced COVID-19 themselves or had someone they know, either a friend or family member, contract the disease, they were more likely to take a clear position on the COVID-19 mortality count; the “unsure” respondents were those without personal experience (H3 = 13.998, p = 0.003). A Kruskal–Wallis test also showed that those who fear contracting the coronavirus, fear that their family will contract the virus, and that they or their family may face complications were more likely to believe the COVID-19 numbers were accurate or undercounted compared to those who believe the numbers are overestimated (H3 = 20.063, p < 0.001; H3 = 18.732, p < 0.001; H3 = 15.649, p = 0.001).

4.2 Impact of Twitter flags on change in belief about COVID-19 numbers

A series of paired t-tests show how individual respondents changed their opinions about the Twitter accounts after a flag was placed on the tweet. The first flag warned participants that the tweet was shared by a suspected bot account. After being flagged as a potential bot, participant’s perspective on the Tweet’s credibility changed for every measure except bias, which remained consistent regardless of the Twitter flags. In responding with their desire to follow the Twitter account, the version with no flag was higher than the bot flag (t298 = 8.638, p < 0.001) and the bot–misinformation flag (t298 = 9.443, p < 0.001). For willingness to retweet this tweet, the unflagged tweet was rated more highly than the bot flag (t298 = 5.165, p < 0.001) and the bot–misinformation flag (t298 = 5.819, p < 0.001). Willingness to like the tweet followed the same pattern from unflagged to a reduced willingness after the tweet was flagged as a bot (t298 = 5.862, p < 0.001) and then as misinformation (t298 = 8.581, p < 0.001). The next set of questions dealt with the perception of the tweet. The unflagged tweet was rated more highly for willingness to seek more information compared to the bot flag (t298 = 6.177, p < 0.001) and the bot–misinformation flag (t298 = 8.793, p < 0.001), respectively; the same was true for usefulness (t298 = 6.113, p < 0.001; t298 = 9.43, p < 0.001), interest (t298 = 6.199, p < 0.001, t298 = 8.318, p < 0.001), trustworthiness (t298 = 6.304, p < 0.001; t298 = 9.349, p < 0.001), credibility (t298 = 7.977, p < 0.001; t298 = 10.439, p < 0.001), accuracy (t298 = 6.264, p < 0.001; t298 = 11.581, p < 0.001), and relevance (t298 = 6.942, p < 0.001; t298 = 9.412, p < 0.001). The one aspect that remained the same despite the flags as a bot or as a bot–misinformation was bias: t298 = -0.452, p = 0.652; t298 = -0.951, p = 0.342. The average rating for each flag condition is shown in Fig. 4.

Fig. 4
figure 4

Change in rating after tweets flagged

A Chi-squared test of independence shows that exposure to the series of tweets with flags changed the opinions of certain participants at a statistically significant rate: χ29 = 462.360, p < 0.001. Those who were unsure or believed the count is accurate were more likely to switch their perspective to the tweet they viewed and agree with it even after it was flagged as a bot and cautioned it contained misinformation. For those who were initially unsure of the count’s accuracy, 80% remained unsure and 20% switched to saying the numbers are overcounted after seeing the overcount tweet. For those who believe the numbers are accurate, 78% held that position, but 12% changed their opinion to match the tweet they saw, while 9% adjusted their opinion against the tweet they saw. Those who believed the numbers are overcounted were most susceptible to changing their opinions after seeing the cautionary flags; 73% continued to believe numbers were overcounted, with 5% becoming unsure, 11% saying the numbers are accurate, and 12% saying the numbers are undercounted. Those who believe the numbers are undercounted were the most dependable in their belief with 88% stating the count is underreported, 4% saying it is accurate, and 8% saying it is overcounted.

4.3 Characteristics based on belief in COVID-19 count accuracy

Using an ANOVA test, we found that there was a difference between respondents' view of the COVID-19 mortality count and their score on the cognitive dissonance scale (F3,295 = 3.437, p = 0.017). Those who believed the count is overstated averaged 0.33 less than those who believe the count is accurate and averaged 0.232 less than those who believe the count is undercounted. There were also differences in the conspiracy scale (F3,295 = 3.21, p = 0.023) and trust in the government (F3,295 = 11.068, p < 0.001). Those who believe the count is overstated had a higher average conspiracy score by 0.308 compared to those who think the number is accurate and by 0.33 for those who believe the count is undercounted. The participants who believe the number is undercounted did not trust the government compared to the other groups with an average difference of 0.608 for those who think the number is overstated and an average difference of 0.77 for those who think the number is accurate.

An additional ANOVA test revealed that religiosity and political affiliation were also associated with differences in belief. On a seven-point scale, the overcount participants were 1.1 points more conservative on average than the undercount participants; the accurate count participants were 1.14 points more conservative than the undercount participants (F3,295 = 8.227, p < 0.001). Those who believe the numbers are accurate or overcounted also agreed more strongly to the statement “I am very religious,” “My religion is very important to me,” and “I believe in God” when compared to the undercount position (F3,295 = 9.610, p < 0.001).

4.4 Characteristics for changed opinion

Anomie, or the breakdown in belief in social bonds, was higher in those who changed their mind (3.38) than in those who didn’t (3.12) (t297 =  − 2.147, p = 0.033).

Trust in government was higher in those who changed their mind (3.57) than those who did not (2.99) (t100.028 =  − 4.18, p < 0.001).

Those who changed their mind had more formal education (5.21) than those who did not change their mind (4.68) (t118.965 =  − 4.111, p < 0.001).

Those willing to change their mind also had more preexisting conditions (2.54) compared to those unwilling to change their mind (1.7) (t297 =  − 2.197, p = 0.029).

Lastly, those willing to change their mind were less concerned about the economy being negatively impacted from the coronavirus (3.16) than those who did not change their mind who had higher levels of concern (3.8) (t297 = 3.9, p < 0.001).

4.5 News media consumption

A Chi-squared test of independence showed that certain news media outlets had an impact on what the participant believed. The number of participants from each group who read or view the following news resources are shown in Fig. 5. Fox news (χ2 = 12.191, p = 0.007), One America News Network (χ2 = 13.379, p = 0.004), National Newspapers (χ2 = 11.495, p = 0.009), Liberal News Websites (χ2 = 8.641, p = 0.034), Conservative News Websites (χ2 = 13.863, p = 0.003), and Facebook (χ2 = 14.977, p = 0.002) all had differences. The White House Briefings, President Trump’s Twitter Feed, Instagram, Twitter, Reddit, Satire, general news websites, news magazines, local and national radio, local television news, BBC, MSNBC, or CNN consumption was not associated with a difference in view on COVID-19 mortality figures.

Fig. 5
figure 5

Differences in news media consumption on opinions about COVID-19 death count

To further understand the effect of news media consumption on our participants' response to the flagged tweets, we performed Mann–Whitney tests to identify whether there were differences for regular consumers of Fox News, National Newspapers, and news sourced from Facebook. Fox News consumers were statistically significantly different from those who do not consumer Fox News, showing a willingness to continue to engage with the tweet by following the account (2.97), retweeting (2.92), and liking the tweet (2.85) compared to those who do not view Fox News who were less likely to follow (2.08), retweet (2.09), and like (2.17); U = 7103, p < 0.001, U = 7370, p < 0.001, U = 8129, p < 0.001. The same pattern held for attitudes toward the tweets, with Fox News viewers rating the tweet’s usefulness, interest level, trustworthiness, credibility, accuracy, and relevance more highly, even with bot and misinformation flags, than those who do not watch Fox News. Seeking news from Facebook also led to a resistance to the flags, with those participants continuing to engage and keep their attitude ratings higher than those who do not use Facebook for news. The National Newspaper readers, on the other hand, were distinct from those who do not read national newspapers because they decreased the amount of engagement they had with the tweets after seeing the flags. National newspaper readers were less likely to follow (2.14) or like the tweet (2.22) compared to those who do not read national newspapers (follow: 2.67, like: 2.59); U = 7945, p = 0.004, U = 8501, p = 0.038.

4.6 Hours on social media and news media

A Pearson correlation identified a trend for participants in how their rating changed after viewing the Tweets with a flag for a suspected bot account and when there was a flag for both the bot account and misinformation (n = 299, p < 0.001). With a greater number of hours spent on social media both before and during the pandemic, participants were more likely to continue engaging with the tweet through follows, retweets, or likes, and rate the Tweet as having higher usefulness, interest, trust, credibility, and accuracy. While those who spent fewer hours on social media lowered their rating of the Tweets on engagement and perception after seeing the cautionary flags, the high-volume users kept their scores higher (See Fig. 6).

Fig. 6
figure 6

Hours spent on social media correlated with higher tweet rating despite flags

The trend was the same for hours spent consuming news media both before and during the pandemic. A Pearson correlation (n = 299, p < 0.001) showed a trend where those who spent more hours consuming news media kept their rating of the tweet’s usefulness, interest, trustworthiness, credibility, and accuracy high; they also were more likely to continue to engage through follows, retweets, and likes. The cautionary flags had less impact on regular news media consumers who continued to rate and engage at higher rates (See Fig. 7).

Fig. 7
figure 7

Higher news consumption in hours correlated with higher tweet rating despite flags

5 Discussion

5.1 Bot flags change participants’ engagement and attitudes about tweets

Our results strongly suggest that Twitter message flags negatively affect participants’ perceptions of unreliable tweets. After seeing a warning that the tweet comes from a suspected bot account, participants in both groups decreased their willingness to engage with the tweet and lowered their opinion regarding how useful, interesting, trustworthy, credible, helpful, accurate, and relevant they found the tweet. (Participants rated the unlabeled tweet as highly biased and did not increase their bias ratings after seeing the flags.) These responses suggest that efforts to flag bot content may mitigate their ability to spread misinformation on Twitter. These results align with previous studies finding that people consider identified or suspected bots as less trustworthy than human-generated content (Waddell 2018; Bruzzese et al. 2020; Graefe and Bohlken 2020; Jakesch et al. 2019). Additionally, because “the author is the feature that led the users to the most accurate perceptions” about the veracity of information (Zubiaga and Ji 2014), it is not surprising that finding out that Twitter posters are bots reduces user attitudes about the tweet.

5.2 Misinformation flags change participants’ engagement and attitudes about tweets

Results similarly showed that flagging the tweet as containing misinformation further lowered participants’ willingness to engage it and negatively affected participant perceptions, especially in terms of the trustworthiness, accuracy, and credibility of the tweet. These responses show encouraging indications that fact-checking can help combat the COVID-19 infodemic. This experiment reinforces similar findings from other studies about the effectiveness of identifying online health misinformation (Oh and Lee 2019; Kim and Dennis 2019; Bode and Vraga 2018), and our results show that merely identifying misinformation, without offering corrections, can decrease participants’ opinions of the misinformation source. Unlike Colliander (2019), these results demonstrate that misinformation flags from social media companies can lower participants’ attitudes about inaccurate tweets. Twitter reported the impact of labeling tweets that contain false information during the 2020 Presidential election (Gadde and Beykpour 2020). During a 16-day period around the 2020 election, Twitter labeled 300,000 tweets as containing disputed or misleading information. A subset of these tweets (n = 456) were flagged and locked so only limited user engagement was allowed. For these 456 tweets, Twitter would only allow them to be “quote tweeted,” where tweets are retweeted but must include an original comment from the user retweeting. Twitter reports tweets with a misleading information flag experienced a 29% decrease in user quote tweets. Further, our research suggests that multiple flags work better than one; flagging tweets for both suspected bot authorship and misinformation decreased positive attitudes more strongly than a bot flag alone, indicating that social media platforms may want to provide multiple flags identifying various issues with unreliable content.

However, misinformation flags did not affect all participants equally. People who reported spending more time on social media showed more resistance to both flags, suggesting that perhaps these participants who spend more time on social media have greater trust in online information or have grown immune to warnings from social media companies about information veracity. These results connect with Allington et al. (2020), who found a link between the use of social media for COVID-19 information and a propensity toward COVID-19 conspiracy beliefs. Additionally, research has connected trust in social media with the likelihood of sharing fake COVID-19 news online (Laato et al. 2020). Participants who spent more time watching news also showed more resistance to the flags overall, though differences also emerged based on the news source. Participants who watched network news were more responsive to the flags, a result that evokes a correlation Allington et al. (2020) found between watching network news and participating in COVID-19-related health-protective behaviors. Inversely, participants who reported getting news from Fox News or Facebook had more positive reactions to the tweets even after viewing the flags. Dhar et al. (2016) proposed a rumor control model where an “authenticated news agency” can flood a social network with counter statements that dilute the effects of misinformation (p. 56). Our study shows the limitations of counter statements in practice when individual users pick and choose who they believe is an authentic news source.

5.3 Flags change some participants’ minds about COVID-19 misinformation

The flags also showed some effectiveness at changing people’s opinions about the COVID-19 death tolls, but more notably in participants who believed numbers were over rather than undercounted. Twenty-one participants who initially felt that death tolls were overcounted changed their mind, though only 10% (8) from this subset changed their minds to believe the counts were accurate. Meanwhile, only sixteen participants who initially believed that death counts were undercounted changed their minds, and only around 4% (8) from that subset changed their minds to believe the counts were accurate. It appears that the flags more effectively changed minds about individual tweets than people’s opinions overall. However, the ability of some participants to change their minds, combined with their lowered perceptions of the tweets, suggests that the flags did not activate a backfire effect, where people respond to fact-checking by becoming further entrenched in their original viewpoints. These results are especially encouraging because participants were sorted into groups to see tweets that aligned with their previously held beliefs, allowing us to test more directly for backfire effects.

Exposure to the tweets also changed the minds of several participants who initially felt that U.S. COVID-19 counts were accurate. Sixteen participants, 23% of those who started the experiment believing the counts were accurate, ended believing that the COVID-19 counts were under- or overreported. While seven ended up disagreeing with the narrative they saw in the flagged tweets, nine participants ended by agreeing with the narrative they saw in the flagged tweets. Exposure to misinformation, even when it is paired with warning flags about its authorship and accuracy, may cause some participants to adopt more conspiratorial views. Though the flags lowered perceptions about the tweets regardless of participant perception of COVID-19 death counts, the misinformation may overpower the flags in some instances. Additionally, flags may work more effectively for participants who align with the misinformation than for those who hold differing or uncommitted beliefs, where their response is just as likely to be agreement as it is disagreement after they view the flagged tweet.

In addition to observing that participants can change their mind after viewing the flagged tweets, we found individual differences also influenced the likelihood participants would change their mind after exposure to the flags. Individual attitudes such as anomie (the view that there is a societal breakdown in norms) and an individual’s trust in the government correlated with a greater likelihood that users would change their minds by the end of the experiment. Further, individuals changed their minds more frequently when they had higher levels of education and more preexisting conditions that make them vulnerable to COVID-19. Despite this effect of preexisting conditions, the experiment showed no effects for COVID-19 anxiety on either responses to the tweets or the likelihood that participants would change their minds, a finding that aligns with Laato et al. (2020).

5.4 Limitations and future research

Studying the impact, spread, and prevention of misinformation on social media is necessary as modern society relies on belief and trust in authorities and public health information to respond collectively to global crisis and tragedy. While this study is novel in its approach to filtering respondents to different versions of the survey instrument depending on their existing beliefs, there are some limitations. First, the window of data collection, September 8–10, 2020, was less than sixty days before a polarized national election in the United States. With Twitter’s tendency to heighten partisan political communication, the ecosystem that Twitter users experienced during this window is more intense than normal which could impact how our participants responded to the flagged Tweets. Additionally, the spread of Twitter information is a complex contagion (Monsted et al. 2017) so it may be necessary to test multiple exposures to conspiracy theories and flagged content. While the current study did facilitate multiple exposures to a single tweet, future studies could add in longer periods of time between exposures to measure the persistence of the effects identified. We might also include full interactivity, like the ability to see who shared and commented, as well as full functionality, with other ways to respond as a participant. Other future studies might look at the source of the flags, whether government, the social media company, or other users, to see which source is most effective at correction and changing belief in conspiracy theories. Additional research might address the psychological perceptions of the flags as visual cues and how different colors, symbols, or language impact participants. Other flags that are more forceful in offering correct information may also allow the identification of the bright line for preventing the backfire effect (Wang et al. 2019). Finally, a realistic experiment that utilizes social network mining to monitor the real-world behavior of those exposed to flagged tweets would provide a glimpse of true behavior (follow, share, tweet, and retweets) rather than self-report data alone.

6 Conclusion

Identifying solutions for the COVID-19 infodemic requires a careful consideration of technical challenges alongside the human ones. The current study helped clarify the ways that human users respond to and interact with flagging techniques when content is identified as misinformation or identified as propagated by a bot account. There is evidence that flags can change engagement behaviors and most user attitudes toward misinformation, suggesting that it should be paired with automatic fact-checking strategies. There are still challenges that remain unresolved, with more regular users of social media showing an immunity to the flags. The stakes are high: Gruzd and Mai (2020) found that “power users” had an outsized impact on spreading the #FilmYourHospital conspiracy theory that COVID-19 is a hoax. When the power users (conservative politicians and right-wing political activists) encouraged their followers to film empty hospital rooms and waiting rooms, social media misinformation inspired an immediate threat to public safety. Multiple content and source flags may have lowered the spread and persuasive power of #FilmYourHospital messages. To ensure that everyone gets high-quality public health and safety information from credible and authoritative sources during large-scale crises, multiple strategies should be used to protect the integrity of our social media networks.