Introduction

With the business model to make profit from advertisement, online social networks (OSN) were established in early 2000s. With billions of users worldwide, sharing their personal information and views on various issues, the information value of such platforms was discovered by companies, organizations, states, and individuals. With the possibility to post any content free of fact-checking filters [1] or source verification [2] with no legal consequences, comment on other posts, no face-to-face interaction, and stay anonymous, such platforms have become the ground for spreading misinformation and extreme opinions by individuals, organizations, and states with various goals.

OSN have become the primary discussion channel of political opinions. Their political content and the way it is spread throughout the platform has the power to alter public opinion [3, 4]. Yet, they were not invented with this aim in mind. OSN were developed to make revenue through advertisement. The higher the number of users, the time they spent on the platform, and their engagement (posting, liking, commenting, etc.), the higher the advertisement revenue. With this business model in mind, user satisfaction [5] and engagement translate to profit. In this business model, users are customers, whose satisfaction is important, while in the underlying political discourse, users are citizens [6].

OSN provide citizens with the opportunity to not only be the audience of news and opinions but to engage in a discussion, express their views by reacting and commenting [7], and participate in shaping and directing the political discourse. This new type of interactivity has posed political campaigns and other political influencers, benign or malignant, to a new set of opportunities and adversities. Because of this two-way interactivity and the possibility for citizens to enter the discussion and express their opinions, TV and newspapers are slowly giving their place as the primary source of news to OSN, websites, and blogs [8,9,10]. This has not been unnoticed by political campaigns who seek elections, as well as, political activists and commentators, and malign actors who attempt to influence and distort the news and its spread toward their ends and goals. Twitter, Facebook, and Instagram have observed immense presence of different political figures and political candidates who deploy OSN platforms to broadcast their activities and opinions on a wide range of national and international matters [11,12,13,14].

While misinformation refers to false information, which misleads the readers, and its unintentional spread, disinformation refers to the deliberate spread of false information (misinformation) by an entity to mislead the readers. If the false information is in line with someone’s existing views, s/he is vulnerable to believing it, without questioning its source or factuality [15], because people tend to consider their perception of reality as truth [16]. Disinformation operations are conducted on OSN by individuals and states, to influence internal matters in their own country or to reach directly to citizens of another country, equip them with false information [3, 4], and thus change the direction of their political discourse, amplify their problems, or sow mistrust and anonymity among them. Disinformation operations do not target the physical infrastructure. They target the democracy’s soft spot at its heart, the free speech. Democracies rely on people and disinformation feeds people with misinformation. Disinformation has slow, invisible, and complicated impacts, unlike wars which have swift, highly visible, and easily understandable impacts.

Disinformation operations were conducted by domestic actors to: impugn President Obama’s religion and birthplace [17, 18], negate public opinion on Affordable Care Act [19, 20], misrepresent the evidence with regard to Iraq’s role in the 9/11 attack and its mass destruction weapons [21, 22], and undermine the climate change’s factuality [23, 24], and by foreign actors to cast doubt on the credibility of the U.S. political system and 2016 federal election and to polarize U.S. citizens [25, 26]. In 2018, Cambridge Analytica sent personalized political messages to U.S. citizens to influence their opinion about U.S. internal policies [27]. Misinformation operations have been utilized by other countries as well [28,29,30]. Piña-García and Espinoza [31] exposed how coordinated campaigns (i.e. astroturfing) were used to influence and manipulate public opinion during the coronavirus health crisis in Mexico and provided insight into how they were detected.

All this highlights the democracy’s vulnerability to disinformation operations and the importance of studies on understanding them. This study focuses on visualizing the geographical distribution of political terms, parties, misinformation, extremism, and topics among Tweets, during the USA 2020 presidential election and attempts to answer the question, whether there is any correlation between the aforementioned classes of Tweets and the election results. To this aim, 1,349,373 original Tweets have been collected in real-time from April 2020 until January 2021, based on four terms: Trump, Biden, Democrats, and Republicans. Out of these Tweets, 40,000 were manually labeled based on political affiliation, the Tweet’s topic, factuality of the information within, and presence of extremism. Then a long short-term memory (LSTM) network was trained to automatically classify the entire set of Tweets into these classes. Since almost all Tweets lack the geographical tag, the location description of the user posting the Tweet was dissected using natural language processing (NLP) methods to identify which state in the USA they reside in. Geographical information systems (GIS) capabilities were deployed to tie the geographical location to the predicted information and visualize the distribution of misinformation, extremism, political affiliation, and topics across the country. Finally, correlation coefficient is calculated between the number of Tweets in the aforementioned classes in each state and the number of votes for Trump and Biden. It was shown that there is a correlation between the size of the aforementioned classes of Tweets and the election results. For instance, a higher ratio of Tweets affiliated with one party is correlated with a higher ratio of votes for that party’s candidate.

The following section reviews some of the related literature to this work. Sect. ‘‘Data description’’ describes the data collection process. Sect. ‘‘Classification of the Textual Content of Tweets’’ outlines the automatic classification of Tweets based on their political affiliation, topic, and presence of misinformation and extremism. Sect. ‘‘Identifying the Geographical Location’’ explains how each Tweet is associated with a state in the USA. Sect. ‘‘Geographical Visualization of Labeled Tweets and Their Correlation with Election Results’’ provides geographical visualizations of the distribution of misinformation, extremism and topics along with their political affiliation across different states in the USA and investigates the correlation between these classes and the USA 2020 presidential election results. Sect. ‘‘Conclusions and future directions’’ concludes this paper and provides future research directions.

Related work

Desouza et al. [27] enumerated the following factors as to why OSN play a key role in shaping the political discourse: (a) data volume and diversity of the data sources, (b) analytical methods that extract semantic knowledge from large volumes of data,(c) automatic algorithms that learn citizens’ personal views and preferences, (d) advancements in behavioral science that provides tactics for persuading humans toward particular actions [32], and (e) the ability to test and modify aforementioned techniques on OSN at a relatively low cost.

Most OSN users are passive, i.e. they read the content [33] without contributing in it [34]. Consequently, active and hyperactive users shape the OSN content. Hyperactivity is one of the information operation tactics to influence discussions by intensively contributing to OSN content [35, 36]. This could be done by both human and automated accounts. Automated accounts (aka bots which is short for robot) are autonomous software whose tasks resemble those of a human on Twitter, such as liking, tweeting, and retweeting, but for specific goals and on a large scale. They execute their actions through the Twitter API. Hyperactive users are those who over-proportionally distribute their political opinions on OSN, compared to regular users, by liking, commenting, tweeting, or other possible means. By overrepresenting the political issues and opinions that are important to them, hyperactive users deform the actual picture of public opinion on OSN and distort the public communication and discussions toward their ends. Papakyriakopoulos et al. [37], Thieltges et al. [38], and Shao et al. [39] showed that the recommendation systems in OSN not only do not prevent this distortion, but also magnify hyperactive users’ interests. OSN recommendation algorithms promote most popular or most liked information. In their business model this translates to more advertisement revenue because popular content encourages users to engage more and spend more time on the platform.

Additionally, these algorithms on platforms such as Facebook, Twitter, and YouTube are designed to offer users content deemed likely to engage them. They do not offer citizens a neutral space to engage in conversations, but offer them highly contrived, personalized media experiences designed to serve the needs of advertisers [1]. This, in turn, results in feeding inflammatory content to users who are drawn to extremist narratives with continually more of the same, leading users down a rabbit hole of extremism [1, 40, 41].

Bots (fake accounts) on OSN have extensively been used to influence political campaigns by increasing a candidate’s number of followers, increasing a post or hashtag’s number of followers, and providing positive or negative comments on other posts [42]. As a side effect or second order effect, this also distorts the statistics reported by mainstream news media about public sentiments on different candidates and opinions.

A few methods are proposed to stop or slow down online disinformation operations through OSN and search engines. Costine [43] proposed to flag and down-rank inaccurate claims on OSN, detected by outside fact-checkers. Garrett and Weeks [44] proposed to bring contextual awareness by involving some users from the poster’s social network in the fact-checking process. In other words, allowing users to correct their peers. Google is introducing two features, claim accuracy and source reliability, to be embedded in their search engine to fight disinformation operations. One focuses on informing the user that a piece of information or the retrieved result is false by having the word false mentioned next to that specific search result [45, 46] and the other focuses on down-ranking the retrieved information when its source has a low trustworthiness rate [47]. For the earlier approach, Google is encouraging outside fact-checker organizations to make their results machine-readable, so that their search engine can automatically embed the fact-checking results into search results [45, 46]. For the latter approach, Google is working on algorithms to rate the trustworthiness of the sources of search results [48].

Garrett [49] proposed to contain the spread of messages and posts that carry extreme anger, outrage, and distrust as another approach to reduce the exposure of socially harmful falsehood to mass audience, on OSN. This approach is based on the fact that disinformation operations often take advantage of emotional extremity to influence how people respond and react to their information environment. An exploratory analysis by Shu et al. [50] on multiple OSN political and entertainment datasets showed that:

  • Users who post real news tend to have longer account ages than those who post false news, implying that fresh accounts are more probable than old ones to be intentionally created for spreading false news,

  • Real news has more neutral replies over positive and negative replies, whereas false news has more negative replies,

  • Real news tends to have a bigger ratio of likes and replies, whereas false news tends to have a bigger ratio of Retweets, and

  • The number of Retweets for real news steadily increases over time, whereas that number suddenly jumps in the beginning and then remains constant over time for false news.

Some attempts have been made to automatically detect political misinformation on OSN. Among the features used in machine learning models to detect false political information are: features extracted from the textual content [2, 50,51,52], sentiment [53], polarity [52], subjectivity [52], disagreement [52], hashtags [2], number of replies [52], number of images, videos, question marks, exclamation points, first/second/third-person pronouns, and smile emoticons in the Tweet thread (conversation tree) [52], account age [52], user engagement features (i.e. number of replies, re-Tweets, and likes) [50, 51], features extracted from user friendship network [51, 52], and user profile features (e.g. user credibility and political affiliation) [51], topology of the diffusion network [53], and features extracted from the content of the URLs mentioned in the Tweet [2]. Among these features, the textual content, hashtags, number of images, videos, and smile emoticons in the Tweet thread (conversation tree), user engagement features, user friendship network, diffusion networks, and the content of the URLs mentioned in the Tweet have shown to be the most effective in identifying false political information.

Our work stands out because it uses both the user’s location description and the Tweet’s content and combines the power of deep learning, NLP, and GIS to visualize the distribution of misinformation, extremism, and topics along with their political affiliation across USA, during the USA 2020 presidential election. It also investigates any correlation between Tweets and the election results.

Data description

Twitter is chosen for this project because not only is it a prominent example of OSN [54], with currently over 330 million monthly active users [55], but also it is a public forum where everyone’s posts are publicly available, and it plays an exceptional role in spreading political misinformation [27]. Original Tweets, written in English, containing at least one of the following four terms: Trump, Biden, Democrats, and Republicans have been collected by our server in real time from the beginning of April 2020 to the end of January 2021, using Python [56]. This resulted in 1,349,373 original Tweets (not Retweets or Replies). Table 1 shows the percentage of Tweets containing each of the four keywords. While 605,225 different Twitter accounts published these Tweets, 74% of these accounts posted only one Tweet, 12% posted only two Tweets, and the remaining 14% posted more than two Tweets. In other words, 14% of the Twitter accounts posted 44% of our collected Tweets. Table 2 shows the accounts posting the largest number of Tweets in our collection.

Table 1 Percentage of 1,349,373 original Tweets collected from April 2020 to January 2021 containing each keyword
Table 2 Twitter accounts that posted the largest number of Tweets in our collection, along with their number of Tweets

Classification of the textual content of tweets

We studied 40,000 Tweets with regard to their relevancy to the USA 2020 presidential election. The relevant Tweets were manually classified in three different ways: whether or not it contains misinformation, whether or not it contains extreme opinion, and whether it is rightist, leftist, or neutral. Misinformation Tweets are those propagating false information and news, conspiracy theories, and lies, as long as their falsehood can be determined through valid sources. Extreme opinion Tweets aim to create violence or radicalize people based on their political party, religion, race, etc., or undermine the country’s political system and federal and local organizations. Leftists Tweets favor Democrats while Rightist Tweets favor Republicans [57].

Additionally, Tweets are classified based on their topic. Classes are not exclusive and a Tweet might fit into more than one topic. Two main topics among the 40,000 Tweets that were manually investigated include: Coronavirus pandemic (30.12% of Tweets) and Tweets that talk about politicians (74.95%). There are other topics, such as government policies (9.4%), USA institutions (9.58%), and elections (%10.51), but because of their small size, the machine would not be able to sufficiently get trained to automatically classify them.

Recurrent neural network (RNN) provides the possibility of classifying a text as a whole while taking both the sequence of a stream of textual information of arbitrary length and their contextual information into account. RNN captures more contextual and structural details from the text than just term frequencies, enabling it to distinguish among documents that a classifier based on term frequencies might not. The classification steps by RNN are as follow [58]:

Algorithm 1: RNN steps for text classification

1. Tokenization

The Tweet’s text is broken into individual terms

2. Constructing a feature vector for each token

A feature vector is created for each token using word embeddings

3. Classifying the first token

RNN receives the feature vector for the first token and generates a class label as output. However, this is not considered as the class label for the entire Tweet yet

4. Classifying the next tokens, one by one, in the same order, until the last token in the Tweet

RNN receives the feature vector for the second token in the Tweet. It produces a class label as output. When RNN attempts to classify the second token, it has in its memory how and why it assigned a specific class label to the first token and it applies that knowledge when it attempts to classify the second token. In other words, while the first token is classified independently, the second token is classified not only based on its own feature vector, but also based on how and why the previous token was classified in a specific category

Then RNN receives the feature vector for the third token in the Tweet. When it attempts to classify this feature vector, it also remembers and applies the knowledge of how it classified all the previous tokens. All tokens need to be fed to the RNN, one by one, until the last token

If a second Tweet needs to be classified, RNN first wipes its memory of how it classified the tokens from the previous Tweet. In other words, RNN classifies the first token of the second Tweet independently, regardless of the previous Tweet’s classification.

Long short-term memory (LSTM) network [59] is a state-of-the-art architecture of RNN for classifying (i.e. labeling) a Tweet’s textual content. An LSTM network (shown in Fig. 1) is composed of:

  • An input layer (shown with xt in Fig. 1), where the number of neurons in the input layer is equal to the number of explanatory variables in the feature space.

  • One or more hidden layers (the gray area marked as memory cell in Fig. 1) which produces a hidden state (shown with ht in Fig. 1) at every time step, and

  • A multi-layer perceptron (MLP) with softmax output layer which receives the hidden state generated by the memory cell (shown with ht in Fig. 1) and produces a class label. There are as many units in the output layer of this MLP as there are classes. Each class is locally represented by a binary target vector with one non-zero component.

Fig. 1
figure 1

Architecture of the LSTM memory cell

Hidden layers, also referred to as memory cells, are the main characteristic of LSTM networks. The structure of a memory cell (or hidden layer) in LSTM is illustrated in Fig. 1. Each memory cell consists of three gates whose values range from 0 to 1, acting as filters:

  • The forget gate (ft) specifies which information is erased from the cell state,

  • The input gate (it) controls which information is added to the cell state, and

  • The output gate (ot) decides which information from the cell state is passed to the hidden state (ht).

At every time step t, each of the three gates is presented with the input vector (xt) at time step t as well as the output vector of the memory cells at the previous time step (ht-1).

Identifying the geographical location

The Twitter server returns Tweets as JSON objects with multiple fields. The textual content of the Tweet is only one of these fields. There are four fields that are related to geographical location: geographical coordinates of the Tweet, geographical coordinates of the Twitter account owner (referred to as the user), description of the place of the Tweet, and description of the location of the account owner on their profile. The first two location fields automatically fill up if the user permits and remain empty otherwise. The other two are arbitrary texts written by the user, which can be left empty. Table 3 shows the number and percentage of Tweets that their location fields are not left empty, out of 1,349,373 original Tweets that we collected.

Table 3 The number and percentage of Tweets that their location fields are not left empty, out of 1,349,373 original Tweets

According to this table, the only field that can meaningfully be used to generate the location of a considerable portion of Tweets is user location description (the last column in the above table). However, writing a program that would automatically extract the state or city from this field faces multiple challenges: (a) there are duplicates or overlaps in city names, i.e. cities with the same or similar names in different states or countries, (b) it is an arbitrary text and does not follow any standards or formats, (c) cities or states are sometimes spelled in innovative or strange ways by authors, and (d) the user can write anything in this box which might not necessarily describe the user’s location. Notwithstanding these challenges, a program was written to look for certain patterns in this field and identify the state in which the user resides from the text. These patterns include the full state names, state name abbreviations, and unique city names within each state. The program was adjusted multiple times to assure that states are not mistakenly identified due to, for instance, cities having the same name in two states or countries, city names (e.g. Charlette) having an extended version that is the name of another city in another state (e.g. Charlottesville), and state abbreviations or city names that can be part of another word.

It is noteworthy that the stricter the rules, the higher the precision (correctness) of identifying the states, but the lower the recall (i.e. not identifying any state at all in many cases). By revising and adjusting the rules, we were able to identify the state for 486,969 of Tweets (recall: 53.90%), out of the 903,546 Tweets that their user location description was not empty, with a precision (or correctness) of 99%. The 99% accuracy was revealed after a manual investigation of 5000 automatically identified states.

Geographical visualization of labeled tweets and their correlation with election results

Table 4 shows the ten-fold cross-validation accuracy of the LSTM network in classifying Tweets. For political affiliation, only overall accuracy is reported since it is a three-way classification: Leftist, Neutral, and Rightist. The highest F1 score was achieved for Topic 1 (Coronavirus pandemic), followed by misinformation, Topic 2 (politicians), and extreme opinion. Next, the entire set of Tweets (1,349,373 original Tweets) are classified using the machine.

Table 4 Ten-fold cross-validation: overall accuracy, precision, recall, and F1 score in predicting misinformation, extreme opinion, political affiliation, and topics

Figure 2 shows the number of times that the word ‘Trump’ appeared in Tweets divided by the number of times that the word ‘Biden’ appeared in Tweets, in each state. Figure 3 shows the collective frequency of three words: ‘Coronavirus’, ‘Corona’, or ‘Covid’ among Tweets divided by the collective frequency of the entire lexicon among Tweets, in each state. In other words, the percentage of the total number of words in Tweets that are either: ‘Coronavirus’, ‘Corona’, or ‘Covid’.

Fig. 2
figure 2

The number of times that the word ‘Trump’ appeared in Tweets divided by the number of times that the word ‘Biden’ appeared in Tweets in each state, from April 2020 to January 2021

Fig. 3
figure 3

Percentage of the total number of words in Tweets that are either: ‘Coronavirus’, ‘Corona’, or ‘Covid’ in each state, from April 2020 to January 2021

Figure 4 shows what percentage of all Tweets were automatically labeled as misinformation in each state. Figure 5 shows the automatically detected political affiliation of misinformation Tweets in each state. Figure 6 shows what percentage of all Tweets were automatically labeled as extreme opinion in each state. Figure 7 shows the automatically detected political affiliation of extremist Tweets.

Fig. 4
figure 4

Percentage of all Tweets that were automatically labeled as misinformation in each state, from April 2020 to January 2021

Fig. 5
figure 5

Automatically detected misinformation Tweets in each state, from April 2020 to January 2021, colored based on political affiliation

Fig. 6
figure 6

Percentage of all Tweets that were automatically labeled as extreme opinion in each state, from April 2020 to January 2021

Fig. 7
figure 7

Automatically detected extreme opinion Tweets in each state, from April 2020 to January 2021, colored based on political affiliation

Figure 8 shows what percentage of all Tweets were automatically labeled as related to the Coronavirus pandemic in each state. Figure 9 shows the automatically detected political affiliation of the Coronavirus pandemic Tweets in each state. Figure 10 shows what percentage of all Tweets were automatically labeled as related to politicians in each state. Figure 11 shows the automatically detected political affiliation of Tweets about politicians in each state.

Fig. 8
figure 8

Percentage of all Tweets that were automatically labeled as related to the Coronavirus pandemic in each state, from April 2020 to January 2021

Fig. 9
figure 9

Automatically detected Tweets about the Coronavirus pandemic in each state, from April 2020 to January 2021, colored based on political affiliation

Fig. 10
figure 10

Percentage of all Tweets that were automatically labeled as related to politicians in each state, from April 2020 to January 2021

Fig. 11
figure 11

Automatically detected Tweets about politicians in each state, from April 2020 to January 2021, colored based on political affiliation

We obtained the number of votes for Trump and Biden in each state in the USA 2020 presidential election via [60]. Table 5 shows the correlation coefficient between the relative number of automatically classified Tweets in each category (the class size divided by the total number of Tweets in each state) and the relative number of votes for each candidate (the number of votes for each candidate divided by the total number of votes in each state). Correlation coefficients between -0.2 and 0.2 are not shown. The most significant observation is the impact of misinformation on the election results. While Leftist misinformation has a positive correlation with Biden votes and a negative correlation with Trump votes, Rightist misinformation has a positive correlation with Trump votes and a negative correlation with Biden votes. The ratio of Rightist versus Leftist misinformation Tweets has a 0.67 correlation coefficient with the ratio of Trump votes versus Biden votes. This highlights that political misinformation on social media is fairly correlated with who people ultimately vote for. A similar observation is made with regard to extremist Tweets, but with a much lower correlation coefficient. The ratio of Rightist versus Leftist extremist Tweets has a 0.33 correlation coefficient with the ratio of Trump votes versus Biden votes.

Table 5 Correlation coefficient between Tweet analytics (rows) and USA 2020 presidential election results (columns); in this correlation matrix darker brown indicates higher positive correlation and darker blue indicates higher negative correlation

The correlation between Tweet topics and votes is also noteworthy. While the topic of Coronavirus had a positive correlation with Biden votes, the topic of politicians had a positive correlation with Trump votes. The number of Tweets about the Coronavirus pandemic has a − 0.42 correlation coefficient with the ratio of Trump votes versus Biden votes. The number of Tweets about politicians has a 0.26 correlation coefficient with the ratio of Trump votes versus Biden votes. Considering the political affiliation of Tweets will provide a clearer picture of how they are correlated with the election results. Rightist Tweets on either topic have a positive correlation with Trump votes and Leftist Tweets on either topic have a positive correlation with Biden votes. The ratio of Rightist versus Leftist Tweets about the Coronavirus pandemic has a 0.53 correlation coefficient with the ratio of Trump votes versus Biden votes. The ratio of Rightist versus Leftist Tweets about politicians has a 0.43 correlation coefficient with the ratio of Trump votes versus Biden votes.

Conclusions and future directions

In this paper, machine learning, natural language processing, geographical visualization, and statistical tools were used to understand the relationship between political misinformation and extremism, topics, and their political affiliation among conversations on social media on one hand, and the USA 2020 presidential election results on the other hand. After automatic classification of Tweets based on the aforementioned classes, it was shown that there is a correlation between these factors and the election results. The strongest correlation highlighted that the ratio of Rightist versus Leftist misinformation Tweets has a 0.67 correlation coefficient with the ratio of Trump votes versus Biden votes. A similar result but with a correlation coefficient of 0.33 was obtained with regard to extremism. Rightist Tweets about the Coronavirus pandemic or politicians were found to have a positive correlation with Trump votes and Leftist Tweets on either topic had a positive correlation with Biden votes. The prevalence of Tweets about the Coronavirus pandemic had a positive correlation with Biden votes and the prevalence of Tweets about Politicians had a positive correlation with Trump votes. This indicates that there is a correlation between what happens on Twitter and how people vote, however, this is not a causal inference. In other words, it is not known whether it is topics, misinformation, and extremism on social media that drive how people vote or vice versa. Was the misinformation and extremism on social media that convinced people to vote one way or the other, or people had already made their mind before engaging in online conversations? Did the topic of Tweets on social media caused people to vote one way or the other, or was it people’s already decided votes that made them engage in certain conversations on social media? While these questions are deferred to another study, we showed that the political affiliation of topics and the extent of misinformation and extremism on social media are correlated with the election results to some level.