Introduction

The COVID-19 pandemic has affected, and continues to affect, lives worldwide in an unprecedented way. At the same time, the amount of information that has been generated during the pandemic is unprecedented. Social media users have created large amounts of publicly available communications that capture their views, opinions, concerns, thoughts, and knowledge about the pandemic. Our research investigates the text content of some of that social media, focusing on Twitter (posts) tweets, but also on Google and Wikipedia searches, to study human communications during the pandemic.

We use text analysis to extract and classify opinions and study how internet search data are predictive of Twitter tweet sentiment. Using both text analysis tools and manual assessment, Twitter tweets are analyzed for their content and expressions of sentiment and psychological content. We study how characteristics of the social media (e.g., pictures or no pictures) lead to different text concepts within Twitter messages. We investigate the relationship between Google and Wikipedia searches and the sentiment of Twitter messages creating a model relating search and social media. We also examine how a dictionary of key COVID concepts discussed in Twitter tweets are related to the extent that social media messages will get retweeted and add to the sender’s reputation in the context of the social media. Throughout, we use text analysis because it provides insight into the social media text provided by Twitter users [1, 2]. Accordingly, we use social media data, in the form of Twitter tweets and Google and Wikipedia searches, all of which were collected during the pandemic. Our specific objectives are to; analyze the sentiment and emotional content in Twitter tweet texts using both computer-based and manual methods; study the differences in text concepts identified in different types of Twitter tweets and retweets; develop a model of “search and tweet” and examine the ability of Google and Wikipedia searches to predict the sentiment in Twitter tweets; investigate the impact of the use of COVID-19 specific concepts (words) on human communication through Twitter; and identify general implications, beyond the COVID-19 pandemic, of the findings identified in this paper.

This research makes several contributions. First, the analysis shows that the text sentiment and emotional content of Twitter messages, with and without video and pictures, is statistically significantly different. Second, the text sentiment and emotional content in COVID-19-based tweets and retweets were statistically significantly different. Third, Google searches and Wikipedia page views are predictive of the percentage of positive and negative sentiment tweets. Fourth, a variable representing a dictionary of words capturing COVID-19 concepts was statistically significant in models of Twitter influence and retweets indicating the impact on human communication. In addition, we propose, as the basis for needed further analysis, a possible COVID-19 health management life cycle and further types of analysis that might provide useful with respect to different types of sentiment (e.g., neutrality, ambivalence).

The remainder of this paper proceeds as follows. The “Data and Methodology” section provides an overview of the data and the methods used. The “Twitter Text Analysis: Tweets vs. Retweets and Pictures vs. No Pictures” section discusses text analysis in Twitter tweets and investigates different levels of emotions in different types of tweets. The “Behavioral Links Between Internet Search and Twitter Sentiment” section examines the behavioral link between internet search, using Google and Wikipedia, and Twitter sentiment. The “Impact of COVID-19 Vocabulary Use of Twitter Reputation and Retweets” section considers the relationship of the impact of COVID-19-based communications on Twitter influence and Twitter retweets. The “Manual Analysis of Ad Hoc Tweets” section manually analyzes the tweets using a sentiment ontology. The “COVID-19 Management Life Cycle, Emerging Pandemic Issues, and Computational Extensions” section presents the notion of a COVID-19 life cycle and discusses additional approaches and extensions, such as Word2Vec, ensemble methods and notions of ambivalence. Finally, the “Summary, Contributions, and Conclusion” section summarizes the paper, discusses the implications of our findings and related research, and proposes further extensions.

Data and Methodology

In response to the pandemic, people sought to gain information about COVID-19 from personal interaction and sharing of stories and content. They shared their concerns, questions, opinions, and knowledge on social media [3]. Accordingly, we use such user provided content as the data for our research, especially Twitter tweets.

The Emotional Content of Twitter Tweets’ Text

Twitter tweets are often recognized as capturing the sentiment and emotional content of the crowd [4]. With millions of tweets generated daily, Twitter generally is perceived as a useful platform for research. Researchers have investigated many related issues using tweets, including tracking disease propagation, anticipating election results, and predicting sports outcomes [5]. Further recognizing the value of Twitter data, Banda et al. [6] created a large database of tweets related to COVID-19 and made it available to researchers, to provide substantial opportunities for investigation [7]. Other research has also used Twitter data for analyzing issues related to users’ attitudes towards COVID-19 [8,9,10].

Since people express their opinions and ideas in this user-generated content, this online text can be mined to extract corresponding sentiment in the textual disclosures [11]. There are different approaches to text analysis [12, 13]. Efforts related to text analysis related COVID-19 vaccinations have used a wide range of AI-enabled social media analysis on large data sets to accommodate the unstructured nature of the data [14]. As an example, Leibowitz et al. [15] used Linguistic Inquiry and Word Count (LIWC) to investigate the text of Twitter tweets generated by emergency medicine Twitter users and found that approximately 34% of the tweets were positive and 31% were negative; 76.5% focused on the present.

Review of COVID Data Sets and a Data Timeline

This research focuses on studying the early part of the pandemic. We, therefore, track some of the key events from early in the pandemic. Figure 1 shows a timeline of significant COVID-19-related events as the pandemic was identified and progressed during its initial stages when there were predictions that it might end by August 2020.

Fig. 1
figure 1

Progression of coronavirus into COVID-19 pandemic

During the early phases of the COVID-19 pandemic, many people turned to Twitter as a platform to both post and retrieve information on the virus. Twitter tweets collected throughout this timeframe became the source of data for this research. We collected sets of data from Twitter, divided into roughly two periods: as the pandemic was identified and progressed to countries shutting down; and as countries started to reopen and cases continued to rise. These periods, were approximately, before early to mid-June, and after mid-June 2020.

LIWC

LIWC text analysis program [16, 17] was applied to the tweets. LIWC is perhaps the leading software for capturing information regarding psychological concepts from text. LIWC uses a psychology-based bag of words approach to analyze text. Tausczik and Pennebaker [18] provide a history of LIWC and the bag of words approach, which is derived from Freud and others and has a long history in psychology. Different concepts are represented within LIWC, such as “positive emotion” and “negative emotion,” but also related concepts such as “anger” and “power.” For each concept, a dictionary of words is included in LIWC. The software is then used to identify the relative frequency of occurrence of these words in a set of text (e.g., Twitter tweets), thus providing a scientific approach to text analysis. Representative concepts examined in this paper are summarized in Table 1.

Table 1 Summary of selected concepts and categories from LIWC

LIWC has two different types of measures, “summary variables” and “categories of words.” The other categories of words, for example, “anger” or “power,” are made in a comparative analysis, typically within the existing sample based on their relative occurrence. However, four concepts (analytic, clout, authenticity, and tone) have been established as “summary variables,” which are “standardized composites, based on previous research” [19]. The four summary variable concepts capture the frequency of word occurrences from other categories. LIWC’s summary variables are not analyzed based on the number of occurrences, as are the category variables, but, instead, the number of occurrences is related to an empirical distribution that ranges from 0 to 100, measuring the percentile. Those summary variables allow us to make statements about these measures, independent of additional comparisons, such as in-sample comparisons.

Although we primarily focus on the summary variables, we also chose the categories of affiliation, anger, health, negative emotion, positive emotion, power, and social for various reasons: health, because the coronavirus is an issue associated with health; anger, because we expect an angry reaction to the pandemic; and ranges of positive and negative emotion because we expect that tweets about coronavirus would be emotional. We expected affiliation to be an important distinguishing variable as people connect with each other in a friendship type of gesture. Similarly, we expect that the tweets provide a social outlet for the tweeter. Finally, we included power because the process of tweeting is likely to provide the tweeter with a certain extent of perceived power over a situation.

Our Approach and Use of Data

Users communicate using Twitter, allowing us to capture and analyze conversations in text format. We can use dictionaries to capture concepts, such as sentiment, emotions, or COVID-19, or use other types of words. By focusing on Twitter, we study human communication and behavior during the pandemic. Figure 2 provides an overview of the analysis conducted in this research.Footnote 1

Fig. 2
figure 2

Analysis of Twitter posts related to COVID-19

The data sets included: Twitter posts from early March to mid-June 2020 (2200 posts); a set of 900 posts collected in mid-July; and approximately 22,000 online posts collected daily from July–August 2020. These posts were collected by scrapping tweets using tools available on the internet; manual collection; and using https://birdiq.net/. The first data set was the most specific to COVID-19 and collected from reading and searching tweets based on keywords, such as symptom, infection, fever, fatigue, sick, COVID, and coronavirus. It also included mentions of family members or friends who might have had contact with the virus. The targeted collection was to identify the sentiment and emotions of people who were specifically dealing with COVID-19 in a concrete way. The second data set was intended to be used to obtain an overall sense of the attitude towards the virus over time. These tweets were collected automatically in July 2020 and cleaned.Footnote 2 They served as input to a text analysis tool. Daily tweets were collected from mid-July until August. Additionally, we collected data from Google searches and Wikipedia page views to study their predictive relationship with Twitter tweets.Footnote 3 A third data set employs additional data drawn from Hussain et al. [14].

Twitter Text Analysis: Tweets vs. Retweets and Pictures vs. No Pictures

In this section, we analyze a random sample of 20,000 tweets gathered in July 2020, chosen based on whether they contained the term coronavirus in order to study different forms of communication through tweets. We removed the tweets that were not English, resulting in 14,352 tweets. That set of tweets was analyzed using different comparative approaches: tweets vs. retweets and tweets with picture and video content vs. no video content. Of the 14,352 tweets, there were 1627 original tweets, 11,667 retweets, and the rest were replies. We did not analyze replies. Of the 14,352 tweets, 12,021 did not include any picture or video; 2331 did include pictures or videos. We analyzed both the two sets of tweets and the entire dataset using LIWC.

Text Analysis of the Entire Sample of Tweets: Analytic, Clout, Authenticity, and Tone

Applying LIWC to the entire sample, we found that the tweets averaged an “analytic” score of 74.02, which generally suggests more logical thinking. In addition, the tweets averaged a “clout” score of 66.60, which suggests more confidence than average. However, the tweets averaged an “authenticity” score of only 23.97, which is in the lower quarter of such scores. Finally, the “tone” averaged only 35.46, which reveals anxiety, sadness, or hostility. These results suggest that, on average, tweets about the coronavirus were relatively analytic, and came from a position of some clout. However, the tweets were not authentic sounding, suggesting guarded positions, and, generally, were more negative than positive.

Text Analysis of Tweets Using Pictures and Video Versus No Pictures and No Videos

Tweets can appear as text only or supplemented with pictures and videos. An important consideration is the extent of the impact of the pictures and videos on the text messages. Does the text differ if the tweeter includes pictures or videos? Do people express different text emotions when they supplement their text with videos or pictures? If they do, what does that mean? Do tweeters expect the pictures and video to tell the story? This section investigates some of these issues within the context of the coronavirus pandemic, while raising questions for future research.

We conducted a text analysis of the differences between the two groups of tweets: those that did not have pictures or videos; versus those that did. We used a two-sample t-test with unequal variances to test the differences between the two populations, for each of our variables. The results are summarized in Table 2. It is interesting to note that, for each text variable, there is a statistically significant difference between the average values for each of the categories, except health.

Table 2 Analysis of tweets with and without pictures and video

The “no picture and no video” tweets had statistically significantly differences, with both more positive and negative emotion and affiliation. The tweets also had greater comparative social context words and were presented with greater power. The differences were statistically significant. Finally, the text in the no picture and no video messages was also statistically significantly more “authentic”; however, the authenticity rating was still in approximately the lower 25%, suggesting a guarded presentation. On the other hand, there were statistically significant differences, with greater analytic vocabulary, clout and tone in the messages with pictures and videos.

This analysis suggests different sentiment and emotional content in the two sets of tweets. It does not establish whether: the use of pictures and video leads to changes in the sentiment and emotion in the text; the type of problem that leads to using a picture or video leads to a different type of text; or if people who communicate using video or pictures use different text than those who do not. Alternatively, it may be some combination of the three. Regardless, these are general issues in human communication and behavior that are subjects for future research.

Text Analysis of Retweets Versus Tweets

We analyzed how the text of tweets that were retweeted differ from those that are original tweets, with the results summarized in Table 3. There was not a statistically significant difference for the variables of affiliation and anger, whereas, for each of the other variables, there was a statistically significant difference. The retweets had greater values for clout, health, affiliation, power, social, negative emotion, and cognitive, whereas the original tweets had larger values for the categories of analytic, authenticity, tone, and positive emotion. Thus, the retweets were more negative in tone, suggested cognitive aspects to the tweet, focused more about health, and approached the tweet from a position of power.

Table 3 Analysis of tweets versus retweets

Thus, the results show a statistically significant difference between the text of retweeted tweets and original tweets. What is not clear is whether the sentiment and emotions are related to the likeliness of a tweet to be retweeted. This is a topic for future research, perhaps using behavioral research. Further, although we find a difference in these pandemic-based tweets, whether the same relationship will hold between non-COVID-19 tweets requires future investigation.

General Progression of Sentiment and Emotions Expression

Figure 3 summarizes the findings from the analysis of the Twitter tweets during the early phases of COVID-19. It provides a general timeline of the results from these three data sets, starting with the early manually collected and analyzed tweets and progressing to the LIWC analysis of the automated collection of tweets and retweets.Footnote 4

Fig. 3
figure 3

Timeline of sentiment and emotions expression

Behavioral Links Between Internet Search and Twitter Sentiment

In an interesting and recent research paper, Hussain et al. [14] analyze the sentiment related to Twitter tweets during the COVID pandemic.Footnote 5 For their analysis, the researchers collected 10% of the Twitter tweets in a database of COVID pandemic tweets [20]. As part of their analysis, they determined the relative percentage numbers of COVID-related tweets that had either positive, negative, or neutral sentiment, over a 37-week time period, during 2020. For the same period, we gathered both Google searches (from the USAFootnote 6) and all Wikipedia page views, since there is no country-by-country availability. We then used those searches and page views to predict the percentage of numbers of positive, negative, and neutral sentiment tweets.

Our analysis is based on the behavioral model that people would search for information (for example, using Google or Wikipedia) and, after gathering their information, potentially issue a Twitter tweet. In that model, shown in Fig. 4, we would expect that the numbers of Google searches and Wikipedia page views to be predictive of Twitter tweets.

Fig. 4
figure 4

General behavioral model of users’ search for online content and reaction

We used a set of variables to capture the relative percent of tweets that had “positive” sentiment (Pos Sent), “negative” sentiment (Neg Sent), and “neutral” sentiment (Neutral Sent). We provide the correlations of the percentage of the numbers of tweets with the numbers of Google searches and Wikipedia page views (for coronavirus and COVID-19). As seen in Table 4, both the numbers of Google searches and Wikipedia page views are statistically significantly related to both the percentage of tweets with positive and negative sentiment. The numbers of searches and page views are negatively related to the percent of positive sentiment tweets, and positively related to the percent of negative sentiment tweets. The percentage of neutral tweets is not statistically significantly related to Google searches or Wikipedia page views.

Table 4 Correlations and p values for models of sentiment

Next, we investigate the capture of lagged (1 week) variables, to test the ability to forecast the relative percentages of positive and negative sentiment tweets. The relative number of Google searches from Google Trends is the lagged variable “Google-1.” We capture the two different sets of lagged Wikipedia page views as “Wiki-Coronavirus-1” (coronavirus) and “Wiki-COVID-19–1” (COVID-19), based on two different sets of pages (coronavirus and COVID-19). The correlation results for the lagged variables are summarized in Table 5, and the regression models in Table 6.

Table 5 Correlation and p values for models of sentiment—predictive
Table 6 Predictive regression models of sentiment based on Google and Wikipedia searches

As can be seen from Tables 4 and 5, the Google search variables and the Wikipedia page view variables are each highly correlated. Unfortunately, that correlation makes using both variables in the same regression equation difficult because of multicollinearity. Accordingly, we used only one of each of the variables in each regression equation, as reported in Table 6.

This analysis showed that both Google searches and Wikipedia page views were statistically significantly predictive of both positive sentiment and negative sentiment, but not neutral sentiment in Twitter tweets. Further, both Google searches (lagged one period) and Wikipedia page views of “COVID-19” and “coronavirus” (lagged one period) are predictive of the percent of Twitter messages with positive and negative sentiment. The percent of Twitter messages associated with positive sentiment was negatively correlated with both Google searches and Wikipedia page views, whereas the number of messages associated with negative sentiment was positively correlated with both the numbers of Google searches and Wikipedia page views.

As a result, it appears that more Google searches and Wikipedia page views, ultimately, are related to more negative sentiment Twitter messages. This is an interesting behavioral finding that should be examined in other settings to determine if the same relationships hold. This is important because it provides a basis for a potential behavioral link between searches for information (Google searches and Wikipedia page views) and statements or positions issued through social media (Twitter).

Impact of COVID-19 Vocabulary Use of Twitter Reputation and Retweets

This section investigates, as an alternative view, the impact of COVID-19 on Twitter use. That is, whether the use of COVID-19 terms in Twitter tweets is statistically significantly related to an “influence measure” of Twitter users and whether the use of those COVID-19 words is related to retweets. Doing so, allows us to study the effects on human communication of the pandemic by tracking the relationship between occurrences of words regarding COVID-19, both on social media influence and reuse of messages. Unfortunately, there has been limited research associated with studying the impact of COVID-19 on these issues.Footnote 7 We, therefore, study the impact using text mining, supported by the generation and use of a dictionary of COVID-19 words. We then investigate the relationship between the dictionary words, and Twitter “influence” scores, and between that dictionary of words and two measures of retweeting. This allows us to gain insights into the impact of COVID-19-based words on communications.

Creating a Coronavirus Dictionary

To assess whether a tweet used COVID-19 concepts, we first generated a dictionary of COVID-19 terms, broadly based on words related to symptoms and controlling the spread of the disease. As a result, we focus on subcategories of detecting the coronavirus (currently have or had in the past), preventing the coronavirus, curing (e.g., a vaccine), symptoms (e.g., fever), and different names for the virus.

There are potentially many different terms that can be used to measure the extent to which text contains information about the coronavirus pandemic. We choose our terms based on the following process. First, we obtained a list of words and phrases that occurred in a list of coronavirus tweets and ranked them by the number of occurrences. Second, those words not related to the disease were discarded; e.g., stop words were removed. Third, we reviewed those frequently occurring words to identify the subcategory to which they belonged. Finally, additional words and phrases from the authors were added. Our focus was on developing a set of words aimed at isolating text related to the coronavirus. Table 7 shows the coronavirus dictionary words.

Table 7 Coronavirus dictionary words derived from Twitter

We aggregated all of the occurrences of these dictionary words under the category “coronavirus.” As with LIWC, we use a bag of words approach, counting the number of words from the dictionary occurring in the tweets, to measure the effects of COVID-19 words on the Twitter communications. As a verification of the importance of the dictionary words in coronavirus text, we conducted a joint, quoted search, using Google to determine co-occurrence of the word with coronavirus as reported in Table 8.

Table 8 Joint number of occurrences with “coronavirus”

The symptom words (e.g., lost sense of taste) had the lowest co-occurrence in our Google search. However, the symptoms of lost taste and smell in Table 8 seem particularly unique to the coronavirus. Despite, their potential low occurrence rate, we assessed that they would allow us to isolate and characterize coronavirus discussions.

Linguistic Inquiry and Word Count for Control Variables

We used LIWC to provide the control variables for our analysis of influence and retweeting. LIWC was used to capture both structural and semantic information about tweets in order to study the impact of our coronavirus dictionary. LIWC “structural” variables are the number of words in the text (WC), and items related to the style or difficulty of the text, such as the number of six letter (or longer words—Sixltr) and the number of words per sentence (WPS). We choose those control variables for several reasons. Word count provides a measure of the actual length of the message. Both WPS and Sixltr capture the “level” at which the tweets are made. More words per sentence and more than six letter words likely indicate a more “educated” tweeter. Alternatively, together WPS and Sixltr provide measures of “readability” or “ease of readability.” These three variables are measured based on the number of occurrences. Table 9 summarizes the structural variables used in this research.

Table 9 LIWC structural variables used

Although LIWC provides several semantic word sets, we focus on the “summary words” that provide measures of the occurrences of some concepts in the text. They can provide control variables over which to normalize the impact of the text content on our dependent variables, in order to test the specific effects of our COVID dictionary.

Dependent Variables: Influence Score, Retweeted Status User Listed Count and “Is a Retweet”

We choose three dependent variables: the Twitter influence score, retweeted status user listed count, and whether the particular tweet was retweeted (“Is a Retweet”), using a dataset generated from https://birdiq.net/twitter-search. Twitter influence scores capture substantial information about the use of Twitter [21]. Canals [22] indicates that the influence score is a joint function of the number of followers, the number following, and the number of posts in the Twitter account and is heavily historical. The “retweeted status user listed count” and whether a tweet is retweeted (“Is a Retweet”) provide measures of the users’ current interest in the information.

Data

We collected a random sample of 900 Twitter tweets from https://birdiq.net/twitter-search during July 2020 using seed words of COVID and coronavirus. Of those 900, we eliminated the ones that were not written in English, bringing the final number of tweets used to 770. It was important to eliminate non-English terms because our analysis used an English dictionary and is dependent on being able to count the numbers of English words in each category. We identified approximately 85% English tweets and 15% non-English, largely Spanish and French.

Empirical Analysis: Correlation and Regression Analysis

We used both correlation analysis and regression analysis to analyze our data. Since the “Is a Retweet” is a nominal variable, we investigated it using logistic regression. In the regression analysis, we used the “variable inflation factors” (VIFs) to determine the extent of multicollinearity among our independent variables. Our largest VIF score did not exceed 1.3 and, thus, was well below the standard of 4 in the literature [23], suggesting very limited multicollinearity.

Empirical Findings

Our analysis took four different approaches. First, we conducted a correlation analysis between each of our continuous variables. Second, we investigated the Twitter influence score in two steps: first, with the structural control variables and the summary text content variables; and second, with the control variables, the summary text variables, and the coronavirus variable. This allowed us to assess the direct impact of the words in our dictionary. Third, we conducted a regression analysis of the continuous variable, retweeted status user listed count, with all our variables. Fourth, we performed a logistic regression on the nominal variable “Is Retweet,” with our control and dictionary variable.

Correlation Analysis

This section uses correlation analysis to investigate the relationships between our variables with particular emphasis on influence score. The correlations are summarized in Table 10 and the p values in Table 11.

Table 10 Correlation analysis
Table 11 p-Values for correlation analysis

As can be seen from Table 11, the coronavirus variable is statistically significantly correlated with the influence score. In addition, two of the semantic control variables, WC and WPS, are statistically significantly correlated with the influence score. Finally, two of the summary text variables, clout and tone, are also statistically significantly correlated with influence score. For those statistically significantly correlated variables, the signs on word count and coronavirus were negative, whereas the signs on the other three were positive.

Regression Analysis of Tweeter Influence Score

Table 12 shows that a model including the structural variables and the summary text variables generates an R-square value of 0.075. Table 13 summarizes the model variables. The two structural variables control variables of word count and words per sentence, and the summary text variables of analytic, clout, and tone, were statistically significant. We, therefore, conclude that, for this set of tweets, the influence score is statistically significantly related to analytic, clout, and tone text variables. Each of the VIFs (variable inflation factors) are less than 4, suggesting minimal multicollinearity (Hair et al. [24] and others).

Table 12 Regression model fit for control variables and summary text variables, without coronavirus dictionary
Table 13 Regression model of influence score for control variables without coronavirus dictionary

Finally, in Tables 14 and 15 we add the findings of our new dictionary variable on the coronavirus. The R-square increases to 0.101, a statistically significant increase. In addition, the same control variables (WC and WPS) and one of the summary text variables (tone) has a coefficient that stays statistically significant. The coefficient on our coronavirus dictionary is also statistically significant and negative, as in the correlation matrix.

Table 14 Regression model fit for control, summary text, and coronavirus variables
Table 15 Regression model of influence score for control, summary text, and with coronavirus variable

Analysis of “Retweeted Status User Listed Count”

We use the same model as in the previous analysis of influence score, to study one aspect of the impact of retweeting the retweeted status user listed count (RSULC) of those re-tweeters. The measure for fit for the equation is in Table 16 and the regression model in Table 17.

Table 16 Regression model fit for retweeted status user listed count model
Table 17 Regression model of retweeted status user listed count model for control, summary text, and coronavirus variables

These results suggest that the vocabulary in the coronavirus dictionary is positively related to the RSULC. Further, each of the structural variables had p values that were statistically significant.

Analysis of “Is Retweet”—Logistic Regression

In this section, we discuss the analysis of the dependent variable “Is Retweet,” that takes on two values: true and false. As a nominal variable, we use logistic regression to analyze the data. The measures of fit are summarized in Table 18 and the coefficients in Table 19, for the model of “Is Retweet” for each of the control variables, summary text, and coronavirus variable. The p values on the coefficients of the semantic control variables WC and WPS were statistically significant, as they were in the estimation of influence score. However, whereas tone was statistically significant in the estimate of influence score, in the case of “Is Retweet,” the p values on the coefficients for clout and authentic were statistically significant. Finally, the p value on coronavirus was also statistically significant variable in estimating the variable “Is Retweet,” as it was in the model of influence score. The results in Table 19 indicate that only coefficients on word count and the coronavirus were positive, whereas those on words per sentence, clout and authentic were negative.

Table 18 Logistic regression measures of fit for “Is Retweet” model
Table 19 Logistic regression coefficients for structural, summary text, and coronavirus variables estimate of “Is Retweet” model

Summary

This section compares the results across the three dependent variables. The p values for word count and words per sentence are statistically significant throughout this section. In the regression model of the influence score, the coefficient on the word count has a negative sign. However, for the logistic regression model of the retweeting dependent variable, the sign is positive. Words per sentence is statistically significant in each of the three models. The coefficient of the text variable tone is positive and has a statistically significant coefficient in the regression model of influence score. However, the coefficients on clout and authentic were negative and statistically significant in the model of retweets. Finally, the p values for our coronavirus dictionary are statistically significant for all three dependent variables. The statistically significant results on our coronavirus dictionary suggest that the coronavirus pandemic has created a vocabulary of its own, including the previously unknown term COVID-19, and that vocabulary influences human communication as captured in Twitter.

Why does our dictionary have a negative sign on the estimation of the influence score and a positive sign on estimation of the retweet measures? We conclude that this is an indication of change in the information being diffused, and that information needs a “new dictionary” to identify it. Influential Twitter users have an established set of followers, posts, and topics. As a result, there is likely to be a consistency in their tweets. However, new and important topics emerged; for example, the coronavirus pandemic can attract great interest and result in retweets. Our results suggest that those creating the coronavirus tweets are a different set of users, diffusing a different set of information than more established Twitter users normally would do.

These results should be important for information technology research. The coronavirus dictionary allowed us to track the changes in vocabulary use in Twitter communications. This comparison between the impact on influence score and retweets allows us to monitor these changes. Special emerging technology dictionaries could be used with a range of technologies to capture and measure the information diffusion associated with such technologies.

Manual Analysis of Ad Hoc Tweets

In addition to the computer-based analysis, we manually analyzed the posts at a finer granularity, by adopting the work of Scherer [25], who identifies 36 ontological categories that deal with affect: admiration/awe, amusement, anger, anxiety, being touched, boredom, compassion, contempt, contentment, desperation, disappointment, disgust, dissatisfaction, envy, fear, feeling of affection/love, gratitude, guilt, happiness, hatred, hope, humility, interest/enthusiasm, irritation, jealousy, joy, longing, lust, pleasure/enjoyment, pride, relaxation/serenity, relief, sadness, shame, surprise, and tension/stress. These terms help to identify sentiment in natural language [26, 27].

We read each of the tweets and classified them as factual or emotional, based on Scherer’s [25] categories. Approximately, 55% of the posts were factual, simply referring to a fact (without emotion or sentiment) intended to be true at the time of posting. Other tweets were factual with some type of emotion and included direct reporting of patient experiences (5%). The remaining tweets reflected only a few emotions (anger, anxiety, desperation, disgust, hope, and surprise).

An example of a tweet classified as factual (which could be falsified later) was: “The spread of #COVID19 by an asymptomatic or someone who is not showing any symptoms appears to be less likely, said #WHO (@WHO) in the recently published summary of transmission of COVID-19 including symptomatic, pre-symptomatic and asymptomatic patients.” An example of the emotion fear was: “A patient with symptoms of a heart attack refused treatment after reading on Facebook that she would die if she went to hospital during the COVID-19 crisis.” Another tweet (again subject to later falsification), but intended to be factual, was: “@CDCgov issued some very useful current best estimates:—About 1/3 of COVID-19 infections are asymptomatic.—40% of transmission is occurring before people feel sick.—Time from exposure to symptom onset: ~ 6 days on average.”

Data

Using an online tool, https://birdiq.net/Twitter-search, we manually collected tweets based on keywords, such as COVID-19, CDC, and WHO. These tweets (over 2000) were reviewed to identify sentiments and insights that would not be possible to extract using an automated tool, again, to gain insight into user behaviors. We strived to show the value of manual mining, recognizing that this type of analysis is not feasible on a large scale. Table 20 provides examples of tweets and classifies them based on their actual, or assumed, intended (potential or real) significance.

Table 20 Data set of ad hoc tweets

From this sample, the most-likely keyword, COVID-19, revealed a variety of expected tweets on: the spread of the virus, the serious of it based on experiences and testimonials, testing, innovative ways to approaching testing and treatments, and others. The tweets shown from the WHO and DCD relate to advice and awareness.

Searches for Nuggets

The notion of the wisdom of the crowds implies that sometimes the crowd is able to perform better than individuals [28]. We investigated whether the content, as provided by the user community of Twitter (the crowd), could provide insights that might be helpful to the general public, or perhaps even medical professionals. The types of insights we were looking for required a human to identify what might be useful content and extract ideas at the tweet level of analysis [29]. Therefore, we reviewed approximately 2000 tweets posted throughout the pandemic. We attempted to identify nuggets; that is, pieces of information with the potential to have real value or use, beyond just a post. Examples of potentially influential tweets are given below. The first is a best practices suggestion.

Tweet (factual/sharing of best practices): #itvnews Many German patients were given oximeters in the community back in April. Other places have also recommended this. https://www.thailandmedical.news/news/COVID-19-tips-oximeters,-a-potential-home-tool-to-monitor-progress-of-COVID-19-symptoms-from-mild-to-moderate-and-to-detect-COVID-19-pneumonia-early

The following tweet shows passing on blood type information from a legitimate news source. Such information could be useful for someone assessing their own risk (e.g., for potential usefulness).

Tweet (blood types). This study finds COVID patients with type A blood are at much higher risk of developing life-endangering symptoms, patients with type O blood experience a “protective effect” https://www.nytimes.com/2020/06/03/health/coronavirus-blood-type-genetics.html

However, a later study by Harvard showed that people who were symptomatic and had blood types of B + or AB + were more likely to have a positive COVID-19 test than people who were symptomatic with blood type O.Footnote 8

The following tweet is factual. Knowing the potential length of the illness might last would be useful to anyone concerned with whether they are experiencing a typical duration.

Swiss TV news (factual): Half of patients (500/1000) contacted by a COVID 19 follow-up service report symptoms after 6 weeks https://www.rts.ch/play/tv/19h30/video/le-virus-recule-et-le-nombre-de-gueris-du-COVID-sont-tres-nombreux--mais-cette-nouvelle-maladie-laisse-parfois-des-traces-?id=11370777

The following tweet reports on a medial study and would be useful for anyone concerned with how seriously the virus might infect them.

Factual: Low levels of the prognostic biomarker suPAR are predictive of mild outcome in patients with symptoms of COVID-19 - a prospective cohort study. Authors: jesper eugen-olsen, Izzet Altintas, Jens Tingleff, Marius St... http://medrxiv.org/cgi/content/short/2020.05.27.20114678

The following two tweets show associations of patient characteristics and occurrence of the disease. These posts are interesting in the sense that the associations being made are non-intuitive. However, they serve as examples of the types of tweets that might trigger self-reporting of whether a person falls into one of these categories, which, in turn, could lead to further investigation to uphold or falsify the conclusions from the reports.

Factual (implications true or falsified later): In one report, dermatologists evaluated 88 COVID-19 patients in an Italian hospital and found 1 in 5 had some sort of skin symptom, mostly red rashes over the trunk. https://inq.news/COVID-toes

Factual (implications true or falsified later): Most #coronavirus patients had no hair https://www.hulldailymail.co.uk/news/uk-world-news/bald-men-could-risk-more-4194866

The tweets below could be important because they provide information on the virus itself, as well as a potential treatment, but are not scientifically proven.

COVID-19 maybe mutating but it’s for the good. Doctors in Italy have claimed that the symptoms of COVID-19 and their intensity is less than what they experienced with the first wave of patients. This suggests that COVID-19 gets weaker as it spreads. https://elemental.medium.com/could-the-coronavirus-be-weakening-as-it-spreads-928f2ad33f89

A new drug, #famotidine, available over-the-counter for relieving #heartburn, has shown promising results in treating the symptoms of #COVID19 https://www.firstpost.com/health/heartburn-drug-famotidine-may-reduce-symptoms-of-non-hospitalised-COVID-19-patients-suggests-case-series-8452421.html

As these tweets illustrate, they provide useful, or potentially useful, information when so much is unknown about this global crisis. Human judgment is needed to assess the validity of the claims in the tweets with scientific study clearly required for some of them. However, the potential value of the information contained within a tweet could not easily be obtained by software.

Twitter Use

People generally turned to Twitter as a platform to make sense of the pandemic. The tweets showed that people also wanted to provide useful information for others, sharing their opinions and knowledge. There were many compassion posts triggered by personal situations.

Example (desperation/disgust): My father, 62 yr suffering from high fever (103-104) from 9 days with no other apparent symptoms. He tested negative on COVID 19. He has history of CABG in 2006. Our family doctor advised to get him admitted. No hospital is accepting patient with fever. Pls help #caremongers

Example (factual/tension/stress): My friend is a nurse & finally broke her silence. She said she’s seeing COVID-19 patients leaving the hospital after COVID with kidney damage. Others will suffer with COPD like symptoms for the rest of their lives. It’s very scary.

Example (disgust) MILD. There’s a huge amount at stake in term mild – for gov actors, health service planners, clinicians, patients, carers...In the days when my own ‘mild’ #COVID19 symptoms have been manageable (Day 52 now), I reflected on mild COVID-19 for @somatosphere https://t.co/rQ9wFdcSQ7?amp=1

This mining revealed a great deal of posts with different perspectives. Many posts were intended to provide useful information. However, some of the posts which reported information considered to be factual (e.g., do not need to wear masks) had the potential to later be proven false.

Ideally, the mining for nuggets could produce insights for the management of the virus. For example, some cases reported on successful convalescent plasma treatments, leading to requests for plasma donations from recovered patients. Other tweets reported some members of a family getting the virus and others who lived at the same location, not getting it. Such reports might be of interest to researchers trying to find commonalities in these cases. The Appendix contains tweets mined from an additional data set. They reveal a combination of medical innovations (attempted or actual), health information, sentiment and personal reports, opinions, and creative comments. The tweets reveal a need to contribute to an ongoing crisis by providing medical information; contributing to the global conversation on COVID-19; or seeking help.

Themes—Summary

The use of Twitter as a critical social media tool in times of major communication needs was obvious with Twitter text providing valuable insights into users’ opinions and attitudes. The same held true as other world events unfold; for example, the Arab Spring and Japan’s earthquakes [6]. For COVID-19, the sentiment analysis revealed a change over time as the pandemic progressed. The most notable trend was that tweet content progressed from providing, and seeking, factual information to expressing emotions, including anger. Prior research found that Twitter, along with other social media, could be used as a predictor of COVID-19 cases and other threats to community health [30]. It is likely there will be continued use of these platforms. The development of large databases of tweets or other user-generated content should, thus, continue to provide substantial research opportunities to investigate COVID-19 or other issues related to global challenges of such magnitude [7].

Content themes emerged. The tweets emphasized information on testing, treating, reporting of well-known figures who tested positive, warnings about the severity of the disease, and other health-related information. Additional themes related to politics, reported scientific breakthroughs (some of which were later shown to be false), economics, reopening of schools and businesses, and others.

We attempted to understand the content of the tweets using sentiment analysis. Many tweets were factual; other showed predictable sentiment of anger, desperation, and hope. Of interest was how Twitter might be used to identify information nuggets, in the traditional sense of a valuable idea. This involved manual inspection and mining. One nugget was a relatively early suggestion that a hospital in India collect the blood of patients who had recovered from COVID-19. Later, the identification of blood type was scientifically investigated as an indicator of the risk of experiencing the disease. However, there does not appear to be a way for a computer program to connect these two, demonstrating the limitations of tools to extract inherent information in text data [12].

In the same way, there is much intentional or non-intentional sharing of misinformation, often referred to as fake news [31, 32]. A computer program that can deal with sentiment well might be able to identify tweets with specific content and others with opposite or contradictory content. We did not, for example, investigate tweets that suggested the COIVD-19 pandemic should not be taken seriously. Instead, we considered reasons why people elected to share content. Representative examples are provided in Table 21.

Table 21 Sharing of Twitter content

Many other investigations are possible. For example, could the impact of international protests be factored into the sentiment analysis? Is it possible to identify a “tipping” point where people realized the importance of being vigilant (wearing masks, etc.) based on posts reported by infected people relaying the seriousness of the disease to others?

Twitter has been used for social debates and expressions of public opinion (e.g., [33]). No doubt, it will continue to be used in this manner for topics of large, public impact. However, with millions of tweets being generated each day, our study has involved a limited number of texts, restricted to those written in English. It would be useful to expand the categories of sentiment we use as well as well as to determine whether there was any age group or gender differences in the negative tweets. Finally, finding the true nuggets will, no doubt, require a huge, semi-automated approach, but doing so might help to identify insights that could lead to the development of better sentiment analysis tools.

What we learn from Twitter as a platform is the potential to reach a large audience and provide much information, informative or otherwise. Of course, there are many issues, but it is not possible to verify them without scientific experimentation and reporting of actual numbers. For example, at one point, based on data from Italy and the UK, a website reported men as having approximately twice the number of deaths as women.Footnote 9 Finally, not all insights can be obtained using existing sentiment analysis tools, but there is a limited amount of insight that can be obtained from manual mining.

COVID-19 Health Management Life Cycle, Emerging Pandemic Issues, and Computational Extensions

The authors would like to thank the guest editor for suggesting some of the content in this section.

This paper has used data collected early in the COVID-19 life cycle. At the time of data collection, it was unimaginable that multiple COVID variants would have emerged. Nor was it foreseeable that, after 2 years, the end of the pandemic is not in sight. However, these realizations suggest that the COVID-19 pandemic has a sustained life cycle with many events that also could be investigated. Because that life cycle has many implications, we examine the basic notion of a COVID-19 life cycle and some of its implications. In addition, we examine some computing extensions for using bags of words in order to address issues, beyond the psychological concerns addressed in this paper. We examine potentially using Word2Vec and other approaches in future models and generate a list of business-based COVID-19 words. We also investigate the potential opportunities for the application of symbolic and sub-symbolic AI for sentiment analysis of COVID-19 as well as other emerging trends.

COVID-19 Health Management Life Cycle and Related Problems

As the COVID-19 pandemic continues, with new variants such as “omicron” emerging, and no solid end in sight, it is clear there is a life cycle to the pandemic (e.g., [34]) that affects healthcare planning, management and resource allocation. The emergent COVID-19 health management life cycle has many things in common with technology life cycles, such as the maturity curve, the hype curve, the adoption curve and others (e.g., [35]). Unfortunately, what it is not clear are the specific concerns, markers, or events, within that pandemic life cycle. Some of those emerging activities within the life cycle appear to include issues, such as managing new outbreaks of COVID, integrating health management efforts across multiple countries, freeing up and allocating resources, and other concerns, whose difficulties and solutions likely have not been established completely because the disease, literally, has been emerging, diffusing, and evolving. A health management life cycle model could be useful for identifying problems associated with each stage in the life cycle, as the disease works through its life cycle. The beginning of one such view of a life cycle is provided in Fig. 5 and includes potential life cycle stages on the horizontal axis and potential problems associated with the stages on the vertical axis. As a companion to this approach, it is easy to imagine a COVID-19 version of the hype cycle that traces the COVID-19 technologies (vaccines, antivirals, infusions, etc.) over time and over the stages, such as the “peak of inflated expectations,” to the “Trough of Disillusionment” to the “Slope of Enlightenment” [36].

Fig. 5
figure 5

Potential Health Management model of a COVID-19 life cycle and related problems

This life cycle model could be helpful in text analysis by providing insights in a number of directions. For example, this research is concerned with psychological issues of communication in social media, suggesting the importance of a text mining approach centered in psychology and helping us choose the tool, LIWC. Across the life cycle, there are likely different psychological problems, that potentially might be identified from analysis of text communications. However, analysts may be concerned with other stages and other problems in the life cycle, requiring a different context than a psychological one. In those settings, analysts may need to generate different bags of words to gather meaning from different contexts about different problems, such as outbreak management or integrating efforts across countries.

Generating Bags of Words in Alternative Contexts for Alternative Problems

A number of approaches could be used to generate bags of words for different contexts in the COVID life cycle. Word2Vec [37] provides two approaches that allow the generation of words that are similar to a seed word: “Continuous Bag of Words” (CBOW) and “Continuous Skip Gram Model” (Skip Gram). Word2Vec identifies words that are similar to a seed word or words, in the text from which they are gathered. For example, as noted by Mikolov et al. [37, p. 5] in the analysis of text they found that the approach would help find similar words, such as how “… France is to Paris as Germany is to Berlin ….” We generated a set of 39 words drawn from a business corpus that are presented as Table 22, using both CBOW and Skip Gram.

Table 22 List of words using Word2Vec

An analysis of those words can imagine the business concerns, e.g., “downturn,” “slowdown,” “economy,” and “recession” captured in the corpus. In addition, the list includes other related risks to business, such as “BREXIT” and “fears.” Further, some of these words, although related to COVID-19, are not uniquely associated with the pandemic, such as “economy” and “abates.”

Poria et al. [38] and Araque et al. [39] suggest that text analysis should employ ensemble approaches. Interestingly, there are two different approaches within Word2Vec, so its use inherently provides the perspective of an assemble of methods. In addition, other approaches such as “Glove” [19], can be included with the two approaches within Word2Vec, to broaden the ensemble of methods. Each of these algorithms could be used to generate sets of words with different seed words. However, using multiple methods generates redundancies and words that, in general, may not be directly related to the seed word(s) in the sense that the analysis is concerned. As a result, it is important to include a “human-in-the-loop” into the ensemble approach when generating a bag of words about a concept.

Finally, Cambria et al. [40] investigated the application of symbolic and sub-symbolic AI for sentiment analysis. Their approach to capture meaning integrates both top-down and bottom-up computing that employs both computational sub-symbolic computational approaches and symbolic logic and semantic network approaches. In so doing, they built a new version of SenticNet. Future research can focus on integrating this approach to build better word sets that match the domain-specific needs of the particular locations of a COVID-19 life cycle, generating the wordsets for the problems as needed.

Ambivalence

Recently, researchers have begun to explore additional approaches of measuring neutral sentiment. Although there are approaches to capturing neutral sentiment, as discussed in the “Impact of COVID-19 Vocabulary Use of Twitter Reputation and Retweets” section, and in the Python natural language tool kit, recently Wang et al. [43] developed a more fine-grained approach to measuring ambivalence. This is important, because, although much of COVID-related activity is emotionally charged, resulting in demonstrations world-wide, some issues apparently garner ambivalence.Footnote 11 For example, Peng and Chen [42] investigate emotional ambivalence and luxury good consumption during the COVID pandemic. However, as noted by Craig et al. [41], capturing ambivalence can relate to the specific issues being considered and the way in which questions are worded, further emphasizing the importance of specific words.

Summary, Contributions, and Conclusion

As of 6th May 2020, there were almost 4 million known, confirmed cases of COVID-19 worldwide. By mid-July, that number more than tripled. It reached 30 million cases by September, and close to one million deaths.Footnote 12 By December 2021, over 250 million cases and 5 million deaths have been documented. By May 2022, that number grew to over 530 million cases and almost over 6 million deaths. Many global efforts are still being taken to combat the virus. As ordinary people seek to understand the virus, and learn how to protect themselves, they frequently turn to online platforms, such as Twitter, which is often regarded as a good resource from which to analyze opinions from user-generated content [44].

Contributions—Human Communication

This paper makes several contributions to knowledge about human communication using social media, couched in the use of Twitter within the COVID-19 pandemic, which lead to interesting questions for future research. First, we found that the text sentiment and emotions of Twitter messages with and without video and pictures is statistically significantly different. Although not clear why, the findings suggest important differences. Is it a general characteristic of human communication that using pictures results in different text sentiment than if pictures are not used?

Second, the sentiment and emotions of the retweets and the original tweets in the pandemic were statistically significantly different. Future research should investigate the extent to which this finding can be generalized. Is there something in the sentiment and emotions of a tweet that makes it likely to be retweeted? Is it a general characteristic of human communication to use this particular type or amount of sentiment?

Third, Google searches and Wikipedia page views are predictive of the percentage of positive and negative sentiment tweets, suggesting that humans perform internet searches and then communicate the results via social media. Future research should investigate the extent to which this phenomenon occurs in non-pandemic settings and can be considered a general model of human behavior.

Fourth, a variable representing a dictionary of words capturing COVID concepts was statistically significant in models of Twitter influence and retweets. It appears that communication of new topics pursued by new users results in retweets, in contrast to tweets from those with large influence scores derived from established pools of followers and topics. This finding can be used to support future research on new user groups or on technology use during a major event (e.g., [45]).

Relationship to Previous Research with LIWC

We have not discovered similar research to benchmark the findings in our analyses. However, very recently, other researchers have used LIWC for various types of related research into COVID-19-based concerns. Silva et al. [46], for example, used LIWC and Twitter tweets to investigate issues associated with misinformation. Barnes [47] used LIWC in analysis of “terror management theory.” Safa et al. [48], similarly, analyzed the detection of symptoms of depression in Twitter tweets. Ebeling et al. [49] used LIWC to investigate the impact of political polarization during the COVID-19 pandemic. Mosleh et al. [50] used LIWC, to analyze correlations with behavior. These efforts support our use of LIWC, although there are limited in their application to the issues examined in this paper.

Conclusion

The COVID-19 pandemic continues to be a topic of much global interest for both health and economic reasons, as new variants evolve. This research has analyzed text data from Twitter to gain an understanding of human communication based on user-supplied content during the pandemic. Twitter tweets were analyzed manually and using a text analysis tool. The results show changes in user participation over time from information seeking to expressions of anger or other emotions. Users retweeted different content with clout and were most concerned with health. Tweets that include pictures and movies have different text than those that do not. The percentage of positive versus negative sentiments found in COVID-19 tweets could be predicted by Google searches and Wikipedia page views. This research can also be considered as an analysis of human communication where new concepts are discussed using text and images, which provide a firm foundation from which to analyze the implications of events or situations that have wide-spread consequences, such as a pandemic or a natural disaster.