A semantic network approach to measuring sentiment

Abstract

Sentiment research is dominated by studies that assign texts to positive and negative categories. This classification is often based on a bag-of-words approach that counts the frequencies of sentiment terms from a predefined vocabulary, ignoring the contexts for these words. We test an aspect-based network analysis model that computes sentiment about an entity from the shortest paths between the sentiment words and the target word across a corpus. Two ground-truth datasets in which human annotators judged whether tweets were positive or negative enabled testing the internal and external validity of the automated network-based method, evaluating the extent to which this approach’s scoring corresponds to the annotations. We found that tweets annotated as negative had an automated negativity score that was nearly twice as strong than positivity, while positively annotated tweets were six times stronger in positivity than negativity. To assess the predictive validity of the approach, we analyzed sentiment associated with coronavirus coverage in television news from January 1 to March 25, 2020. Support was found for the four hypotheses tested, demonstrating the utility of the approach. H1: broadcast news expresses less sentiment about coronavirus, panic, and social distancing than non-broadcast news outlets. H2: there is a negative bias in the news across channels. H3: sentiment increases are associated with an increased volume of news stories. H4: sentiment is associated with uncertainty in news coverage of coronavirus over time. We also found that as the type of channel moved from broadcast network news to 24-h business, general, and foreign news sentiment increased for coronavirus, panic, and social distancing.

Introduction

Problem

With the prosperity of social media over the past 15 years, the society has witnessed the plethora of online opinion expression and even polarization (Del Vicario et al. 2016; Mohammad et al. 2015). Accordingly, a stream of research has emerged regarding sentiment expressed in posts (Nanli et al. 2012; Cambria et al. 2017; Mäntylä et al. 2018). Although there are numerous commercial applications of sentiment analysis (Rambocas and Pacheco 2018), academic researchers have primarily studied sentiment in political contexts. Sentiment analysis of media tone, agenda-setting, election forecasting, and candidate evaluations has developed (Rudkowsky et al. 2018) in different political contexts (Kim and Krishna 2018; Doroshenko et al. 2019; Fogel-Dror et al. 2019).

The dominant methods for sentiment analysis (Kharde and Sonawane 2016) seek to classify messages as positive or negative for use in machine or deep learning using neural network models (Zhang et al. 2018). Less common are methods that measure the degree of positivity or negativity in texts. Classification of textual content into categories of positive or negative (Liu and Zhang 2012; Mäntylä et al. 2018) counts frequencies of sentiment words in a lexicon, a predefined list or dictionary of positive and negative words. Counting individual word frequencies is referred to as a “Bag of Words” (bag-of-words) model because the approach treats all of the words in textual units of observation in a disaggregated way, jumbled together with no relations among them. The proximity of words in the text is ignored. The bag-of-words sentiment scores are simple nominal counts of positive and negative words out of context.

In communication science, rather than classification, content analysis of messages to measure the degree of positive and negative sentiment associated with a target is often the goal. This content analysis requires a different measurement model than bag-of-words, one based on a network approach. Although most social network analyses are of relationships among entities such as individuals, groups, organizations, or nations (Rogers 1987; Monge et al. 2003; Borgatti et al. 2009), a network model has also been useful in treating words in the text as nodes and their proximate co-occurrences as links, forming a semantic network (Danowski 1982, 1993, 2009; Carley 1993; Corman et al. 2002).

Some recent examples of semantic network analysis include work by Danowski and Park (2014), Jiang et al. (2016), Calabrese et al. (2019a, b), and Danowski and Riopelle (2019). Semantic network analysis covers a wide range of aspects of meaning (Osgood et al. 1957). An important advantage of semantic network analysis is it illustrates the relationships among words in the text, thus generating insights about the structures and meanings of the entire text. Here we are concerned only with sentiment, which is just one dimension among many that semantic network analysis can index in the study of texts. Nevertheless, we present a sentiment analysis approach building on word relationships and embeddedness in texts. The method can potentially be applied to other dimensions of texts, as long as researchers are interested in looking for the strength of relationships between a target word and a particular category of words.

Research goal

Our main goal is to propose a semantic network-based measure of sentiment with respect to a target (the name of person, organization, group, brand, etc.) by identifying the shortest paths connecting the target with sentiment words. To evaluate the semantic network method, we compare the measure to ground-truth data, sentiment judgements made by human annotators. We examine whether the network-based sentiment scores for texts they classified as positive or negative have the expected higher sentiment valence with respect to a target. For example, if we take the texts classified as negative and compute the strength of association of a target with negative sentiment words, we would expect that our network method produces sentiment ratios with higher negativity than positivity. Likewise, texts classified as positive should show higher positivity than negativity. This would be evidence for its internal validity. External validity is assessed based on testing hypotheses about sentiment in television news coverage of the coronavirus pandemic using the sentiment metrics.

The key to moving beyond atomistic bag-of-words counts to a semantic network model is to consider their proximities, how words are paired with one another as a window slides through the text, centering on one word after another, tabulating all of the word pairs appearing within the window (Danowski 1993), and stopping at the end of a sentence and restarting at the beginning of the next sentence. The stochastic slide produces a continuous stream of bigrams representing words in context. The result is an adjacency list of edges, where each row is a pair of words, followed by their co-occurrence frequency, which is the basis for semantic network analysis. Once we have the network, we trace the shortest paths from sentiment words to a target word and from a target to the sentiment words to produce ratios of positivity to negativity.

With the identification of word pairs in a text and their co-occurrences, network analysis of these bigrams enables measurement of the distances between words based on the shortest paths linking them. As shown in Fig. 1, to decide sentiment for the target word, we first look for sentiment words in the network and the closest path between the target and the sentiment words. In this case, the target links to three sentiment words. The path may originate from the target to the sentiment word (e.g., Target to Sentiment Word 1), from the word to the target (e.g., Sentiment Word 2 to Target), or in both directions (between Target and Sentiment Word 3). The shortest path between Target and Sentiment 1 passes through Word 2. The shortest path between Target and Sentiment Word 1 is a direct link between the two and so on. To illustrate this with actual data, in the IMDB review database,Footnote 1 the shortest path linking “film” with “suspense” is: film → good → action → suspense. Another example is from media coverage after the April 2010 Deepwater Horizon Gulf Oil Spill. Using “spill” as a seed and “responsible” as a target resulted in this strongest path: spill → gulf → mexico → oil → giant → bp → responsible. This shows how shortest paths can reveal the strings that tie concepts together, based on the contexts of words.

Fig. 1
figure1

A graphic illustration of the network-based approach

Given a lexicon of positive and negative words, the network-based sentiment analysis then measures the closeness of the sentiment words to a target word of interest. In this way, we have a more precise micro-level analysis of sentiment concerning a target, compared to simple non-specific sentiment scores based on frequency counts of positive and negative words appearing anywhere in the textual unit of observation. This kind of sentiment analysis is an example of “aspect-based” sentiment analysis (Pontiki et al. 2016).

Without the proximity information about word co-occurrences, sentiment can only be measured at the level of the whole textual unit, such as a speech, report, news article, tweet, etc., and not for a particular target word of interest. The target-specific scoring of sentiment is particularly suited to communication perspectives. Strategic communication typically seeks to strengthen or weaken associations of attributes linked with target concepts, issues, or introduce new ones.

Shortest paths

The network approach to target-specific sentiment scoring enables the application of a fundamental network concept, that of the shortest paths between nodes in the network (Dijkstra 1959). We can compute for each sentiment word its distance from the target in terms of the geodesic, the number of links in the shortest path. The effect of distance on the association between words is the inverse-square law of physics: transmission of energy in a medium is inversely proportional to the square of the distance from the source.

In network models, distance is a function of the number of links in a path. Shorter paths mean the target is closer to the sentiment words in the network. Closeness is not simply a matter of the raw number of steps in the path. There is a decay function, which is the square of the number of these edges. Although a direct edge between two words is the strongest, each time an intermediate edge lengthens the path, there is a drop-off of network effects from the start node. This is the denominator in our sentiment strength formula. The numerator is the sum of the edge weights along the shortest path. Once we have the numerator and denominator, we can compute the sentiment strength of the path, dividing the sum of the strengths across the path by the square of the number of edges. This computation enables our calculation of a set of sentiment metrics, positivity, negativity, and the sentiment ratio.

Literature review

Sentiment analysis gauges the attitudes, opinions, and emotions of people based on textual data such as online reviews and blog posts (Liu 2012). In this section, we discuss the major methods used to conduct sentiment analysis and review their advantages and disadvantages.

Bag-of-words approaches to sentiment analysis

Lexicon-based measures

The most simple and common approach for sentiment analysis is using a predefined lexicon or dictionary containing sentiment words, their affective orientations, and sometimes the strength of its orientation. Following the bag-of-words approach, lexicon-based approaches first break down a body of text into independent words. Then it counts the frequency of sentiment words (which are defined by the lexicon used) that appear in the text and computes a sentiment score of the text based on the word count.

Commonly used sentiment lexicons include Linguistic Inquiry and Word Count (LIWC) (Tausczik and Pennebaker 2010), SentiWordNet (Baccianella et al. 2010), and the bing lexicon. Whereas some lexicons, such as Bing, contain words in binary categories, others, like SentiWordNet, provide a ratio indicating the strength of the words’ orientation. Lexicon-based sentiment analysis is an unsupervised method that is easy to apply and not domain-dependent. It can be highly accurate if used appropriately (Kundi et al. 2014; Asghar et al. 2014; Khoo and Johnkhan 2018). However, a limitation of lexicon-based measures based on the bag-of-word approach is that it only focuses on the frequency of single, tokenized words. It omits the contexts of the words based on their co-occurrences in the texts that are critical to sense-making. In other words, you just get one score for an entire text regardless of the number of persons, organizations, or brands mentioned.

Machine learning classification

The machine learning approach utilizes supervised learning, which starts by extracting features from a body of text (Liu 2012). Machine learning algorithms applying the bag-of-words approach treat single words as semantic features. Based on the features and outcomes (e.g., annotated sentiment of texts) it learns from the training text data, it classifies texts into different sentiment categories such as positive, negative, and neutral. The key to the performance of machine learning lies in the effectiveness of the features it extracts. The machine learning approach has the edge over the lexicon-based measures as it acquires information directly from the text body rather than a standard lexicon (D’Andrea et al. 2015). Therefore, it is better customized to the text data.

Several software vendors, including IBM (Watson), Google (Cloud Natural Language), Amazon (Comprehend), and Microsoft (Azure) have developed their own proprietary machine learning algorithms for sentiment analysis. These algorithms are relatively easy to use but are not transparent (since they are proprietary) and can be expensive for researchers. This is because a machine learning-based sentiment analysis can be costly to develop. It requires a considerable amount of text data to train an accurate classification algorithm and may need human coders to annotate the training texts. It may also work better for long documents rather than short reviews or tweets so that there are more words to serve as textual features for classification (Khoo and Johnkhan 2018). Machine learning classification using a bag-of-words approach also shares the same limitation with the lexicon-based approach as it only uses independent words and ignores the contexts in which the words are embedded.

Word embeddings for sentiment analysis

A relatively recent development in natural language processing is word embeddings (Mikolov et al. 2013). Word embeddings are techniques that map words in a text into numeric vectors in a vector space. Instead of assuming words as independent, as the bag-of-words approach does, word embeddings often operate based on a sliding window and extract features from a sequence of words co-occurring in a body of text. The approach thus is in alignment with the semantic network perspective and takes into consideration word contexts. Based on how words appear with one another, word embedding algorithms represent the words in the vector space, in which words used in similar ways are closer to one another.

Word embeddings may be applied in two ways in sentiment analysis. The first is by extracting words and their relations in the texts as features for sentiment classification (Kumar and Zymbler 2019). Researchers have also applied pre-trained word embedding corpora to classify texts. So when target texts contain words that did not appear in the training dataset, the algorithm can make judgments about text sentiment based on how close the new words are to the words appeared before in the relational corpora (Rudkowsky et al. 2018). Just like other machine learning models, training with word embeddings requires the dataset to be large to produce an accurate mapping of words in a text. If pre-trained word embedding corpora are used, then the algorithm does not directly learn from the body of text being analyzed and thus may not precisely capture the local context of the texts under scrutiny.

Aspect-based sentiment analysis

Based on the unit of analysis, sentiment analysis can also be classified to be either subjectivity/objectivity identification or feature/aspect-based. Subjectivity/objective identification, as used by studies cited above (e.g., Kumar and Zymbler 2019; Rudkowsky et al. 2018), classifies the sentiment of an entire text. By contrast, aspect-based sentiment analysis takes a more fine-grained approach, aiming to determine sentiment in parts of texts (e.g., opinions regarding different attributes of a product or service) (Pontiki et al. 2016; Thet et al. 2010; Wang and Liu 2015). For example, when analyzing an online review of a hotel, the subject/objective identification estimates the general sentiment of the review, whereas the aspect-based sentiment analysis may examine how positive the review is toward the location, service, room, and food of the hotel. Therefore, the first step of the aspect-based sentiment analysis involves parsing texts into different linguistic components through automated algorithms such as topic modeling (Thet et al. 2010). After the texts are broken down, researchers can then choose to apply sentiment analysis discussed above to measure the aspect-specific sentiments. Aspect-based sentiment analysis thus provides more detailed and accurate information regarding the sentiment in texts, which can be particularly useful when one needs to understand opinions about specific features.

Summary

Existing sentiment analysis methods commonly apply the bag-of-word approach, breaking texts down to independent words without considering word contexts. The more recently proposed word embeddings approach is gaining traction, but machine learning using the approach requires a large amount of data. Using pre-trained word embeddings makes judgments based on previously collected data rather than the body of texts being analyzed, therefore risks missing critical information in the local word context. The semantic network-based approach to sentiment analysis proposed in the current study complements the above approaches. It overcomes the limitations of the bag-of-words approach by gauging the contexts of words in texts based on word sequence and co-occurrence. It has an advantage over machine learning approaches as it does not need a large amount of data and measures sentiments based on the local information in a given text.

Moreover, it allows fine-grained sentiment analysis at the aspect or feature level like aspect-based sentiment analysis does. Instead of relying on unsupervised learning algorithms such as topic modeling to identify features in a text, our approach enables researchers to name the target word of interest (the name of person, organization, event, or brand, e.g., iPhone) and generates a score indicating sentiment toward this specific target. Thus, the sentiment network method can generate sentiment scores for multiple targets of interest in the same text, which enables a comparison of the results among them.

Sentiment network approach

The sentiment network approach measures target-specific sentiment based on shortest paths between the target and sentiment words in the semantic network. The approach has five major advantages over bag-of-words classification approaches. (1) A key advantage is that the network method measures sentiment concerning targets, which is possible because the basic unit of analysis is the word pair in a sentence, not a document. The more micro-level word pairs are links in a chain, forming shortest paths that extend across text units, enabling tracing the closeness of sentiment words to a target word. (2) A further advantage is that the sentiment network approach can compare multiple targets in the same corpus, which expands the scope of hypotheses that can be tested.

(3) The network method includes a way to deal with sentiment ambiguity such as negation. Some text units have a mixture of positive and negative sentiment (e.g., “He was happy that the evil one died.”) and negations (e.g., “not good”). The bag-of-words approach is not good at dealing with negations since it treats words as independent. Human annotations are perhaps a better solution for judging mixed sentiment texts and negations, although measurement error can be high, particularly when they are forced to classify texts into either positive or negative categories. In contrast, the network approach enables better management of this error. By shifting the handling of such ambiguity from the individual text unit level to the aggregated paths of words across the corpus of texts, mixed sentiment strings are tractable. We look at the empirical nature of mixed sentiment in the results section.

(4) A further advantage of our approach is that we avoid problems with some common practices of natural language processing, such as stemming (Porter 1980) or lemmatization (Plisson et al. 2004). It began as an attempt to improve information retrieval (Lovins 1968; Porter 1980). When the goal was finding relevant articles, morphological word endings were considered noise in identifying the important concepts. Reducing morphological variations of unique word strings down to root words by removing suffixes, increases coverage but lowers measurement precision.

In non-retrieval applications, stemming may be useful when the quantities of text are small, so a wider net is needed to increase counts to avoid those less than five, or when the goal is classification into linguistic categories. Nevertheless, stemming obscures important aspects of meaning carried by languages such as tense, singularity/plurality, and the nature of relationships among words. The reduction of linguistic variance obscures finer-grained semantic relationships.

(5) An additional advantage of sentiment network analysis’ over bag-of-words approaches is that it better informs message design for communication campaigns. One could move to a level below the summary scores and identify which particular sentiment words have the highest and lowest strengths concerning a target. These could inform campaign message design, enabling messages to have the same expressions as that found in the texts. The retention of the raw morphological forms of words enables a greater correspondence of the natural language in the texts and in the messages created. This framing of messages in the language of targeted groups is likely to increase message effectiveness (Scheufele and Tewksbury 2007). The analysis of directed links as message pairs retains embedded syntactical information such that one could select a target as the start word, and a sentiment word as the end word, extracting the strings of words along the path to form optimal campaign messages or for summarizing features of the semantic network (Danowski 1982, 1993).

The results of an experiment (Danowski 1993) led to the conclusion that if the goal is to reinforce the dominant associations in the text, one would select the most central and strongly linked sentiment words, while if the goal were to attract attention to an innovation, one would select the central words with less frequent co-occurrences, whose novelty arouses attention and engagement (Danowski 1982, 1993). This process could be repeated for different sentiment words to produce messages that have multiple statements. With the core points sketched out in this manner, one could edit these optimal message strings to be grammatically correct and fill in function words dropped in the network analysis of the text. WORDij’s Opticom module produces the shortest paths between a seed and target word.

Shortest path algorithms

The network approach to sentiment finds the shortest paths between target and sentiment seeds. There are a number of shortest path algorithms (Golden 1976). For any two words in the text, one can trace a sequence of edges that connect them. Two of the most widely used shortest path algorithms are Dijkstra (Dijkstra 1959) and Bellman–Ford (Bellman 1958; Ford 1956). When dealing with millions of nodes and long computing times, parallelization is desirable. Bellman–Ford is suited to parallelization (Hajela and Pandey 2014) because it’s search for shortest paths can be independently performed for all the nodes. As well, the search for shortest paths beginning with a fixed node, called a single source shortest path, can also be made in parallel. In contrast, the Dijkstra algorithm needs to compare all the nodes to find out the minimum distance values. This topological structure cannot be parallelized.

To efficiently deal with the high volumes of words common in social media data, our method uses the Bellman–Ford algorithm in a three-step process. First, we use Bellman–Ford to find the shortest unweighted paths in the network (which means we can trace the links from one word to another without taking into account their word pair frequencies), second, we invert the path lengths and square them so that the shortest paths have the highest numerical values, third, we sum the co-occurrence counts, and fourth, we multiply the path statistic times the sum the weights to measure the magnitudes of association, the strength of the path.

Note that shortest path algorithms use costs as edge weights and find the lowest cost path to traverse from node A to node B, while our measure of edge strength is word pair co-occurrence frequencies, the inverse of costs. This edge weight inversion is the major pivot from transportation routing networks to semantic and social network analysis, where the edge weights are indicators of cohesion rather than costs.

Sentiment network measures and ground-truth

To establish the internal validity of our network measures of sentiment, we align the automated procedures with human annotators’ judgments as to whether a text unit is positive or negative. The comparison is rather crude in that the humans make a categorical classification judgment just like automated bag-of-words methods. The more precise continuous sentiment network measures are compared to nominal classification, where the human must make a decision for a whole text unit, not giving a continuous rating. Many messages contain both positive and negative features, so the categorical classification that annotators may make errors. Ground-truth is, in this sense, a partial truth.

Nevertheless, it is useful to examine the extent to which semantic network sentiment scoring for targets corresponds to the annotation classifications. We expect to find that texts classified by annotators as negative have higher negativity toward targets, while positive ones have higher positivity. This pattern would be evidence supporting the internal validity of the semantic network approach.

Hypotheses

Although there is no way to statistically compare a network method to bag-of-words, because the latter cannot measure target-specific sentiment, we can test hypotheses about whether the network approach is consistent with the annotation results. Our null hypothesis is that the network positivity scores for targets based on positive annotations are no different than negativity scores, and likewise that the network negativity scores based on the negative annotations are no different than the positivity scores. Our main hypothesis is that the positivity scores are higher than negativity scores when positive annotations are analyzed and negativity scores are higher than positivity scores when negative annotations are analyzed. Accordingly, we test for the internal validity of the sentiment network measures by way of their alignment with the ground truth categorizations in two datasets: (1) tweets about airlines and (2) a sample of tweets regardless of topic. Then, we examine external validity in which we test hypotheses about sentiment, panic, and social distancing in TV news about the coronavirus.

Methods

Data

We use two ground-truth datasets in this research. A third dataset is for assessing the predictive validity of the sentiment network method.

Dataset 1

The dataset includes tweets about five US Airlines: United, US Airways, Southwest, Delta, and Virgin America, scraped during February of 2015. The dataset contains 14,845 tweets, comprising 1.5 MB that workers classified as either positive, negative, or neutral. We selected the positive and negative categories for analysis. The data are available on the crowdsourcing website Kaggle.Footnote 2 Rane and Kumar (2018) reported an analysis of the dataset.

Dataset 2

A second dataset is the ground-truth tweet dataFootnote 3 analyzed by Hutto and Gilbert (2014) in their testing of the VADER sentiment analysis method. Rather than relying on several annotators, this work used 20 Amazon Turk workers who rated each tweet from − 4 to + 4 in terms of negative to positive sentiment.

Nevertheless, to compare sentiment network scores to the annotations, we converted sentiment scores into four categories: highly negative, negative, positive, and highly positive. Another aspect of the data required a modification of our procedures compared to the analysis of dataset 1. Because the tweets were a representative sample from the Twitter API and not selected based on topics, the target we chose was a generic word referring to an object, “it.”

Dataset 3

To assess the external or predictive validity of the sentiment network measures, we analyzed data on television news coverage of the coronavirus. Data were available for January 1 through March 26, 2020, from the GDELT project.Footnote 4 There are 14 news outlets included. Broadcast outlets were ABC, CBS, and NBC. Among the 24-h networks, business news channels included Bloomberg, CNBC, and Fox Business Channel. General news networks were CNN, CSPAN, Fox News, and MSNBC, while BBC, Al Jazeera, Deutsche Welle, and RT were foreign channels. The data were keyword-in-context snippets of text in which the center word was ‘coronavirus (or covid-19)’ and words appearing 150 characters before and after were provided.

Hypotheses

We tested four hypotheses and addressed a research question about the coronavirus topic in television news:

H1

Broadcast news expresses less sentiment about coronavirus, social distancing, and panic than non-broadcast news outlets.

The rationale is that with more limited time slots for news, broadcast news has a smaller programming window than 24-h cable news networks. The average time per story for the nightly network news was 2 min and 23 s, according to the Pew Research Center.Footnote 5 This time factor affects the content presented, leading to a more summary treatment. The inverted pyramid style that places the who, what, where, why, and how at the top of the story may not leave much room for more than initial expressions of sentiment. Another factor may be that these TV news shows have the longest history, beginning in an era of objective journalism. The traditional orientation may continue to some extent in the current era of advocacy journalism.

Social distancing

The goal of “flattening the curve” for the spread of the disease is treated in the mediaFootnote 6 as accomplished through increased social distancing.Footnote 7 Epidemic management models (Glass et al. 2006; Valdez et al. 2012) have identified social distancing as essential to controlling the spread of infectious disease. In the coronavirus case, governments have promoted social distancing, emphasizing its positive effects in saving the lives of vulnerable segments of the population.Footnote 8

It is interesting to consider the inversion of polarity associated with social distancing. Typically, social distancing has been considered a negative concept (Westphal and Khanna 2003; Swim et al. 1999; Polansky and Gaudin 1983), with norms favoring social cohesion (Forrest and Kearns 2001) and the reduction of social differences. Differences between social groups (Verba and Nie 1987) were considered harmful to the health of a democratic society. As well, individuals who socially distanced themselves were conceptualized as loners with less stable psychological and social functioning (Hojat 1983). Because social distancing is advocated as a temporary effort in the context of epidemics, when these processes subside, the polarity inversion in public communication about social distancing is likely to recede. Nevertheless, study of short-term and longer-term effects of social distancing on psychological and social variables is warranted.

Panic

When widespread crises occur, there is often accompanying mass panic (Mawson 2005). Mention of panic in the news is likely associated with negativity. Normally, negative information increases attention, information seeking, reasoning, and decision-making. Nevertheless, with increasing negativity, once a threshold is crossed negativity is no longer a stimulus to these rational responses to negative information. Rather, such higher level cognitive processes are short-circuited as panic sets in. With mass panic, herd behavior moves individuals to follow others without critical and rational thinking, operating at a more animalistic level.

H2

There is a negative bias in the coronavirus news across channels.

Many observers of media news assert that news media have a negative bias. Research supports this notion (Hofstetter 1976; Hackett 1984). News media strive to attract the largest audiences. Consider their negative bias in light of the findings of laboratory research. Individuals’ brains have a negative bias, more quickly processing negative than positive information (Taylor 1991). The brain’s negative bias may explain the media’s negative bias.

H3

Uncertainty in news coverage of coronavirus increases sentiment over time.

When uncertainty increases, both negative and positive sentiment increases, based on laboratory experiments (Bar-Anan et al. 2009), both negative and positive affect increases. Uncertainty and sentiment mutually increase one another. This results in a synchronous correlation of the two terms. Nevertheless, there may be some lagged effects such that one of the two concepts may lead the other.

H4

Sentiment increases are associated with an increased volume of news stories.

In a study of sentiment and media coverage of the BP Deepwater Horizon Gulf Oil Spill (Danowski and Riopelle 2019) we found that increases in sentiment were synchronously correlated with increased media coverage. Our theory suggested that sentiment increases attention, information seeking, analysis, decision-making and broadens the view of the situation (Taylor 1991). Here we repeat the hypothesis test with the coronavirus coverage. The rationale is that the theory specifies that the sentiment and media volume relationship is general across topics. Nevertheless, certain topics develop the effects of sentiment more rapidly.

RQ1

Are 24-h business news, general news, and foreign news channels different in the sentiment expressed toward coronavirus, social distancing, and panic?

The fact that there are four different types of news channels: broadcast, 24-h business channels, general news, and foreign news, enables us to examine their differences in sentiment, in addition to H1’s expectation that broadcast will have less sentiment than the non-broadcast channels. The content differences between business and general 24-h news channels may be associated with differences in sentiment. As well, because of the different cultural contexts in which domestic and foreign channels are embedded, these may affect the sentiment they express.

Preprocessing the corpora

Our procedures for preprocessing the corpora and normalizing the data were as follows.

  1. 1.

    Drop words on a stop-word list.

  2. 2.

    Do no stemming.

  3. 3.

    Remove punctuation (except sentence endings).

  4. 4.

    Extract aggregate word pair counts using a sliding window three words wide on either side of each word. [Stop pairing at the end of a sentence when finding a period “.” exclamation point “!” or question mark “?”.]Footnote 9 Drop word and word pair frequencies less than 3. In the area of computational linguistics dropping frequencies of 1 and 2 is a common practice because these low-frequency pairs do not add value to the results (Church and Hanks 1990). Word and word pair frequencies follow a power-law distribution, meaning that most word and word pairs occur less than 3 times. Dropping them produces a more normal distribution. Another consideration is that including words or pairs appearing only once or twice does not add explanatory power, yet it increases the computation load.

Sentiment lexicon development

Sentiment analysis typically uses a lexicon or dictionary that contains positive and negative sentiment words. To construct a lexicon for this research, we began with the positive and negative emotion dictionary from the LIWC program (Pennebaker et al. 2007). It stems words, which reduces them to their roots, removing morphological variants. For example, “walked, walking, walks, walker” are converted to “walk.” Although this is good for small corpora to increase word counts, it limits linguistic precision, because there are different meanings for various morphological endings. These nuances are lost with stemming to root words.

Accordingly, we de-stemmed the LIWC 266 positive and 346 negative stems, looking up each in the AGID list of inflections (Atkinson 2011). We also included the positive and negative word lists from Loughran and McDonald (2016), who analyzed SEC financial report text. Also, we added positive and negative lexicons developed by Liu (2010),Footnote 10 as well as sentiment lexicons from Khoo and Johnkhan (2018). After removing duplicates, the positive lexicon numbers 4485 words, and the negative lexicon contains 6466 words.

Sentiment network analysis procedures

  1. 1.

    Read in edge lists of word pairs and co-occurrence weights. Read in text files containing lists of positive and negative seed words.

  2. 2.

    Run the Bellman–Ford shortest path algorithm to identify the paths from targets with sentiment seed words, then tabulate for each file of seed words the sum of the weighted shortest paths from the seed words to a specified target word, and from the target word to the seed words. This bi-directional tabulation ensures we capture the positive and negative seed words before and after a target word. So that larger numbers indicate closer ties, the path lengths are inverted and squared based on the Inverse Square Law. Then the sum of the co-occurrence weights along the path is multiplied by these inverse squared shortest path values. The result is the measurement of sentiment toward a target, with a value for positivity and negativity.

  3. 3.

    Output positivity and negativity metrics:

    1. (a)

      normalized negativity, the sum of weighted shortest paths to negative seeds divided by the square of the inverse of the number of edges along the path, divided by the number of possible negative seeds;

    2. (b)

      normalized positivity, the sum of weighted shortest paths to positive seeds divided by the square of the inverse of the number of edges along the path, divided by the number of possible positive seeds; and

    3. (c)

      the ratio of normalized positivity to negativity. Ratios less than one indicate that there is more negativity than positivity. For example, if the score were .50, this would indicate that there was twice as much negativity than positivity. Scores above one indicate increasing positivity, without an upper bound.

Computing edge lists

One can produce the edge lists for sentiment network analysis in a variety of ways. There are Python and R packages, such as word2vec (Mikolov et al. 2013), or gensim (Řehůřek and Sojka 2011) which can produce adjacency or edge lists. We used WORDij software (Danowski 2013)Footnote 11 to produce these lists of word pairs and weights. Analysis in WORDij is automated, not requiring coding and the loading of various packages.

We call the program we developed for sentiment network analysis SENET, an acronym derived from “SEntiment NETworks.” The program is coded in C++, and the shortest path procedures are parallelized to enable the code to run efficiently on large adjacency lists. SENET takes as input a list of word pairs and co-occurrence frequencies. The edge list has as a row for each pair of words found through the word windowing process: [word A string, word B string, numeric co-occurrence value]. Run times are such that a file containing 2 million word pairs and 200,000 unique words, comprising 1 Gb of text runs in several minutes on a common laptop.

Results

Prior to producing the results in this study, we analyzed negation in terms of pure and mixed sentiment paths. We report these results first.

Pure versus mixed sentiment paths

When a shortest path contains only one kind of sentiment word, positive or negative, this is a pure sentiment path. Mixed sentiment paths have a combination of positive and negative words. Of the shortest sentiment paths for the two ground-truth datasets, 47% were pure positive, 39% pure negative, 9% of the paths linking targets with positive lexicon words also included negative lexicon words, while 6% of paths linking targets to negative words had some positive words. To see if these mixed paths could be converted to negative or positive, we had three coders judge a random sample of 40 mixed paths as to whether they should be considered as negative, positive, and questionable.

The reliability was .69, below the standard of .80 and above. We found that the percentage of negative and positive recodes was accompanied by a relatively high percentage of paths that were questionable. Of mixed paths, 56% were judged as negative, 29% as positive, and 20% as questionable. Coders reported that they were not confident of their judgments on the mixed paths. So, if we recoded these, it would introduce considerable error. Recall may have increased, but precision would have suffered. Since the percentage of mixed paths was relatively low compared to pure sentiment paths, averaging 7.5%, we decided to drop the mixed paths.

Nevertheless, it would be desirable to recode word pairs such as “not good” to negative sentiment. Additional work is needed to handle negation when “no, not, never” precedes a sentiment word, which occurs for some of the mixed sentiment paths. Most of the mixed paths, however, do not involve clear negation, where the last edge of the path includes a negation word preceding the sentiment word. The appearance of negative words anywhere along the path to a positive word, and vice versa, typically results in ambiguity.

Accordingly, when such ambiguity occurs, the net effect of a combination of positive and negative terms for many paths is neutralization. Calabrese, et al. (2019a) observe that a large majority of tweets are neutral because of a lack of sentiment words. We consider the absence of sentiment words as outside the domain of aggregate target-specific sentiment. Nevertheless, if one were classifying text units, neutral would be a useful category. Moreover, future research may use the network path data to improve the classification of texts.

In our testing for the validity of the network approach using ground-truth data, we examine tweets categorized as neutral in terms of their relative positivity and negativity. Since our method produces a ratio of positivity to negativity, researchers can empirically observe neutrality with respect to a target, where the ratio is close to 1, although neutrality appears to have a considerable range. In the results section to follow, there is evidence of a positivity bias in texts annotated as neutral, extending neutrality above the ratio of 1.0 to near 2.0 with positivity near 5.0.

Dataset 1

To begin, we show an example of a semantic network about airlines from dataset 1, illustrating the contexts for sentiment words. The graph (Fig. 2) shows word pairs occurring 25 times or more. Embedded in this network are shortest paths connecting targets to sentiment words. For example, here is a negative path with “plane” as the target: plane → wifi → usairways → frustrating, and a positive path with “flight” as target: flight → southwestair → made → safely.

Fig. 2
figure2

Semantic network

We computed positivity and negativity for three target words in the airline tweets data file: “airline,” “flight,” and “plane.” Table 1 presents the summary results. The findings show that the ratio of positive tweets to negative tweets averages .63 for the negative annotations, showing that negativity is 1.5 times stronger than positivity. For neutral annotations, the ratio is 2.13, which indicates that positivity is 2 times stronger than negativity, while for positive annotations, it is 3.94, with positivity 4 times stronger than negativity. This finding supports the internal validity of the method.

Table 1 Sentiment variables

Note that the two sources of sentiment data are not perfectly aligned. The sentiment network scores are based on more micro-level text elements than the whole text annotations. Nevertheless, the whole-text ratings are the closest ground-truth data available. Even though the two sources or sentiment ratings are different, their comparison is useful. Future research may have annotators rate word pairs.

The ratio is closer for the negative annotations than for the positive annotations. This pattern suggests a positivity bias in expressions of sentiment. One reason for this may be that people typically identify a mix of positive and negative features when they evaluate an object (Houwer 2009). Nevertheless, scanning the last column of Table 1 shows that the pattern occurs for both datasets. Human assessments of any objects or issues are likely to include some negative and some positive aspects as people perhaps take a mental ratio and reach a binary summary judgement of whether they like or don’t like the entity. This process may account for our findings.

Another possible reason is that media messages frequently contain advice that positivity is more efficacious than negativity. For example, the Mayo Clinic lists on their websiteFootnote 12 the following effects of positivity: increased life span, lower levels of distress, greater resistance to the common cold, better psychological and physical wellbeing, better cardiovascular health, and better coping skills during times of stress.

Such positive statements about the effects of positivity are indicative of the strong encouragement provided in the media for it. Even when feeling negative, people may temper their negative messages with some positivity. Despite the effects on the media of the removal of the Fairness Doctrine (Ruane 2009) that broadcasters must present a balance of opposing views to any opinions advocated, which appears to have unleashed advocacy journalism (Waisbord 2009) with a higher degree of negativity, the population may still resonate with the idea that one is more socially acceptable and credible if one leavens negativity with some positivity. Another consideration is cultural relativism (Donnelly 1984), a perspective that considers that moral issues are not black and white based on universalism but have a range of grays depending on whose value system is used. The effects of this cultural relativity may stimulate the embedding of positive sentiment with the negative even when the negative dominates.

Figures 3 and 4 illustrate the level of negativity, positivity, and the positivity/negativity ratio across three target words and different sentiment levels annotated by humans. Neutral annotations have approximately one-half as much positivity as positive annotations, showing a pattern consistent with expectations that neutral posts would have less positivity than positive posts. Comparing the magnitude of the ratios, we observe that negative tweets contain a higher proportion of positivity relative to negativity than expected. Moreover, neutral annotations have a 2–1 positivity ratio, adding further support to the positivity bias interpretation.

Fig. 3
figure3

Airline tweets: positivity and negativity for negative, neutral, and positive annotations

Fig. 4
figure4

Airline tweets: positivity and negativity ratios for negative, neutral, and positive annotations

Dataset 2

Because the tweets were a representative sample selected by API, and not specific to a topic, we selected a generic word referring to targets: “it,” so we did not use a stopword list. Figure 5 shows the positivity and negativity associated with the target by four levels of annotations: high negative, low negative, low positive, and high positive. Figure 6 compares the ratio of positivity to negativity, showing that the ratio increases across the levels. Table 1 shows that the high levels of sentiment both show the dominance of expected positivity and negativity. High negatives have a positivity ratio of .23, which indicates nearly 4 times as much negativity as positivity. High positives have 31 times as much positivity as negativity. The lower sentiment categories show a bias toward positivity, with both low negative and low positive having approximately 2.5 times more positivity than negativity. Perhaps these low sentiment categories are better considered as neutral yet reflecting a positivity bias. The distribution of sentiment parallels the findings for the neutral category in dataset 1.

Fig. 5
figure5

Vader: positivity and negativity for low and high positive and negative annotations

Fig. 6
figure6

Vader positivity/negativity ratios for high and low positive and negative annotations

Hypothesis test

Our null hypothesis is that positivity and negativity scores based on the annotations are no different. We expect that network-based positivity is higher than negativity when analyzing texts annotated as positive, and that negativity is higher than positivity when analyzing texts annotated as negative. For this test, we computed the average sentiment ratios for negative texts and for positive texts across the comparisons in datasets 1 and 2. For negativity, the mean ratio was .25 with a standard deviation of .29, while the positivity mean ratio was 6.52 with a standard deviation of 21.43. We converted the negative ratio to 1 (which shows that negativity is four times higher than positivity given a ratio of .25) by multiplying times 4 and doing likewise for the positivity ratio. The comparison tested is for the significance of the difference between a value of 1 and of 26.08. This difference was statistically significant (t = 147.47, df = 10,929, p < .0001). The hypothesis is supported.

Dataset 3

Predictive validity: sentiment, panic, and social distancing in TV news coverage of coronavirus

Given the internal validity that the sentiment network approach found, it is useful to consider external or predictive validity, the extent to which we find support for hypotheses based on the method. For this assessment we examined positivity and negativity in the early coverage of the 2020 coronavirus pandemic, from January 1, 2020 to March 25th, the time of our data collection. Figure 7 shows the total number of mentions of coronavirus. We tested four hypotheses and addressed one research question about the coronavirus topic in television news.

Fig. 7
figure7

Coronavirus mentions across the media channels over time

Shortest path statistics

Number of paths

We identified the features of the shortest paths across all media outlets and across time. Pure positive paths numbered 3786, while there were 5112 pure negative paths, 352 mixed positives, and 646 mixed negatives. Mixed paths were 10% of the total.

Edge length

Across each type of path, the longest had 4 edges. The modal path length was 3 edges, comprising 67% of paths.

Path strength

Pure positive paths had a sentiment strength ranging from .75 to 9.0, with pure negative paths had a range of − .75 to − 9.1. The most common sum of the co-occurrence frequencies along the path was 9 with values ranging from 3 to 215. Mixed positive paths’ range was .75 to 4.75, while mixed negative paths ranged from − .75 to − 6.25 in sentiment strength.

Hypothesis tests

H1

Broadcast news expresses less sentiment about coronavirus, social distancing, and panic than non-broadcast news outlets.

For each of the three topics, broadcast news networks, ABC, CBS, and NBC, were compared to the other channels. Negativity and positivity variables were summed for a total sentiment score. Treatment of coronavirus by broadcast outlets was found to have a mean of .23, while the remaining outlets had a mean of .92. This suggests that broadcast coverage contains less overall sentiment (positive and negative) than non-broadcast TV news. A t test found this difference to be significant at p < .001. For social distancing the mean for broadcast was .73 while for the others it was .88, p < .06. Panic had a mean for broadcast of .18 while the other channels it was .81, p < .003. These results support the hypothesis.

H2

There is a negative bias in the news across channels.

The negative bias was tested by dividing the positivity score by the negativity score. This ratio averaged .84. The Z-test for proportions comparing this value to 1.00 for balance of negativity and positivity found them significantly different at p < .0001 with negativity higher by 16%.

H3

Sentiment is associated with uncertainty in news coverage of coronavirus over time.

To test the hypothesis that changes in uncertainty are associated with increases in sentiment, we began by running WORDij’s WordLink with an include list of 297 uncertainty words (Loughran and McDonald 2016) by day by the news data aggregated across channels. Following the procedure used in Danowski and Riopelle (2019), we then factor analyzed these words over time. Taking the first principle component, we identified 51 words that loaded above .60 on the first dimension. Next, we created a string replacement file that converted each of the uncertainty words to a new aggregated uncertainty index. After rerunning WordLink with the string replace file for uncertainty terms, we extracted the counts for the news channels by day. This enabled a time-series analysis of sentiment and uncertainty.

Differencing of 1 was used to remove autocorrelation. Lags of − 7 to + 7 were computed in cross-correlations between the sum of negativity and positivity and uncertainty. The strongest association was for the contemporaneous period with a correlation of .66, p < .00001. There were no significant lags. The synchronous correlation supported the hypothesis (Table 2).

Table 2 Television news negativity and positivity for coronavirus, social distancing, and panic

H4

Sentiment increases lead to increased volume of media attention.

Figure 8 shows evidence of two phases in media mentions of coronavirus. The first period runs from January 1 to February 21, while the second period is from February 22 to March 25. Period 1 negativity was .007 and positivity was .007, while period 2 negativity increased to .0204 and positivity to .0249, and increase of 3.9 times. A t-test of this difference was significant at p < .002. The hypothesis was supported.

Fig. 8
figure8

Negativity and positivity about coronavirus by channel

RQ1

Are 24-h business news, general news, and foreign news channels different in the sentiment expressed toward coronavirus, social distancing, and panic?

Means were tested for differences among the four groups. Although broadcast network news was treated in hypothesis 1, we included it in this analysis. Table 3 contains the ANOVA results. For each topic, coronavirus, panic, social distancing, and uncertainty, there were significant differences. As the type of 24-h news channel proceeds from broadcast network news, business news, general news, to foreign news there are increases in sentiment across each type of channel for each of the topics. Figure 9 shows sentiment about social distancing mentions, whose distribution is similar to that for coronavirus.

Table 3 Sentiment by type of news: broadcast, business, general, and foreign about coronavirus, bias, panic, social distancing
Fig. 9
figure9

Negativity and positivity about social distancing by channel

In summary, this research has found with two ground-truth datasets that the network metrics are internally valid. Moreover, the analysis of coronavirus coverage shows evidence of external validity. This predictive validity of the sentiment network measures is seen in the support for hypotheses about sentiment in media coverage of coronavirus.

Discussion

This study demonstrated that network-based measures of text sentiment have internal construct validity and external predictive validity. The sentiment network approach has several major advantages. It measures sentiment concerning targets, which is possible because the basic unit of analysis is the word pair in a sentence, not an entire document, as in bag-of-words approaches. The network approach enables better management of sentiment ambiguity. Mixed sentiment issues, a small percentage compared to pure sentiment paths, are shifted to an aggregate level where they are clearly identified and removed. Moreover, the sentiment network measures produce a continuous ratio, expanding the scope of useful statistical procedures. As well, the measure better informs message design for communication campaigns.

This research found ground-truth validity for the sentiment network measures, based on the tests with two datasets where human judges’ classified airline tweets as positive-neutral-negative, and general tweets as high negative-low negative-low positive-high positive. The results confirmed the validity of the sentiment network metrics in terms of human annotations. Nevertheless, this work has illustrated a distinction between ground-truth and partial truth. In placing the results in context, consider that “ground-truth” is best thought of as “partial truth” in that humans judged the entire text unit, not pairs of words, resulting in positive, neutral, and negative annotations having considerable levels of both positivity and negativity within them. Despite the error embedded in such ground-truth data, it is considered the best available standard against which to evaluate automated methods.

The results show evidence of a positivity bias in texts. For the posts that annotators judged, we found that at the subtext level of word bigrams, there was a combination of positivity and negativity, although in the direction of the annotation label. When evaluating entities, people consider both positive and negative attributes, yet reach an overall conclusion about whether the object is good or bad (Houwer 2009). These judgments may also be due to the frequent mention in the self-help, positive psychology (Fredrickson 2001), and medical advice that people should be more positive and less negative. As well, there may be a social desirability bias to appear balanced in opinions, which was the norm in public media until the end of the Fairness Doctrine (Ruane 2009), and before the current era of advocacy journalism (Waisbord 2009).

It is important to note that the fundamental basis for assessing sentiment is the lexicon used in the network approach. The lists of approximately 11,000 words, 6446 for negativity and 4485 for positivity, are comprehensive, including morphological variants of words. Nevertheless, future work on comparing different lexicons is needed, finding which ones produce the most consistent results. Lexicon tuning and pruning may look at the contribution of each word to sentiment metrics. Words that contribute to more mixed sentiment paths would be candidates for removal. Words that have the lowest ambiguity would be retained. This lexicon work would contribute to improving sentiment measures.

A key component of this study was the application of the method to examine sentiment in the coronavirus coverage in television news. We found predictive validity through support for four hypotheses:

H1

Broadcast news expresses less sentiment about coronavirus, panic, and social distancing, than non-broadcast news outlets.

H2

There is a negative bias in the news across channels.

H3

Sentiment is associated with uncertainty in news coverage of coronavirus over time.

H4

Sentiment increases are associated with an increased volume of news stories.

It was also found that as the type of channel moved from broadcast network news to 24-h business, general, and foreign news sentiment increased for coronavirus, panic, and social distancing.

Public health campaigns to mitigate epidemics reverse the polarity of social distancing. They advocate social distancing for the social good, to dampen the spread of disease, and to protect vulnerable segments of the population. Such a meaning shift from negative to positive is interesting. Future research that examines short-term and possibly longer-term effects of social distancing campaigns on social behaviors and public opinion is warranted.

The utility this research has demonstrated for the sentiment network metrics suggests that the approach has promise in a wider range of applications. For example, one could use lexicons to measure variables such as uncertainty, teamwork, innovation, and resilience. All that is needed is a list of words that exemplify and instantiate the concept. One approach could be to take an existing scale, such as from a fixed-choice questionnaire, deconstruct it into a list of words, have experts further articulate the list, expand it by finding synonyms and antonyms in WordNet,Footnote 13 then statistically validate the semantic scaling against the fixed-choice metrics. Or, one could create new concept lexicons starting from mining text from news, social media, or customer reviews, finding the co-occurrences of concepts in a list, extracting principal components, then using the words with the highest loadings on the dimensions to create seed files for semantic network scaling (Danowski and Riopelle 2019). The network methods enable automated filtering and measurement of an endless variety of text, targets, and seeds to navigate streams of natural language.

In conclusion, this research found that a novel sentiment network approach has construct validity. The positivity and negativity scores it produced aligned well with ground-truth annotations in two datasets having three and four sentiment categories. Also, our testing of hypotheses about television news coverage of the coronavirus pandemic found evidence of predictive validity. Accordingly, the strong support we found for both internal and external validity demonstrates an improvement over bag-of-words approaches that merely count occurrences of lexicon words to classify whole text units. The higher precision and specificity of the sentiment network approach enables going beyond artificial intelligence-based classification with its hidden layers of nodes in black-box neural networks, moving the network into the foreground illuminated by theory.

Notes

  1. 1.

    https://www.kaggle.com/orgesleka/imdbmovies.

  2. 2.

    The original dataset constructed from this analysis is available on Kaggle: https://www.kaggle.com/crowdflower/twitter-airline-sentiment.

  3. 3.

    https://github.com/cjhutto/vaderSentiment.

  4. 4.

    https://blog.gdeltproject.org/a-new-dataset-for-exploring-the-coronavirus-narrative-on-television-news/.

  5. 5.

    https://www.journalism.org/2012/07/16/video-length/#_ftn2.

  6. 6.

    https://blog.gdeltproject.org/flatten-the-curve-has-lept-into-our-lexicon/.

  7. 7.

    https://blog.gdeltproject.org/social-distancing-is-the-new-coronavirus-buzzword/.

  8. 8.

    https://www.whitehouse.gov/briefings-statements/remarks-president-trump-address-nation/.

  9. 9.

    A reviewer cautioned that grammar rules are often violated in unedited material by lay writers, and that stopping the window at the end of a sentence and restarting with the next sentence could be problematic when punctuation is missing. Because we do not classify texts and instead identify aggregate patterns across them, such occurrences are distributed uniformly across comparisons and do not bias the findings. Moreover, when preprocessing text, natural language analysis typically removes all punctuation, so this issue is not considered important in these models.

  10. 10.

    For more information, visit: https://www.cs.uic.edu/~ liub/FBS/sentiment-analysis.html#lexicon.

  11. 11.

    WORDij can be downloaded from: http://wordij.net/.

  12. 12.

    https://www.mayoclinic.org/healthy-lifestyle/stress-management/in-depth/positive-thinking/art-20043950.

  13. 13.

    https://wordnet.princeton.edu/.

References

  1. Asghar, M.Z., Khan, A., Ahmad, S., Kundi, F.M.: A review of feature extraction in sentiment analysis. J. Basic Appl. Sci. Res. 4(3), 181–186 (2014)

    Google Scholar 

  2. Atkinson, K.: Automatically generated inflection database (AGID). Retrieved from https://github.com/en-wl/wordlist (2011)

  3. Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Lrec, vol. 10, pp. 2200–2204 (2010)

  4. Bar-Anan, Y., Wilson, T.D., Gilbert, D.T.: The feeling of uncertainty intensifies affective reactions. Emotion 9(1), 123 (2009)

    Google Scholar 

  5. Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)

    Google Scholar 

  6. Borgatti, S.P., Mehra, A., Brass, D.J., Labianca, G.: Network analysis in the social sciences. Science 323(5916), 892–895 (2009)

    Google Scholar 

  7. Calabrese, C., Anderton, B.N., Barnett, G.A.: Online representations of “genome editing” uncover opportunities for encouraging engagement: a semantic network analysis. Sci. Commun. 41(2), 222–242 (2019a)

    Google Scholar 

  8. Calabrese, C., Ding, J., Millam, B., Barnett, G.A.: The uproar over gene-edited babies: a semantic network analysis of CRISPR on Twitter. Environ. Commun. (2019b). https://doi.org/10.1080/17524032.2019.1699135

    Article  Google Scholar 

  9. Cambria, E., Das, D., Bandyopadhyay, S., Feraco, A. (eds.): A Practical Guide to Sentiment Analysis. Springer, Cham (2017)

    Google Scholar 

  10. Carley, K.: Coding choices for textual analysis: a comparison of content analysis and map analysis. Sociol. Methodol. (1993). https://doi.org/10.2307/271007

    Article  Google Scholar 

  11. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990)

    Google Scholar 

  12. Corman, S.R., Kuhn, T., McPhee, R.D., Dooley, K.J.: Studying complex discursive systems. Hum. Commun. Res. 28(2), 157–206 (2002)

    Google Scholar 

  13. Danowski, J.A.: A network-based content analysis methodology for computer-mediated communication: an illustration with a computer bulletin board. Commun. Yearb. 6, 904–925 (1982)

    Google Scholar 

  14. Danowski, J.A.: Network analysis of message content. In: Barnett, G., Richards, W. (eds.) Progress in communication sciences XII, pp. 197–222. Ablex, Norwood (1993)

    Google Scholar 

  15. Danowski, J.A.: Inferences from word networks in messages. In: Krippendorff, K., Bock, M.A. (eds.) The content analysis reader, pp. 421–429. Sage, Thousand Oaks (2009)

    Google Scholar 

  16. Danowski, J.A.: WORDij Version 3.0: Semantic Network Analysis Software. University of Illinois at Chicago, Chicago (2013)

    Google Scholar 

  17. Danowski, J.A., Park, H.W.: Arab spring effects on meanings for Islamist web terms and on web hyperlink networks among Muslim-majority nations: a naturalistic field experiment. J. Contemp. East. Asia 13(2), 15–39 (2014)

    Google Scholar 

  18. Danowski, J.A., Riopelle, K.: Scaling constructs with semantic networks. Qual. Quant. 53(5), 2671–2683 (2019)

    Google Scholar 

  19. D’Andrea, A., Ferri, F., Grifoni, P., Guzzo, T.: Approaches, tools and applications for sentiment analysis implementation. Int. J. Comput. Appl. 125, 26–33 (2015)

    Google Scholar 

  20. Del Vicario, M., Vivaldo, G., Bessi, A., Zollo, F., Scala, A., Caldarelli, G., Quattrociocchi, W.: Echo chambers: emotional contagion and group polarization on facebook. Sci. Rep. 6, 1–12 (2016)

    Google Scholar 

  21. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)

    Google Scholar 

  22. Donnelly, J.: Cultural relativism and universal human rights. Hum. Rts. Q. 6, 400 (1984)

    Google Scholar 

  23. Doroshenko, L., Schneider, T., Kofanov, D., Xenos, M.A., Scheufele, D.A., Brossard, D.: Ukrainian nationalist parties and connective action: an analysis of electoral campaigning and social media sentiments. Inf. Commun. Soc. 22(10), 1376–1395 (2019). https://doi.org/10.1080/1369118x.2018.1426777

    Article  Google Scholar 

  24. Fogel-Dror, Y., Shenhav, S.R., Sheafer, T., Van Atteveldt, W.: Role-based association of verbs, actions, and sentiments with entities in political discourse. Commun. Methods Meas. 13(2), 69–82 (2019). https://doi.org/10.1080/19312458.2018.1536973

    Article  Google Scholar 

  25. Ford Jr., L.R.: Network Flow Theory. Rand Corp, Santa Monica (1956)

    Google Scholar 

  26. Forrest, R., Kearns, A.: Social cohesion, social capital and the neighbourhood. Urban Stud. 38(12), 2125–2143 (2001)

    Google Scholar 

  27. Fredrickson, B.L.: The role of positive emotions in positive psychology: the broaden-and-build theory of positive emotions. Am. Psychol. 56(3), 218 (2001)

    Google Scholar 

  28. Glass, R.J., Glass, L.M., Beyeler, W.E., Min, H.J.: Targeted social distancing designs for pandemic influenza. Emerg. Infect. Dis. 12(11), 1671 (2006)

    Google Scholar 

  29. Golden, B.: Shortest-path algorithms: a comparison. Oper. Res. 24(6), 1164–1168 (1976)

    Google Scholar 

  30. Hackett, R.A.: Decline of a paradigm? Bias and objectivity in news media studies. Crit. Stud. Media Comm. 1(3), 229–259 (1984)

    Google Scholar 

  31. Hajela, G., Pandey, M.: Parallel implementations for solving shortest path problem using Bellman–Ford. Int. J. Comput. Appl. 95(15), 2–6 (2014)

    Google Scholar 

  32. Hofstetter, C.R.: Bias in the news: network television coverage of the 1972 election campaign. The Ohio State University Press (1976)

  33. Hojat, M.: Comparison of transitory and chronic loners on selected personality variables. Br. J. Psychol. 74(2), 199–202 (1983)

    Google Scholar 

  34. Houwer, J.D.: How do people evaluate objects? A brief review. Soc. Pers. Psychol. Compass 3(1), 36–48 (2009)

    Google Scholar 

  35. Hutto, C.J., Gilbert, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)

  36. Jiang, K., Barnett, G.A., Taylor, L.D.: Dynamics of culture frames in international news coverage: a semantic network analysis. Int. J. Commun. 10, 27 (2016)

    Google Scholar 

  37. Kharde, V., Sonawane, P.: Sentiment analysis of twitter data: a survey of techniques. arXiv preprint arXiv:1601.06971 (2016)

  38. Khoo, C.S., Johnkhan, S.B.: Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons. J. Inf. Sci. 44(4), 491–511 (2018)

    Google Scholar 

  39. Kim, S., Krishna, A.: Unpacking public sentiment toward the government: how citizens’ perceptions of government communication strategies impact public engagement, cynicism, and communication behaviors in South Korea. Int. J. Strateg. Commun. 12(3), 215–236 (2018)

    Google Scholar 

  40. Kumar, S., Zymbler, M.: A machine learning approach to analyze customer satisfaction from airline tweets. J. Big Data 6, 62 (2019)

    Google Scholar 

  41. Kundi, F.M., Ahmad, S., Khan, A., Asghar, M.Z.: Detection and scoring of internet slangs for sentiment analysis using SentiWordNet. Life Sci. J. 11(9), 66–72 (2014)

    Google Scholar 

  42. Liu, B.: Sentiment analysis and subjectivity. In: Handbook of natural language processing, vol. 2, pp. 627–666 (2010)

  43. Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5, 1–167 (2012)

    Google Scholar 

  44. Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal, C.C., Zhai, C.X. (eds.) Mining Text Data, pp. 415–463. Springer, Boston (2012)

    Google Scholar 

  45. Loughran, T., McDonald, B.: Textual analysis in accounting and finance: a survey. J. Account. Res. 54(4), 1187–1230 (2016)

    Google Scholar 

  46. Lovins, J.B.: Development of a stemming algorithm. Mech. Translat. Comp. Linguist. 11(1–2), 22–31 (1968)

    Google Scholar 

  47. Mäntylä, M.V., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis: a review of research topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018)

    Google Scholar 

  48. Mawson, A.R.: Understanding mass panic and other collective responses to threat and disaster. Psychiatry Interpers Biol. Process. 68(2), 95–113 (2005)

    Google Scholar 

  49. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

  50. Mohammad, S.M., Zhu, X., Kiritchenko, S., Martin, J.: Sentiment, emotion, purpose, and style in electoral tweets. Inf. Process. Manag. 51, 480–499 (2015)

    Google Scholar 

  51. Monge, P.R., Peter, R., Contractor, N.S., Contractor, P.S., Noshir, S.: Theories of Communication Networks. Oxford University Press, Oxford (2003)

    Google Scholar 

  52. Nanli, Z., Ping, Z., Weiguo, L., Meng, C.: Sentiment analysis: a literature review. In: 2012 International Symposium on Management of Technology (ISMOT), pp. 572–576. IEEE (2012)

  53. Osgood, C.E., Suci, G.J., Tannenbaum, P.H.: The Measurement of Meaning. University of Illinois press, Champaign (1957)

    Google Scholar 

  54. Pennebaker, J.W., Booth, R.J., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC, p. 135. LIWC, Austin (2007)

    Google Scholar 

  55. Plisson, J., Lavrac, N., Mladenic, D.: A rule based approach to word lemmatization. In: Proceedings of IS, vol. 3, pp. 83–86 (2004)

  56. Polansky, N.A., Gaudin Jr, J.M.: Social distancing of the neglectful family. Soc. Serv. Rev. 57(2), 196–208 (1983)

    Google Scholar 

  57. Pontiki, M., Galanis, D., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., AL-Smadi, M., Eryiğit, G.: SemEval-2016 Task 5: aspect based sentiment analysis. In: 10th International Workshop on Semantic Evaluation (SemEval 2016). Retrieved from https://hal.archives-ouvertes.fr/hal-02407165 (2016)

  58. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Google Scholar 

  59. Rambocas, M., Pacheco, B.G.: Online sentiment analysis in marketing research: a review. J. Res. Interact. Mark. 12(2), 146–163 (2018)

    Google Scholar 

  60. Rudkowsky, E., Haselmayer, M., Wastian, M., Jenny, M., Emrich, Š., Sedlmair, M.: More than bags of words: sentiment analysis with word embeddings. Commun. Methods Meas. 12(2–3), 140–157 (2018)

    Google Scholar 

  61. Rane, A., Kumar, A.: Sentiment classification system of Twitter data for US airline service analysis. In: 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), vol. 1, pp. 769–773). IEEE (2018)

  62. Řehůřek, R., Sojka, P.: Gensim—statistical semantics in python. Statistical semantics; gensim; Python; LDA; SVD (2011)

  63. Rogers, E.M.: Progress, problems and prospects for network research: investigating relationships in the age of electronic communication technologies. Soc. Netw. 9(4), 285–310 (1987)

    Google Scholar 

  64. Ruane, K.A.: Fairness doctrine: history and constitutional issues. J. Curr. Issues Crime Law Law Enforc. 2(1), 75–89 (2009)

    Google Scholar 

  65. Scheufele, D.A., Tewksbury, D.: Framing, agenda setting, and priming: the evolution of three media effects models. J. Commun. 57(1), 9–20 (2007)

    Google Scholar 

  66. Swim, J.K., Ferguson, M.J., Hyers, L.L.: Avoiding stigma by association: subtle prejudice against lesbians in the form of social distancing. Basic Appl. Soc. Psychol. 21(1), 61–68 (1999)

    Google Scholar 

  67. Taylor, S.E.: Asymmetrical effects of positive and negative events: the mobilization-minimization hypothesis. Psychol. Bull. 110(1), 67 (1991)

    Google Scholar 

  68. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29, 24–54 (2010)

    Google Scholar 

  69. Thet, T.T., Na, J.-C., Khoo, C.S.G.: Aspect-based sentiment analysis of movie reviews on discussion boards. J. Inf. Sci. 36, 823–848 (2010)

    Google Scholar 

  70. Valdez, L.D., Macri, P.A., Braunstein, L.A.: Intermittent social distancing strategy for epidemic control. Phys. Rev. E 85(3), 036108 (2012)

    Google Scholar 

  71. Verba, S., Nie, N.H.: Participation in America: Political Democracy and Social Equality. University of Chicago Press, Chicago (1987)

    Google Scholar 

  72. Waisbord, S.: Advocacy journalism in a global context. In: Wahl-Jorgensen, K., Hanitzsch, T. (eds.) The Handbook of Journalism Studies, pp. 371–385. Routledge, Abingdon (2009)

    Google Scholar 

  73. Wang, B., Liu, M.: Deep learning for aspect-based sentiment analysis. Retrieved from: https://cs224d.stanford.edu/reports/WangBo.pdf (2015)

  74. Westphal, J.D., Khanna, P.: Keeping directors in line: social distancing as a control mechanism in the corporate elite. Adm. Sci. Q. 48(3), 361–398 (2003)

    Google Scholar 

  75. Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8(4), e1253 (2018)

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to James A. Danowski.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Stopword list

The stopword list is accumulated across a number of studies across different text sources. Most of the items are function words, while some of the terms result from boilerplate material in Lexis-Nexis:

a editor latter somehow
a’s editorial latterly someone
aa edt least something
able edu leisure sometime
about ee length sometimes
above e.g. less source
according eight lest sources
accordingly either let specified
across else let’s specify
actually elsewhere letter specifying
after english like sports
afterwards enough liked ss
again entirely likely staff
against especially limited still
ain’t est listings sub
all et little such
allow etc. ll summary
allows even loaddate Sunday
allrightsreserved ever look sup
almost every looking sure
alone everybody looks t
along everyone ltd t’s
already everything m tends
also everywhere magazine th
although ex mainly than
always exactly many thanx
am example map that
among except mar that’s
amongst f March thats
an far May the
and feb maybe theater
another February me their
any few mean theirs
anybody ff meanwhile them
anyhow fifth merely themselves
anyone final metro then
anything first metropolitan thence
anyway five might there
anyways FM mm there’s
anywhere followed Monday thereafter
apart following more thereby
appear follows moreover therefore
appended for most therein
appreciate former mostly theres
appropriate formerly much thereupon
apr forth must these
April four my they
are Friday myself they’d
aren’t from n they’ll
around further name they’re
article furthermore namely they’ve
arts g nd third
as get nevertheless this
aside gets news those
ask getting newspaper though
asking gg newspapers three
associated give newyorktimes through
at given next throughout
aug gives nine thru
August gmt nn Thursday
available go nobody thus
away goes noone to
awfully going nor too
b gone nov toward
bb got novel towards
be gotten November tt
became graph nytimes Tuesday
because graphic o twice
become greetings obit two
becomes guardian newspapers limited obituaries type
becoming h obituary u
been had obviously un
before hadn’t oct under
beforehand happens October unfortunately
behind hardly of unless
being has off until
believe hasn’t oh unto
below have ok up
beside haven’t okay upon
besides having old url
best he on us
better he’ll once usatoday
between he’s one use
beyond hello ones used
both help only useful
brief hence onto uses
but her oo using
by her’s op usually
byline here oped uu
c here’s or uucp
c’mon hereafter other v
c’s hereby others value
came herein otherwise various
can hereupon ought very
can’t hers our via
cannot herself ours viz
cant hh ourselves vs
caption hi out vv
cause highlight over w
causes him overall was
cc himself p washington post
cdt his page we
certain hither pages we’d
certainly hopefully particular we’ll
changes how particularly we’re
chart howbeit per we’ve
clearly however perhaps weather
co http pg Wednesday
column httpwwwnytimescom photo well
com i photograph went
come i’d photographs were
comes i’ll photos weren’t
company i’m picture what
concerning i’ve placed what’s
consequently i.e. plus whatever
consider if pm when
considering ignored pp whence
contain ii presumably whenever
containing immediate publication where
contains in publication type where’s
copyright inasmuch q whereafter
correction inc qq whereas
correctiondate indeed que whereby
corresponding indicate quite wherein
could indicated qv whereupon
couldn’t indicates r wherever
course inner rather whether
crossword insofar rd which
cst instead re while
currently into regarding whither
d inward regards who
date is relatively who’s
dateline isn’t reserved whoever
dd it respectively whom
dec it’d review whose
December it’ll reviews will
definitely it’s rights with
described its rr within
desk itself s won’t
despite j said words
diary jan same would
did January Saturday wouldn’t
didn’t jj saw writer
different jul say ww
digest July saying www
do jun says wwwnytimescom
document June secondly x
documents just section xx
does k sep y
doesn’t keep September yet
doing keeps series you
don’t kept seven you’d
done kk several you’ll
down know shall you’re
downwards known she you’ve
drawing knows she’ll your
during l since yours
e language six yourself
each last so yourselves
ed lately some yy
edition later somebody z
    zz

Uncertainty lexicon: principal component

List of 51 uncertainty terms loading above .60 on the first principal component extracted from the larger 297 word uncertainty list from Loughran and McDonald (2016)

could depends assume assuming
may depends suggested cautious
possible roughly uncertain suggest
maybe sometimes predicted precautions
almost nearly confusing appears
risk somewhere risks anticipate
might depending suddenly predicting
probably somewhat uncertainty approximately
seems exposure anticipating predictions
perhaps depend indefinite appear
believe confusion anticipated doubt
possibility suggesting sometime volatility
possibly believes risking varying

Uncertainty lexicon: full list

Larger list of 297 uncertainty terms from Loughran and McDonald (2016):

abeyance deviate presuming suggests
abeyances deviated presumption susceptibility
almost deviates presumptions tending
alteration deviating probabilistic tentative
alterations deviation probabilities tentatively
ambiguities deviations probability turbulence
ambiguity differ probable uncertain
ambiguous differed probably uncertainly
anomalies differing random uncertainties
anomalous differs randomize uncertainty
anomalously doubt randomized unclear
anomaly doubted randomizes unconfirmed
anticipate doubtful randomizing undecided
anticipated doubts randomly undefined
anticipates exposure randomness undesignated
anticipating exposures reassess undetectable
anticipation fluctuate reassessed undeterminable
anticipations fluctuated reassesses undetermined
apparent fluctuates reassessing undocumented
apparently fluctuating reassessment unexpected
appear fluctuation reassessments unexpectedly
appeared fluctuations recalculate unfamiliar
appearing hidden recalculated unfamiliarity
appears hinges recalculates unforecasted
approximate imprecise recalculating unforseen
approximated imprecision recalculation unguaranteed
approximately imprecisions recalculations unhedged
approximates improbability reconsider unidentifiable
approximating improbable reconsidered unidentified
approximation incompleteness reconsidering unknown
approximations indefinite reconsiders unknowns
arbitrarily indefinitely reexamination unobservable
arbitrariness indefiniteness reexamine unplanned
arbitrary indeterminable reexamining unpredictability
assume indeterminate reinterpret unpredictable
assumed inexact reinterpretation unpredictably
assumes inexactness reinterpretations unpredicted
assuming instabilities reinterpreted unproved
assumption instability reinterpreting unproven
assumptions intangible reinterprets unquantifiable
believe intangibles revise unquantified
believed likelihood revised unreconciled
believes may risk unseasonable
believing maybe risked unseasonably
cautious might riskier unsettled
cautiously nearly riskiest unspecific
cautiousness nonassessable riskiness unspecified
clarification occasionally risking untested
clarifications ordinarily risks unusual
conceivable pending risky unusually
conceivably perhaps roughly unwritten
conditional possibilities rumors vagaries
conditionally possibility seems vague
confuses possible seldom vaguely
confusing possibly seldomly vagueness
confusingly precaution sometime vaguenesses
confusion precautionary sometimes vaguer
contingencies precautions somewhat vaguest
contingency predict somewhere variability
contingent predictability speculate variable
contingently predicted speculated variables
contingents predicting speculates variably
could prediction speculating variance
crossroad predictions speculation variances
crossroads predictive speculations variant
depend predictor speculative variants
depended predictors speculatively variation
dependence predicts sporadic variations
dependencies preliminarily sporadically varied
dependency preliminary sudden varies
dependent presumably suddenly vary
depending presume suggest varying
depends presumed suggested volatile
destabilizing presumes suggesting volatilities

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Danowski, J.A., Yan, B. & Riopelle, K. A semantic network approach to measuring sentiment. Qual Quant 55, 221–255 (2021). https://doi.org/10.1007/s11135-020-01000-x

Download citation

Keywords

  • Sentiment
  • Semantic networks
  • Coronavirus news