Introduction

The World Health Organization (WHO) declared novel Coronavirus Disease 2019 (COVID-19) as a pandemic in March 2020 [16]. It has now spread across the world and there have been more than 6.29 million confirmed cases and more than 380,000 people died because of the virus until 4 June 2020. Almost all countries in the world are battling against COVID-19 to prevent it from the spread as much as possible. The impact of the COVID-19 outbreak has been so huge that it is said to be the most serious epidemic in the past hundred years comparable to pandemics of the past like Spanish flu of 1918, or the Black Death in the mid-1300s. The outbreak has caused scares across the globe affecting millions of people either through infection or through increased mental health issues, such as disruption, stress, worry, fear, disgust, sadness, anxiety (a fear for one’s own health,and a fear of infecting others), and perceived stigmatisation [3, 13]. These mental health issues can even occur in people not at high risk of getting infected, in the face of a virus with which the common people may be unfamiliar [13]. These mental health issues can cause severe emotional, behavioral and physical health problems. Complications linked to mental health include: unhappiness and decreased enjoyment of life, family conflicts, relationship difficulties, social isolation, problems with tobacco, alcohol and other drugs, self-harm and harm to others (such as suicide or homicide), weakened immune system, and heart disease as well as other medical conditions.

Furthermore, lockdown is a commonly completed measure to stop the spread of virus in the community. For example, countries such as China, Italy, Spain, and Australia were fighting with the COVID-19 pandemic by taking strict measures of national-wide lockdown or by cordoning off the areas that were suspected of having risks of community spread throughout the period of the pandemic, expecting to “flatten the curve”. Therefore, it is critical to learn the status of the community mental health especially during the pandemic period so that corresponding measures can be taken in time.

On the other hand, social media such as Twitter represent a relatively real-time large-scale snapshot of messages reflecting people’s thoughts and feelings. Every tweet is a signal of the social media users’ state of mind and state of being at that moment [5]. Aggregation of such digital traces may make it possible to monitor health behaviours at a large-scale [9]. While during the lockdown of COVID-19, people have taken social media to express their feelings and find a way to calm themselves down. Therefore, social media have created possibilities of analysing people’s sentiment and its dynamics during the pandemic period. The most recent work have used Twitter to analyse public sentiment during the pandemic [2, 3]. These work largely focus on the public sentiment in country-level or even multiple countries for a specific period. However, a granular analysis of sentiment of a single city or even a suburb is more operable for related parties such as governmental departments to take corresponding actions. Furthermore, the sentiment dynamics over time instead of sentiment of a specific period will help related parties understand the effectiveness of any measures implemented.

This paper aims to examine community sentiment dynamics due to COVID-19 pandemic in Australia. Twitter data are collected and analysed to extract sentiment scores which may be affected by COVID-19 and related events during the COVID-19 period. Instead of the sentiment examination of the whole country, this paper investigates a fine-grained analysis of sentiment in local government areas of a state in Australia, which can help government to take corresponding measures more objectively if necessary. Furthermore, through analysing topic key words labelled by hashtag in Twitter, this paper examines community dynamics because of different events, government policies and programs or others with visual analytics.

Related Work

Social Media Sentiment Analysis

Sentiment analysis is a research topic that analyzes people’s sentiments, opinions, mental states, and emotions from data resources such as texts, images and videos that generated by human beings [11]. With the rapidly development of social media platforms such as Twitter and Facebook [17], it becomes possible to collect large-scale of text data from millions of users for the social medial sentiment analysis [1]. Most of the methods take the concept of supervised machine learning for social media sentimental analysis [6, 7, 10, 12]. In those methods, the supervised machine learning models classify a tweet into predefined sentimental categories such as positive and negative. Specifically, the supervised machine learning models first need to be trained with data that contain the labels for different sentiments. Then, the trained models can be used for sentiment prediction. For instance, Go et al. [6] adopted the distant supervision strategy to create a labeled tweet dataset for training supervised machine learning models such as Naive Bayes, Maximum Entropy, and support vector machine. The emoticons such as :), :( are served as noise labels to indicate the sentiment of the tweet publisher as positive or negative. With the superior advantages of automatically feature selection from raw data, deep learning techniques have also been adopted for sentiment analysis. Jiang et al. [10] adopted the the latest techniques of deep neural network as the sentimental classifier for more accurate sentiment classification. Except the machine learning-based methods for sentiment analysis, there some lexicon-based methods that exploiting the word polarity to compute the sentimental score for a given text [8, 14, 15]. For example, Ortega et al. [14] proposed to exploit multiple steps of different techniques for Twitter sentimental analysis. First, the pre-processing for tweets was performed to clean the raw texts. Then, an existing polarity detection method were adopted to compute the sentimental score of each word. Finally, they adopted a rule-based classification to classify the sentiments for tweets. The adopted polarity detection and rule-based classification methods were based on existing knowledge bases: WordNet and SentiWordNet.

Sentiment Analysis for COVID-19 Based on Twitter

More recently, with the pandemic of COVID-19 spreading in the world, there are some work that exploiting sentiment analysis as a tool to investigate the people’s reactions. For instance, Barkur et al. [2] analysed public sentiment of Indians after the lockdown announcements were made. The social media platform Twitter was used for the sentiment analysis. The analysis showed that Indians have taken the fight against COVID-19 positively and majority are in agreement with the government for announcing the lockdown to flatten the curve. However, this work focused on the overall public sentiment of a specific period but did not show the dynamics of sentiment during the study period. Bhat et al. [3] analysed sentiments expressed globally through Twitter and found that the perception of Twitter users was mostly positive or neutral during the COVID-19 pandemic period. It indicated that even though Twitter users were quarantined or staying at home, yet they were hopeful and experiencing a different unique socialization opportunity with family. Dubey [4] analysed tweets from twelve countries during a selected period of the pandemic, aiming to understand how the citizens of different countries are dealing with the situation. The results showed that while majority of people were taking a positive and hopeful approach, there were instances of fear, sadness and disgust exhibited worldwide. Furthermore, four countries, France, Switzerland, the Netherland and USA have shown signs of distrust and anger on a bigger scale as compared to the remaining eight countries.

However, these previous work on COVID-19 sentiment analysis largely focuses on the public sentiment in country-level or globally for a specific period. However, little work is found to analyse dynamics of community sentiment over time for different regions especially when government policies are applied or significant events are occurred during the pandemic period.

Data and Methods

Study Location

For this study, we focus on the state New South Wales (NSW) in Australia. NSW has a large population of around 8.1 million based on the census in September 2019 from Australian Bureau of Statistics.Footnote 1 The state’s capital city Sydney is Australia’s most populated city with a population of over 5.3 million people. The Local Government Areas (LGAs) of New South Wales are the third tier of government in the Australian state (the three tiers are federal, state, and local government). There are 128 LGAs in NSW. In this study, the Twitter data were collected for each LGA separately so that the sentiment dynamics can be analysed and compared among LGAs.

Measuring Community Sentiment

Sentiment analysis, also known as opinion mining or emotion AI, is a sub-field of Natural Language Processing (NLP) that tries to identify, extract, quantify, and study affective states and subjective information within a given text. We utilize the VADER (Valence Aware Dictionary for sEntiment Reasoning) [8] to analyse sentiments implied in tweets. VADER is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media.

In this study, VADER is used to compute the sentiment value of each tweet. The sentiment of all tweets in an LGA is then aggregated for each LGA. In this study, sentiment of each LGA is aggregated with the Eq. 13 to get the dominant mean sentiment of LGA, which is also called community sentiment:

$$\begin{aligned} S_{\text {p}}&= \frac{\sum _{i}^{n_{\text {p}}}s_{{\text {p}}}^{i}}{N}, \end{aligned}$$
(1)
$$\begin{aligned} S_{\text {g}}&= \frac{\sum _{i}^{n_{\text {g}}}s_{{\text {g}}}^{i}}{N}, \end{aligned}$$
(2)
$$\begin{aligned} S_{\text {f}}&= \left\{ \begin{matrix} S_{\text {p}}, &{} if \left| S_{\text {p}} \right| >= \left| S_{\text {g}} \right| \\ S_{\text {g}}, &{} if \left| S_{\text {p}} \right| < \left| S_{\text {g}} \right| , \end{matrix}\right. \end{aligned}$$
(3)

where \(s_{{\text {p}}}^{i}\) is the sentiment value of tweet i which has positive sentiment, \(n_{\text {p}}\) is the number of tweets which have positive sentiment, where \(s_{{\text {g}}}^{i}\) is the sentiment value of tweet i which has negative sentiment, \(n_{\text {g}}\) is the number of tweets which have negative sentiment, N is the number of all tweets in an LGA, and \(S_{\text {f}}\) is the final sentiment of an LGA.

Tables 1 and 2 show examples of tweets with positive and negative sentiment values computed with VADER.

Table 1 Examples of tweets with positive sentiment
Table 2 Examples of tweets with negative sentiment

Analytical Approach

In this study, various analytical approaches are applied to examine community sentiment dynamics before the outbreak of COVID-19, during the COVID-19 period, and post-period of the COVID-19. The community sentiment may be affected by various significant events or policies implemented by the government during the COVID-19 period such as the state lockdown or restrictions to movement.

The community sentiment of overall NSW state is first aggregated and analysed. We then split the community sentiment analysis for each LGA of NSW to see the sentiment difference among LGAs. The daily and weekly dynamics of these community sentiment are examined to find how various factors affect community sentiment.

Datasets

To analyse the dynamics of sentiment during the COVID-19 pandemic period in a fine-grained level, we collected tweets from Twitter users that live in the different LGAs of NSW in Australia. The time span of the collected tweets is from 1 January 2020 to 22 May 200 which covers dates that the first confirmed case of coronavirus was reported in NSW (22 January 2020) and the first time that the NSW premier announced the relaxing for the lockdown policy (10 Mary 2020). Table 3 shows the statistics of the collected tweet dataset. We totally collected 94,707,264 tweets with averagely 739,901 tweets for each LGA during the study period. Datasets of COVID-19 tests and confirmed cases in NSW were collected from DATA.NSW.Footnote 2

Table 3 Statistics of the collected Twitter dataset

Results

Figure 1a shows the overview of the number of tests and confirmed cases of COVID-19 in NSW over the COVID-19 period. It was found that there usually had test peaks at the beginning of each week and had less numbers at the weekend, which well aligns with the people’s living habits in Australia. It shows that the outbreak peak of COVID-19 in NSW was on 26 March 2020 and tests were significantly increased after 13 April 2020. It also shows that most of confirmed cases were originally related to overseas.

Fig. 1
figure 1

The COVID-19 spread and community sentiment in NSW

Sentiment Dynamics in NSW

Figure 1b presents the overall community sentiment dynamics in NSW during the study period. It shows that the public had a highly positive sentiment on the new year day. There was a significant decrease until 5 January. This was maybe because of the continued bushfire in NSW at that period. After that, the sentiment of the public was increased and then kept relatively stable until 8 March. From 8 March, the overall sentiment was decreased significantly and kept in low level continuously. This is well aligned with the overall trend of COVID-19 spreading in NSW as shown in Fig. 1a, which shows that the number of confirmed cases of COVID-19 was increased significantly from 8 March. Overall, the community showed a dominant positive sentiment during the study period despite the COVID-19 spread. This observation is also aligned with findings from other researchers who got the similar conclusions from one country or several countries [2, 3]. Furthermore, it shows that the COVID-19 spread did affect community sentiment and decreased the community sentiment during the COVID-19 pandemic, which can be clearly seen from Fig. 1a.

Sentiment Dynamics in LGAs

This subsection further analyses community sentiment dynamics in LGAs in NSW. Figure 2 shows an example of community sentiment dynamics in LGAs in NSW on different 2 days: 10 March 2020 and 16 March 2020. It shows that the community sentiment were different across different LGAs on each day. Furthermore, the community sentiment of each LGA was changed on different days. When we zoom in the sentiment map to LGAs around Sydney City areas as shown in Fig. 3, we can see more details of sentiment changes of LGAs around Sydney City areas on these 2 days. For example, Ryde is an LGA close to Sydney City. In an aged care centre in this LGA, a nurse and an 82-year-old elderly resident were tested positive for coronavirus at the beginning of March.Footnote 3 After that, a number of elderly residents in this aged care centre were tested positive for COVID-19 or even died. At the same time, a childcare centre and a hospital in this LGA have been reported positive COVID-19 cases in March. Many staff from the childcare centre and the hospital were asked to test virus and conduct home isolation for 14 days. All these maybe resulted in public sentiment changes significantly as shown in Fig. 3. For example, the sentiment in Ryde LGA was changed from high positive to high negative on 10 March 2020 and 16 March 2020.

Fig. 2
figure 2

The community sentiment map in LGAs in NSW on 10 March 2020 (top) and 16 March 2020 (bottom)

Fig. 3
figure 3

The community sentiment dynamics in selected LGAs around Sydney City areas on 10 March 2020 (left) and 16 March 2020 (right)

Figure 4 presents the community sentiment dynamics in Ryde LGA over the study period. It shows that the community experienced frequent negative sentiment in March, and recovered to positive dominantly in April. This is maybe because the COVID-19 spread situation in March as mentioned previously in this LGA and the spread was controlled in April, which helped people boost positive sentiment.

Fig. 4
figure 4

The community sentiment dynamics in Ryde LGA

Dynamics of Twitter Topics

We analysed hot topics in tweets labelled with hashtag. The sentiment of each tweet is got using the VADER tool as described previously. If the sentiment of a tweet is positive, the topic key words labelled with hashtag in this tweet are then labelled as positive sentiment. The topic key words with the negative sentiment are got with the similar approach. Different topic key words are then analysed statistically to understand the sentiment dynamics. Fig. 5 shows the dynamics of top 20 Twitter topics during the study period in NSW. Y-axis represents the count of different topic key words occurred in tweets. Positive numbers represent count of topic key words with positive sentiment, and negative numbers represent count of topic key words with negative sentiment. This is same for other figures from Figs. 6, 7, 8 and 9.

Fig. 5
figure 5

The dynamics of top 20 Twitter topics during the research period

Topic of COVID-19

Figure 5 shows that COVID is the dominant topic in the study period from the week of 2 March. When we drill down into details of this figure, we can see that the occurrence of the topic of COVID appeared from the week of 20 January, and began to increase significantly from the week of 2 March. It reached the peak in the week of 23 March. After that, the occurrence of the topic of COVID decreased gradually and kept stable from the week of 20 April. This trend is well aligned with the trend of COVID-19 situation in NSW as shown in Fig. 1a. Furthermore, Fig. 5 shows that the dominant sentiment around the topic of COVID was positive. This conclusion is aligned with findings in previous research [2, 3]. This is maybe because that people had a positive belief to the combat against COVID-19 by the government and the society.

Topic of Lockdown

Figure 6 shows the dynamics of Twitter topic of lockdown. From this figure, we can see that the topic of lockdown was started from the week of 9 March and reached to the peak point at the week of 23 March. On that week, NSW government officially announced the state lockdown on 30 March and the restrictions were begun from 31 March.Footnote 4 After that, this topic was kept stable until the week of 27 April. From the week of 4 May, there was a small peak of the topic occurrences. Maybe this was because that the NSW government announced the ease of restrictions on 10 May.Footnote 5 Overall, the community kept a positive sentiment dominantly to lockdown despite negative sentiment existing. This was maybe because that some people were not accustomed to the restrictions which caused negative sentiment.

Fig. 6
figure 6

The dynamics of Twitter topic of “lockdown”

Topic of Social-Distancing

Figure 7 presents the dynamics of Twitter topic of social-distancing. It shows that the topic of social-distancing appeared from the week of 9 March and significantly increased to a peak after a week. This is maybe because that the confirmed cases of COVID-19 were increased significantly and the government encouraged people to increase social-distancing, which caused a significant increase of public conversation on social-distancing on Twitter. After that week, the occurrence of the topic of social-distancing was gradually decreased until the week of 6 April and then kept stable relatively. Overall, the community showed the positive sentiment dominantly from the beginning of social-distancing until the week of 11 May. This is maybe because that the NSW government announced the ease of restrictions on 10 May as mentioned previously, which resulted in the negative sentiment to the social-distancing from the community.

Fig. 7
figure 7

The dynamics of Twitter topic of “Social-distancing”

Topic of Jobkeeper

The Australian Government introduced the Jobkeeper wage subsidy program at the end of March, allowing businesses to claim a fortnightly payment of $1500 per eligible employee from 30 March 2020, for a maximum of 6 months.Footnote 6 The starting time of jobkeeper topic was also clearly demonstrated in the dynamics of Twitter topic of jobkeeper as shown in Fig. 8. This figure shows that the public kept a positive sentiment dominantly to jobkeeper program from the beginning of its introduction until the week of 27 April. From the week of 4 May, people showed a negative sentiment dominantly. This is maybe because that people had received negative news on operations of jobkeeper program. For example, it was reported that the number of people on jobkeeper was revised down by 3 million due to the errors at the beginning of the program.Footnote 7

Fig. 8
figure 8

The dynamics of Twitter topic of “Jobkeeper”

Sentiment Dynamics Affected by Events

Community sentiment may also be affected by big events during the COVID-19 period. For example, the Ruby Princess Cruise docked in Sydney Harbour on 18 March 2020. Despite having passengers with COVID-19 symptoms, about 2700 passengers were allowed to disembark on 19 March without isolation or other measures, which was considered to create a coronavirus hotbed in Australia. The Ruby Princess Cruise has been linked to at least 662 confirmed cases and 22 deaths of COVID-19.Footnote 8

Figure 9 presents the community sentiment dynamics related to the Ruby Princess. It shows that the topic of the Ruby Princess appeared from the week of 16 March and was increased significantly after that. It reached a peak at the week of 30 March and kept high occurrences until the beginning of May. This trend is well aligned with the actual timeline of the public reporting of confirmed cases and deaths as well as other events (e.g. police in NSW announced a criminal investigation into the Ruby Princess cruise ship debacle on 5 April) related to the Ruby Princess\(^{8}\). Overall, the community showed a dominant negative sentiment to Ruby Princess throughout the event period.

Fig. 9
figure 9

The dynamics of Twitter topic of “Rubyprincess”

Conclusions and Future Work

This study conducted a comprehensive analysis of the sentiment dynamics in the state of New South Wales (NSW) in Australia due to the COVID-19 pandemic. Instead of the country-level study, the sentiment in this work was analysed at local government areas (LGAs) level based on more than 94 million tweets collected from Twitter for a 5-month period started from 1 January 2020. The results of the dynamical sentiment analysis showed that the overall sentimental polarity was positive in NSW and the positive sentiment was decreased from the beginning of March to May 2020 due to the significant increase of COVID-19 confirmed cases from 8 March and the further carried out lockdown policies. When we drilled down into LGAs, it was found that different LGAs showed different sentiment polarity scores during the study time, and each LGA may have different sentiment polarity scores on different days. Despite the dominant positive sentiment most of days during the study period, some LGAs experienced significant sentiment dynamics maybe because of the serious COVID-19 infections in those LGAs. This study also analysed the sentimental dynamics delivered by the hot topics such as government policies (e.g. the Australia JobKeeper program) and the focused social events (e.g. the Ruby Princess Cruise). The results showed that the implemented policies or occurred events affected people’s overall sentiment differently, and at the different stage, people showed different sentiment.

This paper presented a case study of the sentiment dynamics due to COVID-19. More interesting topics can be explored based on the current study in the future. For example, the topic modelling techniques can be applied to conduct more comprehensive analysis to get people’s topic level sentiments. The individual level sentiment dynamics can also be analysed to help government and local communities to locate the people that may suffer from the negative sentiments. Besides the sentiment polarity, other mental health issues due to COVID-19 such as depression and anxiety will also be analysed in Twitter in our future work.