1 Introduction

The pandemic COVID19 has become the most significant challenge humanity has faced since World War 2 (WW2). It is reported that Covid 19 has led to more deaths in the United States of America (USA) than both the Pearl Harbor War and September 11 terror attacks (Haltiwenger, 2020). COVID-19 itself is highly infectious, and the speed by which it can mutate is rapid and in different varieties, with reported six strands of active coronaviruses widely spread worldwide. It had infected more than 17 million of the worldwide population in late July 2020. In early March 2020, the total infected cases were still not reaching 100,000 (WHO, 2020). This global pandemic’s expeditious spread and its resulting mortalities have led to it being identified as one of the deadliest pandemics of the last two centuries.

The outbreak of the novel virus was abrupt and rapid. Governments worldwide have been forced to develop and implement preventive measures like strict social distancing policies during these times. These measures are also known as ‘lockdown’ in some countries or other countries as a government-enforced curfew or emergency (Lin et al., 2020). This measure led to individuals being forced to remain indoors within their homes and allowed to go out only in exceptional times for essential items such as grocery shopping. The abrupt change in daily lives has created a multifold, compromising effect on various critical sectors of society such as financial, health, social circles, and environmentally. Each of these variations has caused ripple effects on the mental wellbeing of individuals. Along with medical practitioners, technology has also emerged as a strong support pillar for managing this pandemic situation. Subsequently, immense research efforts and technology developments are occurring to improve COVID 19 symptoms detection (Wang et al., 2020; Ai et al., 2020; Wang et al., 2020; Sheng et al., 2020; Abdel-Basset et al., 2020a) for understanding the infection curve (Hu et al., 2020; Abdel-Basset et al., 2020b; Kalsi et al., 2018) and enhancing the infrastructure management (Reeves et al., 2020; Keesara et al., 2020).

The pandemic has also led to research studies understanding its impact on public and healthcare staff sentiments (Rajkumar, 2020; Caleo et al., 2018). Evidence suggests that anxiety, depression, and stress are common and expected reactions to the COVID 19 pandemic (Rajkumar, 2020). Recent studies of the effect of similar pandemics on the population indicate that the factors that have contributed the most to reducing the psychological impact of isolation at home were the receipt of clear and consistent information (Caleo et al., 2018; Cava et al., 2005; Lau et al., 2010), well-defined explanations for seeking social isolation (DiGiovanni et al., 2004) or having social, moral and economic support for those seeking the support, as well as the absence of new contagions (Desclaux et al., 2017). Additionally, information to the wider population is viewed to be important as it reduces their perception of risk to an epidemic (Rolison & Hanoch, 2015) (Singh et al., 2021). A unique aspect of this pandemic is the availability and accessibility to Online Social Networks (OSNs) such as Facebook, Twitter, Instagram, etc. OSNs have also been viewed to serve as a communication channel that can be positive and negative. OSNs are positive when sharing correct and valid important information (Gao et al., 2020). Comparatively, OSNs are negative when viewed as sources of misinformation and fake news (Park et al., 2020).

Research studies suggest that understanding positive, negative, and neutral sentiments of a particular cluster of people or locations from various sources like OSNs are not enough for understanding the overall picture of society’s entire mindset during a global pandemic (Fenwick et al., 2020). As COVID 19 is a worldwide phenomenon, where possible, global data should be used to examine and inspect for the significant emotional variations in individuals (ibid). By doing so, the much-needed top view insights can be proffered. The study of lockdowns can also be a management studies issue as the pandemic impacts businesses and individuals alike. Management research has had a long tradition of comparing countries as this allows an understanding of and identification of landscape commonalities and differences in dominant national management ‘paradigms’ or recipes’ (Parry et al., 2020; Walker et al., 2014). A major challenge when performing such an activity manually is processing substantial amounts of generated big data. In such instances, techniques like Machine Learning (ML) and Deep Learning (DL) drawn from Artificial Intelligence (AI) can be used for analysis and understanding, which are proving to be very beneficial.

For the OSN platform that the data was to be collated for this study, Twitter was selected. This was because it is among the most prominently used microblogging social media platforms for thought sharing and is one of the important data sources for researchers when analyzing public emotions and sentiments (Giachanou & Crestani, 2016). It also offers timeline-based conversations (in question and multiple answer format), which makes it easier to analyze the context of discussions and to identify sentiments and emotions. To ensure that the correct selection was being made, Facebook was the other OSN that was considered as it offers conversations in the form of shared posts either publicly or privately. It also provides replies in likes and comments, but these attributes create hindrances when collating data and cause logical data analysis problems. When comparing the privacy and ease of use aspects for data collection, Twitter has fewer privacy restrictions and offers easier data downloading due to its Application Programming Interface (API) in comparison to Facebook’s API. A drawback with Twitter and other such OSN platforms is that of accuracy. Researchers have found that this can be reduced and overcome by ensuring that tweets are pre-processed, sorted and cleansed and then utilized (Liu & Shi, 2019), a strategy that we utilized.

From a literature review of the pandemic outcomes, measures taken to prevent and cure it, it was found that research of the pandemic and preventive measures such as the lockdown are rare. Further, global studies of the pandemic and preventive measures were fewer. In terms of emotions and AI, sentiment analysis is the closest technique that is utilized in research. Therefore, this research team searched for studies about sentiment, or text based emotion analysis affiliated with the pandemic and lockdown measures. This led to very few studies of this nature, which motivated this team to overcome the gap by forming the aim: To identify, explore and understand globally the emotions expressed during the earlier months of the pandemic COVID 19 by utilizing Deep Learning and Natural language Processing (NLP). To fulfil this aim, the authors employed the text-based Emotions Analysis (EA) technique, which is a rarity in the AI and analytics arena. Emotion Detection and Recognition from text is a recent field of research that is closely related to Sentiment Analysis. Sentiment Analysis aims to detect positive, neutral, or negative feelings from a text; whereas, Emotion Analysis aims to detect and recognize types of feelings through the expression of texts, such as anger, disgust, fear, happiness, sadness, and surprise. Emotion detection may have useful applications, such as gauging the happiness of citizens or understanding the perceptions of consumers (Yean, 2015).

By determining the stated aim, the intention is to offer an insight into various individuals’ emotions around the globe during this challenging era. Recently, sentiment analysis has been employed with Twitter analytics via different methodologies such as lexicon-based (Zhang et al., 2011), emoticon-based (Liu et al., 2012) (Go et al., 2009), machine learning-based (Neethu & Rajasree, 2013) (Kaur et al., 2021) (Mendon et al., 2021), and deep learning-based (Dos Santos & Gatti, 2014). To minutely analyze multiple emotions at a deeper level from generalized tweets, this paper proposes using a State-of-the-art Natural Language Processing (NLP) model RoBERTa (Liu et al. 2019), which is a pre-trained language model developed by Facebook. RoBERTa provided improved results in text emotion analysis when compared to existing pre-trained models such as BERT, DistilBERT (Sanh et al., 2019) in some of the recent research studies (Delobelle et al., 2020; Møller et al., 2020). This is a novel aspect of AI data analytics. Studies based on text-based emotion analysis are rare within this challenging pandemic era, so we offer a novel contribution in this area.

For academia, this study’s benefits are applying DL and a state-of-the-art NLP model RoBERTa for examining and understanding the public reactions during the pandemic. ML and DL are innovations that are presently of immense interest; thus, this study will offer insights into the applications of these innovations during the pandemic. Academic studies on emotion analysis are scarce, with the pandemic’s findings even less and studies utilizing AI for an understanding even less; therefore, this study will be beneficial to offer an understanding of emotional well-being from a global perspective amiss in literature. For industry, the benefits of this study are deep insights into emotional wellbeing that the organization’s workforce could also be facing but are not aware of. For policymakers, the results of this study replicate the impact of the policy implementations and acceptance by the public.

To inform readers, the following is offered. Following this introduction, section 2 provides an overview of previous literature findings that reveal other earlier sentiment analysis and COVID 19 research. Section 3 elaborates briefly on the research methodology and explains the process utilized for the dataset generation and its attribute details. Section 4 explains the analysis of this study, which is followed by section 5 that offers the findings of the application of text-based emotion analysis on the pandemic dataset. Section 6 offers a discussion and the implications of this study. Section 7 draws the paper to a close by offering the conclusion, limitations, and future directions of this study.

2 Related Work

2.1 Preventive Measures and COVID 19

The term social distancing is a relatively older concept; however, its enactment has seen variations with the changing times. Early studies of the preventive measure of social distancing found that it is a multi-faceted intervention where facets or stages unfold as the pandemic impacts society (Kwon et al., 2020). These facets may change in subsequent studies as deeper insights into COVID 19 evolves. Using the keywords of ‘social distancing, Covid 19, and Twitter’, several studies related to this one were found. For example, Kwon et al. (2020) identified the facets of social distancing: (1) Purpose and justification. Social distancing is a disruptive nationwide behavioral measure that is being used extensively to bring the pandemic to manageable levels for healthcare systems. (2) Implementation of social distancing to not only avoid mass gatherings but also to maintain a 6-ft distance amongst individuals. Governments also closed non-essential businesses, restaurants, and in the earlier phase of Covid 19, schools, colleges, and universities. (3) Social activity disruptions impose travel restrictions and emphasize less human face-to-face interactions. (4) Adaptation to social distancing by accepting a new way of life and conducting virtual daily life activities like online schooling, working remotely through teleconferencing, online food shopping, telehealth-based visits, and online entertaining through platforms such as Netflix. (5) Positive emotions and (6) negative emotions facets associated with the emotional response to social distancing. These facets could potentially measure the levels of distress culminating over time due to disrupting social behaviors and activities that are usually associated with mental and emotional wellbeing.

Studies of Covid19, social distancing, and Twitter are few. Saleh et al.’s (2020) study, between March 27 and April 10, 2020, used English-only tweets matching two trending social distancing hashtags, #socialdistancing and #stayathome, is a similar study. By analyzing tweets using NLP and ML models, sentiment analysis was employed to identify emotions and polarity. A sample of 574,903 tweets led the study to identify positive and negative polarity and objective polarity. Approximately half (50.4%) of the tweets primarily expressed joy, and one-fifth (20%) expressed fear and surprise. Fenwick et al. (2020) found that initially and contrary to the view that Covid’19 has led to dangerous misinformation that needs to be regulated more strictly, social media and Twitter had led to the triggering of a more effective policy response based around social distancing, lockdown, and containment. Ahmed et al. (2020) completed a study that evaluated the #FilmYourHospital conspiracy theory on Twitter by attempting to understand the drivers behind it. Twitter data related to the #FilmYourHospital hashtag were retrieved and analyzed using social network analysis across a 7-day period from April 13–20, 2020. The data set consisted of 22,785 tweets and 11,333 Twitter users. The Botometer tool was used to identify accounts with a higher probability of being bots. The most important drivers of the conspiracy theory are ordinary citizens; one of the most influential accounts is a Brexit supporter. We found that YouTube was the information source most linked to by users. The most retweeted post belonged to a verified Twitter user, indicating that the user may have had more influence on the platform. There were a small number of automated accounts (bots) and deleted accounts within the network.

Doogan et al. (2020) identified tweets about COVID19 Non-Pharmaceutical Initiatives (NPIs) in six countries and compared the trends in public perceptions and attitudes towards NPIs across these countries. They aimed to identify factors that influenced NPI regimes’ public perceptions and attitudes during the early phases of the COVID-19 pandemic. The team analyzed 777,869 English language tweets about COVID 19 NPIs in six countries (Australia, Canada, New Zealand, Ireland, the United Kingdom (UK), and the United States of America (USA)). The relationship between tweet frequencies and case numbers was assessed using a Pearson correlation analysis. Topic modeling was used to isolate tweets about NPIs. A comparative analysis of NPIs between countries was conducted. From the findings, the New Zealand dataset displayed the greatest attention to NPIs, and the USA dataset showed the lowest. Topic modeling produced 131 topics relating to one of 22 NPIs, grouped into seven NPI categories: Personal Protection (n = 15), Social Distancing (n = 9), Testing and Tracing (n = 10), Gathering Restrictions (n = 18), Lockdown (n = 42), Travel Restrictions (n = 14), and Workplace Closures (n = 23). While less restrictive NPIs gained widespread support, more restrictive NPIs were perceived differently between countries. Four characteristics of these regimes were seen to influence public adherence to NPIs: timeliness of implementation, NPI campaign strategies, inconsistent information, and enforcement strategies.

For their research, Wicke and Bolognesi (2020) utilized an analysis of the discourse around #Covid-19 and large tweet numbers posted on Twitter during March and April 2020. They used topic modeling to analyze such topics where the discourse could be classified. Then, a WAR framing was used to refer to specific topics, such as the virus treatment, but not others, such as the effects of social distancing on the population. The WAR frame was then measured and compared to three alternative figurative frames (MONSTER, STORM, and TSUNAMI) and a literal frame used as control (FAMILY). The results revealed that while the FAMILY frame covers a broader portion of the corpus, among the figurative frames, WAR, a highly conventional one, is the frame used most frequently. Yet, this frame does not seem to be apt to elaborate the discourse around some aspects of the current situation. Therefore, it was concluded that a plethora of framing options or a metaphor menu might facilitate the communication of various aspects involved in the Covid-19-related discourse on social media, thereby supporting individuals to express their feelings, opinions, and beliefs during the current pandemic.

Having identified related Covid 19, social distancing, and Twitter studies, an understanding of other studies that utilized sentiment and emotion analysis for their research was formed and presented in the next sub-section. By considering these studies, we identified the contribution of this study, which is applying deep learning to text-based emotion analysis.

2.2 Sentiment and Emotion Analysis Studies

With the advent of digital technologies and sizeable online data amounts about individuals becoming available, researchers have begun to study the human thought processes and sentiments for enhancing the consumption of technology-based services (Anderson, 2012; Kolekar et al., 2016). For this purpose, applications in various fields like affective computing, information sciences, psychology, and marketing management. However, generally, automating the process of emotion detection with good accuracy is still challenging. The two primary sources used profusely for emotion detection are taken from a text or facial expressions/speech. Since the pandemic is still spreading and more data is emerging, we could not pursue a fully-fledged study. We also wanted to ensure that text-based emotion analysis could be applied to emerging and novel data; therefore, we adopted an exploratory study stance. Various social media platforms serve as one of the essential data sources for text-based emotion detection. For this purpose, we employed Twitter’s tweets as Twitter is the most sought-after platform due to its opinion sharing model structure. Since 2008, several researchers have presented insights into various techniques based on text-based sentiment analysis. For these studies, keywords, lexicons, emoticons, deep learning algorithms, ensemble models were used (Agrawal & An, 2012; Kaur & Gupta, 2013; Yadollahi et al., 2017; Medhat et al., 2014; Zhang et al., 2018). Various research projects considered varied text inputs such as tweets, Facebook comments, product reviews, blogs, post texts, and so forth. Various sentiment analysis techniques have been applied to detect emotions, opinions, views, sarcasm, and sentiments depending on these inputs.

Binali et al., 2010, described keyword-based sentiment analysis as a technique to find a correlation between the arrangements of words of a given text to understand the depicted emotion. Al-Ayyoub et al. (2015) proposed the idea of using the unsupervised lexicon-based methodology to analyze the sentiment polarity of user feedback and reviews on specific events that they developed to form a tool. Researchers (Wood & Ruder, 2016; Purver & Battersby, 2012; Suttles & Ide, 2013; Dhaoui et al., 2017 and Vashishtha & Susan, 2019) have also proposed the use of hashtags, emoticons, and emoji as one of the very effective ways of supervised sentiment learning from the social media text. They proved that along with keywords or lexicons, these add-ons helped to increase the classification abilities.

Most of the proposed techniques utilizing Twitter-based sentiment analysis have employed classifiers of AI that are trained using different tweet features. Classifiers such as Support Vector Machines (SVM), Logistic Regression (LR), Random Forest (RF), Naıve Bayes (NB), and Conditional Random Field (CRF) have been preferred extensively, which worked along with unigrams, bigrams, and n-gram feature sets (Davidov et al., 2010; Stojanovski, 2015). In most sentiment analysis studies, Machine Learning algorithms are used as they work well with a labeled dataset, which is difficult to generate with manual annotations every time an emotion detection domain is considered. Additionally, with the large available datasets, performing a detailed analysis to learn from multiple layers of data representations is difficult with ML algorithms compared to DL algorithms, which suggested a further reason for employing DL. A completed survey study revealed that sentiment analysis can be performed at the document level, aspect level, and sentence level (Zhang et al., 2018). A document level helps to identify the opinions of the individuals, such as a review of a service or a product. The sentence-level helps identify the positive, negative, or neutral sentiment depicted by each line of the given document. Aspect-based sentiment analysis considers the given text from the perspective of the entities it describes and its feedback. For an efficient sentiment analysis at a fine-grained level, multiple researchers have proposed DL techniques, which led this team to consider using a DL technique too. Alongwith DL, various supervised algorithms such as CNN, RNN, and bi-directional LSTM were employed (Severyn & Moschitti, 2015; Araque et al., 2017; Sohangir et al., 2018). However, along with sentiments, Bollen et al. (2011) recommended the idea of analyzing the “public mood” via Twitter data. This mood analysis included the classification of six emotional states: happy, alert, sure, vital, kind, and calm. Subsequently, immense research has been conducted in this domain, where, namely, six emotions from Twitter data were identified and classified: happiness, anger, fear, anxiety, sadness, and joy. Table 1 summarizes the techniques and identified emotions found in the literature.

Table 1 Comparative analysis of emotions analysis classifying techniques

Table 2 identifies multiple techniques that have been employed to determine a combination of emotions from social media data. During the COVID 19 pandemic, several research contributions were made in the last few months to detect the preventions and cures of Covid 19 misinformation, to classify the red zone areas where the pandemic was prevalent, to track human mobility, to optimize the resource utilization, or to develop vaccines (Choudrie et al., 2020). However, the emotional well-being of various professionals and individuals in society has been studied at a minimal level. Since an understanding at a deep and minute level was being sought by this study, a DL, rather than a ML technique was viewed most suitable. Table 2 shows some of the significant studies published recently that relate to COVID 19 and emotion analysis.

Table 2 Emotion analysis studies published related to COVID’19

Having understood the theoretical aspects of this study, the next section explains how the dataset used for this study was created. The most relevant paper for this proposed work was “TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations,” published by Azzouza et al., 2019. This paper represented the idea of using the BERT pertained model for the sentiment analysis for producing the sentence depictions. The model’s accuracy was testing by implementing various classification algorithms such as CNN, LSTM and has achieved the F1 score of 71.82%. In this paper, the authors have trained their model with BERT to classify the sentiments, which inspired authors to implement a similar idea for emotion detection from tweets.

3 Research Methodology and Dataset Creation

The purpose of the proposed research work was to develop a deep learning model which would automatically identify the emotional tone of the tweet to find out multiple emotion variations among the masses. To train such an AI model, no such Covid 19 related labeled emotion dataset was available. For this purpose, the concept of transfer learning was employed under which the model was trained with the readily available generalized emotion dataset and tested on the curated Covid 19 tweet dataset. The details of the dataset used for training and testing are given as below:

3.1 Emotion Dataset

For the emotion analysis, a dataset had to be created that was an emotion-based dataset, and it was used to train and test our emotion analyzer. In this instance, the dataset “Emotion in Text,” published by CrowdFlower, was readily available and was utilized for training the AI model. “Emotion in Text” is currently one of the largest open-source emotion datasets available. Being an open-source dataset, it was accessible to everyone without any monetary payment for entry and allowed access much more easily than another payment-based dataset. The dataset led to 39,740 tweets categorized into 13 categories. These are ‘anger’, ‘boredom’, ‘empty’, ‘enthusiasm’, ‘fun’, ‘happiness’, ‘hate’, ‘love’, ‘neutral’, ‘relief’, ‘sadness’, ‘surprise’ and ‘worry’. The dataset had an imbalance of tweets, with many tweets belonging to the ‘neutral,’ ‘anger’ and ‘worry’ category. Thus, of the 39,740 tweets, 3250 tweets were selected such that each category had exactly 250 tweets. The tweets were chosen to include tweets with a complete sentence structure and were not just a random collection of words.

Along with this, it was ensured that the tweet portrayed the emotion it was tagged with. Since users’ tweets during the pandemic were being categorized, an important emotion of depression was included in the dataset. For the data collection, reference was made to Reddit, a social discussion website. Within Reddit, a subreddit is a topic-specific community that was also consulted. User posts from a subreddit ‘r/depression’ were collected. This was a community where users watched out for one another and offered support to anyone suffering from depression. Several posts were read, and a total of 250 user posts that portrayed the ‘depressed’ emotion was collected. The posts had a maximum of 280 characters, which is also a tweet’s character limit. After this, both the sampled CrowdFlower dataset and the selected Reddit posts were combined to provide a textual dataset containing tagged emotions. This concluded our emotion dataset with 3500 user posts, and each labeled into one of the 14 emotions. The dataset was used to create our NLP model that classified tweets according to their emotions. All this was possible due to the power of transfer learning and using state-of-the-art NLP language models where a model can acquire knowledge from related tasks and data and do very well on the target task. This data analytics process was also the novel contribution of AI and machine learning that this study provided.

3.2 COVID’19 Twitter Dataset and Analysis

Twitter data was collected using Twitter’s API and Tweepy, a python library. Tweepy helps in creating the Twitter bots. These bots are small automated programs that help with fetching tweets from the tweeter API. Tweepy also uses an authentication interface named OAuth, which authorizes users when seeking to download tweets from Twitter. After the data collection, the keywords for downloading the tweet were selected. These were determined from diverse news websites such as BBC.co.uk, CNN.com, WHO, NHS, and other healthcare sector organizations. The team ensured that authentic news websites and other recognized organizaiotns were utilized for this step. The final set of keywords used to collect the tweets were: #Coronavirus, #Corona, #Wuhan, #COVID, #Social Distancing, #Pandemic, #Lockdown, #Epidemic.

To identify the periods that COVID 19 should cover, reference was made to different web sources where Wikipedia data offered the most in-depth information. This led to the formation of Table 3 to illustrate the pandemic’s growth (or not). An added reason for selecting the periods was that twitter data revealed a sudden rise in the number of daily corona virus-related tweets in the first five months. This suggested to the team that individuals were still enthusiastic for monitoring and expressing their opinion about the happenings.

Table 3 The selected countries for this study. Source: Wikipedia. Available at: https://en.wikipedia.org/wiki/COVID-19_pandemic_cases

From Table 3, awareness of the nature of Covid 19 was revealed to commence largely from January/ February 2020, and in many of these countries, the volume of tweets increased as the numbers of Covid cases intensified. The countries selected for this study were based on the numbers of internet users, as shown in Table 4.

Table 4 Percentage of the population covered by a mobile cellular network. Source: https://www.itu.int/net4/ITU-D/icteye/#/topics/2001

Following the selection of the countries, data from the Twitter API was collated. For this, it was known beforehand that the API had limits when requesting the Twitter data. The limitation was that there are 15-min windows for collating the tweets. Each window allowed a maximum of 180 requests to obtain data from Twitter by using a free developer account. With these limitations, 2 million tweets were collated regularly from February to June 2020. The Tweets sampling approach involved a convenience sampling approach that included selecting Tweets written in the English language and the selected keywords that led to an NLP model. Data that were also incorporated were tweets with geo-tags from the list of global countries selected for this study.

Twitter’s tweets were considered because Twitter shares individuals’ thoughts in short text messages known as tweets. Compared to other OSNs such as Facebook or Instagram, Twitter offers users a facility with fewer images, video data, and indirect thought sharing methods in the form of likes or comments, making it easier to read and understand, particularly when analyzing the data. Further, Twitter’s conversations are timeline-based (in a question and multiple answer format) that makes it easier to analyze the context of discussions and identify sentiments towards it.

Once Twitter’s tweets were downloaded, the exploratory data analysis phase was pursued. Initially, tweets were viewed for suitability and further accuracy and discounted on the following basis: If Tweets contained only certain random words, incomplete sentences, or had two or three words. The team also discounted tweets because the emotion of tweets may not offer an in-depth overview of the emotions; thereby, leading to a flawed result. This led to an overall selection criterion that tweets and words containing less than five words were removed to ensure uniformity and clarity. This significantly reduced the number of collected tweets. “Retweets” were also removed to avoid duplication. This led to approximately 1.5 million tweets for February, March, April, May, and June 2020. The tweets were then further processed to remove all the HTML texts, ‘@’ mentions, URL links, and #hashtags. This was because Informal Text Communication (ITC) emoticons can express complex emotions such as sarcasm, irony, or non-textual humor by simulating facial cues and surpassing the text (Kelly, 2015; Wolny, 2016). Thus, an emoticon placed at the end of a text that expressed the exact opposite emotion of a text allowed users to reproduce emotions such as sarcasm or irony, which were removed. This was also on the basis that emoticons with sarcasm are topic-dependent and contextual. What was also learned is that an algorithm needs additional information for classifying sarcasm correctly (Poria et al., 2016). Thus, analysis of a tweet for its genuine emotion was extremely difficult (Asghar et al., 2018; Yang et al., 2019). Consequently, emoticons were removed entirely from all the tweets during the pre-processing phase. Following these careful steps, datasets consisting of clean, pre-processed tweets with their required metadata for this research aim were ready to be analyzed and discussed in the next section.

4 Analysis: The Classification Method

Figure 1 shows the overall analytical process of this study. An analytical method used for various data mining purposes until recently is text-based multi-class emotion analysis. To contribute to the deep learning arena, the authors propose a deep learning-based implementation framework that involves preprocessing the collected tweets, analyzing using the latest pre-trained model RoberTa, fine-tuning it further by applying the concept of Transfer Learning using Emotion dataset, and then identifying the text-based emotions from the self-created Covid 19 dataset. Therefore, the application of the Roberta model along with transfer learning for multi-class data analytics classification on Covid’19 data along with visualizations was the novelty and contribution of this study.

Fig. 1
figure 1

Emotion classification implementation pipeline

Initially, the Transfer Learning technique was implemented to train the model. This was based on the concept of training the model with an available similar dataset and then testing it on differently curated datasets. A similar available dataset was the standard emotion dataset “Emotion in Text” (Mohammad & Kiritchenko, 2018). Once it was confirmed that the model was applicable, it was used to analyze users’ tweets in the Twitter Covid 19 dataset. Four deep learning classifiers were considered to implement the multi-class classifier: LSTM, Bi-directional LSTM, Google’s BERT, and Facebooks’s RoBERTa. RoBERTa was chosen as it performed well when compared to the previously mentioned classifiers. It was also found that research studies using this classifier were scant. Using RoBERTa, eight different emotions classes were identified: anger, depression, enthusiasm, hate, relief, sadness, surprise, and worry. These classes were categorized on a monthly basis, emotion-wide, and based on a country. The details of the dataset and classification method are presented in the following section.

Determining emotions from a piece of text can be achieved using multi-class text classification. For this study, transfer learning was utilized to build a classifier that detected emotions from tweets. Transfer learning is a technique that utilizes a deep learning model trained on a very large dataset. The large dataset is finetuned by a small dataset that is used to perform a specific task. The pre-trained model is then trained utilizing a massive amount of unlabeled text datasets such as Wikipedia. For this study, RoBERTa, a pre-trained model developed by Facebook AI based on Google’s Bidirectional Encoder Representations from Transformers (BERT), was used. BERT is designed to pre-train deep bidirectional representations from an unlabeled text by jointly conditioning the left and right context in all layers. As a result, the pre-trained BERT model was finetuned with just one additional output layer to create state-of-the-art models for a wide range of NLP tasks, and one of them was text classification. BERT was pre-trained using MaskedLM and Next Sentence Prediction objectives (Devlin et al., 2018). When training BERT, BooksCorpus, and English Wikipedia were used. Facebook retrained BERT with a few modifications, including training the model for a longer period and with more data; thereby, removing the next sentence prediction objective, training on longer sequences, and dynamically changing the masking pattern applied to the training data. Additionally, RoBERTa was trained using CommonCrawl News Dataset and OpenWebText, an open-source text corpus of Reddit posts with at least 3 upvotes (Liu et al., 2019).

To create the model to predict text-based emotions, the pre-trained RoBERTa model was finetuned using an emotions dataset. Fine-tuning was then completed using several methods such as, training the entire architecture, training some layers while freezing others or freezing the entire architecture, attaching a few neural network layers of our own, and training this new model. Since the emotion dataset had a minimal amount of data compared to the pre-trained model, the team decided to freeze the entire architecture in order to prevent updating of model weights during finetuning. To implement transfer learning using RoBERTa, Transformers, a State-of-the-art Natural Language Processing library developed by HuggingFace Inc. (Wolf et al., 2019), was employed. The Roberta-base model had 12-layers and approximately 125 M parameters, which were selected for this study. The default arguments were used to train the model for emotion analysis. The RoBERTa-base-uncased tokenizer was utilized to tokenize, generate sentence embeddings and encode the data. AdamW was used as the optimizer to optimize the neural network’s weights, as it is an improved version of the Adam optimizer (Loshchilov & Hutter, 2017). The model was then trained for ten epochs using the emotion dataset to finetune the RoBERTa model. The training was performed with a learning rate of 1e-5, with early stopping methods being used to prevent overfitting the model to the data. The dataset had 3500-labelled tweets, and the dataset was divided into a 75/25 split such that 2625 tweets were part of the training dataset and 875 tweets were a part of the testing dataset.

The data processing architecture of RoBERTa is very similar to BERT, except for the tokenizers, pretraining schemes, and training periods are different. Roberta uses byte-level BPE tokenizers with the dynamically masked language model and a longer training period and iterations compared to the currently used pre-trained models. The architecture of the model is shown in Fig. 1, and to operationalize it, the following steps are needed:

  1. 1)

    Each tweet sentence from the curated Covid’19 dataset was entered as an input to the tokenizer module and had byte pair encoding tokenizer as the first step. For this, the extra token was initially entered at the start [CLS] and the end [SEP] of each sentence to make the task of classification easier and provided separators in the token sequence. Then, the intermediate tokens were further passed for the word that led to position embedding and achieved the final vector representation of each token. Position embeddings ensure the sequence of the input sentence is also considered while analyzing the meaning of the sentence. Note: the Roberta model requires space to commence with, added after the start token as a prefix.

  2. 2)

    The final input entered into the model was: [CLS] + prefix_space + tokens + [SEP] + padding

  3. 3)

    Then, the tokens were passed to the pretrained RoberTa model to complete the model training phase. The model was then trained for longer sequences and for larger mini-batch sizes with a dynamic masking pattern generation each time and an input sentence being fed into the model.

  4. 4)

    The model consisted of the following layers:

  1. i.

    The input layer that converts each tweet into a step 2 like representation and then into a numerical vector representation.

  2. ii.

    The attention masking layer that avoids the attention head to consider the padded tokens.

  3. iii.

    Dropout layer that ensured that the model is not overfitted.

  4. iv.

    2 Conv1D layer was used for the start and the end scores, respectively.

  5. v.

    Softmax activation function was applied to get the index of the selected text.

  1. 5)

    Thereafter, the input was fed to the classifier that performed the multi-class emotion classification and identified a particular emotion class for each tweet,. A confusion matrix with categories (true positives, false positives, true negative, false negative) was also obtained to evaluate the model performance.

  2. 6)

    The performance of the implemented deep learning model was evaluated for accuracy of the new model and compared with other standard implemented models by considering the following parameters:

  1. i.

    MCC Coefficient: The Matthew correlation coefficient (MCC) was a measure that showcased the quality of two-class classifications by correlating the coefficient of the predicted and observed classification values. This was on the assumption that if the model prediction achieved good results in all the four confusion matrix categories, then higher values of the MCC coefficient result ranged between −1 to +1.

  2. ii.

    Training Loss: Training loss was a number that indicated the extent of times the model displayed bad predictions on a single tweet example. The loss value varied depending upon the weights and biases set in the model, and the aim was to achieve the lowest possible loss value. The loss values were used to achieve the most optimized model

  3. iii.

    Evaluation Loss: The loss value, in this case, was calculated at the validation stage using the validation dataset portion as the evaluation loss. It was like a training loss, but it was not used to update the weights and biases; instead, it showed the model performance during the testing phase.

  4. iv.

    Accuracy: Accuracy showed a measure of how often our model performed in terms of correct predictions. Its values were calculated based on the confusion matrix as true positives + true negatives /total samples

  5. v.

    Precision: The exactness of the model was depicted in the form of precision, which showcased how often a model assigned a correct class label to the input. It was calculated as true positives / true positives + false positives

  6. vi.

    Recall: The completeness of the model was depicted in the form of Recall, which resulted from a given class, and how often our developed model could correctly make predictions and keep the count of false negatives minimum. It was calculated as: true positives / true positives + false negatives

  7. vii.

    F1-Measure: F1 was a harmonic combination of precision and Recall that showcased how well the model could minimize the values of true negatives and false negatives. It was calculated as 2 * ((precision * recall) / (precision + recall))

5 Findings

For this study, two types of analysis were employed: model analysis and text-based emotional data analysis. Figure 2 shows the model performance analysis completed to evaluate the implemented workflow performance and comparisons of the results with other existing techniques and similar studies. The achieved results obtained an overall accuracy of 80.33% with an F1-measure score of 75.25 and an MCC coefficient value of 0.78 (Fig. 3). This is a novel result, as previous studies completed on similar topics were referred to and this study’s result was achieved. Similar academic studies are shown in Fig. 4, and Table 5, where the dissimilar results are identified and the novelty of this study’s model are shown. This also implies that our developed model can make 80.33% accurate predictions. Thus, if an academic team wants to make predictions about the impact of the lockdown using this study’s model, they can confidently do so with an 80.33% accurate outcome being obtained (Fig. 5).

Fig. 2
figure 2

RoberTa model performance: MCC coefficient, training loss, and testing loss values for various epochs

Fig. 3
figure 3

Performance comparison of multiple implemented models for emotional classification

Fig. 4
figure 4

The RoberTa model performance metrics: Precision, recall and F1-measure values for the various emotions

Table 5 Comparing the proposed model’s results with previous similar studies models
Fig. 5
figure 5

Proposed model comparison with existing implemented models

Fig. 6
figure 6

Word cloud for emotions

The analysis also utilized generated visual representative graphs that studied the impact of COVID 19 on the emotional state of mind. For this purpose, examining twitter sentiments solely from one perspective was not enough. To overcome this, an overall view of Twitter-based worldwide sentiments for each month was employed. This involved considering various countries and the emotions they faced. This multi-faceted analysis helped to understand the impact on various individuals and the ways individual countries managed the pandemic situation from February to June 2020. By focusing mainly on worry, depression, anger, sadness, surprise, relief, hate, and enthusiasm, an indicator was created. The reference for selecting emotions was taken from the National Research Council - NRC Hashtag Emotion Lexicon (Mohammad et al., 2013; Mohammad & Kiritchenko, 2015) that presents a list of words and their association with 8 different emotions (Fig. 6). The indicator considered the emotions that were pertinent during the pandemic. To ensure that this study’s developed model results can be applied to form an understanding aligned with daily lives, ie. Offering a socio-technical perspective, the data findings were analyzed from three different perspectives:

5.1 Visualizing all the Emotions for Various Countries for every Month

This part of the study’s analysis considered the emotions that individuals displayed in various countries every month between February and June 2020. This insight helped in understanding the variations in emotions occurring each month and in each country. In each month, various countries have exhibited different emotional aggregations. Some of the crucial observations in this analysis were:

Referring to Table 4 and Fig. 7, it can be learned that ‘worry’ was extreme in February, particularly in China, where Covid 19 cases had peaked to 11,821, which was not the case in the other countries. However, in March, the other countries also began to face Covid 19 cases, which is particularly evident in the case of Italy (1128) in Table 4. However, Fig. 8 does not represent this clearly because the Tweets were not largely in the English language; hence discounted. What was also found is that in February, China exhibited all the major emotions of Worry, sadness, depression, surprise, and enthusiasm. Referring to Table 4, in February, China had 11,821 cases compared to the other countries; therefore, the citizens were worried about the increase in Covid 19 cases, sad and depressed at the same time too. Surprise and enthusiasm were also exhibited as citizens were surprised about the rapid transmission of the virus at first (February 2020). Still, in March, as some news sites mentioned, China had managed to control the virus, which led to enthusiasm (Kathirgugan, 2020).

Fig. 7
figure 7

Emotion-wide tweets per million users per country in February

Fig. 8
figure 8

Emotion-wide tweets per million users per country in March

In March 2020, Fig. 8 shows that countries like the USA, UK, Ireland, Australia, and New Zealand, South Africa exhibited all the major emotions of Worry, surprise, hate. In Table 4, the Covid 19 cases indicated a rise in Covid 19 cases, which exhibited Worry, surprise, and hate. In March, Tweets from China were few, which surprised the team, but as Kleinberg (2020) also found, Tweets from China had reduced. This result was also confirmed in our new AI results model, as shown in Fig. 8. Comparing Fig. 8‘s China details to those of China, there is a smaller share. Figure 8 also clearly identifies the other countries’ shares increasing and disseminating the deadly virus in all the other countries, which was confirmed by the results of Table 4.

For April, Fig. 9 showed that India exhibited majorly the emotions of Worry, enthusiasm, and surprise, shown in Table 4. However, as Table 4 shows, Covid 19 cases had also increased for the USA, UK, which suggested the countries moving towards the grey emotions of depression, hate, and sadness. Along with the pandemic, during this month, India faced some unexpected events, such as the Tablighi Jamaat center gathering of devotees (Slater et al., 2020) that caused a ‘super spreading’ of the virus (Slater et al., 2020).

Fig. 9
figure 9

Emotion-wide tweets per million users per country in April

In May, the UK faced a spike in Covid 19 cases as identified in Table 4: 2514 to 171,257. This is also confirmed by our model’s results in Fig. 10, where the UK’s emotions were a worry, hate, anger, and surprise. May also witnessed a downfall in the magnitude of all the grey emotions worldwide. This was due to countries beginning to inform citizens of the end of the lockdowns/social distancing (Brueck, 2020). Therefore, emotions like anger, depression, and sadness began reducing (Fig. 11).

Fig. 10
figure 10

Emotion-wide tweets per million users per country in May

Fig. 11
figure 11

Emotion-wide tweets per million users per country in June

In June, the percentage of ‘worry’ emotions increased again. Almost all the countries exhibited this emotion on the higher side. Nevertheless, it was observed that, rather than the fear of catching COVID’19, the economic slowdown, employment prospects (Domm, 2020), and the blurred hope of returning to normality were the main reasons for representing this emotion in this month.

The next part of the analysis considered the months and countries part of this study.

5.2 Visualizing Emotions Incurred during the Five Months in all the Countries (Figs. 12, 13, 14, 15, 16, 17, 18, 19)

For this part of the analysis, the overall variations in the emotions over the five months in the selected countries were considered. For instance, it was found that during the pandemic, the otherwise positive emotion of “Surprise” shown in Fig. 19 had a negative shade to it. The emotion of enthusiasm shown in Fig. 17 emphasized the natural human behavior of not losing hope, motivating each other, and continuing the battle to overcome COVID 19. Further, this emotion was majorly impacted by how political leaders and caregivers had interacted with the citizens. For this section, various countries’ monthly emotions were represented in graphs. Some of the crucial observations from the analysis were:

Fig. 12
figure 12

Tweets portraying the emotion ‘depressed’ in various countries on a monthly basis

Fig. 13
figure 13

Tweets portraying the emotion ‘worry’ in various countries on a monthly basis

Fig. 14
figure 14

Tweets portraying the emotion’ anger’ in various countries on a monthly basis

Fig. 15
figure 15

Tweets portraying the emotion ‘hate’ in various countries on a monthly basis

Fig. 16
figure 16

Tweets portraying the emotion ‘relief’ in various countries on a monthly basis

Fig. 17
figure 17

Tweets portraying the emotion’ enthusiasm’ in various countries on a monthly basis

Fig. 18
figure 18

Tweets portraying the emotion ‘sadness’ in various countries on a monthly basis

Fig. 19
figure 19

Tweets portraying the emotion ‘surprise’ in various countries on a monthly basis

The emotions of depression and sadness showed in Figs. 12 and 18 peaked in April when COVID 19 began to affect individuals. This was particularly evident in the UK, India, USA, and New Zealand (Osborne, 2020; FE Online, 2020; CBS, 2020). After April, the emotions began to subside, with many countries facing these emotions moderately in May and increasing again, although slightly in June. Therefore, the procedure of unlocking revealed a positive sentiment amongst citizens despite individuals realizing the risk of the pandemic numbers increasing rapidly.

The emotion of Worry exhibited in Fig. 13 appeared prominently in February and March and then moderately in April with a drastic reduction in May. The magnitude of Worry has closely followed the infection trend in each country like China that has major worry tweets in February, the USA in March, and then India in April. Besides, the way the public was kept informed about the situation and major steps taken by the government have affected this emotion to a certain extent.

The emotions of anger and hate shown in Figs. 14 and 15 appeared strongly in March and April. The most logical reason behind these emotions in these months was the financial distress that everyone started feeling after two months of lockdown in February and March. In addition, the emotion of anger was depicted either by those countries that had a large citizen population, which was causing healthcare service mismanagement or by the countries that lacked good healthcare services and pandemic management arrangements. The emotion of hate was one way of expressing anger against various events occurring during the COVID 19 pandemic. However, the emotion of anger had subsided immensely in June.

The emotions of relief and enthusiasm showed in Figs. 16 and 17 were equally but moderately portrayed by citizens during March, April, and May. This was interpreted as the citizens, notably disadvantaged and lower-income communities, grateful for government support and assistance. Additionally, mortality rates began to decline, and the numbers of recovered cases were on the rise. Notably, India illustrated the emotion of enthusiasm during March, April, and May, which the team associated with the “Modi Effect.”

5.3 Visualizing the Variations in Emotions for all the Five Months for Particular Continents and Countries (Figs. 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)

This study considered each individual continent and country’s emotional voyage along with the month-wide and emotion-wide analysis. This data analysis is helpful for countries in various continents when studying the emotional variation of their citizens. It can assist with developing future strategies for their citizen’s preparation for the “new normal.” These graphs offered an analysis of the five months and depicted the journey of each emotion during this time course. Some of the crucial observations from the analysis were:

Fig. 20
figure 20

Month-wide tweets portraying various emotions per million users of the Oceania continent

Fig. 21
figure 21

Month-wide tweets portraying various emotions per million users of the African continent

Fig. 22
figure 22

Month-wide tweets portraying various emotions per million users of the European continent

Fig. 23
figure 23

Month-wide tweets portraying various emotions per million users of the Asian continent

Fig. 24
figure 24

Month-wide tweets portraying various emotions per million users of the North American continent

The majority of the selected countries exhibited the emotion of “Worry” consistently until March and then began reducing (Figs. 25 to 37). This was particularly evident in the USA (Fig. 25), UK (Fig. 27), Canada (Fig. 26), Australia (Fig. 30), Italy (Fig. 34), UAE (Fig. 35). However, countries like India (Fig. 28), France (Fig. 33), Germany (and New Zealand (Fig. 31) indicated a decline in the emotion of Worry from April but emerged again in June. Further, China (Fig. 29) is a country that displayed a sharp decline in the emotion of ‘worry’ from February 2020 that virtually disappeared until June 2020. A country that exhibited a small decline in the emotion of worry from February and remaining constant until a slight peak in June is Brazil (Fig. 32). A country that was a slight exception was South Africa, where there was a sharp increase in worry from February and remained so in April. From April through to June, worry declined, but not largely (Fig. 37).

Besides worry, the next strong emotion portrayed by the USA, India, UK, Australia, Canada, Italy, South Africa, and UAE is Hate and Surprise that peaked either in March or in April. This indicated that individuals began to fear the pandemic, which made them worried. Concurrently, the emotion of hate emerged due to unexpected events or human behavior, such as not accepting some of the restrictions or not using face masks or social distancing, which instilled a feeling of negative surprise and displayed individuals struggle to accept the faced situation.

The emotions of hate, sadness, depression and anger increased in April and May (Fig. 19), which is also confirmed by the data in Table 4. This was attributed to individuals’ displeasure towards various government management measures of the pandemic and the pain of losing dear ones. However, simultaneously, the feelings of relief and enthusiasm were also present, which displayed the positive and collaborative sentiments that were expressed by individuals when supporting each other due to these challenging times.

After this, many of the countries displayed moderate level fluctuation in two grey emotions, i.e., depression, sadness, and two positive emotions, i.e., relief and enthusiasm. This shows that the basic human instincts and our way of living life as “to get up and fight together.” Therefore, at one end, people were feeling lonely and depressed, but on the other side, they were trying to support each other and give strength to one another.

The bottom-most and least expressed emotion shown by almost all the countries are “Anger.” This indicates that even though members of the public were experiencing discomfort and expressing hate, there was still some sympathy within individuals for the role of government and the limitations that they faced.

To obtain a more generic perspective of the emotions and countries, they were grouped together into continents. When considering the emotion worry, Oceania, Europe, and North American continents clearly displayed a peak in March 2020. Worry peaked for Africa in April 2020. Hate was the second most felt emotion and emerged in fluctuating terms in all the continents (Figs. 20 to 24). Surprise and depression were the next most felt emotions. An emotion that did not stand out as anger, which is a revelation as media sources usually referred to anger within individuals towards Covid 19 as it affected individual’s daily lives, relationships, and livelihoods (Smith et al., 2020).

  1. 1)

    Month Wide Analysis:

  1. 2)

    Emotion Wide Analysis

  1. 3)

    Continent-Wide Analysis

6 Discussion of this Study

Previous deep learning studies focused on emotions have used observations for their studies (Chen et al., 2020; Montemurro, 2020; Thakur & Jain, 2020), Twitter APIs, or Tweepy APIs (Dubey, 2020; Venigalla et al., 2020) (Arolfo et al., 2020). Our study used deep learning-based NLP techniques with RoBERTa (Liu et al. 2019), a pre-trained language model developed by Facebook to analyze Twitter APIs. This strategy was pursued after our study found from previous literature that many studies utilize sentiment analysis and machine learning techniques such as supervised or unsupervised learning for the understanding of emotions, but they are usually based on certain contexts and sentiment analysis; e.g., Arabic tweets were analyzed with sentiment analysis to identify rumors (Alzanin & Azmi, 2019). In our study, the novelty was offered by using multiple emotion identification for several months and numerous countries that analyzed the impact of an unexpected exogenous shock event on several countries. Therefore, our study offers novelty for AI and machine learning by focusing not only on the implementation methodology but also on the information obtained by pursuing this type of text-based emotion analysis. By employing this technique, a novel predictive theory about the impact of the pandemic in a multi-country context was developed, which offers immense depth and understanding of the pandemic’s impact. For instance, the framework could determine emotions in certain countries and continents during months and not focus only on a negative or positive stance that sentiment analysis offers.

Using RoBERTa and the Covid 19 context, our study also offered more emotions than the existing emotions that were initially found in archives by researchers such as Jain et al. (2017) or Goel and Thareja (2018). The emotions that they used for their studies were Happiness, Sadness, Anger, Disgust, Surprise, Fear, Tired, Afraid, Sleepy, Relaxed, Bored, Excited. The authors utilized some of these for the study but found some emotions more specific to the pandemic time period.

6.1 Implications of this Study

In academia, AI is becoming increasingly important as it allows the understanding and exploration of large volumes of data. Our study has shown that using deep learning and NLP concepts drawn from AI, an understanding of the emotions that were expressed in various countries can be obtained. By utilizing a comparative, multi-country study, it was learned that worry is an emotion that was constantly displayed in the months of February to June, with the Asian continent being most worried. This was also confirmed by our research that found: “Since the outbreak of the coronavirus (and the disease it causes, COVID-19) began, reports of racism toward East Asian communities have grown apace”, which caused worry in the Asian citizens of the repercussions resulting from these actions (Serhan & Mclaughlin, 2020). Therefore, employing AI and a multiple countries aspect can offer novel insights into various countries’ cultural aspects and citizens.

From the results, an added emotion that was noted is that of surprise. Individuals were surprised by the impact of the pandemic. Although governments have emphasized extreme caution, individuals did not expect the pandemic to impact their livelihoods and lifestyle. For instance, the UN World Tourism Organization (UNWTO) found that travel and tourism were among the most affected sectors of almost every economy due to a massive fall of international demand amid global travel restrictions, including many borders fully closed, in order to contain the virus (UNWTO, 2020). Employment also became a major source of concern: “Many people have lost their jobs or seen their incomes cut due to the coronavirus crisis. Unemployment rates have increased across major economies as a result” (BBC, 2020). An added implication and novelty of this study is the application of Roberta when analyzing the tweets that identified the impacts of the preventive norms on Covid 19.

For industry, this research implies that organizations can identify and understand the impact of the lockdown in their respective countries. Further, for organizations seeking information about individuals’ reactions to the lockdown, our study provides graphical and detailed textual information, which is amiss in sentiment analysis and Covid 19 studies. Sentiment analysis has been used previously in several ways, including identifying the determinants of usage satisfaction of mobile payments that could enhance service adoption in India, a country that is using mobile devices extensively (Kar, 2020). Chang (2019) utilized sentiment analysis to devise a model of social influence to help organizations discover influential individuals on social media.

For policymakers, our results have revealed that there is a direct correlation with the pandemic’s outbreak, its management strategies, and the emotional state of society. Such an analysis can assist government policymakers to plan future policies and activities for the wellbeing of society, particularly when unexpected exogenous shocks such as the pandemic emerge in society.

7 Conclusions

The aim of this study was to identify, explore and understand globally the emotions expressed during the earlier months of the pandemic COVID 19 by utilizing Deep Learning and Natural language Processing (NLP). To address the aim, this study utilized and analyzed a global Twitter microblog dataset to explore and understand how across the globe, individuals’ emotions have changed between the months of January to June 2020. It also identified and formed an emotion classification that was formed using a state-of-the-art DL technique. The implemented model identified an improved performance when classifying multiple emotions from Twitter texts with increased classification accuracy. With the classification model, this study presented emotion trend analysis from three different perspectives: i.e., month-wide, emotion wide and country-wide.

Due to the COVID 19 pandemic, individuals experienced an unusual amalgamation of emotional energies and thought processes, which this study captured and understood using Twitter tweets. Twitter was utilized as it is amongst the most prominently used microblogging ONS platform that is used for thought sharing. It is also one of the important data sources for researchers to analyze public emotions and sentiments. As Twitter is useful to analyze emotions and thought sharing, the authors utilized eight emotions to understand the various emotions of individuals across diverse continents. The emotion that stood out in all the continents was worry, which featured more prominently in the Asian continent from February to June 2020.

Overall, this study’s key findings applied a combined AI transfer learning and RoberTa deep learning model approach that provided results with improved accuracy of 80.33% and an MCC score of 0.78. Using these models, the emotion of Worry was more prominent in the earlier months of lockdown, i.e., February and March. The later months of April and May witnessed an increased intensity in the emotions of hate, anger, and depressed emotion that showcased an impact of the financial and emotional stress on individuals. What also became clear was that the month of June began witnessing a peak in the emotion of Worry, which represented the anxiety among individuals regarding the extended lockdown protocol. Throughout the lockdown period, the emotions of enthusiasm and surprise were evident, but in varying proportions, the team attributed to human beings’ positive, resilient spirit. The countries that largely expressed their opinions and feelings were the USA, UK, India, Australia, and Ireland from the tweets. Having identified the conclusions of this study, the next section discusses the limitations and future directions of this study.

7.1 Limitations and Future Directions

For this study, Twitter tweets were utilized and are deemed essential for such a study due to the large scale, multi-country, global aspect. Twitter has approximately 160 million daily users, although subscribers to Twitter are increasing daily, with a very small percentage of users being spammers. While Twitter regularly updates its security measures to get rid of spammers, this is still a problem that it faces and can potentially affect the outcome of any Twitter-based research. Therefore, for future research, researchers should be mindful of spammers when considering Twitter’s tweets in order to take care of data privacy. (Shaikh & Patil, 2018a; Shaikh & Patil, 2018b).

A diverse limitation is that the datasets of tweets that were obtained did not provide all detailed information such as, demographics like age or gender. This implies that errors such as, information incompleteness and representativeness problems could be evident, and could lead to ethical and data security concerns; thereby, leading to biased results. To prevent such issues, the application of an ethics framework as suggested by Chang (2021) could be utilized in future studies and overcome; thus, offering an ethically compliant dataset.

At the time this study was completed, only nine hashtags were considered when collating the Twitter tweets, which were the most used hashtags from February to June 2020. However, as the pandemic has entered a second phase (October 2020), several new hashtags have appeared on a daily or weekly basis. Due to the time limit that authors had placed on this study, such tweets were beyond the scope of this study; therefore, not considered. Future studies could consider these tweets and offer deeper insight into this study.

Statista (2020) found that 30.9% of Twitter users are of the age range 25 to 34 years, compared to 12% of the 50 years old and above users. Therefore, a large population Twitter’s global population is less than 50 years old. Thus, of the 0.5 million collected tweets, substantial numbers of the users are drawn from a younger demographic sample population. To ensure that a response bias will not arise by focusing only on younger adults, older adults’ emotions should also be considered in future studies.

The proposed study has mapped the emotion variation to each month or country by considering the occurrence of generalized events. However, it can be reduced further to determine the impact of particular economic, financial or social events on public emotions; e.g., the impact of the financial slowdown, salary reductions, enhanced work/life balance situations on individuals emotions.

Finally, a social media analytics organization revealed that from the beginning of 2020, there was a daily rise in the number of tweets focused on individuals’ emotions during the pandemic (Tweetbinder, 2020). From the downloaded tweets, it was found that the maximum numbers of expressed emotions occurred in March 2020. This suggested that individuals were increasingly using Twitter to express their feelings. Also, the emotions variations in these first five months were more evident as such kind of situation was faced by many individuals for the first time in their life so the enthusiasm to closely monitor the surrounding events and express their thoughts was also on the rise. Therefore, the research study analyzes the Twitter data of the first lockdown that lasted for five months (February 2020–June 2020). As the pandemic has spread, there are more global lockdowns, and we propose that a future direction is that the subsequent lockdowns and the expressed emotions could be considered in the future.