Introduction

After a period of rapid global spread, the World Health Organization (WHO) declared the coronavirus disease 2019 (Covid-2019) outbreak a global pandemic on 11 March 2020 (World Health Organization 2020). The first cases were observed in China in December 2019, and Covid-19 had infected 9,457,902 people worldwide by May 2020, causing an estimated 482,247 deaths (John Hopkins University 2020). The pandemic has put serious strain on healthcare systems given the high infection and transmission rates, the lack of efficient treatment options and the clinical severity of Covid-19 (Lipsitch et al. 2020). To prevent transmissions, several countries implemented a range of non-pharmaceutical interventions aiming at restricting social contacts, such as social distancing rules or lockdowns (Hsiang et al. 2020; Cheng et al. 2020). Countries in Central Europe (e.g. Germany, Austria, Switzerland) introduced comprehensive restrictions over a short period from the mid to end of March 2020, which largely prohibited social contacts, forced people to stay at home and led to nation-wide shutdowns of educational institutions and the public business sector (Hale et al. 2020). The psychological and emotional correlates of such comprehensive restrictions of social contact remain largely unexplored.

Adverse emotions experienced in response to a pandemic have been associated with increased psychological burden, unhealthy behaviours, and biased risk perception (Asmundson and Taylor 2020; Lin et al. 2020; Saltzman et al. 2020). Thus, understanding the emotional implication of the Covid-19 pandemic for the public is important for tailoring public health messages and service provision to people’s needs. Anxiety has been identified as a central emotional response to the high risk of infection and the uncertainty related to the unpreceded circumstances during the Covid-19 pandemic. Several studies suggest that rates of anxiety, depression and stress are elevated in populations affected by Covid-19 (Luo et al. 2020). In a German survey, 60% and 74% of respondents reported being anxious and worried because of Covid-19 at the time the restrictions of social contact were announced (Betsch et al. 2020). Efforts aimed at limiting the spread of Covid-19, such as lockdowns and quarantine, often severely restrict people’s capacities to access social and economic resources, which may result in anger and frustration (Brooks et al. 2020; de Las Heras-Pedrosa et al. 2020). Other study results indicate that positive emotions may emerge from increased feelings of support, e.g. from friends and family (Zhang and Ma 2020).

Information available on social media platforms such as Twitter constitute an important resource for monitoring emotional responses at different stages of the pandemic. A number of studies have analysed large bodies of Twitter posts (‘tweets’) to advance knowledge regarding public opinion, experiences of stigma or emotions related to Covid-19 (Budhwani and Sun 2020; Jimenez-Sotomayor et al. 2020; Rufai and Bunce 2020; Han et al. 2020). Findings of a study of tweets from English-language users suggests that common topics discussed on Twitter in the earlier phase of the pandemic include the origin of the virus, its socio-economic impact and risk of infection (Abd-Alrazaq et al. 2020). Most of the topics were characterized by positive emotions. In another analysis of English-language tweets, Lwin et al. found that emotions reflected in narratives on Twitter were dominated by fear related to disease spread in the beginning of the pandemic, and shifted to anger associated with social isolation as the pandemic progressed (Lwin et al. 2020). Language indicative of sadness was used in tweets related to loss of loved ones due to a Covid-19 infection, while positive emotion were expressed in tweets associated with gratitude, hope and resilience (Lwin et al. 2020).

These studies highlight the potential of using social media data to discover changing emotions and concerns of the public over the course of the Covid-19 pandemic. However, many available studies have analysed tweets posted by twitter users globally, representing a heterogeneous population with regard to exposure to infection risk and public health efforts to control the spread of the disease. Examining region-level social media discourse may allow for a more nuanced assessment of psychological and emotional responses of the public at different stages of the pandemic. A study that compared social media posts originating from regions of China and Italy before and after the lockdown found different patterns of language-use reflecting different emotional responses (Su et al. 2020). Anxiety appeared to decrease in Italy following the introduction of social contact restrictions, whereas negative emotions remained unchanged in China (Su et al. 2020).

Several countries in Central Europe such as Germany, Austria and Switzerland simultaneously introduced comparable social contact restrictions over a short period from the mid to end of March 2020 (Hale et al. 2020). Public responses to these measures can be readily identified in social networks such as Twitter by focusing on German language use, which is used only in a confined geographical area in Central Europe covering mainly the populations of Germany, Austria and Switzerland (Eberhard et al. 2019). Thus, the aim of this study was to examine trends and topics associated with emotions reflected in German Covid-19-related discourse on Twitter following the introduction of social contact restrictions in countries of German-speaking Central Europe. Results can be instructive for a more effective coordination of policy and health system response to the psychological outcomes of the Covid-19 pandemic.

Methods

Data collection and cleaning

Twitter’s standard search API, which searches against a sample of tweets published in the past 7 days, was used for data collection. In a series of consecutive searches, German-language tweets that (a) contained the terms ‘#corona’, ‘#coronavirus’, ‘#covid19’ or ‘#covid’; and (b) were published between 2020/03/18 and 2020/04/24 were collected. User-tags and geo-location of tweets were inspected to determine region of tweet origin.

Only tweets containing a self-reference (i.e. ‘I’, ‘my’, ‘mine’, ‘me’) were selected for further analysis to increase the probability of capturing expressions of individual emotions. Tweets were prepared for analyses in several steps: (1) Duplicates were removed in a two-stage process. First, duplicates were identified by inspecting the ‘is_retweet’ label provided by Twitter for each tweet. Second, as some retweets may not have been correctly labelled for technical reasons, the cosine similarity of tweets was additionally calculated to identify duplicate tweets; (2) The position-of-speech of words within tweets was determined; (3) Basic cleaning of tweets was conducted by removing URLs, usernames and converting all characters to lower case.

Data collected in this study were made publicly available by Twitter users, and were thus treated as data within the public domain (Stevens et al. 2015). User anonymity was ensured by presenting data in aggregated form only. Therefore, no further ethical approval was deemed necessary for this study. Data usage and processing in this study adhered to Twitter’s Terms of Service and Developer’s Agreement and Policy.

Materials

The German version of Linguistic Inquiry and Word Count (LIWC) dictionary was used to categorize words within tweets into four emotional categories based on previous research: positive emotion, anger, anxiety and sadness (Meier et al. 2019; Pennebaker et al. 2015; Lwin et al. 2020; Brooks et al. 2020; Luo et al. 2020). The LIWC dictionary has been specifically developed to capture an individual’s psychological states as reflected in language use. It contains selected words which are organized into a set of psychological categories (e.g. positive emotions, anxiety). For a given document, the rate of word counts in the respective LIWC category to the total number of words (%) is calculated, with higher rates reflecting more pronounced emotion. Reliability of the German version of the LIWC has been demonstrated and meaningful associations with related psychological constructs in a number of studies indicate validity (Meier et al. 2019).

Data analysis

Sample characteristics were summarized using descriptive statistics (median, mean [M] and standard deviation [SD]).

Hashtags were removed before calculating rates of emotions as they are not a regular part of general language syntax and may bias word counts. To estimate changes in prevalence of emotions over times, tweets published on the same day were aggregated into a single document and rates of emotions were calculated. Time trends were modelled using logistic regression with day as independent and daily rate of emotion as the dependent variable.

In order to determine associations between emotions and tweet content, bi-term topic models were used to discover clusters of tweets with related content (Yan et al. 2013). A bi-term model (BTM) is based on the underlying assumption that word co-occurrences (i.e. bi-terms) within a set of documents (e.g. tweets) are generated by a mixture of probability distributions of unobserved topics (Yan et al. 2013). BTMs are especially suited to extract topics from short texts with sparse word co-occurrences like tweets (Yan et al. 2013). During the modelling process, words are probabilistically assigned to topics and these topic-word distributions allow for qualitative interpretation of the content of topics.

Topic models based on nouns-only have been shown to produce results which are more interpretable (Martin and Johnson 2015). Thus, only nouns, proper nouns and hashtags that (a) occurred in at least ten tweets; and (b) consisted of at least three characters were included in the topic modelling process. Additionally, we inspected the 100 most frequent words of this reduced corpus. We removed or merged terms that were judged to carry poor discriminative meaning and would thus occur in many topics (see S1 for sample flowchart). To guide decisions on the number of topics to extract, we calculated semantic coherence and exclusivity as metrics of topic quality as proposed in Roberts et al. (2019). Semantic coherence measures how well words assigned to a topic reflect the empirical co-occurrence of words within a set of documents (e.g. tweets) (Mimno et al. 2011). Exclusivity is based on a variant of the FREX metric (Bischof and Airoldi 2012), which assigns higher scores to topics containing words that are both frequent and exclusive to a given topic. We used formulas and code as provided by Roberts et al. (2019).

We ran a series of models with k-topic (k = 10,20,30,40,50,60,70,80,90,100) and selected the final model based on assessment of semantic coherence and exclusivity. Parameters estimated by the final model were used to assign each tweet its most likely topic. Associations between each topic and rates of emotions were calculated using logistic regressions with topic as dependent variable and rates of emotion as independent variables. Only topics showing a positive association with any emotion were retained. Finally, topics were labelled and interpreted by inspecting the topic-word distributions returned by the model. Two metrics were used to identify top words associated with each topic: probability and FREX. Probability is the statistical probability for a word to belong to a given topic, whereas the FREX metric weights the probability of words by their exclusivity for a given topic. Words which are both probable and exclusive score higher on the FREX metric (Roberts et al. 2019). Additionally, we inspected most likely tweets associated with each topic.

All data analysis was conducted using R (R Core Team 2019). Results with a p value < .05 or a confidence interval that did not contain the null hypothesis were considered significant.

Results

Sample characteristics

Sample characteristics are displayed in Table 1. A total of 608,737 tweets were collected (no data collected on 2020-04-17). After removing duplicates and selecting tweets with self-reference, 79,760 tweets were included in the analysis (see S1 for sample flowchart). The mean number of tweets posted per day was M = 2156 (SD = 914). Tweets were posted by 37,020 unique users with the mean number of tweets posted by a single user being M = 2.15 (SD = 3.88). Information on user-tagged country of origin of tweets was available for 3.8% (n = 3034) of the sample. Of these, 97.2% were tagged with Germany (85.4%, n = 2590), Switzerland (6.1%, n = 185) or Austria (5.7%, n = 173) as the country of origin. Geo-location was enabled for 0.2% of all tweets only.

Table 1 Sample characteristics

Rates of emotions

Figure 1 shows the rates of different emotions by day and the fitted trend lines. Time was significantly associated with decreasing rates of positive emotion (b = −0.003, p < .001), anxiety (b = −0.005, p < .001) and sadness (b = −0.001, p = .040). Relative changes were largest for anxiety, which decreased by 18.2% (positive emotion −9.4%, sadness −5.4%, anger 3.5%).

Fig. 1
figure 1

Rates of emotions in Covid-19 related tweets by day. Trends lines and 95% confidence interval fitted with logistic regressions

Topic labels and associations with topics

Inspection of semantic coherence and exclusivity values suggested that a topic model with k = 30 topics provided an adequate fit to the data (see S2). Rates of emotion were significantly increased in a total of 16 topics (sees Tables 2, 3 and S3 for an overview of topics). These topics were labelled based on the 15 top words identified by the probability and FREX metric. For example, the topic characterized by the words ‘measures’ and ‘government’ (probability metric) and the words ‘fundamental rights’ and ‘citizen’ (FREX metric) was subsequently labelled ‘Political measures and implications for basic rights’ (see Table 2). The labelled topics could be categorized into four broader themes (see Table 3).

  1. 1)

    Measures to prevent the spread of Covid-19: Topics in this theme included tweets relating to the impact of political measures to prevent disease spread (e.g. lockdown) on basic rights, social contact restrictions and compliance with these measures, the opening and closing of schools and communication and actions of political leaders.

  2. 2)

    Life during lockdown: Topics in this theme included tweets associated with caring for loved ones, homeoffice and leisure at home, information and digital technologies as well as environment and nature.

  3. 3)

    Infection-related issues: Topics in this theme were generated by tweets covering symptoms, testing and quarantine related to Covid-19, infection risk as well as gratitude for healthcare workers.

  4. 4)

    Impact of pandemic on public and private life: One topic in this theme comprised tweets related to a relatively broad discussions about the impact of the pandemic on the economy and society in conjunction with other global crises (e.g. climate change). Other topics related to panic buying, conspiracy theories and (mis)information, religion as well as the pandemic situation in other world regions and populations, e.g. refugee camps.

Table 2 Words associated with topics
Table 3 Topics and associations with emotions

Topics relating to the theme ‘Measures to prevent spread of Covid-19’ were mostly related to anger and anxiety, although discussions of political leaders were associated with positive emotions. The majority of topics belonging to the theme ‘life during lockdown’ were associated with positive emotions. Only tweets belonging to the topic of caring for loved ones were associated with anxiety and sadness. Topics included in the theme ‘infection’ were mostly associated with anxiety and sadness, except for the topic gratitude for healthcare workers, which was associated with positive emotions. Most topics covered by the theme ‘impact of pandemic on public and private life’ were associated with anxiety, anger or sadness.

Topics most strongly associated with either anger, anxiety, sadness or positive emotion comprised 18.6%, 29.7%, 5.3% and 26.5% of all tweets, respectively. The remaining 19.9% of tweets were assigned to topics not significantly related to any emotion.

Discussion

Non-pharmaceutical interventions such as social contact restrictions and lockdowns have proven effective in slowing the spread of Covid-19; however, little is known about the psychological responses of the public to the pandemic and efforts to control it (Hsiang et al. 2020). We found that rates of anxiety, sadness and positive emotion in Twitter statements decreased over a 1-month period following the introduction of social contact restrictions in German-speaking countries of Central Europe, suggesting reduced polarity of emotional expressions in tweets. Several topics emerged in Twitter discourse that were related to implications of the Covid-19 pandemic in general and to the introduction of social contact restrictions specifically.

Cautious interpretation of these results is warranted, as an individual’s use of words may not precisely reflect underlying psychological states. While several findings show that measures of subjective well-being or mental health are associated with differences in language use, other studies did not find a relationship between psychological measures and language use (Meier et al. 2019; Sonnenschein et al. 2018). Thus, results of linguistic analyses such as this study should be interpreted in conjunction with other indicators of public well-being. In this study, rates of words related to anxiety showed the largest decrease over time. This result may suggest a possible link between governmental efforts to control the pandemic and decreasing anxiety among the public (Luo et al. 2020). In Germany, for example, public approval of social contact restrictions was high at the time of their introduction and Twitter statements concerning political leaders were associated with more positive emotion in this study (Teufel et al. 2020). However, no causal link between governmental interventions and rates of emotions in Twitter statements can be established due to the observational design of this study and the lack of a control group.

It is also possible that decreasing rates of anxiety-related words in Twitter statements reflect a habituation or emotional numbness to pandemic stressors. Topics related to anxiety were most prevalent in this sample. These results corroborate other research showing heightened levels of anxiety in populations affected by Covid-19 (Wang et al. 2020). Congruent with findings from other studies, sources of anxiety and sadness were mainly related to the impact of the pandemic on family and social life, education, as well as infection, pointing to an need to address these public concerns (Mertens et al. 2020; Abd-Alrazaq et al. 2020). With regard to social contact restrictions, anxiety was associated with topics discussing potential negative impacts of both implementation (e.g. with regard to basic rights) and relaxation of social contact restrictions (e.g. re-opening of schools), which suggests nuanced public risk perceptions.

Personal statements expressing positive emotions declined steadily following the introduction of social contact restrictions. Psychological strain resulting from enduring social isolation in addition to the absence of directly observable effects of prevention efforts (e.g. avoiding increasing rather than reducing case-fatality rates) may have inhibited positive emotions. However, topics relating to positive emotions comprised approximately one-quarter of all tweets, indicating that social networks constitute an important source for people to share positive messages of coping with staying at home or solidarity (e.g. with healthcare workers).

Expressions of anger did not change in tweets posted after the introduction of social contact restrictions. Anger was related to social contact restrictions and compliance with these measures, but it remains unclear whether Twitter users expressed anger about having to comply with restrictions or about other people not complying. Anger was also related to statements surrounding the credibility of Covid-19-related information. Misinformation on social media has been identified as a major barrier for public health efforts to prevent disease spread, promoting unhealthy behaviour and erroneous practices that can ultimately harm physical and mental health (Tasnim et al. 2020). The association with anger observed in this study may indicate increasing public awareness and rejection of misinformation, but also highlights the need for concerted efforts strengthening the position of authentic sources of information.

Limitations

Several limitations to this study need to be considered. The target population of this study were Twitter users living in a country in German-speaking Central Europe, but we were not able to determine the exact origin of the majority of tweets. However, user-tagged country of origin indicated that the vast majority of tweets originated from users in Germany, Switzerland or Austria. The following issues specifically limit the generalizability of results: First, we used one of Twitter’s free data access point for which Twitter does not provide all tweets matching a certain search query but samples a collection of tweets with undisclosed underlying sampling scheme. Second, we used a hashtag-based search to identify relevant tweets, seeking to increase the relevance of tweets obtained via Twitter’s data access point. However, samples of tweets collected by hashtag-based searches may not be representative of discourse related to a specific topic (Bruns and Stieglitz 2014). Third, compared to the general population, Twitter users have been shown to be younger, higher educated and more politically liberal (Mellon and Prosser 2017). Last, there was indication that the vast majority of the tweets were posted by German users, followed by users from Switzerland and Austria (97.2% for the three countries combined), which limits the generalizability of results to other countries. Although the timing and stringency of social contact restrictions within Germany, Switzerland and Austria were largely comparable (Hale et al. 2020), different experiences within these populations may have also reduced generalizability of results across these countries.

Furthermore, the word count approach we used to estimate the prevalence of emotions in our sample did not allow identifying negations and other contextual shifters of word meanings (e.g. sarcasm). However, we calculated rates of emotions based on aggregated tweets only (i.e. at the overall or daily level), and thus single misclassifications are unlikely to bias overall results. Observed changes in rates of emotion in Twitter discourse may represent lagged effects of events other than the introduction of social contact restrictions. We were not able to collect data for 1 day in the study period. However, bias resulting from this minimal lack of data is likely to be negligible.

Conclusion

Polarity of emotions expressed by Twitter users decreased following the introduction of social contact restrictions in countries of German-speaking central Europe, mainly driven by decreases in anxiety. Findings further revealed nuanced emotional responses to different aspects of social contact restrictions and the pandemic in general. Providing the public with accessible rationales for both implementation and relaxation of social contact restrictions as well as reliable information on epidemiological and medical aspects of the virus may buffer negative emotions. Continuous monitoring of social media activity at a regional level can contribute to adaptive understanding and responses to changing public concerns over the course of the pandemic.