1 Introduction

Mental health significantly impacts individuals and society, with approximately 18.5% of adults in the U.S. experiencing some form of mental illness annually (Center for Behavioral Health Statistics and Quality, 2021). Understanding these impacts requires a multifaceted approach. Although mental health has long been considered as an individual health problem, previous research shows that some external factors can affect the mental health of a larger population collectively. For example, mental health is closely related to the places and neighborhoods where someone lives and works (Curtis, 2016), and health disparities among different races and neighborhoods are related to the mental health of the residents (Lee, 2009). Therefore, examining mental health factors beyond an individual’s socioeconomic factors through the lens of geography has been applied in previous studies.

Unexpected events causing negative emotions can impact a larger population, and certain groups of people may be affected more than others. For example, a tragic event such as a shooting can lead to post traumatic stress disorder (PTSD) and negatively impact the mental health of not only the immediate victims but also the general population in the surrounding area. Studies have found that the stress levels of students and other individuals in the community can be affected by such events (Saha & De Choudhury, 2017; Bassett & Taberski, 2020). Furthermore, global events may impact mental health collectively, as seen during the COVID-19 pandemic (Moreno et al., 2020; Abbott, 2021; Boden et al., 2021; Li et al., 2022).

The U.S. Department of Health and Human Services and the Centers for Disease Control and Prevention (CDC) define mental health as including emotional, psychological, and social well-being. It is important to note that this research refers specifically to emotional well-being or the psychological aspect of mental health.

Emotional well-being plays a crucial role in overall health and well-being. Particularly, various studies in health psychology have found that negative emotions are associated with poor health, while positive emotions are associated with better health (Cacioppo, 2003; Smith et al., 2004; Richman et al., 2005). Furthermore, emotions also affect work productivity and other outcomes, which can have an impact on society. Therefore, decreased emotional well-being can lead to mental health concerns such as stress, depression, and anxiety, and emotional wellness is closely tied to both physical and mental health. Ultimately, emotional wellness is important for both individuals and society.

Mental health is a broad, complex, and intangible concept, making it difficult to measure (Chandu et al., 2020). Currently, mental health is often evaluated through self-report surveys, which consist of structured questionnaires. However, these surveys can be expensive and time-consuming and are typically only conducted on specific groups due to limited resources. Additionally, the data collected through these surveys may be outdated, making it difficult to make comparisons across large groups or cities.

Social media platforms such as Twitter, Facebook, and Instagram offer rich sources of data that researchers can analyze to study public perceptions and social phenomena (Salathe & Khandelwal, 2011; Tomeny et al., 2017). The language people use on these platforms can convey their feelings and opinions on various topics, including politics, social issues, and public health. Additionally, the real-time nature of social media data makes it useful for situation awareness, monitoring, and identifying behavior patterns and news trends. Numerous studies have shown that social media data could reflect a user’s health behaviors and can be useful for providing mental health information at a finer space and temporal resolution (Gruebner et al., 2017; Angskun et al., 2022; Shaughnessy et al., 2018; Aebi et al., 2021).

While there have been many studies that have used social media data to measure emotions or infer to mental health status, most of these studies have focused on individual-level analysis (Chancellor & De Choudhury, 2020). This means that they have looked at the emotions and behaviors of individuals rather than groups or communities. Additionally, while social media has been widely used as a tool for social sensing there has been relatively little research focused on developing analytic frameworks for quantifying emotions and behaviors at the group or community level. This is an area where more research is needed to create a more comprehensive understanding of how social media data can be used to study mental health and other social phenomena.

The study focuses on creating a daily-updated social sensing index at the city level using Twitter data to understand sentiments related to mental health. Sentiment and emotions have long been studied in relation to mental health. It is acknowledged that negative emotions, especially prolonged negative sentiment can be indicators of mental health challenges. While mental health encompasses a broad spectrum, emotional well-being is undeniably a component of it. In our effort to create a social sensing index for mental health, we have gauged negative sentiments expressed in tweets. Therefore, this study aims to provide insights into the emotional well-being of people living in different cities and changes over time, which ultimately can help to detect where attention needs to be given to taking preventive actions for mental health. To achieve this, the study addresses three research questions:

RQ1. How could social media data be used for generating a daily-updated city-level social sensing index for monitoring collective mental health in a city?

RQ2. How could this social sensing index find where and when negative emotions are lingering?

RQ3. How could this social sensing index detect large emotional events automatically for helping public health professionals to take preventive action?

2 Related work

2.1 Emotion and health

Emotion, or feeling, is a natural, instinctive state of mind derived from one’s circumstances, mood, or relationships with others and includes happiness, sadness, anxiety, disgust, fear, surprise, and anger (Plutchik, 1991; Ekman and Cordaro, 2011). As a part of mental health, emotional well-being refers to the emotional quality of an individual’s everyday experiences (Kahneman & Deaton, 2010).

The emotional state of an individual can have both a direct and an indirect impact on their physical and mental health (Cacioppo, 2003; Smith et al., 2004; Richman et al., 2005). For example, emotions like stress, anxiety, and depression can contribute directly to a wide range of health problems such as a compromised immune system, high blood pressure, heart disease, insomnia, digestive issues, diabetes, and more. Moreover, negative emotions can cause other diseases by contributing to poor health behaviors such as smoking and excessive alcohol consumption.

Emotional well-being is influenced by demographic, environmental, and situational factors. Where people live and work also plays a role in their mental health, as our environment and our neighborhood can have a significant impact on our emotional well-being. For example, people who live near green spaces are more likely to have better mental health (Nutsford et al., 2013; Beyer et al., 2014). Guzman et al. (2021) demonstrated positive associations between green neighborhoods and better mental health and well-being during the COVID-19 pandemic. Conversely, studies have found that living in gentrified or segregated areas can cause more stress and can negatively impact mental health (Gibbons & Barton, 2016; Lim et al., 2017; Tran et al., 2020). Gibbons and Barton (2016) observed that the health of Black residents of gentrifying neighborhoods was more likely to be poorer compared to that of residents in other neighborhoods. Additionally, changes in neighborhoods, such as a rise in foreclosures and changes in collective efficacy, can have a negative impact on mental health. Downing’s (2016) study found that rapid neighborhood changes during the housing crisis, which resulted from an increasing number of foreclosures, have been linked to declines in mental health. Also, Ahern and Galea (2011) suggest that lower collective efficacy in community districts is associated with a higher prevalence of depression.

Local and global social events can affect people’s emotions collectively (Thelwall et al., 2011; Paltoglou, 2016; Yang & Ma, 2020). McCann and Pearlman (1990) define the term “vicarious trauma” to explain how therapists may experience their clients’ trauma. The cognitive beliefs of these therapists may be negatively affected and generally disrupted (Dunkley & Whelan, 2006). This indicates that even people who do not directly experience the event causing a negative impact may be similarly affected by the event.

Local events may affect their immediate communities more than other communities and, in general, those people in the local community are more negatively influenced. For example, an unexpected tragic event may affect not only those who experience it directly but also those who are not directly affected (Gaspar et al., 2016; Saha & De Choudhury, 2017). In this case, where the event happened or how a person is related to the particular event — such as by belonging, gender, or race —may affect their mental health. Saha and De Choudhury (2017) examined how college students were affected by a tragic incident involving gun violence on college campuses; their study shows that the stress levels of college students increased after the shootings. Similarly, Bassett and Taberski (2020) suggest that, after a tragic shooting, many people’s stress levels changed and they were struggling.

Global events may impact people’s mental health collectively. In January 2020, the first case of COVID-19 was reported in the United States. Needless to say, the COVID-19 pandemic has had a significant impact on mental health. The uncertainty and fear caused by the virus, as well as by the restrictions imposed by lockdowns, have contributed to increased levels of stress, anxiety, and depression among people all over the world. Many studies have confirmed this and have suggested that the pandemic has had a negative impact on mental well-being (Moreno et al., 2020; Abbott, 2021; Boden et al., 2021; Geirdal et al., 2021; Płomecka et al., 2020; Chandu et al., 2020; Wang et al., 2022; Mengin et al., 2022; Bailon et al., 2020).

Moreover, stress and negative emotions that accumulate over time can have a significant impact on long-term health. Studies have shown that people who experience large amounts of stress and negative emotions for a prolonged period are more likely to have health issues (Leger et al., 2018, 2020). For example, Leger et al. (2018) analyzed a nationwide survey and their results indicate that lingering negative feelings over daily stressors may impact people’s long-term health. Given the longer-term mental health effects of COVID-19, community monitoring and mental health screening in selected groups is crucial (Moreno et al., 2020). This highlights the importance of monitoring and addressing negative emotions before they develop into mental disorders. Understanding the link between emotions and health can help in the development of interventions and strategies for promoting emotional well-being and for preventing or managing health problems. It is important to manage and reduce negative emotions to maintain good mental and physical health. Therefore, there is a need to develop creative ways to monitor the level of well-being in the population and to identify early signs of negative sentiments and emotion in a community.

2.2 Social sensing via social media data

The term ‘social sensing’ is used to refer to the approach of collecting and analyzing data from diverse devices to gain insights regarding human behaviors and trends (Wang et al., 2015). With the development of smartphones and other mobile sensors in the Big Data era, individuals work as sensors by generating traits of human behaviors and socio-environmental information (Goodchild, 2007; Liu et al., 2015). Social sensing studies using social media data to examine urban dynamics and human behaviors have flourished in the last decade. Social media data can contain a broad range of topics from ordinary communication to trending social issues. Additionally, the spatial information of social media data can be obtained either directly through its metadata or indirectly through the user’s profile or text (Tsou, 2015). This user-generated data provides great potential to understand our society, especially in urban areas since it reflects human behaviors. Therefore, it can be used to support current methods of information gathering such as surveys for investigating diverse aspects of society.

Social media data captures diverse perspectives on public opinions, making it a valuable tool for understanding social issues in a city. Additionally, social media can provide real-time information and serve as a social sensing tool. For example, many studies have shown that citizen plays role as a sensor in the response to natural disasters such as earthquake (Sakaki et al., 2010; MacEachren et al., 2011), Hurricane (Ukkusuri et al., 2014), fire (Craglia et al., 2012; Wang, Ye, & Tsou, 2016), or floods (Fuchs et al., 2013; Fohringer et al., 2015; Li et al., 2018), informing recover and emergency response efforts.

Additionally, social media data has been used to shed light on various topics in public health including: health monitoring such as detecting disease outbreak, and public perception of health phenoemena, environmental factors related to health status, mental health and subjective well-being (SWB). Studies have used social media data to identify outbreak of disease and influenza such as flu (Lampos et al., 2010; Nagel et al., 2013), H1N1 influenza (Signorini et al., 2011) and dengue fever (Chunara et al., 2012; Gomide et al., 2011). These studies show the correlation between tweet messages including illness symptoms and the actual incidence. Considering that the existing influenza surveillance system may take up to two weeks to identify an outbreak, Twitter data can be more effective than traditional surveillance methods. In another study, Eichstaedt et al. (2015) analyzed language usage on Twitter to predict heart disease mortality.

Regarding public perception on the health phenomena, two studies show that correlation of geographical regions that have the certain demographic composition or the rate of vaccination coverage and public perception and sentiment on vaccination (Salathe & Khandelwal, 2011; Tomeny et al., 2017). Kim et al. (2017) investigated how social media played a role in public perception and risk communication using the 2015 Middle East respiratory syndrome (MERS) case in South Korea. Their findings suggest that both social networks and spatial proximity to where users live are relevant to spatiotemporal patterns of public interests.

A person’s use of language conveys their feelings and emotions directly and this can be used as an estimation of mental health conditions. Coppersmith et al. (2014) utilized the Linguistic Inquiry and Word Count (LIWC) tool to assess language use on social media. Their findings revealed statistically significant differences in language use between control users and those diagnosed with depression. Notably, the depression group exhibited a higher frequency of negative emotion words compared to the control group. Similarly, De Choudhury, Gamon, et al. (2013) observed amplified negative emotions in individuals with depression.

Therefore, many studies aimed to quantify mental health by extracting emotional descriptions from social media and other text data. These studies have used various machine learning methods such as sentiment analysis, and Natural Language Processing (NLP) to analyze social media data and gain insights into people’s mental and emotional states. Therefore, these studies have showne the potential of using social media data to investigate the quality of life or SWB and mental health. For example, Yang and L. mu. (2015) analyzed tweet contents and spatial patterns to detect users’ depression symptoms and found that using social media for detecting depression can be more efficient than current questionnaires or self-report methods since its language is in a natural setting. Similarly, Wang, Hernandez, et al. (2016), analyzed weekly trends in work stress and emotions and found that work stress and negative emotion expressed on Twitter is dropped on Friday. Other relevant studies like Hao et al. (2014), Coppersmith et al. (2014), Gaspar et al. (2016), and Shaughnessy et al. (2018) have used different methods and techniques to quantify and analyze the social media data and provide insights into mental health and well-being. More recently, Aebi et al. (2021) demonstrated that big data such as social media data can be used for monitoring mental health consequences of the COVID-19 despite existing concerns regarding big data, such as ethical and legal concerns, and validity concerns.

While the majority of studies are focused on individual-level analysis, one of the few examples of using Twitter data to measure the sentiment of a larger population is Hedonometer (Dodds et al., 2011). Hedonometer uses NLP and sentiment analysis to analyze tweets and assign a happiness score to each tweet, and then aggregate these scores to create a daily happiness index for a given population. Hedonometer can be used to show daily patterns in happiness and detect relevant events that cause spikes in happiness. Mitchell et al. (2013) measured happiness across the U.S. and Schwartz et al. (2013) investigated variation in well-being and detected which words can indicate life satisfaction by using Hedonometer. The authors found that word use gives additional predictive accuracy above the socioeconomic and demographic variables (age, sex, ethnicity, income, and education) in predicting life satisfaction.

Unlike Hedonometer which uses only positive words to measure happiness, this study focuses on measuring negative emotions. A previous study demonstrated that people express negative emotions more predominantly on Twitter (De Choudhury et al., 2012). Moreover, in terms of scale, Hedonometer collects data from all over the world and presents the overall trends globally. Therefore, it does not provide information about local areas such as city-level. Given that local events may affect the emotions of people in the surrounding area, it is worth looking at smaller areas' sentiment trends. Similarly, De Choudhury, Counts, and Horvitz (2013) developed a social media depression index (SMDI) by using a prediction model on Twitter data. SMDI uses diverse features such as emotions, language, social relationships, and behaviors on Twitter to measure whether a post is a depression indicative. The result of the model is correlated with the actual CDC statistics. This research is valuable in that the authors looked at diverse perspectives of depressed users and therefore, diverse features are applied to build a model. However, this study is more focused on the accuracy of the model rather than showing situational awareness which may be causing negative responses and how it changes over time. Relevant research by Nguyen et al. (2016) aggregated tweets that indicated happiness, diet, and physical activity into census tracts and compared them with socioeconomic information and demographic data.

Overall, the studies mentioned above demonstrate that social media data can be a valuable resource for investigating individual- and population-level health behaviors and mental health conditions more quickly and cost-effectively than traditional survey-based methods.

3 Study area and data

3.1 Study area

Data was collected from 4 cities in the U.S. - Atlanta, Denver, San Diego, and Seattle - considering their geographical locations and demographics. Each city has its own history, culture, and demographic compositions, and these all play a role in the overall response to the event and thus affect certain communities differently.

According to the U.S. Census 2020, the largest ethnic group in Atlanta, GA is Black or African American (48.20%), while White alone (Not Hispanic) (54.00%) is the largest ethnic group in Denver, CO. In San Diego, CA Hispanic or Latino (30.10%) is the second largest ethnic following White alone (Not Hispanic), while in Seattle Asian (16.30%) is the second largest group (See Table 1).

Table 1 Population of each city by race (%)

3.2 Social media data

3.2.1 Data collection

Twitter data was collected using a set of mental health-related keywords between April 15, 2020 and June 03, 2020 from four cities. By setting a centroid point and 20 miles of radius covering most urban areas, the search API collects tweets within this area. Following Shatte et al.'s (2019) comprehensive review of machine learning applications in mental health research, we focused on keywords related to mental health, with a particular emphasis on depression, stress, and anxiety. These conditions were chosen due to their close association with negative emotions, in contrast to conditions like schizophrenia or dementia. Our set of mental health-related keywords encompasses general mental health and symptoms associated with stress and depression, including 'mental health, mentalhealth, depression, depress, depressed, lonely, sad, anxiety, stress, stressed, stressed out, feel drained, mentally exhausted, sleep disorder.' We conducted keyword testing to ensure the collection of relevant tweets. As a result, a total of 1,501,018 tweets were collected across the four cities over the period between April 13, 2020, and June 03, 2020.

3.2.2 Data filtering and preprocessing

Social media data analysis requires data filtering to reduce noise that can affect the results. In this study, retweets were excluded, and tweets generated via third-party platforms were filtered out, following the method used in previous studies (Tsou et al., 2018; Han et al., 2019; Park & Tsou, 2020). ‘Source’ field in Twitter data indicates platforms through which tweets are created. According to Tsou et al. (2017) and Han et al. (2019), non-meaningful tweets for the analysis such as job postings, traffic, and weather broadcasting, are mostly generated by non-human users (i.e., robots). Third-party platforms often offer automatic generating functions, which can be utilized to create tweets. Therefore, tweets generated via third-party platforms are likely promotional and company tweets and these can be excluded by limiting the ‘source’. In this study, tweets that are generated from Twitter applications such as ‘Twitter for iPhone’ are analyzed.

Table 2 shows the number of identified tweets per each type, and the number of unique users in each city. Approximately 6% of tweets on average were excluded from each city after filtering, resulting in a total of 1,444,474 tweets for analysis.

Table 2 Number of tweets and Unique users in cleaned dataset

After the filtering steps mentioned above, NLP is utilized to preprocess the text for sentiment analysis, including removing handles, URLs, and tweets with less than two words.

4 Methodology

4.1 Sentiment Analysis

The various sentiment analysis methods can be categorized into three classes; Lexicon-based methods, supervised-machine learning methods, and deep learning methods (Beigi et al., 2016; Ribeiro et al., 2016). Lexicon-based methods such as LIWC (Pennebaker et al., 2001), Affective Norms for English Words (ANEW) (Bradley and Lang, 1999), and AFINN which was developed by Finn Årup Nielsen (Nielsen, 2011) use the dictionary of words to assign sentiment scores. Lexicon-based methods do not require labeled data unlike supervised-machine learning methods. Among many supervised-machine learning methods, Support Vector Machine, Random Forest, and Naive Bayes are popularly used in previous studies (Chancellor & De Choudhury, 2020). More recently, several deep learning methods have been applied in studies. The prediction performance of sentiment analysis methods can significantly vary based on the methods and data quality (Beigi et al., 2016; Shatte et al., 2019). Certain lexicon-based methods, such as VADER and TextBlob, perform better with social media data (Felmlee et al., 2020; Mujahid et al., 2021). Moreover, these methods are not affected by the training dataset, making them domain non-specific (Sanyal & Barai, 2021). In this research, we employ TextBlob, a python library, for sentiment analysis. This lexicon-based method assigns sentiment values as polarity ranging from -1 to 1, with 1 being the most positive, 0 representing neutral sentiment, and -1 indicating the most negative sentiment (Loria et al., 2018).

4.2 Computing a daily city-level sentiment and changes

4.2.1 Considering users tweet multiple times

To ensure a more balanced daily sentiment score, we consider the Pareto principle, which indicates that a few users can dominate social media engagement. Without this consideration, all tweets from the same user would be counted individually. Therefore, tweets from a single user would each carry equal weight, potentially overemphasizing their emotions. For example, if someone posted several tweets with negative feelings in one day, that does not mean multiple users feel that way; it is just that person's fluctuating emotions. Recognizing this, we adjust our methodology: if a user tweets multiple times in a day, we compute their average sentiment for that day. To derive the overall average daily sentiment score, we then average the sentiment scores of all individual users. This approach ensures that active tweeters do not disproportionately skew the overall sentiment for the day.

As illustrated in Table 3, user 'A' tweeted three times on May 16, 2020. Without accounting for these multiple tweets from this user, the daily average sentiment score stands at -0.5924. In contrast, our adjusted method produces a score of -0.6426.

Table 3 Example of tweets and daily average sentiment

4.2.2 Categorizing daily sentiment into 7 levels

We categorize sentiment analysis results into seven levels based on deviations from the mean and standard deviation, adapting the method originally used by Dodge et al. (2009) for trajectory segmentation. For example, Level 4 deviates by 0.5 standard deviation from the mean, while Levels 3 and 5 fall between 0.5 and 1.5 standard deviations. Level 2 ranges from 1.5 to 2.0 standard deviations above the mean, and Level 6 represents shifts 1.5 to 2.0 standard deviations below the mean. Level 7 reflects the most extreme negative sentiment, exceeding two standard deviations below the mean, and Level 1 comprises values exceeding the mean plus two standard deviations. This approach enables nuanced sentiment intensity classification and is visualized in generating the social sensing index for mental health (see Fig. 5).

4.2.3 Categorizing degree of sentiment change into 5 levels

Sentiment scores fluctuate daily, but significant changes can indicate an event arousing emotion. These sudden changes in sentiment scores can be detected simply by looking at the daily sentiment variance over time. However, the absolute value of sentiment analysis results can vary depending on the tool used. Therefore, we use changes in daily sentiment scores to detect periods where negative emotions are high. Two components of sentiment changes, direction, and degree are considered. The direction refers to whether the sentiment score has increased or decreased from the day before, while the degree refers to the significance or magnitude of the variance.We visualize changes using circles on a line graph (see Figs. 2 and 3). The size of the circle represents the degree of change, with larger circles indicating a bigger change. The color of the circle indicates the direction of change, with blue meaning an increase and red meaning a decrease. We categorize sentiment changes into five levels based on the mean and standard deviation of sentiment change, as shown in Fig. 4, to compute entropy. Sentiment changes within the mean are classified as Level 3, indicating a moderate degree of change. Level 2 represents more substantial changes, exceeding the mean but falling below one standard deviation above it, especially in the increased direction. Conversely, Level 4 reflects changes in the same range but in the decreased direction, signifying significant decreases in sentiment. The most pronounced shifts, such as decreases extending beyond one standard deviation above the mean, are categorized as Level 5, while increases within this range are designated as Level 1. Therefore, Level 1 (green) represents large increases in sentiment, while Level 5 (red) indicates significant decreases. This system provides a structured interpretation of sentiment fluctuations, enhancing clarity and understanding.

4.3 Entropy

Entropy was originally defined in physics to explain the behavior of thermal systems by Rudolph Clausius, and later introduced into information theory by Claude Shannon (Cabral et al., 2013). Shannon’s entropy measures uncertainty and randomness in a message or system. Entropy has been used in diverse domains such as geography and urban studies, and machine learning. In geography and urban studies, the entropy index is used to measure the diversity or heterogeneity of land use or mixed land use within a region and monitor urban sprawling (Maria Kockelman, 1997; Cabral et al., 2013). In data analysis, entropy is often used as a measure of the amount of information contained in a dataset, or the degree to which the data is organized or predictable. Therefore, entropy is used as a measure of the purity of a set of data in machine learning. In a decision tree algorithm, entropy determines which attribute should be used to split the data at each node in the tree by minimizing the entropy (Du et al., 2011; Hu et al., 2011).

This study adopts the concept of entropy for anomaly detection in the degree of sentiment changes. Sentiment fluctuates and small degrees of change are frequent while big changes are rare. Entropy can be used to detect unusual events in a dataset by measuring the amount of information and randomness in the data. When an unusual event occurs, it is likely to introduce new and unexpected information into the data, which can be detected by an increase in entropy. By measuring the entropy of the degree of sentiment change on each day, it is possible to detect sudden changes or anomalies that indicate the occurrence of an unusual event.

4.3.1 Computing entropy

In the context of sentiment changes, we measure the level of uncertainty or unpredictability of the change in sentiment over time by using entropy. Shannon’s entropy formula is:

$$H(x)=-{\sum}_{i=1}^nP\left({x}_i\right){\cdot \mathit{\log}}_2P\left({x}_i\right)$$

where H is the entropy of the system, P (xi) is the probability of a single event. \({\sum}_{i=1}^nP\left({x}_i\right)\) is a summation for probabilities from i to n and log2 is the base-2 logarithm. We compute entropy based on the degree of sentiment changes, where P (xi) represents the probability of a specific degree of sentiment change occurring. High entropy values indicate a high degree of unpredictability or randomness in sentiment change, while low entropy values indicate a more predictable change in the degree of sentiment change. When a predictable event happens, there may be no significant changes in sentiment scores therefore, it is not surprising, and it has low entropy. However, the unusual event that is highly unexpected may cause big changes in sentiment scores which is surprising and therefore, have higher entropy values.

By examining periods where entropy is low, indicating there are no big changes in the sentiment change, it is possible to detect where negative emotions persist. This information can be obtained by looking at the category of the average sentiment scores in these segments.

In this study, the focus is on the change in sentiment scores over time rather than the actual sentiment scores themselves. The sentiment scores may vary depending on the method used to measure sentiment, and therefore, the goal of the analysis is to examine the daily changes in sentiment and measure the unpredictability or randomness of these changes using entropy.

4.4 Generating a social sensing index for mental health

The entropy is based on the probability of the degree of sentiment changes in the dataset and it does not indicate whether the changes are positive or negative. Therefore, the social sensing index for mental health comprises both the entropy value and the sentiment level value. We define a social sensing index for mental health SId as a vector composed of two components, entropy and sentiment level on a particular day d. The index SId is defined as follows:

$${SI}_d=\left[{E}_d,{S}_d\right]$$

where Ed represents the entropy value for the degree of sentiment change on day d, and Sd represents the sentiment level value from 1 to 7 on day d. The vector SId provides a single summary measure that captures the overall entropy of degree of sentiment change and sentiment level for that particular day. This index can be used to track changes in the entropy and sentiment level values over time, identify patterns or trends in the data, and compare the values across different days and cities.

5 Results

5.1 Temporal variation of sentiment

Figure 1 displays the changes over time in the daily average sentiment for each city. The fluctuations observed in the daily average sentiment scores of all cities are possibly related to particular events, such as the arrests of Ahmaud Arbery’s murderers on May 7, protests and riots following George Floyd’s death on May 25, and the riots in several cities on May 30. The daily average sentiment scores in all cities have declined since George Floyd died. These patterns suggest that daily average sentiment scores can indicate the sentiment surrounding these events.

Fig. 1
figure 1

Daily average sentiment change among four cities (Atlanta, Denver, San Diego, and Seattle)

Figure 1 reflects a dip in sentiment observed in San Diego on May 16, the date of the Open San Diego riot. This suggests the occurrence of a local incident or response that was unique to San Diego and not reflected in other cities. The absence of this sentiment in other cities indicates that local events may influence sentiment differently across various locations. This highlights the importance of incorporating the local context when analyzing sentiment, to gain a more accurate understanding of people’s perspectives in different locations.

5.2 The degree of sentiment changes

Of the four cities, the results of Atlanta and San Diego are presented for specific reasons. First, we prioritize Atlanta, which has a substantial Black population, given that the victims in two major events—the arrest of Ahmaud Arbery's murderers and the tragic death of George Floyd—were Black individuals. Second, we select San Diego because it was the site of a locally detected event, the Open San Diego riot, which was not found in other cities.

Figures 2 and 3 illustrate the fluctuations in sentiment and the extent of changes observed in Atlanta and San Diego. On the right side of each figure, the color bar represents sentiment levels categorized into seven levels: Level 1 indicates the least negative sentiment and level 7 indicates the most negative sentiment. For each date, blue and red dots indicate positive and negative changes in sentiment, respectively. The size of each dot represents the magnitude of change.

Fig. 2
figure 2

Average daily sentiment and degree of sentiment change in Atlanta (Red dots represent a decrease in sentiment, while blue dots represent an increase. The size of the dots reflects the degree of change, with larger dots indicating a higher degree of change and smaller dots representing a smaller degree of change. The sentiment level is represented by the color bar on the right side)

Fig. 3
figure 3

Average daily sentiment and degree of sentiment change in San Diego (Red dots represent a decrease in sentiment, while blue dots represent an increase. The size of the dots reflects the degree of change, with larger dots indicating a higher degree of change and smaller dots representing a smaller degree of change. The sentiment level is represented by the color bar on the right side)

As illustrated in Figs 2 and 3, several significant incidents greatly impact sentiment in various cities on specific dates. On May 7, the arrest of Ahmaud Arbery's murderers correlates with a noticeably negative shift in sentiment in Atlanta, as  compared to that in San Diego. Then on May 16, in response to a local reopen riot, San Diego experienced a more pronounced negative sentiment compared to other cities; this is represented by a larger and redder dot. Moreover, on May 26, both cities recorded a significant decline in sentiment, a likely reaction to George Floyd's death on May 25, which elicited strong negative responses worldwide.

Beyond the dates of these specific events, there are days with notable fluctuations not tied to a single event. These variations often reflect a mix of global happenings, local incidents, or reactions to political views, especially sentiments related to President Trump. Notably, these spikes tend to be more pronounced during weekends, possibly indicating heightened feelings of sadness or stress due to quarantine measures and event cancellations. For instance, on April 19, multiple events led to pronounced public reactions. In San Diego, the closure of walking trails and beaches sparked widespread dissent, resulting in protests advocating for their reopening. At the same time, a tragic shooting in Nova Scotia, Canada added to the significant spikes observed on that day.

The heatmap shown in Fig. 4 displays the extent of sentiment changes in different levels. Levels 1 and 2 reflect a rise in sentiment, while Levels 4 and 5 signify a decline. The frequency of Level 3 is more prominent than other categories, indicating that sentiment changes were typically small. On May 25, there was a substantial decline (Level 4) in sentiment in Atlanta. However, on May 26, this decline extended to other cities in the western and Pacific regions, with level 4 and level 5 sentiment changes being observed in Denver, San Diego and Seattle. However, it is important to note that the heatmap represents only the degree of sentiment change per day and does not provide information on the original sentiment scores.

Fig. 4
figure 4

Heatmap of daily degree of sentiment change (The color of the bars indicates Sentiment Change Levels 1 through 5, with Level 1 (green) indicating a highly increased sentiment, Level 3 (yellow) indicating a small increase or decrease, and Level 5 (red) indicating a large decrease)

5.3 Social sensing index for mental health

Examining daily average sentiment scores can help detect unusual events, but this method may not be sufficient to identify periods when sentiments remain low for several days. Figure 5 shows the Social Sensing Index for mental health. This index has two components: entropy and sentiment level. The bar graph display entropy of the degree of sentiment changes in Atlanta and San Diego.

Fig. 5
figure 5

Social sensing index for mental health (The bar graph displays entropy values, and the sentiment levels are indicated by different colors. The darker color of Level 7 represents the most negative sentiment)

Sentiments fluctuate, and therefore, small degrees of change are common. Entropy values are lower for changes that are more predictable and higher for those that are less predictable. Therefore, a higher entropy value on a specific day compared to the value on a previous day indicates that the degree of change in sentiment is unusual, or less predictable.

By analyzing the entropy value along with the colors in the bar graph that represent the sentiment level on a given day, one can determine whether the event that caused an increase in entropy is linked to negative emotions. This information can be useful in identifying significant events and in understanding their impact on the sentiment in a city.

Examining entropy values with sentiment level can be even more beneficial in detecting persistent negative emotions. Higher entropy with darker bar colors can indicate an unusual change that aroused severe negative sentiment. Following these unusual events, low entropy with darker colors indicates lingering negative sentiments.

The Social Sensing Index operates at the city level and is designed to compare sentiments across cities and to identify events specific to each city. As previously mentioned, this index effectively pinpoints a riot in San Diego on May 16, a unique event not observed in other cities; in Fig. 5, this event is indicated by the heightened entropy exclusively in San Diego. The dark color of the bar highlights the strong negative sentiment associated with this isolated event. Furthermore, the Social Sensing Index can identify nationwide issues. For instance, on May 26, higher entropy was observed in both cities; after May 26, relatively negative sentiments persisted with low entropy values for several days in both cities. George Floyd's death and the subsequent protests against police brutality may be linked to this lingering negative emotion.

6 Discussion and conclusions

In this study, we propose a social sensing index for mental health using Twitter data. We compute the daily changes in sentiment and the entropy of these changes to generate a daily updated city-level social sensing index for mental health; we visualize the social sensing index for mental health using the entropy values and sentiment categories.

We find that unusual events such as the death of George Floyd, a riot in San Diego, and protests against police brutality arouse negative emotions that result in higher entropy values and indicate severe negativity. This is particularly true of the Open San Diego riots, which the index did not detect in other cities. Therefore, the index helps detect unusual events, both local and across multiple cities. Additionally, it enables us to detect the persistence of negative sentiment after such events, as with that persisting since the death of George Floyd.

We acknowledge the limitations of this study including the lack of user representation on social media platforms, which can result in biased user groups. Twitter users in the U.S. tend to be younger, more racially diverse, more educated, more likely to be Democrats, and to have higher incomes than the general public (Pew Research Center, 2012; Pew Research Center, 2019).

Moreover, data collection using limited number of keywords results in limited data. Validation is another challenge inherent to social media data, which differ from traditional data. Further, researchers find patterns from massive amount of data that were not previously available. The dynamic and fast-changing nature of the index in this study, coupled with its suitability for long-term analysis, makes it difficult to validate. Furthermore, the impact of negative emotions on mental health may not present immediately, and it is unclear how long negative emotions must persist to affect mental health.

Therefore, future research should address validity. Potential datasets such as suicide rates, hospitalizations, and emergency department visits related to mental health can be compared to show correlation with the index.

Regardless of limitations, this study makes three key contributions to the growing body of research that uses social media data to examine mental health at the city level: First, unlike most previous studies, this study focuses on mental health at a city level rather than the individual level. A city-level index that is updated daily can provide a broader perspective on the mental health of a population. This is particularly important for public health professionals, who need to understand the mental health of a population as a whole to plan appropriate interventions for communities.

Second, the study uses entropy to generate a daily updated social sensing index for mental health. A key challenge in using social media data to study mental health is the quantification of emotions. To address this challenge, this study uses entropy analysis on the degree of sentiment changes to generate a daily updated social sensing index for mental health. The use of entropy provides a unique way to detect unusual sentiment changes and to monitor overall sentiment changes in a city. By visualizing color bars to indicate seven categories of sentiment along with entropy values, the index highlights not only areas where unusual events arouse negative emotions but also where negative emotions persist.

Third, this study has practical value as it provides a daily updated index for monitoring changes in emotional well-being over time. The Social Sensing Index effectively detects not only spikes in negative emotions but also identifies cities where these sentiments linger. The lengthening of the band of negative emotions over time can indicate an increased risk for developing more severe mental illnesses. This is crucial because a prolonged negative sentiment after specific events can indicate a broader, collective emotional response that may necessitate intervention. Public health professionals can leverage this index as a valuable tool for monitoring the mental health of a population and can promptly identify emerging mental health concerns. This proactive approach can prevent the escalation of mental health issues by allowing early recognition, intervention, and implementation of effective strategies to support the well-being of the population. Furthermore, educational institutions can take action based on this information by recognizing that students in certain cities may need extra help or resources because they continue to feel these strong emotions. By looking at these patterns, public health experts and educators can carefully choose when to run programs that promote mental health, making sure to address issues when they matter most and require attention.

Using social media data to generate a social sensing index for mental health has important implications for understanding and addressing mental health issues at a city-level. In conclusion, the index generated from this study can be a valuable tool for monitoring the mental health of a population and for planning appropriate interventions.