Introduction

Social media has become a fruitful source for social science research in our increasingly connected world. The growing number of users on all platforms (Facebook, Twitter, etc.) creates new communication networks every day; thus, it is not surprising that topics such as protests, elections, and polarization have been analyzed within online environments. In this study, we use the digital footprint of Twitter to track the social stages of information sharing in the aftermath of natural disasters.

Specifically, this work explores communication dynamics to understand the evolution of information sharing in the aftermath of a large-scale upheavals, extending to social media the offline framework of social stages defined by Pennebaker [1]. Pennebaker is a leading pioneer in linguistic inquiry, with a strong emphasis on using word count and dictionaries (i.e., LIWC). His social stage model [1] presents a theoretical framework based on the rate (count) of talking people maintain in the aftermath of large-scale upheavals. Being a pioneer in automated sentiment analysis, Pennebaker also speculated about the role of emotions in such situations, leading to questions about their transmission and how the process evolves considering the dependencies between emotions over time. Armed with much more data and new tools, we revisit this research agenda and explore how modern computational methods can contribute.

We collected data from social media for three independent earthquakes: Southern California, USA, on July 5, 2019; Oaxaca, Mexico, on June 23, 2020; and the Aegean Sea, Turkey, on October 30, 2020. The assessment of emotions was conducted using Natural Language Processing tools. The analysis was completed by applying a combination of change point analysis, analysis of variance, time series clustering, and Granger-causality tests. As a result, we aim to develop a deeper understanding of the social information sharing process and communicated emotions on social media, which can provide valuable information for disaster management efforts and the general knowledge of social response following large-scale upheavals.

Literature review

Toward a social media stage model in the aftermath of large-scale upheaval

To explain how individuals personally deal with the process of coping in the aftermath of traumatic events, different stage models have been proposed [2,3,4]. In essence, stage models establish that every person faces a progression of phases through time to handle the trauma. For example, Kübler-Ross [2] argued that individuals evolve through five stages in the aftermath of experiencing the death of someone close: denial, anger, bargaining, depression, and acceptance. However, these psychological stage models for individuals do not consider that human beings also face responses to traumatic events in a social context. Undoubtedly, the emotional communication process after a traumatic event is shaped by people’s social world. This reasoning inspired Pennebaker [1] to propose a model of social temporal stages to understand the communicational dynamics in the aftermath of large-scale upheavals.

To elaborate his model, Pennebaker surveyed people in the aftermath of two traumatic events, the earthquake of Loma Prieta in 1989 that struck the San Francisco Bay Area and the beginning of the Gulf War in 1991. He asked the participants about the number of conversations and thoughts related to the incidents. The resulting model comprised three social stages: emergency, inhibition, and adaptation (For details, see Fig. 9.4 in Pennebaker [1] (p. 216)). The emergency phase is characterized by heightened anxiety and an elevated level of reported talks and thoughts about the event. The inhibition phase shows a drop in discussions about the issue and a more constant number of thoughts. Finally, the adaptation phase reveals a return to normalcy with a low activity level related to the incident, signaling that the event is psychologically over for most community members.

The crisis and risk communication field also has proposed models with different stages to describe communication processes in the aftermath of large-scale upheavals [5,6,7]. It is important to notice that the difference between both types of communication is that risk communication addresses events that can potentially become a crisis. Still, they are not at that point yet. According to the study of Spence et al. [8], a stage model that can be relevant to the study of social media technology is the model proposed by Fink [5]. Fink’s model defined a “crisis life cycle” into four stages. First, the promodal stage comprises a period of buildup with hints and clues about an impending crisis that can occur. Second, an acute stage produced by a trigger event. During this stage, damage can be caused to vulnerable publics or organizations. Third, the chronic stage, where the reputation of organizations or communities can suffer for a period in which they struggle to return to normalcy. Finally, the termination stage corresponds to when the crisis resolves and the original situation becomes irrelevant to the actors involved. Another model used in social media studies and crisis communication is the Crisis and Emergency Risk Communication model (CERC) [9, 10]. The CERC model [6] assumes that crises develop in predictable ways evolving from risk to crisis, recovery, and evaluation. It comprises five stages: pre-crisis, initial, maintenance, resolution, and evaluation. Compared with Fink’s model, the CERC model seems more comprehensive because it considers the need to educate the public about the risks in the pre-crisis stage and adds an evaluation stage that allows for assessing responses, including communication effectiveness. In contrast with Pennebaker’s model, crisis and risk communication approaches describe the lifecycle of a critical event until it is resolved, while Pennebaker studies the dynamics of conversations and thoughts right after the event, not focusing on the managerial aspect of disasters.

Building on Pennebaker’s work, the first goal of this study is to analyze to what extent the model replicates in social media. Pennebaker [1] suggests that “thinking and talking about a trauma tend to dissipate at different rates over time” (p. 215). As such, the sharing process is characterized by an emergency phase with a high number of conversations, followed by inhibition and adaptation phases in which the quantity of discussions drops. By using digital footprint data, we focus on the rate of talking of the model to answer the following:

RQ1

Can we identify different stages of social information sharing on social media in the aftermath of a catastrophic event?

Emotions and large-scale upheavals in social media

The quantitative formalization of Pennebaker [1] focused first and foremost on different stages in terms of the “rate of thoughts and talking” (p. 216). We can only speculate that this reasoning might be linked to his methodological focus on the word count (i.e., LIWC). However, he also reflects on the role of feelings during “these emotion-laden interviews” [1] (p. 208)]. In a review of the psychological meaning of words, Tausczik and Pennebaker [11] establish that “language is the most common and reliable way for people to translate their internal thoughts and emotions into a form that others can understand.” (p. 25). As such, emotions conveyed in messages play an essential role in transmitting ideas by mediating personal cognitive information understanding and interactive social behaviors of individuals [12].

Collective events such as natural disasters, protests, or terrorist attacks are relevant cases of social sharing of emotions. In these events, people face mutual experiences that they share later with others in conversational situations [13, 14]; the more intense the emotion a person experiences, the more likely to speak about it [15]. Furthermore, when mass and social media get involved in the coverage of collective events, they elicit a spread of social sharing of emotions in every direction, reminiscing what happens in a nuclear reactor [15].

Previous studies analyzing social media during critical events have shown that individuals appear to use these platforms more for affective display than information seeking [16,17,18]. Thanks to the availability of information on social media and the development of computational methods for text analysis, sentiment analysis, which classifies text based on its positive or negative valence, has produced many studies on the aftermath of natural disasters. These studies include but are not limited to, automatic processing and classification of sentiment after catastrophic events [19,20,21,22], identification of crisis-related information for disaster management [23, 24], understanding the role of emergency responders and organizations [25,26,27], or determining the origin of events based on users posts and geolocated data [28, 29]. Even though the concepts of sentiment and emotion are treated as equivalent sometimes, in this work, we consider emotion analysis as the classification of messages in different emotional categories based on previous theoretical definitions, while sentiment analysis is the assessment of the valence -positive, negative, or neutral- of the content.

In the area of the expression of emotions in the aftermath of large-scale upheavals, it has been shown that users exhibit sympathy for people affected by the event, share personal experiences [30], and praise people and organizations that provide support and help [31]. Moreover, people use social media to cope with traumatic experiences. For example, studying Facebook in the aftermath of typhoon Haiyan, Tandoc and Takahashi [32] found collective coping strategies to help alert family and friends about survival, develop a social construction of the experience, and the management of feelings. Similarly, Nilsen et al. [33] detected that survivors of terror attacks use online environments to mourn publicly and perform symbolic actions. Social media can be considered platforms where positive emotions contribute to emotional bonding, support-seeking, and therapeutic channels in the aftermath of critical events [34].

Considering the importance of emotions in the information sharing process in the aftermath of large-scale upheavals, we would like to expand the comprehension of Pennebaker’s model by studying the expressed emotions across the model stages. Therefore, we formulate a second research question:

RQ2

Can we distinguish different stages of communicated emotions on social media in the aftermath of a catastrophic event?

RQ2 describes the communicated emotions using Pennebaker’s model, yet it is also of interest for this work to understand the dynamics of emotions in the process. Previous studies on how emotions evolve in the aftermath of large-scale upheavals have shown mixed results. For example, Spence et al. [8] found that relevant information becomes less prevalent during a crisis, and messages predominantly express negative emotions later. In comparison, Garcia and Rimé [35] observed that individuals could change from negative to positive expressions of comfort and support in the aftermath of terrorist attacks. Considering these results, we want to understand if an emotion X can influence a later surge of an emotion Y (X → Y). Moreover, if such influence happens in one stage of the model and a new Y → Z effect follows it in the next stage, it would be possible to establish the chain X → Y → Z to describe a stage model of emotions that presents emotional influence through time. We analyze this scenario by exploring a third research question.

RQ3

Is it possible to identify the influence of emotions through stages in the aftermath of a large-scale upheaval? Does this influence generate chains of emotions?

Events of study

In this study, we have focused on one type of natural disaster, large earthquakes with a magnitude greater or equal to 7.0 Mw on the moment scale. By focusing on earthquakes, we can compare our analysis with one of the natural disasters used by Pennebaker when he created his model. Moreover, a triplicated analysis allows us to understand commonalities and differences between the same type of disasters when studying Pennebaker’s model and avoid the omnipresent threat of the replication crisis [36, 37].

The first event corresponds to an earthquake of magnitude 7.1 Mw that struck Ridgecrest, California (LA), on July 5, 2019, at 8:19 pm (UTC-7). This earthquake was the most powerful in the state in 20 years. Its effects were perceived across many areas in California, parts of Arizona, and Nevada.

The second natural disaster incorporated in this study is the earthquake that struck the state of Oaxaca, Mexico, on June 23, 2020, at 10:29 am (UTC-5) with a magnitude of 7.4 Mw. This event also caused damage to hundreds of houses in the affected area, where at least ten deaths were reported.

Finally, the third earthquake we analyze in this work occurred in the Aegean Sea, Turkey, on October 30, 2020, at 2:51 pm (UTC + 3) with a magnitude of 7.0 Mw. Even though the Greek island of Samos was the closest to the epicenter, the Turkish city of Izmir was the most affected. It has been estimated that more than 700 residential structures were damaged or destroyed, and at least 117 people died in the Turkish province of Izmir.

Method

Applying a combination of computational methods to process social media data using time series analysis, we analyzed data from three independent events with similar characteristics to replicate our work. First, we coded emotional expressions for five emotions using a deep neural network system for semantic analysis. Then, time series of the number of messages and each emotion were created to study the temporal structure of the data and the relationship and evolution between them. Next, to analyze RQ1, we determine the stages in the communicated emotion cycle using a change point analysis procedure. Later, we performed ANOVA and clustering analyses to differentiate the stages of communicated emotions to answer RQ2. Finally, to examine the existence of a chain of emotional stages of RQ3, we employed the framework of Granger causality for time series.

Data collection

We collected three datasets from Twitter following the large earthquakes of LA, Mexico, and Turkey described previously. For the first two cases, the data collection was performed using Twitter API v1.1, while for the third case, Twitter API v2.0 was used.

Considering our interest in information shared on a social scale, we used historical archives of Twitter [38, 39] to identify the most relevant trending topics related to the events in the country where the earthquakes struck. We selected the first and second most relevant keywords for our data collection from the trending topics identified for each event. As a result, we gathered the following keywords, for the LA case, #EarthquakeLA and #californiaearthquake; in the case of Mexico, we requested #Mexicoearthquake and #cdmsismo; and for the Turkey case, the keywords selected were #Izmir and #deprem. From this initial request using the Twitter APIs, we filtered out all the tweets classified as undetermined language by the metadata provided by Twitter. We constrained our analysis to 24 h after the events. All the tweets resulting from this process were incorporated into the assessment of emotions in the following step.

Classification of emotions

Tweets’ emotions were analyzed using the IBM Watson Natural Language Understanding (NLU) system [40]. We selected NLU because it uses deep learning to extract semantic features and metadata from texts. Its emotions module has been trained to detect the presence of anger, disgust, fear, joy, and sadness [41]. Moreover, NLU has been demonstrated to be an effective text analysis tool used in different studies with Twitter data [42, 43]. It has also proved that its results outperform similar systems available [44]. Moreover, following the validation process for automatic content analysis advised by Grimmer and Stewart [45], Hilbert et al. [46] validated the NLU emotions module against human coders.

The NLU emotions module only processes text in English; therefore, we used Google Translate to translate all the non-English tweets in our datasets. Google Translate has been shown to be a viable and accurate tool for translating non-English language [47, 48], making this tool suitable for our study. Once the translation process was completed, we removed all the tweets that Google could not translate. The final step before assessing the emotions of the tweets in NLU was to remove the web links in the text. As NLU uses semantic analysis, eliminating words such as stop words from the tweets was unnecessary.

For the assessment of emotions, NLU assigns values between 0 and 1 to the presence of anger, disgust, fear, joy, and sadness. These five emotions are known as basic emotions in the literature on the categorical classification of emotions [49]. After processing the level of emotions for each case of study, we obtained NLA = 144,095, NMX = 24,679, and NTR = 286,115 tweets for the LA, Mexico, and Turkey earthquakes, respectively.

Considering that joy is the only positive emotion given by NLU and pondering that we are working with natural disasters, we reviewed tweets with joy in more detail. Assuming that tweets with high levels of joy show a more accurate measurement of that emotion, from each dataset, we extracted the set of tweets with joy higher or equal to 0.75. From the lists created, we manually examined a random sample of 100 tweets for each case to understand the NLU assessment of joy and how it relates to other positive emotions. We based our review of the relationship between joy and other positive emotions on the work of Hu et al. [50], which found that joy correlates significantly with amusement, hope, inspiration, and interest. Following the procedure of Hu et al. [50], we used the definitions of amusement, hope, inspiration, and interest in Fredrickson [51] for manually re-coding the sample of tweets extracted. The results of the recodification process showed that the NLU assessment of joy could be reclassified as interest and hope 28% and 32% of the time, respectively. The case of amusement is interesting because, in the LA earthquake, it represents up to 42% of the recodification, while its percentage is negligible for Turkey. Finally, inspiration does not represent relevant percentages for any earthquake (< 10%). Considering these results, we decided to change the denomination of “joy” to “interest/hope” when analyzing positive emotions related to earthquakes in this work (for simplicity, we refer to it as interest in the text). We understand that the dynamic of positive emotions is more complex. Still, the name redefinition also makes sense when classifying emotions in the bi-dimensional space of valence and activation [52], where joy, interest, and hope have positive valence with low activation levels.

Time series

Stage models explain how the variables of interest evolve in time, and time series analysis demands equidistant bins in time. Considering the amount of data and its origin from social media posts, we created binning groups of tweets into one-minute timeframes that provide an adequate tradeoff between sample size and granularity. We adopted this decision about the timeframe because smaller timeframes (seconds) generated many data points with no information, and longer timeframes (> 1 min) summarized a considerable amount of information at the initial stages of the process. Moreover, as time series analysis requires the use of lagged information in time, a timeframe of one minute seems adequate because it can be interpreted as a standard measure of time in social media consumption. The resulting time series had 1440 consecutive data points (24 h × 60 min). When there was a whole minute without information retrieved, a data point with zero values was added to the database. We added 0 data points for the case of LA, 156 for Mexico, and 2 for Turkey. We created seven time series for each case based on the one-minute binning. All five emotions (anger, disgust, fear, interest, and sadness) were assessed for each tweet, and scores were summed within one minute. The resulting group of time series quantifies the total intensity of each emotion within the one-minute timeframe. Additionally, we calculated two more time series, one adding the number of tweets per minute, representing the total of messages shared about the event, and another with the sum of all five emotion scores, representing the total emotional intensity. Table 1 presents a summary of the datasets.

Table 1 Summary of datasets for the three earthquakes studied

Analytical procedure

Pennebaker’s social sharing model refers to the number of times people talk and think about the preceding events. Using social media data, we can only focus on the talking rate as we cannot access people’s thoughts.

To analyze RQ1, following Pennebaker’s reasoning, the aggregated variable of the number of tweets collected for each point in the time series in our data represents how much people talk online about the event. Stage models are based on changes that happen over time. Still, because of the social nature of information sharing in the aftermath of large-scale upheavals, it seems unlikely that the extension of the stages is the same for all events. However, if an underlying process exists, its structure should change similarly. We performed a changepoint analysis in the time series to explore the existence of structural changes asked in RQ1. Changepoint analysis identifies points within the data where statistical properties change [53]. Methodologically, it is linked to the logic of stationarity, common in econometrics [54], which demands that general statistics do not change within the time series. In other words, the dynamic persists over time. If the basic statistics change, we can distinguish different ‘stages’ within the data. We used the library changepoint in R to compute the changepoints based on a change in the variance for the time series (for details, see Sect. 5 of Supplemental Material).

After identifying stages based on the number of tweets, we examined RQ2, which asks if it is possible to distinguish between different stages of communicated emotions. We started the analysis with a descriptive approach to showcase the level of each emotion in different stages. The initial process allowed us to visualize the most prevalent emotions for the different stages and commonalities among cases. Then, we used two analytical methods to explore RQ2 in more detail: a differential and a relational analysis. First, we analyzed the differences of means using ANOVA with the post-hoc Tukey HSD test to determine if there are significant pairwise differences in the average emotional intensity within stages (for details, see Sect. 6 of Supplemental Material). Second, we looked at the relationship between emotions considering the time structure because even if the average intensity between emotions is not statistically different, it could be that those emotions do not group within stages. For grouping emotions, we used hierarchical clustering for the time series. This model-free approach allows us to measure the proximity between time series considering the closeness of their values at specific points in time [55]. We completed the following procedure to find the clusters within the stages found in RQ1. Initially, we normalized the time series for each emotion. Then, we computed a metric of dissimilarity based on the autocorrelation function (ACF) with the TSclust package in R. We used ACF because it represents the correlation between a time series and a lagged version of itself or other series; therefore, it accounts for the time structure of the data. Finally, the hierarchical clusters were computed using a complete agglomeration method implemented in R. The number of optimal clusters (\(k=2\)) was determined using the silhouette coefficient (for details about the clustering process, see Sect. 7 of Supplemental Material).

To examine the influence between emotions through time and the existence of a chain of emotions for RQ3, it is necessary to determine the dependence between time series. We use the Granger causality test [56] to assess the predictive structure between emotions. In this case, the term ‘causality’ can be a bit misleading; even so, it is commonly used, and Granger received the 2003 Nobel Prize in Economics for the concept. In essence, Granger causality tests if a time series \({Y}_{t}\) is useful to forecast another time series \({X}_{t}\) by comparing the model that contains \({Y}_{t}\) and \({X}_{t}\) versus another that only contains \({X}_{t}\). Therefore, it does establish directionality but is mute on the deeper question of a causal mechanism. However, it reveals how much one variable allows predicting another. The assumptions to perform Granger causality tests were restrictive in the original formulation [56]; a strong assumption was that the set of time series must be stationary. A less restrictive extension to test for Granger causality was elaborated later by Toda and Yamamoto [57], which allows the use of non-stationary and cointegrated time series to perform the test. The procedure to perform the causality test in this work incorporates the following steps: test time series stationarity using the Augmented Dickey-Fuller and KPSS tests; determine the maximum order of integration adjusting an ARIMA model; define the number of lags for each model examining the Akaike information criterion (AIC), Bayesian information criterion (BIC), and Hanna-Quinn information criterion (HQ); review the stability and correlation of the residuals in the model; and, finally, perform the test using the Toda-Yamamoto procedure. We followed the implementation of Lukito [58] in R to test Granger causality using Toda-Yamamoto (see Sect. 8 of Supplemental Material).

Results

RQ1: stage model

RQ1 asks if we can identify a stage model in the communication process on social media in the aftermath of a catastrophic event. In Pennebaker’s model [1], the variable used to describe the social process was the number of times people talked about the event. In this study, the variable representing how much people talked about the event corresponds to the number of tweets (count) posted online.

To find the structural changes in the social talking process on social media, we analyze the variance structure of tweets count. Using changepoint analysis, we split the time series into chunks with time frames of stable variance. The structural changes we found are also used to separate the time series of the different emotions for subsequent analysis. We based this decision on the fact that the time series of emotions were created as the sum of emotions in tweets for each minute; therefore, a high and positive correlation between the time series of tweets count, and emotions should be expected. As a reference, the smaller correlations for the LA and Mexico cases were between tweets count and anger with values of r (1438) = 0.97, p < 0.001, and r (1438) = 0.94, p < 0.001, respectively. While for the Turkey earthquake, the smallest correlation occurred between tweets count and disgust, r (1438) = 0.95, p < 0.001. These results show the number of tweets as a good proxy to use its structural changes to divide emotions’ time series (correlation results are available in Sect. 4 of Supplemental Material).

Figure 1 shows the different timeframes defined by the changepoints and their intersection with Pennebaker’s model. The LA case is divided into five timeframes, while the Mexico and Turkey cases are in six. The review of variance reveals that LA (\({SD}_{LA,1}=260.44, {SD}_{LA,2}=63.55, {SD}_{LA,3}=28.29,{SD}_{LA,4}=13.50, {SD}_{LA,5}=13.89\)) and Mexico (\({SD}_{ME,1}=39.88, {SD}_{ME,2}=6.81, {SD}_{ME,3}=5.26,{SD}_{ME,4}=4.41, {SD}_{ME,5}=3.31, {SD}_{ME,6}=1.28\)) maintain a structure that decreases variability, whereas Turkey (\({SD}_{TK,1}=68.55, {SD}_{TK,2}=76.47, {SD}_{TK,3}=41.25,{SD}_{TK,4}=45.68, {SD}_{TK,5}=27.86,{SD}_{TK,6}=17.84\)) is more variable. To map the results of the change point analysis onto Pennebaker’s model, we rely on the characteristics of the stages proposed by the author. First, the emergency phase is characterized by a high number of reported talks about the event. In our analysis, we observe in the three earthquakes that after the peak in the number of tweets, a changepoint is detected when the quantity and variability of tweets decrease; therefore, by Pennebaker’s model, we identify this initial period as the emergency. Second, to differentiate between inhibition and adaptation phases, the original definition says that the former shows a drop in the level of discussions while the latter returns to normalcy with low activity levels. Based on the timeframes defined by the changepoints, we noticed a decline after the emergency phase and, at the end of the process, a constant activity level in our data. To separate between the inhibition and adaptation phases, we looked at the data’s linear trends, noting that small slopes with low activity levels and variability respond to the definition of adaptation. Thus, we incorporate the adaptation phase timeframes with the slopes of linear trends close to 0. In the case of LA, timeframes four (\({m}_{LA,4}=-0.36\)) and five (\({m}_{LA,5}=-0.03\)), for Mexico timeframes four (\({m}_{MX,4}=0.01\)), five (\({m}_{MX,5}=-0.01\)), and six (\({m}_{MX,6}=0.00\)), and for Turkey, timeframes five (\({m}_{TK,5}=0.02\)) and six (\({m}_{TK,6}=-0.06\)) were part of the adaptation phase. The previous definition left two timeframes in the inhibition phase for each case. Considering that the earthquake in Turkey shows a different behavior in the first part of the inhibition phase, we divided this phase into two.

Fig. 1
figure 1

Changepoint analysis and Pennebaker’s model

At first sight, changepoints for the case of LA and Mexico line up quite nicely, while the case of Turkey seems to differ, a closer look suggests that this difference is not in intensity but rather in the length of the stages. Figure 2 shows that the drop in the number of tweets from the average of the emergency stage to the average of the consecutive stage is between 30 and 60%, with a decline of another 32–48% from the second to the third stage, and another 70–80% from the third to the fourth stage. While the progression of the Turkish case is still distinct from the strong alignment of the LA and Mexico case, the general tendency corroborates Pennebaker’s proposal.

Fig. 2
figure 2

Percentage of reduction in the average number of tweets between consecutive stages

Methodologically, we conclude that our replication suggests a pattern with some variation, especially concerning the length of the stages. Therefore, for this work’s subsequent analysis, we divided the stages into emergency, inhibition A, inhibition B, and adaptation (see labels in Fig. 2).

Our changepoint analysis reveals that information sharing on social media in the aftermath of catastrophic events resembles Pennebaker’s model, with rapidly increasing beginning and consecutively decreasing communication intensity. However, while all three cases show similar patterns in terms of intensity, our data suggest that these stages do not necessarily have the same length. Still, we can differentiate them in terms of their average differences and variability between stages.

RQ2: stage model of communicated emotions

We perform a differential (ANOVA) and relational analysis of emotions (clustering) within the identified stages to analyze how emotions change over time in the aftermath of a traumatic event. First, we identify the importance of each emotion by determining the percentage they represent in each stage and their similarities on average. Second, using the ACF, we grouped emotions in clusters according to their similarity.

Relative percentages for each emotion in the stages of the social sharing process displayed in Fig. 3 indicate that sadness and interest are predominant in all cases. Moreover, we observe that positive expressions represent 33% on average if we compare the aggregation of positive and negative emotions.

Fig. 3
figure 3

Proportion of emotions by stages

In Fig. 4, the ANOVA’s post-hoc Tukey HSD test for pairwise comparisons reveals that sadness and interest have a statistically equal mean for the LA earthquake in all stages. Moreover, the LA case shows no significant differences between anger and fear across the process, disgust and fear during the first three stages, and disgust and anger for the final three stages. Considering the previous description, the LA case can be characterized by having two groups of emotions with similar mean: sadness and interest; and fear, anger, and disgust. In the case of Mexico, we found that the means of disgust and anger are not statistically different among all stages. Sadness and fear are not different at the beginning. The same happened for fear and anger in the last two stages. For Mexico, fear, anger, and disgust are similar, while interest is not similar to any other emotion. The case of Turkey only shows two instances in which emotions are not statistically different, disgust and fear in the initial stage and sadness and interest in the inhibition B stage. Overall, the descriptive analysis shows a commonality differentiating the most and less expressed emotions. Sadness and interest show similar mean levels during all the process, and they are the most expressed emotions. While fear, anger, and disgust are expressed less; also, these last emotions present occasional similarities in their means (details about the ANOVA and Tukey HSD test for each stage in Sect. 6.5 of the Supplemental Material).

Fig. 4
figure 4

Tukey HSD test with a pairwise comparison of emotion means within stages for the three earthquakes

The previous analysis gives an initial understanding of what emotions are more relevant on a relative scale. Considering that we are working with time series, we also incorporate information from the time structure. The ACF is a measure that helps to understand how the present value of a time series is related to its past values. To describe the process, we use hierarchical clustering to group emotions based on their ACFs.

Figure 5 presents the results of the hierarchical clustering method. In the LA earthquake, emotions representing the highest percentages (sadness, interest) are closely grouped during the emergency stage. Later, they stay part of the same cluster, but their association also includes anger and disgust. Anger forms groups with all emotions in different stages through the process. In the case of fear, it becomes a separate branch in the final two stages of the process. Disgust is initially associated with fear and anger and later with sadness. Compared with the previous analysis, we can observe that the relationship between sadness and interest also has a component of autocorrelation in time. In contrast, fear is a more independent emotion that does not cluster with others during inhibition B and adaptation stages. This last result shows that the emotional expressions are not focused on alarming feelings in a geographical area known for facing many earthquakes.

Fig. 5
figure 5

Hierarchical clustering of emotions by stage. Black and grey colors represent different clusters. The dissimilarity measure was based on ACF, and the optimal number of clusters (\(k=2\)) was determined using the silhouette coefficient

The case of Mexico also shows a close relationship between sadness and disgust, grouping them close during the inhibition and adaptation stages of the process. Moreover, they create a separate cluster in the adaptation period, which is the longest in extension. Interest, the emotion most expressed according to the descriptive analysis, evolves from an isolated cluster to an association with fear. Looking at the ANOVA analysis, we can observe here that, even though interest was significantly different in terms of mean, it develops associations with other emotions when ACF is considered. For the review of sadness, we notice that it can be grouped with disgust toward the end of the process. The sadness-disgust relationship was also observed in the LA case in the adaptation stage.

Lastly, for the Turkey earthquake, similar to Mexico, we observe that disgust and sadness are part of the same cluster through the process but not always being the most similar emotions. Interest, important in terms of mean, isolates in a branch of the dendrogram during the two final stages forming its own cluster. In the final stages, we observe a separation between positive and negative emotions that can indicate differences in emotional expression between cases of places used to receive earthquakes, like LA and Mexico and other areas where these events seem to create a more significant social impact.

Overall, the hierarchical clustering results reveal that general patterns across stages are complex to define. However, we observed that sadness and disgust show a significant association across all the cases, which is consistent with the share of the emotion they represent in the total. An interesting contrast is that while interest is a more isolated emotion in the aftermath of the Turkey earthquake, it has more significant associations with other emotions in the LA and Mexico cases. On the contrary, fear, more separated in the LA case, forms different clusters in the Turkey case.

RQ3: chain of emotional stages

To analyze the existence of a chain of emotions, we study whether emotions extracted from social media messages can produce predictable sequences. The relationship X → Y represents the shortest possible sequence in which an emotion X influences the expression of emotion Y. When this predictive relationship does not exist; we can assume that emotion Y is better explained/predicted only by its past. To establish a chain of emotions in Pennebaker’s work, we need to find a sequence that evolves through the stages in the model. We studied the time series of emotions adjusting vector autoregressive models (VAR) for each one of the stages then we ran Granger causality tests to determine the relationships. Table 2 shows the statistically significant results (for details about all Granger causality tests results and the lags used in each test, refer to Tables 3, 4, 5, and 6 in the appendix).

Table 2 Granger causality tests results

Granger causality tests do not show a common pattern among the three cases; thus, we cannot determine the existence of a chain of emotions. Another outcome is that the emotion that predicts others more frequently is interest in the LA and Mexico earthquakes, whereas, for Turkey, the most predictive emotion is sadness. These results match the ACF clustering analysis where interest becomes isolated in the Turkey case. On the other hand, the emotion that others predict on more occasions is sadness in the case of LA, while anger appears forecasted more for Mexico and Turkey. Overall, emotions with low activation levels (sadness, disgust, interest), regardless of their valence, are the most relevant to predict other emotions. Conversely, on most occasions, the predicted emotions present high activation levels (anger, fear). Only two sequences of three emotions are present through different stages of the social sharing process fear → sadness → disgust in the two final stages of the LA case, and fear → interest → fear in the initial stages of Mexico. We can also notice that relationships between emotions appear more frequently in the adaptation phase, predicted primarily by interest and sadness.

Conclusions

Our findings suggest that, even though the duration of the stages varies between cases, the structural changes in the information shared on social media in the aftermath of catastrophic events identify the stages of Pennebaker’s model. Two of our three cases are impressively similar, while our third case warns about the possibility of deviance in further replications. After analyzing emotions, we notice that sadness and interest are the two most expressed online. They can be characterized by having different valence (negative and positive, respectively) but with a similar low activation level. Furthermore, we advert significant differences between the type of emotions shared by the US and Mexico cases in contrast with Turkey. Finally, the search for influence between emotions shows that interest and sadness are good predictors of other emotions in the long run (adaptation stage).

Discussion

New data in old theoretical bottles

Our findings demonstrate that the rate of talking in Pennebaker’s offline model can be extended to social media. Moreover, this extension was made by linking the change between stages to structural changes in time series data. We also noted that the duration of the emergency, inhibition, and adaptation phases was different for each case. Still, the stages were identifiable, which shows that what matters is the change in intensity between them instead of identifying a fixed period. This outcome follows similar results of previous analyses about stage models at the personal level. For example, studying people who faced the death of a spouse, child, or other tragedies, Wortman and Silber [59] determined that only about 30% of the participants evolved their trauma according to stage models, while most present variations from the theoretical models. The same authors reconfirmed these results years later when they revisited their conclusions [60].

In the case of Turkey, we observe that the inhibition phases were more prolonged than in the other cases; looking in more detail, it can be noticed that the amount of activity during the inhibition stage generates a surplus compared to the other cases. A possible explanation could be that in LA and Mexico, because of their closeness to the Pacific and Cocos tectonic plates, people are more used to experiencing earthquakes; therefore, they tend to talk less about them. In contrast, Turkey is part of Europe, which does not experience earthquakes often. In fact, in the datasets of LA and Mexico, we find tweets mainly in English and Spanish, but for Turkey, in addition to English and Spanish, we also get Latvian, Romanian, Portuguese, French, and, obviously, Turkish which shows a broad geographical interest within Europe to talk about the event. Another explanation might be the saturation of the network. Using the number of retweets as a proxy for information saturation, we notice that the average number of retweets for the Turkey case during the stage of inhibition A and B together is MTK = 11.22, while LA and Mexico have means of MLA = 38.25 and MMX = 17.02, respectively.

Our emphasis on replicating our results showed deviance for the LA and Mexico pattern in the case of Turkey. While the analysis of three cases seems helpful in terms of replication, it is important to note that it is still far from being representative in terms of statistical regularity; therefore, stages of different time lengths should be expected. The addition of dozen future cases might lead to the emergence of a larger picture, taking this case replication study to large-scale statistical regularity.

What emotions tell us

The study of emotions presents interesting outcomes. We observed that interest and sadness are the most expressed emotions in all cases across the stages of Pennebaker’s model. Considering the two-dimensional decomposition of emotions in their fundamental activation and valence components [52], sadness and disgust express negative valence with low activation, fear and anger transmit negative valence with high activation, and interest conveys positive valence with low activation. The analysis of interest and sadness shows that a low activation level is relevant in social communication in the aftermath of earthquakes, regardless of valence. Moreover, interest and sadness are good predictors of other emotions, especially in the model’s adaptation phase. A possible explanation for this situation is that more active emotions, such as anger and fear, are more intense right after earthquake shocks, and subsequently, even though they diminish their intensity, they can be predicted by less active emotions. This prediction might come out of frustration in the population after expressing sadness and interest for an extended interval. In line with this rationale, it has been shown that emotions like fear are relevant at the initial stages of a natural disaster [8]. At the same time, sadness typically presents messages of compassion that attract either positive-compassionate or angry responses [61].

The presence of positive emotions in the aftermath of catastrophic events has been associated with demonstrations of kindness, prayers, gratitude, and hero-praising, among others [31, 34, 61]. We find similar expressions in our dataset, for example, LA: “No damage, here. You all good? #earthquake #EarthquakeLA #californiaearthquake”, MX: “Life puts us right moments of tension to value and enjoy those moments of happiness, tranquility, and well-being #sismocdmx #CDMX #sismo, or TK:” For those who cannot enter their house, free soup will be served in the garden of BUENAS BISTRO in Bornova during the night #earthquake #izmir #canimizmir Aegean Sea.” However, when we compare the dynamic of interest using the cluster and Granger causality analyses, we notice that in the case of Turkey, interest evolves without grouping or predicting other emotions. This finding reinforces that positive feedback loops exist in social conversations about critical topics [61, 62].

In the broader context of crisis communication, our analysis can be inserted in the initial stage of a crisis. It can shed light on actions to implement during this period. For example, we observe that after a disaster occurs, Twitter trending discussions are intense. Still, they tend to disappear fast, a circumstance that disaster management authorities should consider if one of their goals is to spread information through this platform. Regarding emotions, in the last stages of the process, sadness and interest are predictors of emotions such as fear or anger. We argue that these emotional expressions hours after the main event can result from frustration. Then, as the CERC model proposes, rapid communicational interventions should be established to reduce uncertainty and address this kind of emotional turmoil [6] because, as research suggests, formal leaders are relevant to help people interpret disruptive events [63].

Complementary methods

Regarding the methodological analysis, we observe that asking for more restrictions on the level of relationship between emotions generates a better comprehension of the phenomenon, but with some obstacles.

Initially, using the ANOVA and post-hoc test of pairwise comparisons to evaluate general statistics of emotions gave us an initial descriptive approach to understanding our data. Later, using the method of hierarchical clustering with ACF as a measure of dissimilarity presented a straightforward procedure to generate associations between emotions without the need to verify assumptions. However, because time structures are complex, the clustering results cannot reveal how the correlation between lagged versions of the time series, including information from each emotion’s past, represents interactions between them. Finally, a more detailed result was obtained using VAR models to test Granger causal structures. Yet, as with other statistical models, the complication lies in adjusting a stable model that verifies the assumptions needed to establish correct statistical inferences. For example, the analysis of residuals for the LA and Mexico cases in the adaptation phase shows a slight deviation from normality, but the adjusted model was stable; therefore, we observed the results of the Turkey case to compare and review the consistency of the results in the other two cases. We consider that by mixing classic differential analysis, model-free clustering, and model-based analyses, we were able to complete our study with different and complementary methodologies.

Limitations

There are also limitations that we can observe in this study. First, we collected information from the most relevant hashtags about the events; therefore, we missed information related to the issue when it did not contain the hashtag. We did not reach the full extent of the conversation on social media by missing information linked to the event. Moreover, how Twitter generates trending topics is a black box we cannot access. It is impossible to tell how many people posted content about the issue because it was already popular or because they were concerned about the subject. Second, Pennebaker’s model was created with evidence from human beings sharing their experiences. In social media environments, it is known that behind the user’s profiles, there are not only humans but also automated accounts (bots) publishing content. As we did not eliminate bots, our data is partly contaminated with information produced by them. However, when users access the hashtag interface on Twitter, there is no evident distinction between what content is updated by people or bots. Third, to assess emotions in the data collected, we used NLU of IBM Watson, a Machine Learning as a Service (MLaaS) tool. The main characteristic of MLaaS is that cloud computing providers offer them. For this reason, NLU is not an open-source tool; as such, we do not have the ability to know how it works. Fourth, unlike other natural disasters, earthquakes are not a single event; a series of aftershocks follow them. We did not incorporate a time series with the aftershocks, which could have clarified nuances in the different dynamics of the process.

Summing up

We have shown the extension of a theoretical offline social model to online communication using observational data from social media. To do so, we used three datasets to achieve consistency and avoid the replication problem. We also shed light on how emotions evolve and establish relationships between them when people talk in the aftermath of large-scale upheavals and how fast the process develops in a few hours. Our results bring a better emotional context to disaster management authorities in charge of crisis communication. Future directions of this work can include the use of different types of large-scale upheavals, such as other natural disasters, terrorist attacks, or breaking news.