Introduction

The human approach to risk is a complex and multifaceted issue. Psychosocial research has shown that to gain a better understanding of how it functions, it is crucial to investigate the psychological and cultural factors involved, such as individual characteristics, emotions, feelings, psychological distance, denial, and moral disengagement [1,2,3,4,5,6]. These factors include psychosocial processes that can be expressed in natural language as means of communication [7] and provide insight into how individuals perceive and respond to risks. This assumption underlies, for example, sentiment analysis in its many applications [8], or the identification of psychosocial processes beyond emotions/emotional states such as morality in Linguistic Inquiries and Word Count [9, 10], or even behavior patterns [11]. The massive source of data available on social media presents an opportunity to study how these processes are manifested in language in a worldwide dataset.

However, automatically identifying the psychosocial processes in text is very complex, and remains an open challenge due to the ambiguous and context-dependent nature of human language. Two main approaches exist for this purpose: dictionary-based methods and machine learning-based methods. Dictionary-based methods are the most widely used in the field of social psychology. According to these methods, the presence or absence of the process is given by an index indicating the presence/absence of a list of words that together define the process. Even if they have proven useful, limitations arise from their inability to capture the nuances of language use in context, leading to a less refined or coarse-grained identification of psychosocial processes. In contrast, machine learning-based methods have demonstrated the ability to overcome these limitations [12]; however, they require significant domain-specific knowledge and effort to develop a training set allowing automatic identification of the processes [11, 13].

This paper proposes a model that operationalizes psychosocial processes, defining stance towards risk using machine learning to capture their specific features, and examines how they are expressed in natural language, providing a training set for their automatic classification on social media. The model categories for text classification focus on moral disengagement, distancing, and denial in relation to environmental and health risks (climate change and the Covid-19 pandemic).

Psychosocial processes related to stance towards risk

The following paragraphs describe certain psychosocial processes which, according to pertinent literature, are relevant to risk perception and provide a more or less effective coping mechanism in the face of risk events. They are processes of moral disengagement, psychological distance (construal level theory), and states of denial. The decision to focus on these psychosocial processes over others—linked nonetheless to perception of risk and willingness to act—is based on the fact that these in particular concern the role of individuals and their acknowledgement or otherwise (at different levels) of the risk in question (summarized here as risk stance). This enables their joint consideration and comparison.

Processes of moral disengagement

Moral disengagement [5] is grounded in the agentic perspective of social cognitive theory. It is referenced as the mechanism through which people selectively disengage their moral self-sanctions from their detrimental behavior. In general, it defines the role of the individual with respect to an event or an issue, in this instance, a risk. According to the author, various processes work together at both individual and social levels, and can be identified based on where the moral self-sanctions are focused: on behavior, agency, or effect [5]. When the focus is on behavior, these processes are moral, social, and economic justification, used to attribute honorable purposes to practices regarded as harmful but in pursuit of a just end (from a moral, social or economic point of view). Similarly, the process of advantageous comparison is one of framing an issue as the lesser of two evils, making it not only acceptable but even morally right. When the focus is on agency, the process involved is the displacement of responsibility, that is to say obscuring or understating the true agentive role in causing harm. In this case the agentive role is attributed to someone else. Finally, focusing on the effects, moral disengagement is accomplished by downplaying or discrediting evidence of harm (disregarding, minimizing, and disputing detrimental effects).

As regards risk, mechanisms of moral disengagement have been studied in the context of terrorism or environmental sustainability [5], and of risk behaviors related to health [14, 15]. Several studies show that engaging in these processes leads to justifying one's ineffective risk management behavior, for example not observing prevention rules during the Covid-19 pandemic [15] or being unwilling to make changes that could mitigate the harm inflicted on future generations in the case of environmentally degrading practices [5].

Construal-level theory of psychological distance

The process of psychological distance formulated within the Construal Level Theory [16, 17] is used to understand how people represent and manage a concept that cannot be experienced in the here and now. The reference point for psychological distance “is the self in the here and now, and the different ways in which an object might be removed from that point—in time, in space, in social distance, and in hypotheticality—constitute different distance dimensions” [17]. According to the authors, the process of abstracting an object or event (or a risk, in this instance) involves both a simplification of and an attribution of new meanings to that object or event, related to the cognitive processes involved in the selfsame process (e.g., categorization). The ability to abstract and thus imagine an object or event is needed if one is to manage it, and thus orient oneself toward prediction, evaluation, and behavior [16]. Thus, imagining an object as more or less distant determines one’s position relative to the object in terms of both space and disposition and, consequently, coping with it. In this sense, psychological distance is related to risk stance. Four dimensions of psychological distance are identified, related to each other but considered as distinct [17]. The temporal dimension refers to when an event occurs (near or remote in time). The spatial dimension refers to where it occurs (near or remote in space); the social dimension, to whom it occurs (to me, or someone else from a social group to which I do or do not belong). Finally, the hypothetical dimension refers to whether or not an event occurs. So, the shorter the distance, the more likely its imagined occurrence; the greater the distance, the less likely.

The link between psychological distance, risk perception, and willingness to act when faced with such risk has been widely studied. Many studies have focused on climate change [18], others more recently on the Covid-19 pandemic [19]. Research has shown that perceiving a risk as distant can be linked to low concern and readiness to act [4, 19]. Recent experimental studies have shown the importance of reducing psychological distance to promote effective responses and concern over a risk. For example, Jones et al. [20] show that climate communications framed in such a way as to reduce psychological distance can increase public engagement with climate change. According to Chu & Yang [21], distance framing influenced the readiness of people to evaluate—and react to—climate change messages: distancing the impact of a phenomenon would increase ideological polarization, whereas reducing the distance has the opposite effect.

States of denial

According to Cohen [22, 23] the process of denial involves cognitive, emotional, and action components. Cohen classifies denial into three types. The first is literal denial, where something did not happen or is not true. This form of denial aligns with the dictionary definition and corresponds to non-acknowledgment of facts. The second form of denial is interpretive, where the facts are acknowledged but given a different meaning from what others perceive. The third type of denial, implicatory denial, acknowledges the facts but denies psychological, political, or moral responsibility, which implies an inability to do anything about the situation.

The above-mentioned forms of denial have been referenced in literature—by the same author [23], and others—mainly in the context of studies on atrocities and war crimes. However, the denial construct is studied extensively in relation to risk in the field of social psychology, demonstrating perhaps one of the best known and most intuitive antecedents to ineffective coping in the face of risk events. Risk denial has been extensively studied in relation to climate change [6, 24], health [25, 26] and more generally, science denial [27,28,29].

Toward an integrated model of psychosocial processes defining stance towards risk

Stance classification and detection is a frequently explored area in Natural Language Processing, often entailing the categorization of attitudes towards the object of study as positive, negative, or sometimes neutral. However, the operationalization of stance often varies across different research studies, resulting in a lack of consistency [30]. Currently, there is no a unique and standard method for operationalizing this construct, including the incorporation of a psychosocial perspective. Consequently, using stance classification systems designed for other purposes in a specific study can pose challenges. This underscores the importance of clarifying the theoretical perspective guiding the operationalization of this construct. As previously explained, in this paper, the construct of risk stance refers to the role of individuals and their acknowledgment or lack thereof of the risk in question. It encompasses the psychosocial processes described above, namely moral disengagement, psychological distance (construal level theory), and states of denial, as explained below.

All the above-mentioned psychosocial processes were investigated in the cited studies, mainly using validated measurement scales for the construct. There are only a few attempts to identify some of these processes in language. For example, Snefjella and Kuperman [31] show that abstractness (references to geographically more distant cities, chronologically more remote times in the past or future) can be a proxy for geographic and temporal distance in language. However, there is no systematic guidance on how to identify all these processes in relation to risk in natural language. The first step in pursuing this purpose is to create a reference model that defines the categories pertaining to psychosocial processes.

In short, all of the above-mentioned psychosocial processes: (I) concern the acknowledgement or otherwise of the risk and (II) one’s own role with respect to the risk; (III) are indicative of willingness to act when faced with a risk event; (IV) can be expressed in natural language. Some processes may be applicable conceptually to multiple theories, and thus overlap. For example, hypothetical distance (i.e., the question of whether or not a risk exists) overlaps with. Or again, the attitudes of disregarding, minimizing, and disputing detrimental effects that typify moral disengagement can be considered also as expressions of interpretive denial; likewise the displacement of responsibility and implicatory denial processes, which both question the responsibility of the individual in dealing with the risk.

To create an integrated model that allows the identification of stance towards risk from a psychosocial perspective, the processes indicated above have been systematized, highlighting the tendency to overlap and the specific features of each one (Table 1). In the model, stances towards risk imply acknowledgement or otherwise and role of the individual with respect to the risk, and have been conceptualized as different stances in a continuum from consciousness to denial (Co-De).

Table 1 Macro and micro categories of the risk Co-De model and the corresponding socio-psychological process

The model is composed of four macro-categories containing nine micro-categories (i.e., the psychosocial processes). At the first pole of the continuum, there is Consciousness, i.e., acknowledgement that the risk in question exists. Next in sequence is the Justification macro-category. This category implies acknowledgment of the risk, but the disinclination to deal with it (i.e., one’s own role) is justified through three different processes (a unique label was applied where there are several overlapping processes from different theories, as summarized in Table 1): Moral, Social, and Economic Justification; Advantageous Comparison; and Displacement of Responsibility. The third macro-category, Distance, implies acknowledgement of a risk, albeit one that does not concern the person or related groups since it is considered remote in terms of time (Temporal Distance), space (Geographical Distance), or affected population/group (Social Distance). At the opposite end of the continuum to Consciousness, finally, there is Denial, which indicates lack of awareness (i.e., no reason to take up any risk-aware stance) and comprises two micro-categories: Interpretive Denial, in which the risk is defined differently from “what is generally perceived”; and Literal Denial, in which the very existence of the risk is called into question.

Testing the risk Co-De model

The fundamental concept underlying this study is that if psychosocial processes can be articulated using language, then they can be discerned in natural language. The challenge lies in determining whether it is feasible to automate the identification process that an expert would employ. If the risk Co-De model suggested here is effective, it is expected that distinct categories—namely, the psychosocial processes—can be distinguished through the use of machine learning. This is because it is expected that these processes/categories will be characterized by specific combinations of linguistic features that can be automatically identified through machine learning algorithms. To test the effectiveness of the model automatically identifying the proposed categories (i.e., the psychosocial processes), the steps outlined below were followed (Fig. 1).

Fig. 1
figure 1

Representation of the steps followed to test the model

First it was necessary to create a training set, that is to say a set of texts—tweets in this case, pertaining to the issue in question, namely risk—and labeled according to the proposed categories. First, tweets pertaining to Covid-19 and climate change were collected (Phase 1, Fig. 1), selected, and labeled manually according to the risk Co-De model categories (Phase 2, Fig. 1). Then, each category was explored to observe its peculiarity as reflected in co-occurrences of the most used nouns, adjectives, and (sometimes) verbs, and in terms of prototypical tweets classified as a whole (Phase 3b, Fig. 1). Finally, starting from the training set, machine learning algorithms were trained to test the effectiveness of the risk Co-De model, i.e., whether on the basis of the previously classified tweets it would be possible to identify recurring linguistic features that would in turn allow automatic classification according to the risk Co-de model. This means that the corpus constituting the training set was divided into training and validation sets to see if different algorithms were able to make an accurate prediction with respect to the pre-labeled categories (Phase 3a, Fig. 1).

Data collection

To create the training set, Italian tweets pertaining to risk were gathered utilizing the rtweet R package [32]. Given that the model is concerned with categorizing stance towards risk, prominent risks in online discourse were defined at the time of collection, namely the Covid-19 pandemic and climate change, and tweets were selected and collected based on inherent hashtags that were trending and relevant. In a multi-risk perspective, various risks coexist and interact within the same context [33], and therefore are not necessarily separated in the online discourses. In fact, even though tweets were collected using hashtags related to climate change, these same tweets discuss Covid-19 (and vice versa) and, sometimes, the Russian–Ukrainian conflict. All of these risks coexisted in online discourse in Italy at the time of collection although, as evident from the marked difference in the number of tweets collected, the attention paid to them is not uniform.

The process of Twitter scraping resulted in 216,624 tweets related to the Covid-19 pandemic and 6087 tweets regarding climate change, based on their respective hashtags. The Covid-19 tweets covered two time periods—end of January/February 2020 and March/April 2022—while climate change tweets spanned the period March-September 2022. From this initial collection, a subset of 29,304 tweets was selected for manual labeling. Given the disparity in the number of tweets collected for each topic, the sample comprised all 6,087 of the climate change tweets, plus a randomly selected subset of 23,217 tweets related to the Covid-19 pandemic. This selected sample was then analyzed and categorized manually according to the risk Co-De model, resulting in a final corpus of 2381 classified tweets (8.1% of the analyzed sample).

Analyses

Manual labeling and exploration of the categories

The sub-sample of collected tweets was divided into four parts and analyzed manually by four independent expert judges (social and clinical psychologists instructed by the authorFootnote 1). Their task was to label each tweet with the appropriate category from the risk Co-De model, discarding those that were not pertinent. If any of the psychosocial processes could be identified in a tweet, the entire tweet was labeled accordingly, otherwise it was excluded. The attribution of the tweets to be analyzed took place in several successive stages, to keep track of the ratio of analyzed to attributed tweets, and balance the categories found as far as possible. During the analysis process, the judges debated the wording of the categories in the language until they reached an agreement. The total number of labeled tweets (2813) was then rechecked manually by the author and further debated. Among them, potentially ambiguous categorizations (432 tweets) were discarded and 274 tweets were re-discussed and the labels were changed by mutual agreement. Finally, a total of 2381 tweets were classified, divided into the macro and micro-category as shown in Table 2.

Table 2 Number of labelled tweets in macro- and micro-category

The labeled tweets constituting the training set corpus were cleaned by removing links and punctuations and normalized by replacing uppercase with lowercase letters. The entire corpus comprises 67,025 word tokens (total number of words) and 10,685 word types (words as they appear in the dictionary). The average number of word tokens per tweet is 28.15.

To understand what the categories include, so that the result of manual analysis could be presented, the most recurrent nouns and adjectives and the associations between them were identified. The R package UDPipe was applied [34] to lemmatize texts. Part of Speech (PoS) was tagged through the pre-trained model PoSTWITA (2022-11-15 v2.11) [35], built on Universal Dependencies treebanks and made up of Italian annotated tweets. Then co-occurrences of lemmas (adjective, nouns, and—if explicative—verbs) were identified and represented. Co-occurrences make it possible to see how relevant lemmas (selected on the basis of the PoS tagging) are used in the same sentence or next to each other. By observing relevant lemmas in their context of use (i.e., the whole tweet—concordance analysis) it was possible to display certain relevant tweets classified in each category.

Model training

Starting from the training set corpus (cleaned and normalized, but not lemmatized) the training procedures designed to test the risk Co-De model were performed. The trainings are presented in detail only for the four macro-categories, but were also conducted for the micro-categories.Footnote 2 However, the micro-categories still contain too few texts classified to conduct reliable analyses. The procedure followed was the same.

The analyses carried out in this paper start from a vectorial representation of texts commonly used in bag-of-words approaches, that is to say, a numerical representation of a set of text documents commonly beginning with a document-term matrix (DTM). In this representation, each document is embodied as a row of the matrix, and each unique term present in all documents constitutes a column. The representation obtained from the DTM allows comparison of documents based on common words they contain, as in the case of co-occurrences, calculated to explore categories. Starting from more complex matrices, e.g., by replacing the columns (i.e., the terms) with other text features, such as linguistic features, further representations of texts can be obtained. To this end, before employing machine learning techniques to test the effectiveness of the risk Co-De model, the corpus constituting the training set was turned into a document-feature matrix, as already accomplished in previous studies with the same final aim of classification through machine learning [36, 37].

Applying the procedure on R, use was made of the document representation known as Authors Multilevel N-gram Profiles (AMNP) [37], a procedure developed for authorship attribution. In this instance it is applied to the attribution of categories that constitute psychosocial processes. The underlying assumption of the procedure is, as already intimated, that the corpus can be represented by a matrix of documents (i.e., in this application text chunks made up of tweets) x features. The AMNP procedure combines information (i.e., features) from different linguistic levels (semantic, syntactic, morphological, phonological), taking into account the complementary nature of the information at character and word levels, and, instead of considering a single type of feature, uses a combined vector of both character and word n-grams of different sizes. This solution has proven to be very efficient compared to a single n-gram feature [37].

The aim of applying machine learning is to find out whether or not the process of abstraction employed by experts in identifying psychosocial processes in tweets can be replicated automatically. This involves identifying recurring linguistic features and their combinations in tweets that can be used to classify them— automatically—according to the categories of the risk Co-De model using machine learning algorithms. The underlying assumption is that if the process (i.e., the category) is expressed in language, it can be represented as a combination of linguistic features. Therefore, by representing the text as a document (text chunks)-feature matrix, machine learning can discern how this expression occurs by detecting the features and their combinations in the given categories, and then automatically identify them.

The training set corpus was divided into the four macro-categories of the risk Co-De model by merging the tweets and dividing the text into chunks of equal dimension (400 words). As tweets are particularly short and heterogeneous texts, it has been demonstrated that performance is enhanced by utilizing homogeneous sample sizes of merged tweets [37].

Accordingly, the training set corpus was divided into text chunks of 400 words, which is the unit used for identifying linguistic features. The linguistic features comprise the 400 most frequent word unigrams, the 400 most frequent word bigrams, the 400 most frequent character bigrams, and the 400 most frequent character trigrams (corresponding to the different linguistic levels). These features are then extracted from the text chunks and arranged in a matrix with dimensions of 163 rows (corresponding to the text chunksFootnote 3) and 1600 columns (representing the linguistic features). A normalized frequency was calculated for all features to avoid text length bias in subsequent calculations. Values with low variance are excluded to reduce noise in the classification process.

Two machine learning algorithms for classification were then applied: Support-Vector Machines (SVM) [38, 39] and Random Forest (RF) [40, 41]. SVM (discriminative classifiers) fit a margin-maximizing hyperplane between classes. This has been shown to be less prone to overfitting and provides better generalization [12]. RF consists of a learning method based on generating a multitude of randomized and uncorrelated decision trees. It is considered a good choice for social media datasets [12]. The algorithms utilized a fivefold cross-validation technique. This involved conducting training on 4/5 of the available training set and then using the remaining 1/5 as a validation set. This process was repeated five times, with each iteration using a different set of text chunks as the validation data. The final outcome of the cross-validation procedure was determined by calculating the arithmetic mean of the test results from each of the five iterations.

Results

Manual labeling and exploration of categories

The exploration of manually categorized tweets provides insight into the abstraction process employed by expert judges to identify how categories are expressed and what they encompass, as well as the characteristics that define these categories.

Consciousness

The category of consciousness primarily comprises tweets that acknowledge the existence of the risk in question, stating the situation and describing what to do (Fig. 2). For instance, a tweet stating “Deforestation in the Amazon and pollution of the sea are irreversibly changing the climate” falls under this category.

Fig. 2
figure 2

Co-occurrences of nouns and adjective (lemma) within sentences in the Consciousness macro/micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

Justification

In the Justification macro-category there are tweets expressing awareness of the risk, but in a way that suggests there is no need to counter it. In the specific context of the micro-category of Moral, Social, and Economic Justification, it is argued that individuals should accept the risk (e.g., not adhering to the safety measures imposed during the pandemic crisis) to mitigate more significant problems such as the repercussions on the economy (Fig. 3). An example of a tweet classified under this micro-category is the following: “Remove all restrictions on Covid now or it will be economic carnage now that there is a war.”

Fig. 3
figure 3

Co-occurrences of nouns and adjective (lemma) within sentences in the Moral, Social, and Economic Justification micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

In the micro-category of Advantageous Comparison, apparent disregard for a risk is justified by the existence of other problems seen as more pressing or significant (Fig. 4). An example of a tweet classified under this category is “These idiots then cry panic over the coronavirus. It would be better to control the deaths caused by pollution”, or “Italy: 0 victims per day due to coronavirus, 6 victims per day due to femicide. Open your mind, please. #coronavirus”.

Fig. 4
figure 4

Co-occurrences of nouns, adjective (lemma) within sentences in the Advantageous Comparison micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

In the Displacement of Responsibility micro-category, the responsibility for doing something to counter the risk is attributed to others, often politicians or, in general, the government (Fig. 5). An example of a tweet classified under this micro-category is “You dedicate a day to the earth because in reality you do nothing but tweet #EarthDay #WorstGovernment #MilitarySpending”.

Fig. 5
figure 5

Co-occurrences of nouns and adjective (lemma) within sentences in the Displacement of Responsibility micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

Distance

In the Distance category, there are tweets in which the risk is presented as remote from the author of the tweet, in whatever sense. From a temporal perspective (Temporal Distance micro-category), the distance is expressed through statements indicating mainly that the risk is over, such as “at least Covid19 has disappeared”. In this category, the non-existence of the risk (mainly Covid-19 in this case) is expressed through specific verbs such as “end” and “decrease”, which have been maintained for clarity in the representation of co-occurrences (Fig. 6).

Fig. 6
figure 6

Co-occurrences of nouns, adjective, and verb (lemma) within sentences in the Temporal Distance micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

In the Geographical Distance micro-category, the risk is placed elsewhere, affecting someone else somewhere else, for example in China, where the first cases of Covid-19 occurred, or in Africa, in connection with migration flows, in which case the solution would be simply to close airports and seaports (Fig. 7). An example tweet classified under this category is “Covid 19 will never arrive in Italy from China.”

Fig. 7
figure 7

Co-occurrences of nouns and adjective (lemma) within sentences in the Geographical Distance micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

In the Social Distance micro-category, the risk does not concern the writer but others, belonging to different social categories, such as the elderly or people with health problems, in the case of Covid-19 (Fig. 8). Reference is made to categories considered typically or supposedly vulnerable, as in the tweet “Covid-19 kills the elderly.”

Fig. 8
figure 8

Co-occurrences of nouns and adjective (lemma) within sentences in the Social Distance micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

Denial

The Denial macro-category includes the micro-category of Literal Denial, reflected in expressions that describe the risk as something invented, and explicitly deny the existence of the risk in question, such as “coronavirus is all a hoax” or “there is no climate crisis”. Words such as “hoax” and “nonsense” feature in this category (Fig. 9). Several frequently occurring words in this category are highly critical of the government (e.g., “criminal government”). In fact, many tweets classified within this category assert that the government has lied about the existence of the risk in question, which—according to the tweets’ authors—does not exist.

Fig. 9
figure 9

Co-occurrences of nouns and adjective (lemma) within sentences in the Literal Denial micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

The micro-category of Interpretive Denial contains expressions that reinterpret the situation by stating the real truth, or by minimizing or discrediting the alleged risk through euphemisms, irony, etc. For example, the event that defines the risk is actually a tool for other purposes, such as SARS-CoV-2 being described as a virus created ad hoc to distract people, or climate change used as an excuse for the convenience of others. Consequently, messages contain words such as “truth”, or give indicators of what is really being talked about, for example a (normal) “flu” (Fig. 10). Some tweets classified within this category include “Covid19, the biggest weapon of mass distraction ever created: finally, the truth”, and “Influenza mistaken for pandemic”, or “what they secretly want in accordance with the EU: expropriate homes to give them to banks, insurance companies, entities, etc. and reduce everyone to renting via the land registry and green bomb Greta Thunberg, but when will Italians wake up?”.

Fig. 10
figure 10

Co-occurrences of nouns and adjective (lemma) within sentences in the Interpretive Denial micro-category (3 words distance). The thickness and intensity of the line is proportional to the frequency with which the words appear together. English translation by the author

Model training

To verify the effectiveness of the risk Co-De model, SVM and RF algorithms were trained starting from the training set corpus. As regards the four macro-categories: Consciousness (C); Denial (De); Distance (Di); Justification (J), the confusion matrix obtained after the training showed that the average accuracy obtained in classification is 86% with SVM and 70% with RF (Table 3). The final values used for the SVM model were degree = 1, scale = 0.1 and C = 0.25. The final value used for the RF model was mtry = 1600. Training with SVM performs considerably better than with RF (Fig. 11).

Table 3 Cross-Validated (fivefold) Confusion Matrix for SVM and RF trained with the four macro-categories. The sum of the texts chunks in each column reflects the number of text chunks present in the validation set within the corresponding category and is proportionate to the training set. The sum of the total cells corresponds to the validation set, which is 1/5 of the total text chunks (i.e., 32.6 chunks out of a total of 163). Each cell reports the mean value of text-chunks classified across the five iterations
Fig. 11
figure 11

Comparison of performance of developed models (four categories) with SVM and RF

With SVM the number of correct predictions across all predictions (accuracy) is > 90% in all four categories (Table 4). The F1 score, which is a harmonic mean of precision and recall, indicates how many positive predictions made by the classifier are correct (true positives), as well as the number of positive cases the classifier correctly identified in relation to all positive cases in the data. The F1 score is highest for the Consciousness category and lowest for the Distance category. This is because the distance category is often misclassified by inclusion in other categories (low recall). It is worth noting that this category has the lowest number of tweets (and thus text chunks) that were classified. With RF the results are similar, but performance is inferior (Table 5).

Table 4 Classification results for SVM (four categories)
Table 5 Classification results for RF (four categories)

The training was also conducted with the micro-categories,Footnote 4 obtaining an overall accuracy of 56% for SVM and 65% for RF, with F1 scores of 0.48 and 0.42 respectively.Footnote 5 Although there are few tweets classified in the training set (and thus few text chunks), probably to achieve satisfactory performance with more categories, it is interesting to note that in this case the better overall accuracy is achieved with RF.

Discussion

The purpose of this study is to provide a training set that will train machine learning algorithms to identify stance towards risk. It is based on a model (risk Co-De model) created from psychosocial processes related to risk perception that are described in existing literature [5, 17, 23]. The risk Co-De model comprises four macro-categories (Consciousness; Justification; Distance; Denial); these can be divided into nine more detailed micro-categories (Consciousness; Moral, Social, and Economic Justification; Advantageous Comparison; Displacement of Responsibility; Temporal Distance; Geographical Distance; Social Distance; Interpretive Denial; Literal Denial).

The premise of the study is that if it is possible to express psychosocial processes in language, then it is possible to detect them. The risk Co-De model provides a framework for identifying specific psychosocial processes in text, based on their theoretical conceptualization. The challenge lies in determining whether the identification process that an expert would employ can be replicated automatically. Given the complexity of this process, a training set was generated to enable its implementation via machine learning. To this task, machine learning-based approaches have proven to be more efficient than dictionary-based methods [12], as they are capable of capturing the nuances of language use in context. Therefore, for the development of a new classification system and theoretical model, it was decided to proceed with the former approach. The results show that when considering the four macro-categories, the performance is quite satisfactory, even compared to other studies that have used categories related to psychosocial processes [11, 13], especially with SVM (86% overall accuracy, macro-average F1 score 0.85), confirming the effectiveness and usefulness of the risk Co-De model. This result demonstrates that categories can be distinguished on the basis of linguistic patterns, and therefore, as a combination of linguistic features, these processes can be identified automatically in natural language. Moreover, this result highlights the reliability of the manual labelling process and of the creation of the training set.

However, the training set is still too limited to give satisfactory results when considering the micro-categories, consequently it is not yet possible to train with all nine micro-categories. Unsatisfactory classification can be due more to a question of categories with few tweets than to the actual content of the tweets. However, considerations regarding potential redefinitions of the micro-categories can only be made with a larger training set. The effort required to obtain labeled data for a specific domain and the complexity of the proposed categories must be taken into account. Therefore, the overall result can be considered promising and suggests that the proposed line could be pursued further.

Some limitations emerge that must be evaluated in future developments of the study. The data collected for manual tagging is probably too general, as the ratio of items classified to items discarded is very high (with only 10% approximately of the tweets being retained). However, since the aim was to create a training set, potentially ambiguous tweets were initially discarded in favor of maximizing the effectiveness of the training. In addition, the percentage of categories found is not balanced. Although attention was paid to this aspect during manual labeling, it was not possible to find sufficient classifiable tweets within certain categories in some cases (e.g., Temporal Distance). This may be due to the data collection periods, which consider only certain phases of the Disaster Risk Management (DRM) cycle. Collecting data that takes the DRM factor into account could help in this respect, as different psychosocial processes may or may not be expressed depending on the different phases of the DRM cycle. For example, temporal distance from the risk related to the Covid-19 pandemic may be less likely to figure during the crisis phase.

In addition, the dataset is unbalanced towards one type of risk despite the intention to consider a multi-risk perspective. In effect, during the data collection period, despite the evident coexistence of different risks, attention was focused more on one risk, for example the Covid-19 pandemic compared to climate change, due both to media exposure and to its being more directly experienced [42, 43], which may influence classification. This observation also highlights another important point: the framework underlying the classification system of the risk Co-De model adopts a multi-risk approach. Consequently, the analyzed tweets do not always exclusively pertain to a single risk event, and the training dataset should encompass various interconnected topics that correspond to these risk events. However, it is important to note that the training set has not yet undergone testing with a separate test set. This testing is essential to evaluate potential overfitting and—together with further analyses—for cross-topics validation (where the topics considered, in this case, are different risk events) [44] in alignment with the multi-risk approach that forms the basis of the risk Co-De model.

Finally, with this type of classification, psychosocial processes are treated as mutually exclusive, attributing the presence or absence of the corresponding category to the tweet. Further study developments could consider the presence of different processes simultaneously in the same tweet, labeling not the entire tweet but the specific expression of the process [45]. Despite the limitations identified, the precision and expertise with which the training set was constructed have led to very promising results, which are difficult to find elsewhere in literature and can be used for further studies.

Concluding remarks

The automatic identification of psychosocial processes in natural language presents an open and highly complex challenge. Given the large volume of data available through social media, and the role of the latter in risk perception and management, such identification can be crucial for studying and understanding complex phenomena, such as human approach to risk, as well as to inform policy and build tailored communication. This study presents a model (risk Co-De model) capable of addressing the challenge in question, albeit in the simplest of forms, with four macro-categories. With the aid of the risk Co-De model, it becomes possible to identify different stances towards risk in social media automatically, ranging from consciousness to denial, as expressed in natural language. Once the functionality of the risk Co-De model has been demonstrated, the training set can be expanded and developed in different languages.